Response Format and Response Behavior 1 |
|
Chair | Mr Jan Karem Höhne (University of Göttingen ) |
Coordinator 1 | Dr Timo Lenzner (GESIS – Leibniz Institute for the Social Sciences) |
Coordinator 2 | Dr Natalja Menold (GESIS – Leibniz Institute for the Social Sciences) |
Survey response scales can have many different formats. These different formats depend on the type of design decisions questionnaire designers make. For instance, decisions need to be made regarding the number of answer options or the kind of verbal labels to use. Previous research has shown that the way response scales are designed has an impact on measurements’ quality. Moreover, it has been shown that within questionnaire design, the format of the response scales has the biggest impact on responses and measurement’s quality.
Although the impact of different response scale formats on measurements’ quality has been studied in the literature, it is often complex to extract questionnaire design best practices. Conclusions from a single experiment on a set of measures using specific variations in the format of the response scales, cannot easily be extrapolated to other surveys or measures.
Few studies to date have focused on assessing the impact of each of the design features of survey instruments over a wide range of experimental data. Following previous research, I conduct a meta-analysis of Multitrait-Multimethod experimental studies focusing on the different types of response scales evaluated.
Compared to previous studies, this meta-analysis provides evidence, on the one hand, including new experimental data and, on the other hand, including a larger variety of decisions about the type response scale features to use.
A large and ever growing literature investigating the optimal length of rating scales for the measurement of subjective phenomena in surveys continues to perplex researchers designing questionnaires. While there is now an emerging consensus that fully-labeled scales with 5 (for unipolar constructs) or 7 points (for bipolar constructs) maximize both the validity and reliability of measurement, a number of large-scale repeated and/or longitudinal social surveys continue to make use of 11-point scales with only end-point labeling. In the European Social Survey, where 11-point scales have been used widely, three arguments have been put forward in favour of their continued use, each relating to the need to limit different sources of measurement error. The first reflects the idea that ongoing use of 11-point scales is partly a legacy of decisions made in the earliest waves and the desire to preserve the continuity of measurements over time. The second argument concerns the findings of MTMM experiments testing the reliability and validity of different question formats cross-nationally, which have provided empirical support for the 11-point format over selected alternatives. The third argument concerns the challenges involved in translating scale point labels and the potential impact of scale design on measurement invariance across countries. In the present study we review existing data and evidence from the ESS pertaining to these three arguments in order to draw conclusions about the ongoing suitability of 11-point scales for future surveys, and in particular, for multi-lingual surveys. Using data from experiments run in Rounds 1 through 7 of the ESS, we aim to assess to what extent 11-point scales improve comparability across countries, and whether emphasis on measurement continuity over time and measurement invariance across countries affects measurement quality.
Characteristics of response scales are important factors guiding cognitive processes underlying the choice of a response category in responding to the request for an answer on an attitude item. While agree/disagree (A/D) questions enjoy a great popularity and continue to be the standard question format for assessment of attitudes in surveys, researchers recently tend to recommend the use of item-specific (IS) questions. Contrary to A/D questions response categories of IS questions address the dimension of interest directly thereby – as is often postulated – disburden respondents in responding to survey questions. Accordingly it is hypothesized that measurement attributes of IS questions outperform those of A/D questions, especially with respect to different forms of response bias. However, empirical evidence up to now is mixed (Liu et al., 2015; Hanson, 2015).
This paper contributes to this discussion by comparing measurement attributes of both question formats with respect to (1) different response behavior (preference for the first answer categories appearing on the left side of the scale or selection of the middle answer category) on a five point decremental as well as incremental rating scale. (2) Special attention is given to the reliability of composite scores computed for the A/D and the IS question format using structural equation modeling as analytical tool. The study is based on a split ballot design with repeated measurement and random assignment of respondents to either question format. For both question formats items correspond with regard to content but differ in response categories. Content areas are job and achievement motivation as well as anomie and xenephbia.
Literature
Hanson, Tim (2015). Comparing agreement and item-specific response scales: results form an experiment. Social Research Practice, 1, Winter 2015, pp. 17-25.
Liu, Mignan, Shunghee, Lee and Conrad, Frederick G. (2015). Comparing extreme response styles between agree-disagree and item-specific response scales. Public Opinion Quarterly, 79 (4), pp. 952-975.
The objective of this study was to test the effects of the number of response options and verbal labelling on acquiescence responding. A split ballot experiment of 3009 respondents, randomly selected from a probability based web panel of the general population of Iceland, was undertaken. Respondents were randomly assigned to one of eight experimental conditions. The independent variables were number of response options (5, 7, 9 and 11 options) and verbal labelling (fully labelled and numerical). Acquiescence responding was measured with a pair of two logically inconsistent attitude questions. The results showed that as the number of response options increased from seven to nine the likelihood of acquiescence responding increased. The effect of verbal labelling and the interaction between the number of response options and verbal labelling did not reach statistical significance. The results suggest that when the number of response options increases beyond seven, the task of selecting a response option becomes more cognitively difficult causing some respondents to resort to acquiescence responding. This applies both to fully labelled scales and numerical scales. Thus, to decrease the risk of acquiescence responding in attitude measurements, it is recommended to use five- or seven point rating scales.
Motivating respondents to repeatedly participate in survey waves and complete questionnaires is one of the key challenges for panel survey designers. Unlike other modes online data collection allows for directly addressing and thus motivating respondents while filling in questionnaires. However, up to date there are only few studies including immediate feedback in (longitudinal) online surveys (for example Scherpenzeel & Toepoel, 2014; Kühne & Kroh, 2016).
Providing immediate feedback by relating respondents’ answers to the outcomes of public opinion polls may keep them interested in both the current and future surveys. However, informing respondents about polling results may influence their response behavior and thus bias survey measurements.
We present the results of a series of survey experiments during which a random subsample of respondents received in-survey information on the outcomes of a public opinion poll, specifically on other peoples’ behaviors. The experiments were part of the first online interview after respondents were recruited for the German Internet Panel (http://reforms.uni-mannheim.de/internet_ panel/home/). Our key research questions are: Does feeding back polling results affect respondents’ reported behavior? Are there short-term effects on respondent’s survey satisfaction and long-term effects on participation in subsequent waves?
Our results show that providing polling results did not affect response distributions. However, in the subsample receiving feedback response times were with one exception significantly higher suggesting that these respondents invested more cognitive effort to answer the questions.
In addition, feedback resulted in a more positive survey evaluation. In particular, respondents receiving feedback evaluated the questionnaire as more interesting, more varied, less difficult and less personal than respondents answering questions only. However, we did not find any effects on retention in one of the five following survey waves.
Literature
Kühne, S., & Kroh, M. (2016). Personalized Feedback in Web Surveys: Does it Affect Respondents’ Motivation and Data Quality? Social Science Computer Review. Advance online publication. doi: 10.1177/0894439316673604
Scherpenzeel, A., & Toepoel, V. (2014). Informing panel members about study results. Effects of traditional and innovative forms of feedback on participation. In M. Callegaro, R. Baker, J. Bethlehem, A. S. Göritz, J. A. Krosnick & P. J. Lavrakas (Eds.), Online Panel Research. A Data Quality Perspective (pp. 192-213). Chichester, UK: John Wiley & Sons.