Tuesday 16th July
Wednesday 17th July
Thursday 18th July
Friday 19th July
Download the conference book
Construction of Response Scales in Questionnaires 3 |
|
Convenor | Dr Natalja Menold (GESIS) |
Coordinator 1 | Mrs Kathrin Bogner (GESIS) |
Researchers are invited to submit papers dealing with the design of response scales for questions/items to measure opinions or behaviour in surveys. The papers could include questions about several design aspects of response scales such as number of categories, middle category, unipolar or bipolar scales, numerical and/or verbal labels, ascending or descending order of categories or the scale's visual design. However, of interest are the effects of several design aspects on respondents' responses as well as on data reliability and validity. In addition, effects of cognitional or motivational factors could be focus of the studies. Also, specifics in design of response scales in different survey modes, their comparability in mixed mode surveys as well as their intercultural comparability are further topics of interest.
In the German General Social Survey (GGSS) attitudes towards abortion are asked with a bipolar response scale which allows respondents to either disapprove or to approve the legal use of abortion in varying circumstances. This response scale is used since the early 1980s. However, trends over time show that less and less respondents support abortion absolutely. We observe this finding especially when circumstances are described in which the health of the woman or the child is endangered. This trend reaches a dimension where it hinders the analysis of the responses since there is not enough variation in the answers. Based on a pretest we saw that a 3-point response scale can solve this problem to some extent since respondents answer more differentiated. As a result we developed a slightly adjusted 3-point response scale for the GGSS 2012 where respondents can choose between disapproving, approving to a certain extent and approving entirely to abortion under different specific circumstances. In the GGSS 2012, an experiment was conducted were we asked in a split the new version and the old bipolar version of the response scale.
In the presentation I will show results of the pretest as well as first results of the GGSS 2012 showing to what extent the new enlarged version of the response scale leads to different results as the old version and will give a preliminary assessment of the new response scale.
In the Anglo world, the dominant form of ideology measurement on surveys is the use of a liberal-conservative Likert item. This measurement has many advantages; it is short and generally understandable to the survey respondents. But it also comes with the disadvantages of all single item Likert type measures: distributional issues, inability to separate true score from error variance, etc. The Wilson-Patterson (WP) scale, despite its popularity in psychology, has rarely been used in political science. Some recent studies that used the WP instead of the Liberal-Conservative Likert item received criticism from deviating from the status quo. The simplistic nature of the scale, single word issue descriptions with trichotomous yes/no/uncertain responses, bothers political scientists. The promotion of the WP is also a difficult sell as political scientists rarely are willing to place a 20-30-item battery on a survey. The goal of our study is two fold. First we wish to demonstrate the validity of the WP and argue that it is not only a good measure of ideology, it is better than the single liberal-conservative Likert item. Secondly we wish to demonstrate that the administration of the WP is relatively quick and cannot be compared to other 20-30 items scales given its simplistic nature. This study presents a rigorous comparison of the two measurement techniques to better understand the costs and benefits of both approaches.
Due to the growth of cross-cultural surveys, questionnaires developed in Europe and North America are often translated and used in developing countries in Africa, Asia, and Latin America. These surveys often include rating scales to measure opinions. Although respondents from developing countries likely understand rating scales differently due to their cultural and socioeconomic contexts, there is little empirical research on how they interpret different types of rating scales. Studying how respondents from developing countries understand of rating scales is important in designing culturally appropriate scales for these contexts.
To address this gap, we conducted 20 semi-structured, in-depth interviews in Ethiopia. Interviewers probed respondents about how they understood and responded to three rating scale designs: (1) Verbal scale (e.g., Completely Disagree, Somewhat Disagree, Neutral, Somewhat Agree, Completely Agree); (2) Numeric scale (1-5, with verbal labels at the anchors); (3) Branched or unfolding scale that first asked about direction (Agree, Disagree, Neutral) and then asked about extremity (Completely, Somewhat).
Preliminary analysis suggests that respondents most preferred scales with verbal labels. Due to the lack of descriptive labels in numeric scales, most respondents significant difficulties understanding those scales. Respondents viewed branched scales favorably, but thought the two-step questioning process was repetitive. Some respondents also reacted negatively to the branched scales because these questions were seen as providing fewer opportunities to provide nuanced responses. Our results shed light on the cognitive processes respondents use to answer rating scale questions in developing countries, and provide guidance to questionnaire designers.
When measuring frequencies it has to be determined what time period is to be covered and how time intervals should be allocated. Besides mere methodological considerations just like equidistance of intervals also the covered content has to be taken into account. Here, a trade-off between parsimony of response scales and a meaningful time frame in relation to single items of a scale will become necessary. For example, when measuring parent-child-activities within the National Educational Panel Study, these activities can be expected to have different frequency distributions but should nevertheless be surveyed with the same response scale (an 8-point-scale ranging from "several times a day" to "never"). As expected, the corresponding items show very different means, variances, skewness and kurtosis in a sample of 2349 parents of 4-year old children for activities like "reading to the child" vs. "going to a libary". In the paper, possibilities of post hoc modification of scales by means of optimal scaling will be discussed.