All time references are in CEST
Restarting the debate on unipolar vs. bipolar rating scales 2 |
|
Session Organisers | Dr Mario Callegaro (Google Cloud) Dr Yongwei Yang (Google) |
Time | Wednesday 19 July, 14:00 - 15:00 |
Room | U6-05 |
With few exceptions (e.g. Höhne, Krebs, and Kühnel, 2022) scale polarity research is lacking in the past few years of survey research on rating scales. Scale polarity is a decision to make when writing a questionnaire “is theoretical, empirical, and practical. “ (Schaeffer & Dykema, 2020, p.40). The second decision to make is to decide how many scale points a unipolar or bipolar scale should have. Other decisions are the use of labels (if fully labeled or endpoint labeled) and if using numbers associated with the scale point.
In this session we want to restart the debate on scale polarity and its effect on data quality (DeCastellarnau, 2018).
More specifically we are looking at contributions to this topic such as:
Didactical or empirical studies aiming at clarifying the polarity nature of key constructs
Empirical studies on the data quality and/or practical utility of using bipolar vs. unipolar question and scale design
Impact of question (e.g., balanced wording) and answer scale design choices (# of scale points, choice of labeling, scale orientation, etc.)
Understanding the “why” (e.g., through asking or observing respondents)
Cultural/language generalizability or moderators/mediators
Studies using samples other than opt-in online panels
Mode effects, if any on visual vs auditory presentation of the scales
Validity and reliability of the two scale formats
Systematic reviews
Meta-analytic studies
DeCastellarnau, A. (2018). A classification of response scale characteristics that affect data quality: A literature review. Quality & Quantity, 52, 1523–1559.
Höhne, J. K., Krebs, D., & Kühnel, S.-M. (2022). Measuring Income (In)equality: Comparing Survey Questions With Unipolar and Bipolar Scales in a Probability-Based Online Panel. Social Science Computer Review, 40, 108–123.
Schaeffer, N. C., & Dykema, J. (2020). Advances in the science of asking questions. Annual Review of Sociology, 46.
Keywords: rating scales, unipolar, bipolar
Professor Randall K. Thomas (Ipsos Public Affairs) - Presenting Author
Dr Frances M. Barlas (Ipsos Public Affairs)
Ms Megan A. Hendrich (Ipsos Public Affairs)
Designing surveys that can be completed easily on smartphones requires an increased focus on short response formats. Besides semantic labels, two alternatives are the use of emojis or numbers. For our study, we developed emoji scales using thumbs up or down with increasing size to represent gradations (bipolar went from large thumb down to large thumb up; unipolar only had thumbs up going from small to large), while the numeric scales used negative numbers and positive numbers (bipolar - negative numbers to positive numbers, with a 0 midpoint; unipolar scale only with positive numbers starting at 1). We conducted an online study with 10,664 non-probability respondents to compare emoji and numeric scales to semantically-labeled scales. We randomly assigned respondents to label types (semantic, numeric, emoji), gradation level (e.g. 3, 4, or 5 responses) and scale polarity (bipolar versus unipolar). Neither the emojis nor the numerics had semantic labels. We found that the both the emoji and numeric scales produced point estimates and variances comparable to the semantically-labeled scales. Specifically, unipolar scales had lower point estimates than bipolar scales regardless of whether they were semantic, numeric, or emoji. The validity of the unipolar scales was also higher than the bipolar scales for semantic, numeric, and emoji formats, replicating prior findings for semantic scales only. While the emoji and semantic scales took about the same amount of time for respondents to answer, numeric scales took less time. Overall, response formats with numbers or emoji responses options without corresponding semantic labels yielded as reliable and valid results as semantic scales. Our findings suggest that numeric and emoji scales are good options for making surveys smartphone friendly, though we discuss some caveats for each scale type. We further discuss the conceptual expansion of scale polarity to formats beyond semantic scales.
Professor Randall K. Thomas (Ipsos Public Affairs) - Presenting Author
Ms Megan A. Hendrich (Ipsos Public Affairs)
Dr Frances M. Barlas (Ipsos Public Affairs)
Response categories may be used differently as a result of ethnic background or country of residence. Many researchers believe that respondents from some countries or different ethnicities are less likely to use extreme response categories while those from other countries are more likely to use them. When making comparisons between countries or ethnicities, we need to ensure that we do not confound country/ethnicity with other factors before we can attribute differences due to culture and not to other factors, such as scale polarity (bipolar vs. unipolar) and extent of verbal labeling of response categories. In this study, we present 3 web-based survey experiments, 2 from the U.S. and 1 international where we compared scale variants (e.g., unipolar versus bipolar) and the extent of semantic anchoring (fully anchored scales give a semantic label for each response; end anchored scales provide only the extremes of the scale), and order of responses in how they affect differences in conclusions and can relate to differences between countries. We found significant differences in endorsement proportions for the response categories as a function of scale type. The fully anchored unipolar scale showed a lower endorsement of the highest response categories across countries/ethnicities. Across the experiments, there were mean differences in the evaluations of the issues as a function of country/ethnicity. However, controlling for familiarity with the topic and demographic factors, we found that these differences between groups and countries was eliminated or reduced for most of the activities we examined. We found significant differences in response patterns (extreme versus middling response patterns) as a result of type of scale – especially for end-anchored scales, regardless of scale polarity.
Dr Jon Krosnick (Stanford University) - Presenting Author
When designing rating scales for questionnaires, researchers must decide how many points to include. But there appears to be no consensus among researchers about optimal scale length. This paper offers a theoretical perspective on the influence of scale length on measurement quality and tests the theory with two studies. Study 1 is a meta-analysis of 39 prior scale-length experiments. Study 2 is the largest experiment to date, done with a nationally representative panel of American adults who are experienced at completing online surveys (N = 6,055). In the experiment, respondents were randomly assigned to one of ten scale lengths (2- to 11-points), twenty opinions were measured, and the influence on several data quality measures was evaluated. In both studies, measurement quality improved as scales lengthened up to 7 points for bipolar constructs and up to 4 points for unipolar constructs, and no substantial benefit was gained from lengthening scales more.