All time references are in CEST
Satisficing in Self-Completion Modes: Theoretical Understanding, Assessment, Prevention and Consequences for Data Quality |
|
Session Organisers | Dr Daniil Lebedev (GESIS - Leibniz Institute for the Social Sciences, Germany) Dr May Doušak (University of Ljubljana, Slovenia) |
Time | Tuesday 18 July, 09:00 - 10:30 |
Room |
Self-completion surveys, which are increasingly preferred over face-to-face modes, present unique challenges. Rising costs, declining response rates, and interviewer effects make face-to-face surveys less viable. However, self-completion modes (web, mail, or mixed) introduce their own data-quality related challenges. Without an interviewer, respondents face a higher cognitive load, which can lead to satisficing – providing suboptimal answers – especially among those with lower cognitive ability or motivation. This behaviour increases measurement error and lowers data quality.
Survey methodologists have developed various indicators to assess response quality by detecting undesired respondent behaviour, such as straightlining and acquiescence, along with paradata-measured response styles like speeding, multitasking, motivated misreporting, and others. Questions assessing respondents' subjective enjoyment, cognitive effort, and time investment also help identify satisficing propensity. These tools can be used for detection and prevention through immediate feedback or adaptive survey designs based on survey data, paradata, or probing questions.
This session focuses on theoretical advancements in understanding satisficing and response styles in self-completion surveys, whether computer-assisted or paper-based, research on survey data-based indicators, paradata, and probing questions for assessing and preventing satisficing. Developing and implementing satisficing propensity models and tools and evaluating satisficing's impact on data quality in self-completion modes are also key topics. Contributions may address these areas and related research topics.
Keywords: satisficing, response styles, self-completion, web surveys, mail surveys, paradata, data quality indicators, probing questions, experiments, motivation, cognitive effort, cognitive ability, respondent engagement
Keywords: satisficing, response styles, self-completion, web surveys, mail surveys, paradata, data quality indicators, probing questions, experiments, motivation, cognitive effort, cognitive ability, respondent engagement
Dr Diana Zavala-Rojas (RECSM - Pompeu Fabra University)
Mr David Moreno-Alameda (Complutense University of Madrid) - Presenting Author
Ms Hannah Schwarz (RECSM - Pompeu Fabra University)
Many surveys are currently dealing with the switch from face-to-face to self-completion modes, the European Social Survey being an important example. There is extensive literature assessing whether there are mode effects on measurement, for example comparing face-to-face and self-completion modes. Our research aims to go beyond this and assess if measurement could be substantively altered because response styles are elicited to differing extents in face-to-face versus self-completion modes. We study six types of response styles: acquiescence, mid-point and extreme response style, primacy and recency response styles and straightlining. We use experimental data from the European Social Survey Round 10 for this, available for two countries: Finland and the UK. Preliminary results show that aquiescence is lower in self-completion modes.
Dr Laura Fumagalli (ISER, university of essex) - Presenting Author
Professor Peter Lynn (ISER, university of Essex)
Web surveys are becoming more and more popular. However, without interaction with an interviewer, answering survey questions can be a cognitively demanding task, and even participants who initially agree to take part in a survey may drop out before completing it or engage in satisficing behavior to reduce the cognitive load. Using a large online sample from the Understanding Society study, we experimentally test the impact of providing encouraging messages to respondents on the likelihood of survey completion and the occurrence of satisficing behavior. Respondents were randomly assigned to either a treatment group or a control group. Those in the treatment group received four different encouraging messages throughout various sections of the questionnaire; respondents in the control group received no messages. The first message appeared at the start of the household finance module: “We're going to ask you some important questions about money. We know it's a sensitive subject so we want to remind you that all your answers will be kept safe and anonymous.” The second message appeared at the end of the household finance module: “Thank you for all the information you've shared so far. We will keep it safe.” The third message was displayed early in the self-completion module: “We're going to ask some sensitive questions, so we just want to remind you that your responses will be anonymised, and no one will know it's you.” The final message appeared toward the end of the self-completion module: “We appreciate some of those questions may have been difficult to answer, so thank you for sharing. Now keep going - you're not too far from the end of the survey.” Preliminary results suggest that the effects depend on how the messages are implemented.
Mr Kaidar Nurumov (University of Michigan) - Presenting Author
Dr Sunghee Lee (University of Michigan)
Multi-item measurement scales with agree-disagree response categories are one of the major tools for measuring latent constructs such as attitudes, beliefs, and personality. While popular, they are prone to response styles, systematic tendencies to choose certain response options on agree-disagree scales irrespective of the intended measurement constructs. Response styles may distort univariate and multivariate statistics and contribute to measurement error as well as comparability. Models stemming from Item Response Theory and Factor Analysis have been suggested to adjust for various types of response styles, such as acquiescent response style (ARS), mid-point response style (MRS), and extreme response style (ERS). However, their applications to different groups of respondents (e.g., racial/ethnic groups in cross-cultural surveys) with varying tendencies to different types of RS have been limited.
We address this gap with a simulation study that reflects a population with multiple subgroups where each subgroup exhibits different response styles levels (low, medium, high). These levels in turn, are measured by the size of the group with RS tendencies. We will apply the following adjustment methods: random intercept item factor analysis (RIIFA), item response tree (IRtree), and the multidimensional nominal response model (MNRM). With these models we correct for the separate presence of MRS, ARS, and ERS. We compare their effectiveness for the individual-level response style adjustment and examine the comparability of the adjustment across MRS, ARS, and ERS. This study will be particularly valuable for survey practitioners who utilize statistical measurement models with secondary data that may be contaminated by response styles. More importantly, the results may help to understand the measurement error and its comparability caused by various response styles in a multi-group context, particularly useful when working with cross-cultural, self-completion surveys.
Dr Marek Muszyński (Institute of Philosophy and Sociology, Polish Academy of Sciences) - Presenting Author
Careless/insufficient effort responding (C/IER) is a major concern of self-report data quality. If left uncorrected, it can significantly distort research conclusions (Arias et al., 2022; DeSimone et al., 2018; Maniaci & Rogge, 2014; Woods, 2006).
Attention checks, such as instructed response items or bogus items, have been employed to detect C/IER. Respondents must select a specific response (e.g., “Please select ‘Strongly agree’”) or avoid agreeing with illogical claims (e.g., “I have been to the moon”) to pass the check.
However, such checks’ validity is low, resulting in too many false positives (attentive participants failing the checks) and false negatives (inattentive participants passing the checks; Gummer et al., 2021). It appears that respondents fail checks due to reasons other than inattentiveness, e.g. purposeful noncompliance or strategic responding (monitoring the surveys for checks).
Recent research suggests that subtle attention checks, which are less conspicuous and more integrated into the survey's context, could improve validity by reducing the likelihood that participants identify them as checks (Kay & Saucier, 2023; Curran & Hauser, 2019).
This project aims to develop and validate a database of such subtle attention checks, employing the idea of frequency items generally agreed upon and infrequency items typically disagreed with (Kay, 2024). The items will be created to align with various research themes such as individual differences, political studies or market research.
Validation will occur through a series of empirical studies analyzing item endorsement rates, participant reactions in cognitive labs (think-aloud protocols), and correlations with “old” attention checks and established C/IER indicators (straightlining, person-fit, response times.) Data will be collected across diverse sample pools differing in survey experience and survey motivation, including online panels, social media recruits, and university students (Shamon & Berning, 2020; Daikeler et al., 2024).
Dr Joss Roßmann (GESIS - Leibniz Institute for the Social Sciences) - Presenting Author
Dr Sebastian Lundmark (SOM Institute, University of Gothenburg)
Professor Henning Silber (Survey Research Center, Institute for Social Research, University of Michigan)
Professor Tobias Gummer (GESIS – Leibniz Institute for the Social Sciences and University of Mannheim)
Survey research depends on respondents' cooperation and attentiveness during interviews as inattentive respondents often provide non-optimal responses due to satisficing response strategies. To address varying levels of attention, researchers increasingly implement attentiveness measures, yet there is limited experimental evidence comparing different types of attention checks, particularly regarding their failure rates and the risk of false positives (e.g., Berinsky et al., 2016; Curran & Hauser, 2019). False positives may occur when respondents deliberately disregard instructions, leading to misclassification of attentiveness.
To explore these issues, we conducted experiments in the German Internet Panel (GIP), a probability-based online survey (N=2900), and the non-probability online Swedish Citizen Panel (SCP; N=3800). Data were collected during summer and winter 2022. Respondents were randomly assigned to attentiveness measures, including instructional manipulation checks (IMC), instructed response items (IRI), bogus items, numeric counting tasks, and seriousness checks. These measures varied in complexity and effort required. The SCP study extended the GIP experiment by implementing two attentiveness measures per respondent, one early and one late in the survey.
Results show significant variation in failure rates across attentiveness measures. Despite generally low failure rates, IMC and IRI checks exhibited higher failure rates due to their difficulty, design flaws, and instances of purposeful non-compliance. The findings were consistent across the GIP and SCP.
We conclude that while many attentiveness checks effectively identify inattentive respondents, IMC and IRI checks may overestimate inattentiveness due to design challenges and respondent behavior. Future research should focus on refining attentiveness measures to balance accuracy and respondent engagement, ultimately improving the quality of data collected in web-based surveys.
Professor Randall K. Thomas (Ipsos Public Affairs) - Presenting Author
Ms Megan A. Hendrich (Ipsos Public Affairs)
Ms Jennifer Durow (Ipsos Public Affairs)
Acquiescence bias occurs when survey respondents select ‘agreeable’ responses instead of responses that more accurately reflect their views. Questions with an agreement response format are believed to be more prone to elicit acquiescence bias than item-specific question types (cf. Krosnick & Presser, 2010). However, experiments that compare agreement and item-specific question types appear to have conflated response polarity (agreement scales are typically bipolar scales with higher means, while item-specific scales are typically unipolar scales with lower means; see Dykema et al., 2021). Our previous studies found no meaningful differences in distributions and validity for agreement and item-specific question types when controlling for scale polarity (e.g., unipolar agreement had similar means as unipolar item-specific items). We expanded the topics examined to replicate and extend the findings. This study employed a 2 X 2 factorial design, comparing unipolar and bipolar response formats and agreement and item-specific question types using ten questions about community issues (quality of schools, convenience of medical care, diversity of restaurants, etc.). We had 5,339 respondents from a well-established probability-based panel (Ipsos’ KnowledgePanel). We randomly assigned them to one of the four conditions (bipolar agreement, unipolar agreement, bipolar item-specific, or unipolar item-specific). Respondents also completed overall evaluations of their community to evaluate the criterion-related validity of the experimental items. Response distributions and means for agreement and item-specific question types were similar when using the same polarity – bipolar formats had higher means than unipolar formats. We also found that there was little difference in the criterion-related validity between the two question types (agreement vs. item-specific). Based on these results, prior findings of higher acquiescence bias in agreement question types failed to account for the typical response patterns that occur with bipolar scales more generally.
Miss Dörte Naber (University of Granada) - Presenting Author
Dr Patricia Hadler (GESIS - Leibniz Institute for the Social Sciences)
Professor Jose-Luis Padilla (University of Granada; Mind, Brain and Behavior Research Center (CIMCYC))
Satisficing Theory has been extensively applied in survey methodology to investigate how question and respondent characteristics influence satisficing behavior and data quality. Specifically, respondent motivation has emerged as a key predictor of data quality, with survey fatigue and topic interest commonly used as proxies. However, beyond these context-specific indicators, broader motivational traits, such as Need for Cognition (NFC), may provide deeper insights into the role of motivation in response behavior. While NFC is well-studied in other fields, its impact on survey responses remains underexplored. In this study, we address a critical gap in the literature by investigating the effect of NFC on data quality, with a specific focus on question difficulty – a dimension Satisficing Theory predicts will amplify the role of motivation but has not yet been explored in this context. To address this, we investigate how NFC impacts data quality across varying levels of question difficulty at different stages of the question-and-answer process. Specifically, we analyze variations at the response stage by comparing open-ended and closed-ended questions, and at the retrieval stage by contrasting concurrent and retrospective recall tasks, as implemented in Web Probing.
We analyze data from a 2020 experimental Web Probing study using the access panel of respondi/bilendi, involving a sample of 2,184 respondents from Germany. Participants answered three specific probes on quality-of-life aspects presented in one of four experimental conditions: open-ended or closed-ended format and concurrent or retrospective design, reflecting varying levels of question difficulty. Additionally, respondent characteristics such as education and age were included. Data quality was assessed through nonresponse and speeding, enabling comparability between open-ended and closed-ended questions. Our findings contribute to a more nuanced understanding of respondent motivation and its implications for survey design and data quality.
Dr Yfke Ongena (University of Groningen) - Presenting Author
Dr Marieke Haan (University of Groningen)
Agree-disagree (AD) items are assumed to evoke more satisficing behavior than construct specific (CS) items (i.e., items with a different response scale for each item, depending on the response dimension being evaluated). In this study we assess the effects of AD versus CS items on response patterns that are assumed to be indicators of satisficing, such as straightlining. In earlier research, straightlining has been shown to be more prevalent in surveys that are completed on a PC than on smartphones. However, effects of device use for self-completion have not been researched extensively in previous studies comparing AD and CS items.Our survey was conducted in November 2024, with 3,500 flyers distributed across a neighborhood in a large Dutch city with subsequent face-to-face recruitment by students. The flyers included a QR code and URL for survey access, and respondents were incentivized with a locally produced cake for their participation. The survey was filled out by 543 individuals (completing at least 50% of the questions), yielding a 13% response rate at household level. A smartphone was used by 85% of the participants, whereas 15% used a PC.
Respondents were randomly assigned to four blocks of either five AD items or five CS items. Straightlining occurred more frequently in AD items than in CS items, with 18% of respondents in the AD condition showing straight lining, as opposed to 5% of respondents in the CS condition. PC respondents were more likely than smartphone respondents to straightline in battery items phrased as AD items, but this effect was not found when items were phrased as CS items. This shows that using CS items might be more beneficial when the questionnaire is filled out on a computer than on a smartphone.
Ms Hannah Schwärzel (TU Darmstadt) - Presenting Author
Professor Marek Fuchs (TU Darmstadt)
Satisficing behavior is a threat to data quality in surveys. Web surveys offer the technical possibilities to detect satisficing behavior while respondents answer the questions and to prompt them to improve their answer. Previous studies have demonstrated that prompts in the questionnaire can be used to mitigate, for example, speeding, non-differentiation, and item-missing. However, some respondents ignore the prompts. In this study, the question is raised if this unresponsiveness to feedback particularly applies to respondents with a general tendency to satisfice. Therefore, we compare the effectiveness of interactive feedback that aims at a reduction of don’t know answers for respondents with a low, intermediate or high tendency to satisfice. To determine this tendency, we use the satisficing scale of the Maximization Inventory (Turner et al. 2012) that provides a measure for satisficing as a personality trait.
Results of a randomized field-experimental web survey indicate that interactive feedback in reaction to initial don’t know answers reduces the prevalence of final don’t know answers. Interestingly, the mitigating effect of the interactive feedback is shrinking for respondents whose personality exhibits high levels of satisficing.
In the discussion we propose that don’t know responses as an indication of satisficing response behavior are caused in part by fluctuating respondent motivation and also by varying levels of task difficulty depending on the questions posed to respondents. Using prompts has the potential to reduce the prevalence of don’t know answers due to low motivation or high question difficulty and is thus able to mitigate the negative effects of situational satisficing on data quality. By contrast, satisficing response behavior in surveys caused by a rather stable personality trait resists the influence of interactive feedback.
Ms Çağla E. Yildiz (GESIS - Leibniz Institute for Social Sciences) - Presenting Author
Professor Henning Silber (University of Michigan)
Dr Jessica Daikeler (GESIS - Leibniz Institute for Social Sciences)
Ms Fabienne Kraemer (GESIS - Leibniz Institute for Social Sciences)
Dr Evgenia Kapousouz (NORC at the University of Chicago)
Satisficing response behavior, including straightlining, can threaten the reliability and validity of survey data. Straightlining refers to selecting (nearly) identical response options across multiple items within a question, potentially compromising data quality. While straightlining is often interpreted as a sign of low-quality responses, there is a need to distinguish between plausible and implausible straightlining (see Schonlau and Toepoel, 2015; Reuning and Plutzer, 2020). We introduce a model that classifies straightlining into plausible and implausible patterns, offering a nuanced understanding of the conditions under which straightlining indicates optimized response behavior (plausible straightlining) vs. satisficing response behavior (implausible straightlining). For instance, straightlining is plausible when answering attitudinal questions with items worded in the same direction, but it becomes implausible when items are reverse-worded. This study examines how question characteristics (grid size, design, straightlining plausibility) influence straightlining behavior. For our analyses, we use the German GESIS Panel, a mixed-mode (mail and online), probability-based panel study, leveraging a change in the panel’s layout strategy in 2017 that shifted grid questions from matrix to single-item designs, offering a unique quasi-experimental set-up. Our initial multilevel regression analyses, using data from 1,917 respondents and 19 grid questions from the wave before and after the design switch, show that matrix designs are associated with higher levels of straightlining compared to single-item designs. Our preliminary analyses, based on coding by five survey methodology experts, classify 26.3% of these questions as exhibiting plausible straightlining, with the remainder showing implausible patterns. Further analyses investigate how these classifications correspond to conditions under which straightlining reflects optimized versus satisficing response behavior, offering deeper insights into the role of question characteristics. This research enhances questionnaire design and the accurate identification of low-quality responses, addressing gaps in linking question characteristics to straightlining plausibility.