ESRA logo

ESRA 2025 Preliminary Program

              



All time references are in CEST

Assessing the Quality of Survey Data 2

Session Organiser Professor Jörg Blasius (University of Bonn)
TimeTuesday 15 July, 11:00 - 12:30
Room Ruppert paars - 0.44

This session will provide a series of original investigations on data quality in both national and international contexts. The starting premise is that all survey data contain a mixture of substantive and methodologically-induced variation. Most current work focuses primarily on random measurement error, which is usually treated as normally distributed. However, there are a large number of different kinds of systematic measurement errors, or more precisely, there are many different sources of methodologically-induced variation and all of them may have a strong influence on the “substantive” solutions. To the sources of methodologically-induced variation belong response sets and response styles, misunderstandings of questions, translation and coding errors, uneven standards between the research institutes involved in the data collection (especially in cross-national research), item- and unit non-response, as well as faked interviews. We will consider data as of high quality in case the methodologically-induced variation is low, i.e. the differences in responses can be interpreted based on theoretical assumptions in the given area of research. The aim of the session is to discuss different sources of methodologically-induced variation in survey research, how to detect them and the effects they have on the substantive findings.

Keywords: Quality of data, task simplification, response styles, satisficing

Papers

Using generative AI applications to look up answers to political knowledge questions in web surveys

Professor Tobias Gummer (GESIS - Leibniz Institute for the Social Sciences) - Presenting Author
Dr Tanja Kunz (GESIS - Leibniz Institute for the Social Sciences)
Mr Oscar Martinez (GESIS Leibniz Institute for the Social Sciences)
Mr Matthias Roth (GESIS Leibniz Institute for the Social Sciences)

Political knowledge questions are frequently used in political science surveys, yet their validity is increasingly challenged in web survey contexts where respondents can look up answers. When looking up answers, measures of declarative memory are confounded with procedural memory. Previous studies have focused on detecting and deterring such lookup behavior, often assuming respondents use search engines or databases like Wikipedia. However, the rise of generative AI applications, such as ChatGPT and Microsoft Copilot, introduces new dimensions to the information search process. These tools, designed to generate human-like text, offer an alternative means for retrieving information, potentially influencing how political knowledge scores reflect procedural memory.
This study explores the evolving landscape of lookup behavior by examining respondents’ use of generative AI to find answers to political knowledge questions. Through a two-study design, we first identify respondents’ distinct strategies for using these tools (Study I). Here, we conducted a web survey among respondents recruited from an online access panel. We then evaluate the textual information generated by AI applications based on these strategies, analyzing its complexity and content (Study II). Here, we collected texts generated by 4 generative AI applications based on 5 prompting strategies identified in Study I for 12 political knowledge questions from renowned election studies. Our findings highlight key differences in the search processes and the utility of information retrieved via generative AI to answer political knowledge questions. We aim to answer two research questions:
RQ1. How do respondents use generative AI applications to answer political knowledge questions?
RQ 2. How useful is the textual information obtained when using generative AI to answer political knowledge questions, and how does it differ depending on prompting strategies?


Identifying Careless Survey Respondents through Machine Learning and Response Patterns

Mrs Leah Bloy (Hebrew University Business School ) - Presenting Author
Dr Yehezkel Resheff (Hebrew University Business School)
Professor Avraham N. Kluger (The Hebrew University)
Dr Nechumi Malovicki-Yaffe (Tel Aviv Univesity)

Invalid responses pose a significant risk of distorting survey data, compromising statistical inferences, and introducing errors in conclusions drawn from surveys. Given the pivotal role of surveys in research, development, and decision-making, it is imperative to identify careless survey respondents. The existing literature on this subject is comprised of two primary categories of approaches: methods reliant on survey items, and methods involving post-hoc analyses. The latter, which doesn’t demand preemptive preparation, predominantly incorporates statistical techniques aimed at identifying distinct response patterns that are associated with careless responses. However, several inherent limitations limit the precise identification of careless respondents. One notable challenge is the lack of consensus concerning the thresholds to use for the various measures. Furthermore, each method is designed to detect a specific response pattern associated with carelessness, leading to conflicting outcomes. This paper seeks to assess the efficacy of the existing methods using a novel survey methodology encompassing responses to both meaningful and meaningless gibberish scales, where the latter compels respondents to answer without considering item content. Using this approach, we propose the application of machine learning to identify careless survey respondents. Our findings underscore the efficacy of supervised machine learning combined with unique gibberish data methodology (GibML) as a potent method for the identification of careless respondents, aligning with and outperforming other approaches in terms of effectiveness and versatility.


Does survey introduction language impact data quality?

Dr Nicholas Yeh (internal revenue service) - Presenting Author
Dr Gwen Gardiner (internal revenue service)
Dr Scott Leary (internal revenue service)
Ms Brenda Schafer (internal revenue service)

The United States Internal Revenue Service (IRS) annually administers the Individual Taxpayer Burden Survey to gather data about the time and money that individuals spend to comply with federal tax reporting requirements. Triennially, individuals who filed their return after December of the year the return was due are also surveyed. These late filers often respond at lower rates which could cause data validity issues. The current survey protocol involves mailing an invitation to complete a web-only survey. This invitation directs them to a survey landing page on IRS.gov where they read information about the purpose of the study and click a link to begin. This study focuses on testing modifications to the introductory language on the survey landing page and the online survey introduction pages. These modifications emphasize benefits to the participants, enhance readability (e.g., making critical information more salient), and other best practices aimed at improving the number and quality of survey responses. In this study, individuals were randomly assigned to receive either the modified language (experimental condition; N = 7,850) or the original language (control condition; N = 7,850). Preliminary analysis revealed the experimental condition significantly increased the response rates compared to control condition (21%). The current analysis focuses on examining whether the experimental condition also increased measures of data quality (e.g., completeness/missing data, consistency of responses across survey items). We also explore the potential impact on the quality of the critical time and money questions.


Understanding item nonresponse patterns in cross-national self-administered web and paper surveys

Ms Victoria Salinero-Bevins (European Social Survey HQ (City St George's, University of London)) - Presenting Author
Mr Nathan Reece (European Social Survey HQ (City St George's, University of London))

As the European Social Survey transitions from collecting data through face-to-face interviews to using web and paper self-completion modes, it is expected that item nonresponse will increase in the absence of interviewers. Among the 12 countries that implemented self-completion surveys during or alongside ESS Round 10, item nonresponse was generally higher than in their most recent face-to-face survey. Within self-completion modes, completions on paper generally have more item nonresponse than on web. This paper describes in detail which parts of the ESS questionnaire have been particularly susceptible to higher item nonresponse when switching modes from face-to-face to self-completion and investigates other possible factors that may be correlated with the propensity to skip questions in self-completion.

In general, we find that item nonresponse is most prevalent among questions that involve complex routing instructions in the paper questionnaire and open response questions. Nonresponse in open response questions tends to be higher among paper completions although this is not universally the case. Evidence also suggests that nonresponse patterns can be sensitive to features of the layout and graphic design of the paper questionnaire. Among web completions, we investigate whether nonresponse patterns can be explained by device types and certain demographic characteristics. Nonresponse tends to be higher among respondents answering on mobile phones. Nonresponse patterns differ according to operating system, both on mobile devices and on personal computers and tablets. The findings presented will help inform how cross-national social surveys can reduce item nonresponse when transitioning from face-to-face interviewing to self-administered web and paper questionnaires.


Attention Checks in Online Access Panels: The Role of Feedback

Mr Sebastian Vogler (Leibniz Institute for Educational Trajectories) - Presenting Author
Ms Anika Bela (Leibniz Institute for Educational Trajectories (LIfBi))
Ms Jaqueline Kroh (Leibniz Institute for Educational Trajectories (LIfBi))
Ms Elisabeth Nowak (Leibniz Institute for Educational Trajectories (LIfBi))

The increasing problem of low participation rates in large-scale surveys necessitates the re-evaluation
of traditional methodologies. Self-administered online surveys have emerged as a promising
alternative, yet concerns about data quality, particularly in online access panels, persist. Attention
checks in online surveys play a crucial role to ensure high-quality data by identifying inattentive
respondents. While the implementation of online surveys has the potential to reduce costs, enhance
the composition of respondents and facilitate greater access, surveys using online access panels are
frequently accompanied by significant disadvantages: The incentive for financial or other forms of
compensation in exchange for the completion of a questionnaire might induce distortions in the
respondents' actual response behaviour by responding to questions in an inadequate, untruthful or
hasty manner just to complete the survey with minimal effort. At the same time, high failure rates in
attention checks can result in significant data exclusion, raising the need for supplementary strategies
to improve attentiveness.
This study investigates the potential of feedback-based interventions as a solution, providing
participants with feedback about failed attention checks and emphasizing the importance of high-
quality responses.We present results from an online survey with a randomized controlled trial (RCT) on
the effect of a feedback intervention on respondents’ attentiveness, response behaviour, and overall
data quality. We deployed an online survey (N=600) with 48 items including questions about work
satisfaction and career opportunities as well as health behaviour. Attention checks are incorporated as
instructed manipulation check after half of the questionnaire. The control group only received attention
check, whilst the treatment group received the feedback intervention followed by the same attention
check, again. We will present first results on the impact of feedback on the enhancement or
deterioration of attentiveness, response behaviour and the overall quality of survey data.


An Examination on the Performance of Variance Estimators in International Large-Scale Assessments

Mr Umut Atasever (IEA Hamburg) - Presenting Author

The primary objective of this study is to compare the relative performance of the most widely used sampling variance estimators in international large-scale assessments (ILSAs): the Balanced Repeated Replication (BRR) method and the (paired) Jackknife Repeated Replication (JK2) method. Additionally, this study examines the impact of a Fay modification factor on BRR and JK2, assessing its effect on the precision of sampling variance estimation. A Monte Carlo simulation approach is employed, simulating a TIMSS student population with realistic distributions of achievement scores, standard deviation, intra-class correlation coefficient (ICC), and background characteristics. Probability samples are repeatedly drawn following a two-stage stratified cluster sampling design, where schools are primary sampling units (PSUs) and classes are secondary sampling units. For each sample, the population parameter of interest and its sampling variance are estimated, with the sampling variance approximated from the variability of estimates across samples. The performance of each variance estimator is evaluated by comparing estimated sampling variance to the approximated variance. Results suggest that JK2 and BRR without Fay modification yield the most precise variance estimates for smooth statistics such as means, with JK2 demonstrating the highest stability across conditions. Fay-modified JK2 does not significantly impact variance precision, while BRR with a Fay factor enhances performance for non-smooth statistics, particularly at 10% and 30%, compared to the 50% Fay factor and BRR without Fay. Under non-response conditions, JK2 Original Strata (OrgStr) underestimates variance, while BRR Original Strata (OrgStr) inflates variance estimates. The OrgStr approach retains the initial variance strata structure, even when some PSUs do not respond, rather than reconstructing strata based only on participating units. This study advances discussions on variance estimation in ILSAs, providing insights into the optimal application of resampling methods, including the trade-offs of Bootstrapping, BRR, and JK2 under varying statistical