ESRA logo
Tuesday 14th July      Wednesday 15th July      Thursday 16th July      Friday 17th July     




Tuesday 14th July, 14:00 - 15:30 Room: HT-103


Using Paradata to Improve Survey Data Quality 2

Convenor Professor Volker Stocké (University of Kassel, Germany )
Coordinator 1Professor Jochen Mayerl (TU Kaiserslautern, Germany)
Coordinator 2Dr Oliver Lipps (Swiss Centre of Expertise in the Social Sciences (FORS), Lausanne, Switzerland)

Session Details

“Paradata” are measures generated as a by-product of the survey data collection process. Prominent examples of paradata are data available from the sampling frame, call-record data in CATI surveys, keystroke information from CAI, timestamp files, observations of interviewer behavior or respondents’ response latencies (see Kreuter 2013 for an overview). These data can potentially be used to enrich questionnaire responses or to provide additional information about the survey (non-)participation process. In many cases paradata are available at no (or little) additional cost, but the theoretical basis for using paradata as indicator for survey data quality is very underdeveloped. Some examples about the use of paradata are:

Paradata in fieldwork monitoring and nonresponse research: Paradata are often used in the survey management context. With control charts survey practitioners can monitor fieldwork progress and interviewer performance. They are also indispensable in responsive designs as real-time information about fieldwork and survey outcomes which affect costs and errors. However, their role as indicator for interviewer or fieldwork effects, as well as predictors for nonresponse is unclear.

Paradata to understand respondent behavior: Paradata might aid assessing of the quality of survey responses, e.g. by means of response latencies or back-tracking. Research has used paradata to identify uncertainty in the answers given by respondents, e.g., if respondents frequently alter their answers. In this new strand of research, however, indicators might still be confounded and tap into multiple dimensions of the response process (e.g., response latencies may be an indicator for retrieval problems and/or satisficing).

Paper Details

1. Predicting Response Times in Web Surveys
Mr Alexander Wenz (University of Essex)

Although survey length was shown to have an impact on data quality and costs, researchers are often uncertain about the length of their survey. The present study investigates the influence of question properties and respondent characteristics on item-level response times in web surveys. By using client-side response times of the GESIS Online Panel Pilot, a probability-based online panel, the majority of findings from previous research can be replicated. Beyond replication, the analysis indicates that respondents speed up across the waves of the panel survey and that questionnaire navigation paradata can explain variation in response times.


2. Using response latencies in measurement-error models to account for the social desirability bias in surveys
Dr Robert Neumann (Technische Universität Dresden)
Mr Hagen Von Hermanni (Technische Universität Dresden)

We propose an approach that uses response times as prior information about the distribution of errors when answers to sensitive question are the subject of interest. By applying different prior distributions about the nature of errors while answering sensitive item questions, we are able to shed light on the open questions about the cognitive answering process while avoiding the use of complex and reactive randomization strategies to investigate sensitive topics.


3. Moderation Effects of Response Latencies and Meta-Judgments on Response-Effects and Attitude-Behavior Consistency
Professor Jochen Mayerl (University of Kaiserslautern)
Professor Volker Stocké (University of Kassel)

The aim of the paper is to test moderation effects of cognitive accessibility of the respective target information in respondents’ memory concerning a) moderation of attitude-behavior consistency, and b) moderation of the occurrence of response effects. The paper compares two alternative indicators of cognitive accessibility: self-reported response certainty and the time needed to answer the question.
A second issue addressed in this paper is how response certainties and response latencies should be transformed prior to data analysis.
Statistical analyses of data of two surveys include multiple group comparison and interaction effects in linear as well as logistic regression.


4. Non-observation bias in an address-register-based CATI/CAPI mixed mode survey
Dr Oliver Lipps (FORS, Lausanne, Switzerland)

We study bias from nonobservation in a survey with a sample drawn from an address-based population register. While the primary survey mode was the landline, households with no matched telephone number received a personal visit. We distinguish bias from (telephone) undercoverage, noncontact, and noncooperation.
The strongest composition bias of the telephone sample is due to undercoverage. In the combined telephone/face-to-face sample, bias from noncooperation reduces the advantage of adding the face-to-face mode. We give recommendations on the prioritization of person groups or who in the telephone sample should rather receive a personal visit.