Using paradata to assess and improve survey data quality 3 |
|
Chair | Dr Caroline Vandenplas (KULeuven ) |
Coordinator 1 | Professor Geert Loosveldt (KULeuven) |
Coordinator 2 | Dr Koen Beullens (KULeuven) |
Non-response bias can occur due to variation between individuals in their likelihood to be available and willing to participate in the survey as well as due to planned or unplanned variation between countries or interviewers in the implementation of the survey. With response rates falling, non-response bias assessment and adjustment are of increasing importance for survey research. Recognising the issue of non-response bias is especially relevant in the context of cross-national surveys in which response rates may differ considerably from one country to another. Non-response bias will not only produce error in country level estimates, variation in response rates and non-response bias may also jeopardise comparability of results.
The European Working Conditions Survey (EWCS) is a cross-sectional face-to-face survey of workers in Europe designed and commissioned by Eurofound, an agency of the European Union. The survey is repeated every 5 years. In the sixth edition of EWCS conducted in 2015 paradata were collected including contact process information and interviewer observations. In this paper we use these data to estimate the propensity to (1) be successfully contacted and (2) participate in the survey. To estimate the propensity of being contacted we apply multilevel survival models which allow us to take characteristics of all contact attempts as well as country specific characteristics of the contacting process into account. We use multilevel logistic regression models to estimate the propensity to participate conditional on the propensity to be successfully contacted. The estimated propensities can subsequently be used to estimate scores on key substantive variables in the survey, showing the extent to which the sample can be assumed to be biased. In addition, these models will provide insight into the extent to which process characteristics are associated with response propensity and bias, providing lessons for future survey design.
Management of panel participation and prevention of attrition could be greatly facilitated if researchers knew sufficiently long in advance which panelists are likely to attrite. Clearly, resources could then be directed efficiently towards those individuals at high risk. These efforts are particularly relevant in probability-based online panels, which are costly to recruit but where attrition may still be high.
Statistical models usually fall short in offering strong predictive validity of panel attrition making them essentially inoperable for effective panel management. However, models also usually apply a relatively limited set of data, such as socio-demographics. This is surprising given that panels generate vast amounts of content and meta-data which may be exploitable in better ways. Machine learning techniques are designed for dealing with such amounts of data offering the potential of improved predictive validity. This paper illustrates the enormous potential of these techniques using the example of the LISS online panel.
LISS was established in the Netherlands in 2008 and has been maintained until today. Our data (n=9,405) comprise participation behavior for each panelist in each month of the panel (2008-2014) alongside a set of 38 background characteristics (demographics, economic information, health behavior, psychological traits, political traits, and survey attitudes).
We evaluated how well the machine could learn by cross-validated LASSO models to predict short-term panel attrition until the second year (2009) and long-term attrition until the seventh year (2014). LASSO is state-of-the-art for selecting predictors with the best cross-validated fit. We entered the 38 background characteristics with second order polynomial terms as predictors and then evaluated by how much predictions were improved when including past participation behavior. The models were trained by 10-fold cross-validation on n=8,405 panelists and tested on n=1,000 randomly selected independent panelists.
Predicting short-term attrition (2009), the baseline model without participation behavior performed unsatisfactory, with area under the ROC curve (test AUC) of .605 (sensitivity = .481, specificity = .689, classification accuracy = 70.1%). However, including only the panel participation information of January 2008 increased AUC already to .814. Including data until March increased AUC to .886 and until June to .930 (spec. = 864, sens. = .879, acc. = 88.6%). These results demonstrate that by including past participation behavior managers could accurately infer attrition 6 to 9 months later.
Predicting long-term attrition (2014), the model without participation information had again insufficient fit (AUC = .641). We then added past participation behavior in 2008 increasing AUC strongly to .841 (sens. = .723, spec.= .831, acc. = 76%). By adding the behavior of the second (2009) and third (2010) year fit even increased to AUC = .896 (acc. = 86%).
In conclusion, the accuracy of our predictions offers practitioners very valuable information in panel retention efforts. Moreover, our approach demonstrates the strong potential of machine learning when applied to panel (meta-)data which are commonly too large to be handled in traditional models. Stronger appreciation of this potential will improve effective panel management in the future.
The response rate is an important quality attribute of surveys (Schöneck/Voß 2013). For the quality management of a higher education institution a high response rate in their graduate tracer studies is necessary to guarantee a sufficient sample size for analyses on the level of study programmes (Schomburg 2003). In general, there seems to be a tendency of decreasing response rates (Aust/Schröder 2009). A systematic non-response of certain populations can lead to a biased sample (Zillmann 2016). Process control data are of growing importance in survey research to monitor the survey process (Kreuter et al. 2010).
The response rate is influenced by administrative means (e.g. use of incentives; more contacts) (Kropf et al. 2015). Furthermore, individual characteristics of the respondents, in our case graduates of higher education institutions (HEI), as well as characteristics of the HEIs influence the response rate (Porter/Umbach 2006).
Our analyses are based on data from KOAB (German Cooperative Graduate Survey project) with about 60 participating HEIs every year. Every graduate (full census) of these HEIs is invited to answer questions about his study and his early career. The questionnaire is implemented online.
In every HEI, there are practitioners who are responsible to conduct the studies. So far, we did not know whether the involvement of these practitioners is related to the response rate. The process control data of the study includes data about the activities of these HEI practitioners. Our assumption: The more the HEI practitioners invest in these activities the higher the response rate.
The project delivers data that can be used to analyse their influence on the response rate:
• Graduate database: Data of the graduates to be surveyed
• Administrative data: Data which document the conduction of the survey (e.g. type of invitation letter, use of incentives)
• Administrative paradata: Automatically produced data about the involvement of the HEI practitioners
• Other process control data: Other data about the involvement of the practitioners (e.g. contributions in the online project network, conference participation)
As a first step, we analyse which individual characteristics of the respondents influence the participation in the survey. Therefore, we use the data of the KOAB graduate surveys from graduation year 2010 to 2014 (n = 680,000). Here are some of the results: Germans have a higher probability to participate than non-Germans. Furthermore, there is a lower probability to participate if the graduation took place at the beginning of the graduation year (earlier).
The results of the first step are then used to analyse the core question: To this aim we use process control data from graduation years 2007 to 2014: 444 surveys which took place in 87 different higher education institutions. We analyse if the involvement of the HEI practitioners influences the response rate. We control for the composition of the individual characteristics at the HEI, characteristics of the HEI and the administrative measures used by the HEI practitioners. Preliminary results show the influence of some of the involvement variables. We will discuss them fully.
Key performance indicators (KPIs) derived from paradata can be used to monitor the quality of data collection. This presentation will describe how KPIs can be used to monitor a split-ballot experiment, highlight continuities between our findings and paradata-related best practices and outline a program for further research.
The presentation will focus on the use of a set of KPIs in a split-ballot experiment embedded in a CAPI survey of 1,500 adults. The experiment asked parallel scale questions of each subsample. Respondents were randomized to two different versions of the test questions. Randomization was carried out within each sampling point, with the two questionnaire versions alternated. The maximum number of interviews per sampling point was eight.
The researchers identified several KPIs to monitor during data collection: overall number of interviews per sampling point; number of interviews for each questionnaire version per sampling point; date and time of each interview within sampling point to confirm questionnaire versions were alternated. In addition, there were separate variables for stratum, PSU number, sampling point, original or replacement sampled unit, interviewer ID and characteristics, supervisor ID, and length of interviews.
Due to budget limitations, it was impossible to ensure daily or weekly monitoring of the data. Instead, the researchers identified two points during fieldwork to conduct reviews of the survey paradata. The first review took place after the completion of a total of 630 interviews across the two subsamples and the second after the completion of 1,270 interviews. After the second review deviations from the survey protocol were identified in about 20% of the sampling points, requiring remedial action.
The presentation will detail the data-quality checks that were conducted at each review, describe the process by which remedial action was deemed necessary, and propose a program for the monitoring of fieldwork through paradata analysis.