NEW DEVELOPMENTS IN ADAPTIVE SURVEY DESIGN |
|
Session Organiser | Professor Barry Schouten (Statistics Netherlands and Utrecht University) |
Time | Friday 2 July, 13:15 - 14:45 |
Adaptive survey designs have been researched extensively over the last ten years and are now seen as standard options at various statistical institutes. This session presents papers in which new developments are presented. These concern open research questions as how to optimally stratify the target population, how to learn efficiently from historic survey data, how to optimally define stopping rules and phase capacity, how to include measurement error, how to extend ASD to sensor surveys, how to assess interviewer effects without interpenetration, and how robust ASD is to pandemics like COVID-19.
The session is a closed session consisting of two parts of in total nine papers from authors from four different institutions.
Keywords: Nonresponse; measurement error; tailoring; survey design
Dr Stephanie Coffey (US Census Bureau) - Presenting Author
All aspects of a survey design, from the length of the survey period, to the mode of data collection, to individual data collection features like incentives or mailings, will affect both who responds to a survey and how much it costs to obtain their response. In order to conduct data collection successfully in a budget-conscious environment, decisions related to survey design require balancing concepts of data quality and costs. Recently, responsive survey designs have emerged as a way to tailor data collection features to specific subgroups or specific cases within a data collection period in order to save costs and/or improve survey outcomes. For the most part, however, responsive designs in the survey methodological literature do not incorporate actual survey response data into their decision framework. We report on a responsive design experiment in the National Survey of College Graduates that incorporates optimization as a way to minimize data collection costs for a small increase in the root mean squared error of a key survey estimate. We used a Bayesian framework for our optimization, in order to incorporate historical data and current accumulating data to make predictions and data collection decisions during live data collection. Results demonstrate significant cost savings with no significant changes in the RMSE of a key survey estimate or the unweighted response rate.
Dr James Wagner (University of Michigan) - Presenting Author
Surveys face difficult choices in managing cost-error tradeoffs. Stopping rules have been proposed as a method for managing these tradeoffs. A stopping rule will limit effort on cases in order to reduce costs with minimal harm to quality. Previously proposed stopping rules have focused on quality with an implicit assumption that all cases have the same cost. While this may be true for mail or web surveys, this assumption is unlikely to be true for either telephone or face-to-face surveys. We propose a new rule that looks at both predicted costs and quality. This rule is tested experimentally against another rule that focuses on stopping cases that are expected to be difficult to recruit. We test both Bayesian and non-Bayesian versions of the rule. The Bayesian version of the prediction models uses historical data to establish prior information. We find that the new rule produces higher quality estimates for about the same cost regardless of the use of prior information.
Mr Yongchao Ma (Utrecht University) - Presenting Author
Dr Nino Mushkudiani (Statistics Netherlands)
Professor Barry Schouten (Statistics Netherlands and Utrecht University)
Adaptive survey designs are based on the rationale that any population is both heterogeneous in its response and answering behaviour to surveys and in its costs to be recruited and interviewed. Different survey design features may be effective for different members of the population. Adaptive survey designs acknowledge these differences by allowing differentiation of survey design features for different population subgroups based on auxiliary data about the sample; the auxiliary data is linked from frame data, registry data or paradata. The resulting strata receive different treatments.
The main components to adaptive survey designs are candidate treatments, a stratification of the population into subgroups, quality and cost criteria that need to be optimized, and a strategy to find the optimal allocation of subgroups to treatments.
A complicated but crucial choice is the stratification of the population into subgroups. Such stratification can be oriented at explaining heterogeneity in response propensities, explaining differences in surveys costs, or in explaining the answers to the main questions of the survey. Each choice leads, in general, to a different stratification. We develop a strategy for stratification and evaluate the sensitivity of the performance of resulting designs to this strategy
Mrs Shiya Wu (Utrecht University) - Presenting Author
Dr Harm Jan Boonstra (Statistics Netherlands)
Dr Mirjam Moerbeek (Utrecht University)
Professor Barry Schouten (Statistics Netherlands and Utrecht University)
Precise and unbiased estimates of response propensities have a decisive role in data collection monitoring, analysis and adaption. In a fixed survey climate, those parameters are stable and their estimates ultimately converge. This is not true in survey practice as response propensities gradually vary in time. For example, seasonal variation in response propensities and downward trends. Therefore, understanding time-dependent variations in predicting response propensities is key when adapting survey design.
This paper investigates and interprets time-dependent variations in response propensities through multi-level time series models. Reliable predictions can be generated by learning from historic series and updating old knowledge for new data through this time series approach in a Bayesian framework. The utility of the method is to predict Web response propensities based on the Dutch Health Survey.
Dr Brady West (University of Michigan) - Presenting Author
Methodological studies of the effects that human interviewers can have on the quality of survey data have long been limited by a critical assumption: that interviewers in a given survey are assigned completely random subsets of the larger overall sample that is being measured (also known as interpenetrated assignment). In the absence of this type of study design, estimates of interviewer effects on survey measures of interest may simply reflect differences between interviewers in the characteristics of their assigned sample members, rather than recruitment or measurement effects specifically introduced by the interviewers. We introduce a new Bayesian approach for overcoming this lack of interpenetrated assignment when estimating interviewer effects. This approach, which we refer to as the “anchoring” method, leverages correlations between observed variables that are unlikely to be affected by interviewers (“anchors”) and variables that may be prone to interviewer effects (e.g., sensitive or complex factual questions) to statistically remove components of within-interviewer correlations that a lack of interpenetrated assignment may introduce. The improved estimates of interviewer effects on survey measures will enable survey managers to more effectively manage a data collection in real time and intervene when particular interviewers are producing survey outcomes that vary substantially from expectations. We evaluate this new methodology empirically using a simulation study, and then illustrate its application using real survey data from the Behavioral Risk Factor Surveillance System (BRFSS), where interviewer IDs are systematically provided on public-use data files.