Incorporating machine learning into the response process |
|
Coordinator 1 | Dr Erica Yu (U.S. Bureau of Labor Statistics) |
Most applications of machine learning to surveys begin after the data have already been collected, such as analyzing open-ended text responses, assigning classifications, and imputing missing values. Looking forward, incorporating machine learning methods directly into the response process has the potential to reduce measurement error and the burden imposed on respondents and data collectors. For example, survey instruments can assist respondents in coding their own open-ended responses, suggest probes or follow-up questions conditional on patterns of responses and paradata, or process documents to automatically identify responses to survey questions. Technology can enable survey designers to customize the instrument to seamlessly collect more and more targeted information from respondents in-the-moment during the survey.
These applications of machine learning to the response process affect not only the programming of the instrument but also the nature of the respondent’s task and their interaction with the instrument. Do these new response tasks require respondent training to be successful? Will respondents find targeted probes detail burdensome? Can machine learning-driven probing reduce measurement error and respondent satisficing? Will respondents correct machine learning errors? Will there be any carryover effects on other items within the survey? Can these methods be incorporated into the response process for both establishment and household surveys? This session will explore the new concepts that survey designers must consider when incorporating machine learning into the response process.