All time references are in CEST
Recent advances in collecting and analyzing open-ended survey responses |
|
Session Organisers | Dr Ruben Bach (University of Mannheim) Dr Anna-Carolina Haensch (Ludwig Maximilian University of Munich) Professor Matthias Schonlau (University of Waterloo) |
Time | Tuesday 18 July, 09:00 - 10:30 |
Room |
Collecting and analyzing responses to open-ended survey questions used to be a tedious task. The steadily growing availability and popularity of web surveys and powerful machine learning and natural language processing techniques for data analysis in recent years have lowered the burden of collecting and analyzing open-ends by a lot, however. For example, respondents in a web survey may directly type in their response to an open-ended question or provide an audio record of their responses, which can later be transcribed automatically using speech-to-text algorithms. Natural language processing techniques and (un)supervised machine learning approaches allow, for example, to identify and analyze topics, sentiments, and stances in respondents' answers at unseen scale and speed. For unsupervised approaches, no manual coding of responses is necessary as, for example, clustering techniques are used to group answers by similarity. For supervised approaches, a sample of manually coded examples is used to train a model that may then be used to automatically code the remaining data, often with high levels of accuracy. With powerful large language models like GPT-4o, Llama 3, and the like, few-shot and zero-shot learning are now easily accessible for all researchers, including those with limited programming skills.
This session's goal is to bring together researchers from a variety of disciplines, such as survey research, statistics, and computer science. We will discuss recent advances in the collection and analysis of responses to open-ended survey questions using machine learning tools. We welcome submissions that address topics such as
- Statistical-learning / Machine-learning analysis of open-ends
- Speech-to-text algorithms for open-ends
- Large language models for analyzing open-ends
- Novel ways to collect open-ends
- Applied studies using novel analysis methods of open-ends
Keywords: Open-ended survey responses, machine learning
Miss Leah von der Heyde (LMU Munich, Munich Center for Machine Learning) - Presenting Author
Dr Anna-Carolina Haensch (LMU Munich, University of Maryland)
Dr Bernd Weiß (GESIS – Leibniz Institute for the Social Sciences)
The recent development and wider accessibility of large language models (LLMs) have spurred discussions about how these language models can be used in survey research, including classifying open-ended survey responses. Due to their linguistic capacities, it is likely that LLMs are an efficient alternative, potentially eliminating the need for time-consuming manual coding and the pre-training of supervised machine learning models. As most existing research on this topic has focused on English-language responses or single LLMs, it is unclear whether their findings generalize and how the quality of these classifications compares to established methods. In this study, we investigate to what extent different LLMs can be used to code German open-ended survey responses. We compare several state-of-the art LLMs and prompting approaches, including zero- and few-shot prompting and fine-tuning, and evaluate the LLMs’ performance relative to classifications assigned by human experts as well as their reliability. Preliminary results suggest that while LLMs appear reliable across iterations, their predictive accuracy is subpar, both in absolute terms and compared to more conservative methods. Performance differences between prompting approaches are conditional on the LLM used, as overall performance differs greatly between LLMs. Finally, LLMs’ unequal classification performance across different categories results in different categorical distributions. Performance might be improved when fine-tuning LLMs. We discuss the implications of these findings, both for methodological research on coding open-ended responses and for their substantive analysis, and the many trade-offs researchers need to consider when choosing automated methods for open-ended response classification in the age of LLMs. In doing so, our study contributes to the growing body of research about the conditions under which LLMs can be efficiently and reliably leveraged in survey research and improve data quality.