ESRA 2025 Preliminary Program
All time references are in CEST
New Data Spaces for the Social Sciences - An Interdisciplinary Program for Survey Innovation in Germany |
Session Organisers |
Professor Cordula Artelt (Leibniz Institute for Educational Trajectories) Dr Anika Schenck-Fontaine (Leibniz Institute for Educational Trajectories) Professor Corinna Kleinert (Leibniz Institute for Educational Trajectories)
|
Time | Friday 18 July, 09:00 - 10:30 |
Room |
Ruppert 042 |
To expand our understanding of and have an impact on the major social challenges of the coming decades, including digitization, climate change, growing diversification, pandemics, and war-induced societal changes, the social sciences need to unlock new opportunities for collecting and analyzing data. Many countries have a set of well-established longitudinal survey programs, but surveys are plagued with fundamental challenges related to validity, cost, and sustainability. Therefore, systematic and far-sighted social science research needs to explore the potential of recent technological advances and explore new forms of data, new methods of data acquisition, and new measures of data quality.
Developing and utilizing such new data sources and data infrastructures necessitates the bundling and orchestrating of skills, knowledge, and expertise across different fields of empirical social sciences and computer sciences, which can only be managed by large-scale research programs. To achieve these goals, the German Research Foundation (DFG) has established the long-term infrastructure priority program “New Data Spaces for the Social Sciences” to open up and develop such new data spaces (https://www.new-data-spaces.de/en-us/). Within this program, a series of highly innovative research projects in four main research areas were funded: exploration and integration of different data types, respondent-driven designs, instrument validity, and multimodal data acquisition. The purpose of this session is to introduce this program, present first results of research projects funded within this program, and foster exchange with initiators and researchers who are active in similar programs in other countries.
Keywords: Survey innovation, New Data Spaces, data infrastructure, Germany
Papers
Evaluating ASR for Social Science Research: A Comparison of Semantic Metrics (Authors are Members of SPP 2431)
Mr Nicolas Ruth (Leipzig University)
Mr Andreas Niekler (Leipzig University)
Ms Leonie Steinbrinker (Leipzig University) - Presenting Author
Mr Stephan Poppe (Leipzig University)
Automatic Speech Recognition (ASR) is an essential technology for automating the transcription of qualitative data in social science research, particularly with large interview datasets. Recent advancements in ASR have introduced powerful new tools to the field, but their implementation requires careful and thoughtful consideration to ensure reliability and accuracy. Since outcomes vary significantly depending on the model and its (hyper-)parametrization, it is crucial to evaluate the generalization capabilities of ASR models on specific research data using a meaningful and comparable metric. Addressing these challenges will enable social scientists to effectively leverage these technologies in their research.
The most commonly used metric for this purpose is the Word Error Rate (WER). WER depends on specific language-specific text transformations and focuses on surface-level accuracy, making it inadequate for evaluating transcript quality in social sciences and downstream NLP tasks. To address limitations, modern, semantics-oriented metrics have been developed in recent years. Metrics such as Embedding Error Rate (EmbER) and Semantic-WER apply penalties for different types of errors, while methods like BERTScore, SeMaScore, SemDist, and Aligned Semantic Distance (ASD) improve evaluation by utilizing contextual embeddings and advanced matching techniques to assess semantic similarity.
Our research centers on comparing the usability of these semantic metrics for ASR in social sciences and developing an evaluation for ASR transcriptions using aligned window-based semantic comparison, as opposed to relying on traditional singular value metrics. The proposed talk is not only designed to improve the quality of individual research projects but also to contribute to the creation of new data spaces for the social sciences, where ASR is a fundamental technology. Robust ASR evaluation and utilization methods, particularly those addressing semantic validity crucial for qualitative datasets, are essential for unlocking the potential of large-scale qualitative data and enabling new research directions.
New Data Spaces Cross-National Synergies at the Example of Respondent-Driven Sampling
Dr Carina Cornesse (GESIS - Leibniz Institute for the Social Sciences) - Presenting Author
Dr Jean-Yves Gerlitz (University of Bremen)
Professor Olaf Groh-Samberg (University of Bremen)
Mr Curtis Jessop (The National Centre for Social Research (NatCen))
Professor Olga Maslovskaya (University of Southampton)
Professor Sabine Zinn (German Institute for Economic Research)
The German Infrastructure Priority Program "New Data Spaces for the Social Sciences," funded by the German Research Foundation, was established to facilitate collaboration among research infrastructures, leverage synergies, and drive innovation in the field of data collection. This initiative shares a similar vision with the UK’s "Survey Futures," a program funded by the Economic and Social Research Council to advance survey data collection methods. To propel the development of data collection methodologies, it is essential to expand the perspective on infrastructure innovation from a national to an international level, fostering synergies between countries and infrastructure programs.
This presentation contributes to this dialogue by focusing on two projects that explore an innovative data collection methodology—respondent-driven sampling (RDS)—within the German and UK programs. It will outline the two projects, including the distinct study designs planned for 2025 in both countries, discuss how the projects mutually benefit each other, and explore strategies for strengthening and expanding international collaboration between infrastructure programs. Additionally, the presentation will highlight key similarities and differences between the German and UK infrastructure programs, providing insights from the perspective of researchers funded under these initiatives.
The Behavioral Measurement Toolbox
Dr Julian Detemple (University of Mainz)
Professor Florian Hett (University of Mainz)
Professor Michael Kosfeld (University of Frankfurt) - Presenting Author
Dr David Poensgen (University of Frankfurt)
Understanding human preferences – e.g., related to time, risk, or social considerations -- has been central to advancing research in economics and other social sciences. These preferences influence decision-making in diverse contexts, from individual choices to social interactions, and aggregate outcomes. However, existing behavioral methods to measure preferences based on incentivized choices in stylized decision situations are not standardized and often difficult to implement in survey studies and field settings. Further, there exists insufficient research regarding which exact measure proves to be the most valid.
This paper introduces the "Behavioral Measurement Toolbox" (BMT), a user-friendly platform designed to measure individual traits like time, risk, and social preferences through incentivized and controlled decision situations. With just a few clicks, researchers can implement these measures, with BMT maintaining comparability and transparency across different implementations. Details regarding past measurements are easily traceable and documented, which facilitates meta-studies and fulfilling open science requirements. Flexible usability, online and offline, enables standardized measurement across different field contexts, in surveys, as well as lab studies. BMT’s flexible technical architecture also explicitly allows integration with existing survey platforms and hence complements other methods within the quantitative social sciences. BMT is therefore ideally suited to allow researchers from all related disciplines to incorporate behavioral preference measures into their work, while it also promises methodological advancement in terms of standardization and establishing validity.
Opportunities and Challenges for the Future of Surveys
Professor Pamela Davis-Kean (University of Michigan) - Presenting Author
Surveys have long been the primary avenue for gathering data on the beliefs and behaviors of individuals across the world. However, in the past decades, individuals who provide their time to answer questions and provide us with information on their lives have dwindled to the point where it is becoming difficult to ascertain the sample sizes large enough to analyze the data which is especially problematic for sub-groups of interest. Response rates have been declining for a while but took a particularly dramatic hit during the COVID-19 pandemic years and have rebounded only slightly over the last few years. These lowered response rates threaten to reduce the effectiveness of surveys to help researchers make inferences about the population. This has led to a "crisis" in survey research on figuring out ways to do surveys in the future that represent the populations of interest.
For decades, the “gold standard” of creating new randomized population surveys has been to randomize address information and connect with respondents by knocking on doors and asking those who live at these addresses to participate in a given survey. However, this method is not longer garnering the response rates of 20 years ago where you could expect to get around 80% of the sample to participate in a study. The average response rates using this method are now considered very good at 50% but are generally lower. Web surveys have been used for decades to try and find another “entrance” into the household but have consistently low response rates with many national panels only obtaining around 5-6% of respondents who agree to take the survey. This presentation will discuss new ideas and opportunities for increasing response rates in surveys as well as the importance of sustaining data infrastructure for societal challenges in the future.