ESRA 2025 Preliminary Program

All time references are in CEST

New Data Spaces for the Social Sciences - An Interdisciplinary Program for Survey Innovation in Germany 2
Session Organisers	Professor Cordula Artelt (Leibniz Institute for Educational Trajectories) Dr Anika Schenck-Fontaine (Leibniz Institute for Educational Trajectories) Professor Corinna Kleinert (Leibniz Institute for Educational Trajectories)
Time	Friday 18 July, 11:00 - 12:30
Room	Ruppert 042

To expand our understanding of and have an impact on the major social challenges of the coming decades, including digitization, climate change, growing diversification, pandemics, and war-induced societal changes, the social sciences need to unlock new opportunities for collecting and analyzing data. Many countries have a set of well-established longitudinal survey programs, but surveys are plagued with fundamental challenges related to validity, cost, and sustainability. Therefore, systematic and far-sighted social science research needs to explore the potential of recent technological advances and explore new forms of data, new methods of data acquisition, and new measures of data quality.

Developing and utilizing such new data sources and data infrastructures necessitates the bundling and orchestrating of skills, knowledge, and expertise across different fields of empirical social sciences and computer sciences, which can only be managed by large-scale research programs. To achieve these goals, the German Research Foundation (DFG) has established the long-term infrastructure priority program “New Data Spaces for the Social Sciences” to open up and develop such new data spaces (https://www.new-data-spaces.de/en-us/). Within this program, a series of highly innovative research projects in four main research areas were funded: exploration and integration of different data types, respondent-driven designs, instrument validity, and multimodal data acquisition. The purpose of this session is to introduce this program, present first results of research projects funded within this program, and foster exchange with initiators and researchers who are active in similar programs in other countries.

Keywords: Survey innovation, New Data Spaces, data infrastructure, Germany

Papers

Exploring the Impact of Virtual Avatars in Survey Interviews

Mr Patrick Schrottenbacher (Goethe University Frankfurt) - Presenting Author
Dr Lydia Kleine (Leibniz Institute for Educational Trajectories)
Professor Alexander Mehler (Goethe University Frankfurt)
Professor Corinna Kleinert (Leibniz Institute for Educational Trajectories)
Professor Christian Aßmann (Leibniz Institute for Educational Trajectories)

The field of representative survey studies has evolved significantly, enabling surveys to be conducted through live video interviewing, a promising alternative to face-to-face interviewing. Recent technological advances have facilitated innovations, including the representation of both the interviewer and the interviewee as virtual avatars. A significant factor in enabling this has been the improvements made to Virtual Reality (VR), which have enhanced accessibility to features such as hand, face, and eye tracking. Representing the interviewer via a virtual avatar has both positive and negative effects on user comfort and experience, which, in turn, influence the overall quality of the interview. For example, the so-called Other Avatar Effect arises partly from the discrepancy between the perceived voice (and the associated self-imposed image of the speaker's appearance) and the actual appearance of the avatar.
To date, there has been little research into the impact of avatars representing interviewers in VR on the data quality of interviews. By means of special experiments, the FACES project will investigate the influence of avatar features. These experiments will examine the influence of situational factors, such as the interview environment, to explore the interaction space of features as broadly and efficiently as possible. This will be achieved through smaller specialized experiments that aim to isolate particular factors. Initial hypotheses are presented alongside the virtual environment setup, and the design of the interview situation for the larger study is discussed.

The Future of Survey Data Collection in the UK

Professor Peter Lynn (University of Essex) - Presenting Author

This presentation will provide an overview of the work, achievements, and vision of Survey Futures, a major initiative in the UK designed to future-proof social survey data collection (www.surveyfutures.net). Survey Futures addresses a number of challenges that are also of interest to New Data Spaces for the Social Sciences, including those related to the costs and sustainability of survey data collection, data quality and new modes and methods. Tools and guidance will be provided for survey commissioners, survey agencies and survey data users, and we also hope to influence the debate about the future role of surveys within a changing data landscape.

CIRCLET: Unified Solutions for Data Processing and Visualization in Computational Social Sciences

Mr Kevin Bönisch (Text Technology Lab, Goethe-University) - Presenting Author
Mr Alexander Mehler (Text Technology Lab, Goethe-University)

Recent technological advances, particularly in natural language processing (NLP) and the computational humanities, are leading to novel applications and corresponding evaluations in the social sciences, especially in the context of large corpora. To bridge the gap between NLP and the social sciences, systems are needed that can be continuously extended by the integration of new research results and at the same time offer horizontal and vertical scalability of the underlying (mostly textual) resources. This means that systems are needed that support both the processing and visualization of multimodal data while ensuring interoperability. To address these needs, we present a synthesis of two recent developments: for scalable data processing, we use the Docker Unified UIMA Interface (DUUI) as an NLP pipeline, followed by a novel system for visualizing text-based data, the Unified Corpus Explorer (UCE). In this study, we focus on the latter, demonstrating UCE's versatility in making UIMA-annotated data searchable, visually accessible, and tangible. Our approach leverages a dynamic and customizable microservice architecture, incorporating various annotations such as named entities and semantic roles, as well as Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) techniques. Together, these methods enable the generation of embedding spaces, facilitate corpus-based interactions through chat interfaces, and provide a range of search and visualization functionalities for the underlying data. Finally, we highlight the genericity of both DUUI and UCE, demonstrating their integration across multiple domains and use cases, facilitating both primary and secondary data analysis, all without the need for re-implementation or costly model redevelopment. We exemplify UCE using two datasets relevant to the social and educational sciences: one based on Twitter and the other based on experimental data from Critical Online Reasoning (COR) experiments.

ESRA 2025 Preliminary Program

New Data Spaces for the Social Sciences - An Interdisciplinary Program for Survey Innovation in Germany 2

Papers

Exploring the Impact of Virtual Avatars in Survey Interviews

The Future of Survey Data Collection in the UK

CIRCLET: Unified Solutions for Data Processing and Visualization in Computational Social Sciences