ESRA logo

ESRA 2025 Preliminary Program

              



All time references are in CEST

Harnessing AI and machine learning techniques for survey data collection

Session Organisers Professor Gabriele Durrant (University of Southampton)
Professor David Bann (University College London)
Dr Liam Wright (University College London)
TimeTuesday 15 July, 09:00 - 10:30
Room Ruppert Blauw - 0.53

Recent innovations in Artificial Intelligence and Machine Learning (hereafter ‘AI’) are projected to transform society and influence how we conduct research and, more specifically, how we perform survey data collection and analyses. This session will focus on recent developments in the use of AI for survey data collection. The use of such techniques may have significant impacts on the quality of the resulting data and the speed of production and may open up new survey design opportunities.

For this session, topic areas of interest include, but are not limited to: the use of Large Language Models (LLM) for survey research, data collection and data usage, use of LLMs for questionnaire design and optimisation, for tailoring questions to respondents based on profiles or previous answers, the use of AI for survey cost efficiencies, error detection, for evaluation and testing, variable coding, analysis of open-ended text data, cognitive testing in surveys, improvement of measurement quality, the role of LLMs as (qualitative) interviewers, and the use of AI-driven chatbots. Quality implications and resulting challenges based on the use of AI will be carefully discussed.

At present, AI techniques to handle and analyse existing and new forms of data are being developed at speed, but skills development from such research is limited, not easily accessible, and not always targeted at those most in need. The session will discuss these opportunities and challenges and will feature the latest developments from a research and training and capacity building project in the UK run as part of the National Centre for Research Methods (NCRM). The project will drive forward research under three AI related themes and will deliver an innovative Training and Capacity Building (TCB) programme to ensure survey researchers stay at the forefront of digital skills concerning AI.

Keywords: AI, machine learning, large language models, survey data collection, questionnaire design

Papers

Adapting Van Westendorp Price Sensitivity Meter (VW PSM) and Newton, Miller, and Smith (NSM) Extension to Gauge Users Expected Accuracy of AI-Assisted HR Support Technologies

Mr Shao Wei Chia (Google) - Presenting Author

Given the fast development of AI, particularly conversational AI, there is a huge interest to apply AI to enhance the experience of HR support. Since conversational AI (powered by large language models, LLMs) is statistical, we know conversational AI can never be 100% accurate. While there are many different frameworks to measure LLMs' information quality (e.g., accuracy, factuality, groundedness), to date there is not an established method to capture users' expected accuracy.

One would expect users to expect perfect accuracy especially when they are interacting with AI-assisted technologies to troubleshoot their HR related issues. However, given the nature of LLMs, perfect accuracy is currently unattainable. Hence, we need to find the “ideal” accuracy from the users’ perspective.

In the present study, we adapted Van Westendorp Price Sensitivity Meter (VW PSM) to estimate users' expectations with regards to the accuracy of AI technologies in HR support. Further, we adapted the Newton, Miller, and Smith (NSM) extension to estimate expected engagement with any AI technologies in HR support. Triangulating results from VW PSM, NSM extension, and current limitations of LLMs, we derived a minimum threshold of accuracy. We also observe regional differences in expected accuracy which suggest the need to have a differential approach in different regions to boost user adoption and engagement.

Although we are able to estimate the “ideal” accuracy based on users’ expectations, in the present study, we were not able to validate the “ideal” threshold. Further studies should explore methods to validate the threshold by looking at the actual user engagement with the AI-assisted technologies. Further, there is also a need to explore the generalizability of findings using such methods cross-culturally.


Synthetic data fidelity: how less can be more

Dr Jools Kasmire (UK Data Service / University of Manchester) - Presenting Author

Synthetic data is generated rather than observed and includes stuff that someone makes up on the spot, random numbers generated by simplistic code, predictions made by complex machine learning models, the output of sophisticated digital twin simulations and much more. An important related concept, fidelity captures how “faithful” a synthetic data set is to a real-world counterpart. As such, fidelity is often seen as an important, if not the most important, feature of synthetic data.

Yet, fidelity is not binary; a synthetic data set can be very faithful in some ways while wildly unfaithful in others, with the specifics of its fidelity determining its usefulness. For example, if synthetic data is intended to fix gaps or biases in real-world data sets, then it must be deliberately unfaithful to the original in at least some specific ways. At the same time, not all synthetic data sets try to mimic, replicate or augment existing real-world data sets and may not even use any real-world data within the synthetic data generation process. As such, fidelity (and especially high fidelity) is not always as important as might be assumed.
This talk introduces and defines what synthetic data is and is not before diving into the role of fidelity, before highlighting common use cases, generation methods and concerns around synthetic data at varying levels of fidelity. When used appropriately to link a method, data set and research question, synthetic data can provide a valuable alternative to real-world data in situations where real-world data is unavailable, restricted, or unknown. Importantly, synthetic data is especially useful to enhance reproducibility and transparency in research by balancing data utility against privacy protection as well as to facilitate hypothesis testing, method development.


Constructing Modern Ads for Viewers

Mrs Cathy Zhao (UX Data Scientist, YouTube Ads UX @Google) - Presenting Author

Consumers and viewers prefer high-quality ads. Social ads perceived as higher
quality and more appealing are more likely to connect with viewers, and promote engagement.
Research showed aesthetic appeal and emotional value drive overall perception & click intent
more significantly than informative and clear messaging.
How to make ads more modern, more appealing, have better design quality, higher emotional
values? In this paper, we start from defining the construct of modernity for social ads, and the
levers of improving modernity, then build the measurement framework for modernity and
validate through experimentation.
Modern ads should be able to capture and retain viewer attention, and elevate brand image and
perception. And the dimension matters to viewers' experiences. Modernity can be characterized
by 4 levers - Contemporary Aesthetics, Emotional Inspiration, Cultural Relevance, Trend
Awareness.
We build a question bank to evaluate the levers for modernity. For each lever, we have a
quantitatively validated set of sentiment questions. We will limit 1-3 questions for each lever.
Users report their agreement to the questions on a 5-point scale. The questions will be
calibrated and improved through user studies and interviews.
We then start the evaluations at scale with the help from human raters, ensuring raters are
answering the questions in a way how viewers would view an ad. The answers for each lever
contribute to a metric, a total modernity score. This metric should be measurable in
experiments, where changes are reported. while keeping subjectivity within the modernity
calculation at a minimum. The metric can also be validated in randomized experiments and
analyzed against viewer perception such as ad relevance, user trust, but also with viewer
engagement, and performance.


Qualitative Data Meets Quantitative Analysis: Bridging the Gap with Large Language Models

Dr Georg Wittenburg (Inspirient) - Presenting Author
Dr Josef Hartmann (Verian)

We explore the application of Generative AI models, particularly Large Language Models (LLMs), for survey research, emphasizing their potential to quantify qualitative information from open-ended responses. In the field of Generative AI, textual data, incl. Internet content, can be conceptualized as a vast set of implicit question-answer pairs. These pairs, when utilized as training data, are compressed within an LLM’s weight parameters, enabling the model to probabilistically return context-appropriate responses to a given prompt. Repeated prompting, especially when constrained to numeric answers („On a scale from 1 to 10, tell me…?“), yields a sample that can be statistically analysed to infer information within the model's embedded data.
Two examples are provided to illustrate this method. First, we prompt an LLM to assess hypothetical income levels associated with various job titles, generating statistically interpretable estimates of income distribution. Second, we prompt the model to estimate the likelihood of voting for a political candidate based on verbatim feedback, providing an output scale from 1 to 10 indicating a propensity to vote. This repeated sampling approach allows quantitative insights to be derived from qualitative information through multiple iterations, transforming open-text responses into data suitable for statistical analysis, while at the same time compensating for the well-documented propensity of LLMs to „hallucinate“ answers in the absence of actual information.
The method is valuable due to its flexibility in setting evaluative criteria (e.g., income, political alignment) and input data types (e.g., job roles). Furthermore, this approach enables low-cost, coding-free statistical analysis of open-text data. Applied to real-world samples, this method could supplement survey data, thus addressing item-nonresponse issues. This research presents a novel approach to integrating LLMs into survey research, expanding the potential for statistical analysis of open-ended qualitative data.


Survey2Persona: Facilitating Survey Insight Generation Through Automatic Segmentation

Mr Soon-gyo Jung (Qatar Computing Research Institute)
Dr Joni Salminen (University of Vaasa) - Presenting Author
Dr Bernard J. Jansen (Qatar Computing Research Institute)

Surveys are a widely adopted research method in HCI and UX research in many organizations. Surveys help researchers and practitioners assess usability, identify design opportunities, gauge user perceptions, and understand the relationships between human factors and technology use.

However, many practitioners find analyzing survey data challenging because it involves steps such as data cleaning, exploratory data analysis, statistical and machine learning, and reporting the results in an engaging manner. To facilitate this process, we present Survey2Persona, a system and methodology that helps researchers and practitioners to map their data, automatically performs principled transformation steps, analyzes the data using machine learning algorithms, and outputs a set of persona profiles that represent distinct respondent groups in the data.

Survey2Persona can be of help to researchers and practitioners who have data on a certain population but lack the sophisticated data science and analytic skills to segment the data. Persona generation, in particular, consists of many steps and requires specialized skills that can make it unfeasible for people who still would like to create meaningful segments from their data. Personas are fictitious individuals that represent groups of people; they are created to make user representation more empathetic and memorable than “nameless and faceless” segments. The personas created by Survey2Persona represent different respondent groups in the input survey data, which can be virtually any category of people: system users, customers, citizens, students, stakeholder groups, and so on.

Survey2Persona has an easy-to-use interface and broad applicability across different survey datasets (the requirements are the inclusion of Likert scale statements and demographic information about each respondent), increasing its potential value for survey researchers and practitioners.

We believe that Survey2Persona would be of interest to the conference participants, particularly those attending the “Surveys and HCI and UX Research” session.