All time references are in CEST
The use of Artificial Intelligence (AI) in Data Collection |
|
Session Organiser | Mrs Joanne Groves (Office for National Statistics) |
Time | Tuesday 18 July, 09:00 - 10:30 |
Room |
Artificial Intelligence (AI) is a field of computer science that aims to create systems that perform tasks requiring human intelligence.
AI has moved into the domain of data collection, including the application and development of questionnaires, chatbot interviewers and generation and analysis of data. Its use is perceived as a time and resource saving tool or as having another expert in the room that can discover new perspectives and topics. However, the risks associated with it include:-
• Bias, error and uncertainty.
• Lack of transparency of data collection process.
• Ethical, legal, and social issues, such as privacy, consent, ownership, and accountability.
• Negative impact on the human role, interaction, and trust.
This session is for researchers working in data collection and survey design/testing. It aims to:-
• Explore the benefits and risks of using AI in:-
o Creating a data collection tool such as a survey.
o Collecting the data itself.
o Summarising and analysing findings.
• Share best practice in the use of AI, including quality assuring results it generates.
We are particularly interested in abstracts about applications of AI at any stage of the data collection process; examples of where AI might have been used are listed below:-
• To generate survey questions or questionnaires.
• To assist participants in completing online questionnaires.
• In data collection, e.g. Chatbots as interviewers or Chatbots as participants answering questions based on assumed characteristics.
• Producing transcripts and/or interview or focus group summaries.
• Analysing qualitative data.
• Comparing the results of using AI in any stage of the qualitative data collection process, with traditional qualitative data collection methods.
• Quality assurance of AI processes/results.
Please consider this as a session. I am aware of other researchers I can approach for abstracts and have an abstract (possibly two) of my own to submit.
Keywords: artificial intelligence, AI, comparisons, traditional methods, quality assurance, risks, benefits, data collection, transparency, trust, bias, ethics, chatbots, transcribing, summarising, analysis, qualitative methods, questionnaire design,
Professor Noreen Naranjos Velazquez (IU International University of Applied Sciences) - Presenting Author
In the context of network research on childhood sexual abuse, researchers face significant challenges due to the emotionally distressing nature of the sensitive data (Naranjos Velazquez, in press). The use of ChatGPT (OpenAI 2023) offers potential support in this regard. This artificial intelligence (AI) tool enables efficient data processing while maintaining emotional distance by converting qualitative content into quantifiable formats. This facilitates statistical analysis and reduces the emotional burden on researchers (Naranjos Velazquez, in press). The inter-coder reliability of this method has been evaluated, with varying degrees of confirmation. In these evaluations, the results of a human coder were compared to those of ChatGPT (Naranjos Velazquez 2024; Naranjos Velazquez, in press). Furthermore, the results of different AI models were compared using ChatGPT. A varied level of agreement was observed, which highlights both the reliability of ChatGPT in processing sensitive information and its limitations. This presentation explores the ethical and practical implications of using AI in network research and discusses the limitations of this AI tool (Naranjos Velazquez, in press). Particularly relevant is the question of how these findings can be applied to egocentric network research in the context of child sexual abuse. Therefore, the analyses presented are based on reports from survivors of child sexual abuse (UKASK 2024).
References:
Naranjos Velazquez, N. (2024). ChatGPT als KI-Assistent in der Forschung zu sexuellem Kindesmissbrauch: Wie hoch ist die Übereinstimmung zwischen Mensch und Künstlicher Intelligenz? https://doi.org/10.13140/RG.2.2.30949.41448
Naranjos Velazquez, N. (in press). ChatGPT als KI-Assistent in der Aufbereitung traumabezogener Daten: Ein Forschungsbericht.
OpenAI. (2023). GPT-4 Technical Report. https://arxiv.org/pdf/2303.08774.pdf
UKASK. (2024). Geschichten die zählen. https://www.geschichten-die-zaehlen.de/
Mr Oscar Cariceo (Kintu) - Presenting Author
The use of Artificial Intelligence (AI), particularly Large Language Models (LLMs), has transformed data collection and analysis processes across various domains. This paper presents a framework for leveraging LLMs to analyze user-generated content, specifically mobile application reviews. The proposed system automates the retrieval, analysis, and reporting of app reviews to provide insights into user satisfaction and key discussion topics. By employing AI-driven techniques, such as natural language processing (NLP) and sentiment analysis, the system offers a scalable, efficient, and expert-level solution for extracting actionable insights. Our study explores the potential of AI as a time-saving tool and its implications for understanding user feedback in greater depth.
Dr Susanne Weber (Institute of Medical Biometry and Statistics) - Presenting Author
Mr Jochen Knaus (Weizenbaum-Institut)
Dr Erika Graf (Institute of Medical Biometry and Statistics)
Dr Jörg Sahlmann (Institute of Medical Biometry and Statistics)
Mr Dominikus Stelzer (Institute of Medical Biometry and Statistics)
Professor Martin Wolkewitz (Institute of Medical Biometry and Statistics)
Professor Harald Binder (Institute of Medical Biometry and Statistics)
Mr Urs Fichtner (Institute of Medical Biometry and Statistics)
Background:
At many universities and medical centers, statistical consulting is offered by experts to clinicians and scientists, who plan and conduct studies involving statistical procedures. At the Institute of Medical Biometry and Statistics in Freiburg, more than 24 scientists operate a central consulting service. Since March 2024, a license of the commercial Large Language Model (LLM) ChatGPT was offered to the consultants to support them with their statistical consulting tasks. Within the EXPOLS project, we aimed to gather insights into how statistical consultants use LLMs, what barriers and advantages they experienced and which perceptions they have on the use and proliferation of LLMs for their (future) work as scientists and consultants. For this purpose, we conducted n=6 semi-structured interviews with a purposive sample of the statistical consultants. The average length of the interviews was 48 minutes. The audio data was transcribed verbatim. For this contribution, we aim to explore the performance of proprietary LLMs (ChatGPT, Claude Sonnet, Llama, Gemini) for deductive coding of qualitative text data.
Methods:
After data collection is finished by the end of 2024, we will develop a code system deduced from the interview guideline. The code system will be used to manually code the text data using MAXQDA software. Coding will be applied iteratively to create subcodes or new codes inductively, where necessary. Afterwards, a parallel coding approach will be implemented using the updated code system on the four LLMs. The results will be exported to MAXQDA to estimate inter-coder-reliability between human coders and the LLMs. Furthermore, we will explore systematic differences across the LLMs and the manual coding.
Results:
The results will be presented at the conference and the applicability of the LLMs for deductive text coding will be discussed with a focus on consistency, precision and inter-coder-reliability.
Mrs Joanne Groves (Office for National Statistics) - Presenting Author
This study carried out at the Office for National Statistics (ONS), UK, is an investigation of the efficacy of AI to generate questionnaires that compare in clarity, respondent understanding and accuracy to those generated by human experts in the field of questionnaire design.
The aim is to evaluate the quality of AI-generated questionnaires, highlight potential risks and explore the possibility of integrating AI into the process of survey design to improve efficiency. The study compares two questionnaires intended to collect data to meet identical data requirements . One is generated through AI by an AI expert prompt writer who is not an expert in questionnaire design. The comparison survey is written by questionnaire design experts. The AI prompt writer and the questionnaire design experts worked separately from each other and the time it took to produce each questionnaire was measured.
The questionnaires will be compared, using an assessment tool based on good practice questionnaire design principles, by experts uninvolved in the initial stages . These experts will not know which questionnaire is which. Each questionnaire will be evaluated against various criteria and the better design for each criterion identified. A report will be drafted to inform wider work at the ONS on AI efficacy in the field of questionnaire design.
At the ESRA conference we will present the results of the comparison and comment on the differences between the AI generated questionnaire and the one generated by the questionnaire design experts.