ESRA logo

ESRA 2023 Glance Program


All time references are in CEST

Quality Assurance in Digital Trace Data gathered through Data Donations: Frameworks, Tools, and Best Practices

Session Organisers Mr Yannik Peters (GESIS - Leibniz Institute for the Social Sciences)
Mrs Fiona Draxler (University of Mannheim)
Mrs Laura Young (University of Mannheim)
Mrs Jessica Daikeler (GESIS - Leibniz Institute for the Social Sciences)
TimeTuesday 18 July, 09:00 - 10:30
Room

In the evolving landscape of survey research, new data sources, such as digital trace data (e.g., online interaction, social media and browsing behavior data), have garnered significant attention. While the value of combining survey data and digital behavioral data in the form of data donation offers great potential for social science research, their practical implementation from a survey research perspective presents unique challenges. Those challenges particularly concern data quality and reliability.
This session will delve into the critical aspects of quality assurance in the context of collecting digital trace data through data donations. We will explore comprehensive frameworks, state-of-the-art tools, and best practices tailored to ensure the integrity and usability of these data sources.
Key topics will include:
1. Frameworks for Quality Assurance: An overview of frameworks designed to evaluate the quality of digital trace data through data donations, including criteria for assessing reliability, validity, and representativeness.
2. Tools and Platforms for Data Validation: A discussion on tools, technologies, and platforms (e.g., the KODAQS toolbox) for validating the quality of digital trace data collected through data donations.
3. Best Practices and Case Studies: Case studies on data donation for collecting and processing online interaction data, focusing on assessing data quality and providing examples of how to measure and improve it. Real-world case studies will illustrate successful integration of these data types into survey research, highlighting challenges and solutions.
4. Didactics of Data Quality Issues: Strategies for teaching data quality assurance for digital trace data through data donations. This segment will focus on educational approaches.
This session aims to foster a deeper understanding of the methodological challenges and practical solutions in assuring the quality of data donations and digital trace data in survey research.

Keywords: Digital Trace Data - Data Donation - Data Quality - Data Linkage

Papers

WhatsApp Data Donations for Interpersonal Relationship Research: First Insights on Data Quality and Relationship-level Chatting Differences

Mr Julian Kohne (GESIS - Leibniz Institute for the Social Sciences) - Presenting Author
Professor Christian Montag (Ulm University)

We combined a questionnaire with data donations to explore the feasibility of WhatsApp data donations to investigate interpersonal relationships. To do so, we investigated the factors that contribute to participants not donating or restricting data donations (dropout, non-consent to data donation, non-compliance with stated donation intention, and self-censoring the data), as well as the predictive potential of anonymized chat log characteristics for relationship-specific survey responses (relationship type and interpersonal closeness). We examined differences in willingness to donate, actual donations, and self-censorship across a range of demographic, psychological and relationship-relevant characteristics. In our opt-in study (N = 357), after non-consent and dropout from the survey (N = 60), about 70% of remaining participants (N = 206) stated to be willing to donate, with younger participants being more willing, and willing participants exhibiting less privacy concerns than unwilling participants. We found some evidence pointing to women being more willing to donate than men. We did observe an intention-behavior gap with only ~68% (N = 140) of willing participants actually donating. We did not find significant differences between donors and non-donors based on individual tests (e.g., gender, age, personality, relationship status) but found some evidence for personality and sexual orientation as potentially influential factors in a logistic regression model. With respect to relationship characteristics, first descriptive analyses point to observable differences in the number of messages, number of words, number of emoji, number of URLs, and average replytimes by relationship type and interpersonal closeness scores in the last 30 days prior to answering our survey. The final presentation will include LASSO or Ridge Regression model results to quantify the predictive potential of these variables. We discuss implications for conducting WhatsApp data donation studies, limitations, and directions for future research with respect to our findings.