All time references are in CEST
Data skills for analysts in contemporary social survey research |
|
Session Organiser | Dr Vanessa Higgins (UK Data Service/University of Manchester) |
Time | Tuesday 18 July, 09:00 - 10:30 |
Room |
Survey analysts are experiencing a rapidly advancing data skills landscape. There is a need for the development of traditional survey data skills for contemporary research needs, while recognising growing potential for integrating survey data with an expanding array of other sources such as web data, streaming data and administrative data. More data are moving into TREs, with data access requiring additional sets of skills to those required for traditional modes of data sharing. Computing power means that new types of analyses are possible, with computational social science and machine learning developing at pace. New conversations are being had and cutting-edge work is being done on synthetic data, and while AI for coding and analysis is bringing new uses, there are concerns about verifiability and reproducibility. All this means new skills are needed, and these are constantly evolving.
The aim of this session is to create a space to share ideas around the development of data skills for analysts in contemporary social science research. We invite submissions on the development of data skills on topics such as:
- managing, accessing and analysing survey data
- new or alternative sources of data
- linking survey data with new or alternative sources of data
- teaching/training in contemporary data skills
- the use of learning technologies and software
Papers need not be restricted to these specific examples
Dr Deb Wiltshire (GESIS-Leibniz Institute for the Social Sciences) - Presenting Author
Dr Wiebke Weber (LMU)
Dr Simon Parker (DKFZ)
Dr Vanessa Gonzalez Ribao (DKFZ)
Mr Markus Herklotz (LMU)
Germany has a large Research Data Centre infrastructure that facilitates the analysis of complex and highly sensitive data. Hence the growing need to foster awareness about the importance of using sensitive, potentially disclosive data responsibly. Researchers and data professionals must acquire special skills to handle these data in an ethical and efficient way. However the data protection training available at institutions is typically generalist and does not cover important specialist topics such as lawful data sharing or statistical disclosure control. Research Data Centres may offer more specialist training for researchers seeking access to the data they hold, but such courses are not recognised across services.
To address this situation, researchers from GHGA, KonsortSWD and BERD@NFDI are working collaboratively to develop ASSURED: an adaptable E-Learning training and accreditation system that enables researchers to follow a training pathway tailored to their individual needs and to achieve a widely-rec
Professor Giovanni Busetta (University of Messina)
Professor Maria Gabriella Campolo (University of Messina) - Presenting Author
Professor Antonia Cava (University of Messina)
Dr Debora Maria Pizzimenti (University of Messina)
Digital ethnography is gaining prominence as a crucial methodological approach for exploring workplace discrimination, providing access to authentic narratives and unfiltered perspectives shared in online spaces. By analyzing user-generated content from platforms such as Reddit and other social forums, researchers can uncover complex dynamics of bias, stigma, and inequality that traditional methods, such as surveys or interviews, may fail to capture.
This paper explores the methodological contributions of digital ethnography to labor market research, emphasizing its capacity to:
- Capture real-time narratives: Online platforms offer immediate and candid accounts of discriminatory experiences, free from the constraints of structured data collection.
- Reach diverse populations: The anonymity and accessibility of digital spaces facilitate engagement with marginalized groups often underrepresented in conventional studies.
- Analyze social dynamics: Interactions within online communities reveal collective strategies of resistance, shared language, and evolving norms around discrimination and equity.
The study critically evaluates the strengths and limitations of digital ethnography in labor market research. While it allows unprecedented access to rich, organic data, challenges such as self-selection bias, platform-specific cultures, and ethical concerns must be addressed. The paper underscores the importance of ethical safeguards, including informed consent, privacy protection, and the contextual interpretation of public content, advocating for the responsible application of this approach.
Through illustrative case studies, the paper demonstrates how digital ethnography complements traditional methodologies by adding depth and context to quantitative findings. This integration fosters a comprehensive understanding of workplace discrimination, capturing the complexity of discriminatory behaviors and their broader implications for policy and organizational practices.
This study concludes for integrating digital ethnography into mixed-method research designs. By leveraging its potential, researchers can advance the study of workplace discrimination and contribute to the development of equitable labor market policies in an increasingly digital world.
Mrs Jen Buckley (UK Data Service, University of Manchester)
Ms Alle Bloom (UK Data Service, University of Manchester) - Presenting Author
To address the growing demand for accessible data skills among postgraduate researchers, the UK Data Service partnered with three providers of postgraduate training for the social sciences (Doctoral Training Partnerships (DTPs)), to develop the online course 'Introduction to Finding and Using Data’.
The goal was to create high-quality data training that would equip students with skills in key areas relating to the use of data for social research. It focused on upskilling students beyond the hard statistical training taught in their university courses, and covered five main topic areas: the types and sources of data for social research, the UK Data Service and the support and training available, practical information and tools to help researchers find suitable data for a project, ethical issues when sourcing data, and what to consider when starting data analysis.
The outline of the online modules was developed in collaboration with training providers to ensure it addressed knowledge gaps and worked alongside existing training. We trialled the modules in an initial pilot phase, where we recruited student consultants to participate in focus groups tasked with recommending changes to the course design and content. The feedback from students then informed a redesign of the course.
Existing survey data is focused on as a key example throughout the course. Other newer forms of data are also showcased, demonstrating the need for researchers to understand how these can be utilised alongside each other in the changing data landscape.
This presentation will discuss the question: What data skills do we need? Using the course as a case study, it will highlight the need to move beyond purely statistical skills, and provide data training that helps researchers to develop broader skill sets - particularly for working with large-scale survey data.
Dr Pierre Walthéry (UK Data Service, University of Manchester ) - Presenting Author
Dr Sarah King-Hele (UK Data Service, University of Manchester )
This presentation will introduce the Survey Skills Pathway, a new, skills-focused series of asynchronous learning material currently under development by the UK Data Service (UKDS) and an initial implementation of the Data Skills Framework (DSF).
Over the past few decades, the landscape of training in foundational survey methods—particularly for non-academic users—has become more fragmented. This shift reflects the emergence of new disciplines such as computational and data science, as well as the rise of independent users of social survey data who often lack an academic background.
In response to the increasingly diverse training needs in survey data analysis, the UKDS published the Data Skills Framework (DSF) in 2023. This comprehensive framework provides a conceptual map of the discrete skills needed in this new data-driven landscape and is meant to be used as a roadmap for content creation. The Survey Skills Pathway is the first curriculum design based on the DSF and consists of a series of asynchronous modules and lessons, mapped to the framework, that can be followed sequentially in a similar fashion as a traditional undergraduate course or as standalone modules. In its final form. it will consists of 12 core modules covering social survey design, data exploration and foundations of data analysis, alongside broader topics such as the role of secondary data analysis in the research process and ethics of data analysis.
What sets this approach apart is its modular, flexible structure, which supports user-centered learning while retaining conceptual coherence.
Dr Riza Battista-Navarro (University of Manchester)
Dr Marta Cantijoch-Cunhill (University of Manchester)
Dr Alex Cernat (University of Manchester)
Mr Conor Gaughan (University of Manchester) - Presenting Author
Professor Rachel Gibson (University of Manchester)
This paper will focus on demonstrating the conceptual and methodological value and challenges in producing anonymised and standardised variables from survey respondents’ digital trace data (DTD). We will do this using existing YouGov datasets collected over two time periods in the US 2020 and 2024, and a third collected in the UK 2022. The US datasets link individual survey responses to their Twitter feeds and the UK to their browsing history. All three datasets were designed to address research questions about the effects of digital media consumption and exposure on citizen attitudes and behaviours. The paper will proceed in three main stages. First we will identify a range of new anonymized variables that can be created from the DTD that can address important new substantive questions about the impact of web and social-media content on individuals’ political engagement. We will also specify a set of more methodologically interesting variables that we can extract from the observational trace data that can be used to validate the survey responses. After identifying the range of ‘ideal’ variables that could be generated, we will then select a subset of these variables to show how they can be operationalised and discuss the technical challenges faced in doing so, focusing particularly on comparing Twitter to browser data. We will select the variables by rating them on two core criteria of scientific value and ease of computation. In a final stage we will reflect on the ethical issues raised in this process of linking survey data with digital trace data, and the key ‘take homes’ that our research has identified for future projects of this type to consider, prior to data collection.