All time references are in CEST
Ways Forward: Social Science Research Data Infrastructures in Europe |
|
Session Organisers | Professor Christof Wolf (GESIS Leibniz-Institute for the Social Sciences & KonsortSWD) Professor Daniel Oberski (Utrecht University & ODISSEI) Professor Nicolas Sauger (Sciences Po & Prodego) Mikael Hjerm (Umeå University & CORS) |
Time | Tuesday 18 July, 09:00 - 10:30 |
Room |
Research data infrastructures (RDIs) for the social sciences have substantially developed over the last decades and have become indispensable for social research. RDIs include survey programs, such as ESS, EVS, ISSP, WVS, but go beyond these: RDIs support researchers through all phases of the research data lifecycle. Writing data management plans, selecting measurement instruments, providing tools for data collection or for coding open-ended questions, data archiving, etc. In several countries initiatives are aiming at bundling RDIs for social research, such as KonsortSWD (DE), ODISSEI (NL), Prodego (FR), CORS (SE). Yet, the field of RDIs faces several challenges, for example:
How can RDIs cooperate across national borders and develop European perspectives? Currently the European Open Science Cloud (EOSC) is assembling national and thematic nodes. Are they a way forward? National RDIs, targeting mostly national researchers, do not focus on exchange and learning across borders. Therefore, we lack the ability to develop best practice or set standards. Social science data archives are one exception: they have networked for almost five decades and now operate CESSDA ERIC – organizing possibilities for exchange and discussing standards and speak with one voice vis-a-vis funders like the European commission.
How can RDIs support and incubate innovative initiatives at the local/national level and leverage them to develop international standards?
(How) can AI help RDIs to be more productive, to (semi-)automate services that still consist mainly of manual tasks?
How can RDIs be funded sustainably? Funding of RDIs is often organized similarly to research projects, i.e., funds are only granted for a very limited period. Additionally, funding is often not sufficient because reviewers (and sometimes applicants) have difficulties validly estimating necessary effort to operate a RDIs.
We invite contributions addressing these and related questions, including success stories of implementing or operating RDIs.
Keywords: Research Data Infrastructure, Survey Data, Digital Data, EOSC
Professor Rory Fitzgerald (European Social Survey HQ (City St George's, University of London))
Dr Eric Harrison (European Social Survey HQ (City St George's, University of London)) - Presenting Author
The ESS was established in 2002 and quickly became one of the most frequently used infrastructures for social scientists in Europe. Since cross-national data collection is complex and resource intensive, it is essential for social scientists to collaborate to organise survey data collection that is relevant to many researchers.
In this presentation the different stages of development of the infrastructure are outlined. The formative period, saw the demand for the ESS identified by the scholarly community, leading to the development of a Blueprint. The second period saw the first survey operations, where the ESS was run as a series of separate research grants funded by the EC and national funders delivering high quality data according. The third phase saw the ESS benefit from EC infrastructure support, allowing the core team to develop additional services and facilitate more extensive data provision. The fourth phase saw the ESS become an independent legal entity (a European Research Infrastructure Consortium). This has been followed by a fifth phase seeing membership increase to become the highest of any ERIC. Finally in its current phase the ESS is in a period of innovation, changing its core data collection methodology to self-completion and seeking to develop its cross-national web panel and extend that beyond Europe. Links with RIs outside of the social sciences are also being established.
The presentation will conclude by discussing how the successful development of the ESS has been based on a careful balance between bottom up and top-down elements, complemented by high quality methodology, relevant questionnaire content and, crucially, very accessible data. This combination of organisational and operational elements has led to over 230, 000 users of ESS data since it began delivering data, over 7000 publications, extensive use in university teaching and frequent reference in policy documents.
Professor Franciska de Jong (CLARIN ERIC / Utrecht University) - Presenting Author
This contribution aims to outline the dynamics in the landscape of European infrastructures relevant for social survey research.
In Europe several initiatives are contributing to the enhanced synergy in collecting/curating/studying data that reflect societal dynamics through a focus on the interoperability of their infrastructural service offer. The common platform for this endeavour goes under the name of SSH Open Cluster (SSHOC), which in 2022 resulted out of several collaborative EC-funded infrastructural projects. The scope of SSHOC goes beyond social data and also covers cultural/humanities resources. Combined with the rise of machine learning and large language models -in short: AI-, enhanced data interoperability across the boundaries of disciplines, languages, data types, period and regions, is likely to stimulate multidisciplinary approaches and the integration of data sources and formats in the study of societal phenomena. Especially the RDIs CESSDA (social survey data) and CLARIN (language resources) are now in the position to support comparative research with workflows that combine data types and analysis methods in innovative ways. Federated services for the translation of survey questions, response data, and spoken interview data will help increase the speed of generating research data as well as the harmonisation/standardisation of data formats (including data published/archived by third parties, such as publishers and public sector institutes). The resulting support for comparative studies at a wider scale is likely to stimulate the development of innovative multidisciplinary research agendas and the potential for societal impact of the research findings. Interoperability with disciplines beyond the SSH domain is also likely to become more advanced due to the role of SSH RIs in the wider EOSC landscape. With CLARIN as a candidate node in the EOSC Federation, the SSH Open Cluster at large is well aligned with any new interoperability approaches that may emerge.
Dr Bonnie Wolff-Boenisch (CESSDA ERIC) - Presenting Author
Data archives and infrastructures in the social sciences have a long history of managing research data. The first networks, such as CESSDA and IASSIST, dedicated to data curation, archiving, and dissemination, were established in the early 1970s, long before the advent of Open Science initiatives and the formal recognition of the FAIR principles.
Despite the increasing digitisation of society and research, the core mission of research data archives in the social sciences has remained consistent. However, these archives have expanded their responsibilities and diversified their tasks over time. The expert knowledge in research data management and data curation for long-term preservation, accumulated over the past 50 years, has enhanced the visibility of national and European data infrastructures, such as CESSDA ERIC, among decision-makers. This expertise has also fostered increased collaboration between social science research and other disciplines at the European level.
Additionally, the Social Sciences and Humanities Open Cluster (SSHOC), one of the five EC-funded Science Clusters, offers a framework for more collaborative development and knowledge-sharing in practices, tools, policies, and standards. This approach to strategically leveraging assets and resources aligns with the concept of the European Open Science Cloud (EOSC) Federation and serves as a model for future local or national nodes. Cross-border and cross-disciplinary collaboration is particularly vital as we enter the era of artificial intelligence (AI), given the ethical, technological, and methodological challenges it poses for data management.
There is a need for well-curated and interoperable data to maximise the potential of AI in providing researchers with innovative tools. Adopting common best practices and standards across local, national, and European infrastructures in the social sciences would strengthen the discipline and accelerate the development of trustworthy AI applications.
Professor Christof Wolf (GESIS)
Dr Bernhard Miller (GESIS) - Presenting Author
The National Research Data Infrastructure (NFDI) in Germany represents a significant advancement in supporting researchers throughout the research data lifecycle. Within this framework, KonsortSWD serves as the consortium for social, behavioral, educational, and economic sciences, developing comprehensive services to facilitate and enhance data management practices – for survey data and beyond.
This contribution presents two key aspects of KonsortSWD: First, KonsortSWD extends beyond traditional survey methodology by implementing comprehensive research data management (RDM) solutions. Through a network of 41 accredited research data centers, it provides researchers with access to sensitive data while ensuring ethical and legal compliance.
Second, KonsortSWD has developed value-added services that significantly enhance researchers' capabilities. Examples to be presented include the Open Data Format offering an innovative solution for data processing and exchange, enabling efficient workflows across different software environments while maintaining FAIR principles. Additionally, specialized tools like STAMP provide standardized data management planning.
These developments illustrate how national research data infrastructures can effectively support researchers while addressing critical challenges in data management and accessibility. KonsortSWD's experience demonstrates the potential for scaling solutions internationally and contributing to standardized practices across borders. By sharing our experiences we hope to help initiate cross-border exchange and cooperation on the development of joint services with other national initiatives.
Dr Adrienne Mendrik (Eyra) - Presenting Author
Mr Melle Lieuwes (Eyra)
The growing complexity of research data management in the social sciences requires innovative and sustainable infrastructures to support researchers throughout the data lifecycle. We introduce the Next platform, a collaborative, open-source web platform developed by Eyra (eyra.co), co-created with researchers. The platform acts as an online operating system (OS) for research, hosting innovative modular software services—akin to applications on a traditional OS—that support social sciences, such as data donation. The Next platform is easy to use for researchers and readily available online after sign in.
The Next platform exemplifies a sustainable approach to RDI development. Built on reusable modules within the shared Next mono codebase, each software service benefits from and contributes to a collaborative ecosystem. This modular design enables the efficient use of resources across services, ensuring long-term support and adaptability while fostering interoperability with third-party tools such as Qualtrics. The Next platform and the software services on the platform are governed by the Nova software foundation with representatives from various universities and Eyra co-creating policies based on co-determination.
This presentation highlights the platform's role in addressing critical challenges in the field of RDIs, including fostering cross-border cooperation, leveraging innovative technologies, and ensuring sustainable funding. For example, software services like data donation on the platform are supported with funding from various countries, such as the Netherlands (e.g. universities and ODISSEI), Germany (e.g. Mannheim University, GESIS), the UK (e.g. Cambridge), fostering cross-border cooperation and ensuring sustainable funding.
We will showcase real-world applications of the platform and discuss its potential to set new standards for RDIs in Europe and beyond. Since the Next platform functions as an integration hub, it contributes to a more interconnected and efficient research ecosystem, advancing the social sciences.
Dr Steven McEachern (UK Data Service) - Presenting Author
The development of social science research infrastructure has a long established history. From the earliest work on development of comparative social surveys by collaborations such as the International Social Survey Program and World Values Survey, to the networks of social science data archives established in Europe and across many developed economies, there has been a sustained effort to build national infrastructure to support the social sciences.
This investment however varies in both level and complexity, particularly outside of Europe. Countries such as the US, Canada, Australia, New Zealand and the Phillipines have had long term programs of survey data collection and data support, but vary significantly in the level and scope of the investments made, particularly as larger national and international infrastructures have developed within Europe. This appears to be a function of both the national recognition of the social sciences, and the embeddedness of the social science community within the research infrastructure landscape.
This paper seeks to consider how such institutional arrangements have developed, with a focus on the author's experience in developing infrastructure in both Australia and the UK. A comparison of the historical and recent strategic and operational arrangements in the social science archives in the two countries provides insight into how such institutional arrangements developed within the social sciences in each country. This is then followed by an analysis of arrangements across domains, using insights from the recently completed WorldFAIR project lead by CODATA. The WorldFAIR review provides the basis for comparing how social science is positioned relative to other domains by looking at domain data management practices within a domain. The paper then concludes with suggestions for sustaining and growing infrastructures in the social sciences based on reflections from the case studies in the paper.
Dr Michele Santurro (National Research Council - Institute for Research on Population and Social Policies) - Presenting Author
Professor Cristiano Vezzoni (University of Milan)
Professor Ferruccio Biolcati Rinaldi (University of Milan)
Ms Loredana Cerbara (National Research Council - Institute for Research on Population and Social Policies)
Mr Frank Heins (National Research Council - Institute for Research on Population and Social Policies)
Mr Nicolò Marchesini (Italian National Institute of Statistics)
Dr Angela Paparusso (National Research Council - Institute for Research on Population and Social Policies)
Ms Claudia Pennacchiotti (National Research Council - Institute for Research on Population and Social Policies)
Dr Francesco Piacentini (University of Milan)
Dr Luciana Taddei (National Research Council - Institute for Research on Population and Social Policies)
Italian social science research has historically been hindered by fragmented research infrastructure and a critical lack of longitudinal data, with profound consequences for understanding and addressing social change. The Italian Online Probability Panel (IOPP) will be the first initiative in Italy to produce high-quality longitudinal survey data monitoring social change. Based on a probability sample of approximately 10,000 individuals aged 18–74, drawn from the national population register, IOPP will enable robust tracking of social transformations. Offline participants will be included via postal questionnaires, ensuring comprehensive representation. Recruitment will start in early 2025, with data collection spanning five annual survey waves.
IOPP surveys will combine a core questionnaire, measuring enduring constructs such as family and housing, education, work, income, inequality, vulnerability, and political attitudes, with rotating modules developed via open calls to the research community. IOPP adheres to state-of-the-art standards in probability sampling, question testing, translation for international comparability, coding, response rate optimization, and incentive design.
Notably, IOPP is developed within the Fostering Open Science in Social Science Research (FOSSR) project, which addresses Italy’s infrastructural gap by establishing an innovative Research Data Infrastructure (RDI) aligned with the European Open Science Cloud. Grounded in Open Science and FAIR principles (Findable, Accessible, Interoperable, Reusable), FOSSR integrates advanced tools for collecting, managing, and analysing economic and social change data. It also connects IOPP with leading RDIs, such as CESSDA ERIC, SHARE ERIC, and RISIS, and international surveys like GUIDE and GGS, included in the ESFRI Roadmap 2021. IOPP thus represents a crucial step in embedding Italy within the global ecosystem of social science research.
Professor David Richter (SHARE Berlin Institute)
Dr Michael Bergmann (SHARE Berlin Institute)
Ms Theresa Fabel (SHARE Berlin Institute)
Dr Stefan Gruber (SHARE Berlin Institute)
Ms Magdalena Hecher (SHARE Berlin Institute)
Ms Stephanie Stuck (SHARE Berlin Institute) - Presenting Author
International cooperation is part of the daily business of SHARE. Collecting microdata on health, socio-economic status, and social networks of individuals aged 50 or older, it offers researchers a rich "natural laboratory" to explore the challenges of population ageing. SHARE covers all continental EU countries, Switzerland, and Israel, with over 616,000 interviews conducted from more than 160,000 individuals since 2004. SHARE demonstrates how research data infrastructures (RDIs) can facilitate social science research through setting standards and comprehensive support across countries as well as the data lifecycle.
SHARE applies a concept of ex-ante harmonisation, ensuring data comparability across borders, e.g. a model contract as well as generic survey instruments are developed and used by all countries involved. Data processing, curation and data releases are conducted in very close cooperation between SHARE Central and SHARE country teams. The regular SHARE data collection is based on computer-assisted personal interviewing (CAPI). In 2020 and 2021, SHARE collected two additional surveys by phone (CATI) to gain information on the life situation of the 50+ population during the pandemic (SHARE Corona Surveys), showcasing the potential of RDIs to adapt quickly to emerging research needs. Furthermore, SHARE incorporates retrospective life history information (SHARELIFE).
SHARE’s integration into the European research landscape shows how RDIs can network across national borders, develop international standards, and drive innovation, e.g. SHARE also cooperates with other European studies, such as ESS, GGP or GUIDE, in projects funded by the European Union to make use of synergies exchanging knowledge and developing innovative tools. This contribution explores SHARE as a case study addressing key RDI challenges, including cross-border cooperation and innovative initiatives. SHARE's experiences highlight the value of harmonised, open-access RDIs for advancing social science research.