ESRA logo

Tuesday 16th July       Wednesday 17th July       Thursday 18th July       Friday 19th July      

Download the conference book

Download the program





Wednesday 17th July 2013, 09:00 - 10:30, Room: No. 16

Research Data Management for Re-use: Bringing Researchers and Archivists closer 1

Convenor Dr Alexia Katsanidou (GESIS - Leibniz Institute for the Social Sciences)
Coordinator 1Mr Laurence Horton (GESIS - Leibniz Institute for the Social Sciences)
Coordinator 2Dr Christina Eder (GESIS - Leibniz Institute for the Social Sciences)

Session Details

Research data management includes organizing, documenting and validating data to produce long-term re-usable data. Good research data management practice fulfills the King, et al. (1994: 8), requirement of social science that "procedures are public" to verify quality and permit replication. Accordingly, to the fullest extent possible the research community requires access to data and contextual documentation. Funding bodies impose archiving requirements on researchers, and data archives establish standards and procedures to ensure data are preservable, discoverable, and comprehensible. Following these practices, large survey programs increasingly make data management plans, collect metadata and document every stage of research: from conception to analysis.

However, in practice surveys can be inadequately documented due to miscommunication between researchers and archivists resulting in poor planning, which bring time and resource pressures and lead to poor quality data. Data and contextual information can remain hidden and vulnerable: stored on researcher hard-drives or websites, metadata could be incomplete or non-existent, variable and value labels may be cryptic, and do- or syntax-files nowhere to be found. Thus, despite suggestions and standards, effective implementation of data management plans (if existent) remains unfulfilled.

This session brings together two audiences: researchers designing and/or implementing data management plans in survey research, and archivists involved in digital preservation and dissemination of survey data. This session is a forum to discuss and evaluate approaches to research data management, promote common understanding of problems encountered, and discuss means to an end product: reusable data. We have already received expressions of interest in participating from at least seven researchers and archivists from relevant institutions.

We welcome papers from data creators, principal investigators or data managers dealing with theoretical, methodological, and practical problems in research data management in cross-sectional, repeated or longitudinal surveys, as well as papers from archive personnel


Paper Details

1. From the survey design to the data archive - supporting the survey workflow with open standards and tools

Mr Ingo Barkow (DIPF)
Mr David Schiller (IAB)

A typical challenge for research infrastructures and data archives is the ingest of foreign datasets and metadata from data producers into their own systems. As data producers are not rewarded to prepare their data for archiving the process of ingest is very often time-consuming and connected to manual processes. This paper wants to show how a survey process from creating a study from the scratch, designing the instruments, perfoming the data collection, handling the administrative processes, curating the data, disseminating the data, publication and at last data archiving for secondary usage can be handled within open standards connecting the individual tools. As a result metadata is produced during the process without any further effort making the handling of research data much easier for the data producer and the data archive. This session will show the researchers perspective and demands on such a tool chain from the methodology side as well as a set of available tools (e.g. Rogatus, Colectica) and standards (e.g. DDI Lifecycle, SDMX) which can be used for putting this process into practice. Furthermore it will also show this process for different kind of studies (e.g. longitudinal studies and panels) as well as methods (e.g. qualitative studies or mixed mode).


2. INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN

Ms Irena Vipavc Brvar (Slovene Social Science Data Archives, Faculty of Social Sciences, UL, Slovenia)

In the past years it has been common that data archives have not been involved in data collection process, their inclusion started only at data distribution and long-term preservation. That resulted in additional problems, tensions and extra work on both sides, archives' and researchers'.

Recently many funders ask researchers to prepare a Data Management Plan and this is becoming an important professional standard. Data Management Plan is a scheme that among others regulates a process of handing over research data to data archives. Not only archives but also researchers themselves benefit from good data management. When research data are well organized, documented and accessible, they could be easily reused, which add value to surveys and exposes work of researchers.

In this presentation we will discuss about inclusion of data archives in a data management plan and a data collection itself. We will present issues that we have encountered in our everyday work in which we will expose parts of a data management plan where extra care is needed. As a case study we will present a survey which already included a deposition of data to data archives in a project proposal. Researchers should be aware that preparing materials for distribution and long-term preservation does not take much additional time or include additional costs if the whole process of data collection is properly managed and planned ahead.



3. Integrating User Feedback to improve Data Management

Mr Stefan Friedhoff (Bielefeld University)

Due to the demands of progressively more sophisticated data management, many researchers face problems while adapting existing Data Management strategies to their own research processes. Previous research in this field was mainly conducted from an archives point of view but lacking real world adaption and inclusion of the researchers view. The INF project (Information and Data Infrastructure), which assists data documentation across 17 projects within a Collaborative Research Center (SFB882) in the field of Social Sciences, identified three main problems for implementing data management strategies: methodological problems (1), acceptance problems (2) and problems of granularity (3). Based on researcher interviews, focus groups, and surveys, we were able to identify specific problems in these areas, and to develop both technical as well as methodological solutions. In this presentation we show to what extent documentation can be standardized in a research center holding heterogeneous data, and where it becomes necessary to adapt specific solutions to overcome methodological differences. This must be done while still maintaining a high degree of standardization between the projects. Thus, we will systematically present the emerged problems as well as strategies required to deal with them and to make them transferrable in a generalized way to other research projects.


4. Academic writing and Data Management

Dr Alexia Katsanidou (GESIS-Leibniz Institute for the Social Sciences)
Laurence Horton (GESIS-Leibniz Institute for the Social Sciences,)
Uwe Jensen (GESIS-Leibniz Institute for the Social Sciences)

Publishing in top journals is demanding and requires the highest levels of clarity. Empirical papers traditionally have a methodology section explaining to the reader the exact steps taken by the authors when operationalizing and testing their hypotheses. The role of this section is to ensure the replicability of the results presented in the paper allowing the scientific community to monitor the quality, challenge the findings and thus promote good scientific practice. The information included in this section starts with the operationalization of dependent and independent variables, the way they were measured and the possible re-codings they underwent. Then it moves to explaining the statistical methods used and the rationale behind them.
The quality of methodology sections of papers using secondary data heavily depend on the quality of the data documentation provided with the data used. Following the principle “garbage in, garbage out” we test wether the formal standards needed for a good methodological section follow the standards for high quality data documentation. We do that by testing a set of published papers using datasets with state of the art documentation and a set of papers using badly documented data. We find striking differences demonstrating that good data management is an integral part of good science.