ESRA logo

Tuesday 16th July       Wednesday 17th July       Thursday 18th July       Friday 19th July      

Download the conference book

Download the program





Wednesday 17th July 2013, 14:00 - 15:30, Room: No. 7

Data Quality Management in Cross-National Surveys 2

Convenor Dr Jessica Fortin-rittberger (GESIS)
Coordinator 1Dr Christina Eder (GESIS)
Coordinator 2Dr Manuela Kulick (GESIS)

Session Details

Achieving data of the highest quality is the aspired goal of all survey programmes. While a growing literature offers guidelines on standards for cross-national research, most of the recommendations are directed at the design phase of surveys. Meanwhile, much less attention has been allocated to the post-data collection stage so far.

Information about data processing from data cleaning to quality checks is rarely available in details and thus not always clear to researchers and analysts. Since little is known about the steps taken in data processing, we have reasons to believe that practices differ between survey programmes. This represents a unique opportunity for collaboration. Contrasting different approaches to data cleaning and quality control will help shed light on each approach's particularities, and by the same token help harness the strengths of each method through comparison of individual positive features and drawbacks. For some survey programmes, this might provide the first steps towards a more systematic approach to data cleaning and quality control.

The proposed session aims to cover this important gap by exposing in details a series of cross-national models of data cleaning and quality control processes, clarifying which procedures are used, and highlighting commonalities and dissimilarities. By confronting different methodologies and experiences of various approaches to data management, it will be possible to engage in a discussion, and most important, a critical evaluation of the processes employed by each approach: highlighting strengths, finding ways to minimize weaknesses, and suggesting strategies for possible improvements. This discussion could contribute to establishing guidelines of best practice in this field.

While the session is primarily geared towards the process of post-collection data management in cross-national programmes we also welcome papers that focus on national or subnational surveys.


Paper Details

1. Data coding and harmonization: How DataCoH and Charmstats are transforming social science data

Dr Kristi Winters (GESIS - Leibniz Institute for the Social Sciences )
Mr Martin Friedrichs (GESIS - Leibniz Institute for the Social Sciences )

In 2012 GESIS will launch two electronic resources to assist social researchers. The website DataCoH (Data Coding and Harmonization) will provide a centralized online library of data coding and harmonization for existing variables to increase transparency and variable replication. The software program Charmstats (Coding and Harmonizing Statistics) will provide a structured approach to data harmonization by allowing researchers to: 1) download harmonization examples; 2) document variable coding and harmonization processes; 3) access variables from existing datasets for harmonization; and 4) create harmonization projects for publication and citation. This paper explains DataCoH and Charmstats and demonstrates how they work.


2. Manufacturing content? Principles for cross-sectional data processing in the European Social Survey: The case of ESS5

Mr Ole-petter Øvrebø (Norwegian Social Science Data Services)

In cross-national surveys like the European Social Survey, national resource asymmetries at the post-collection stage, represent potential challenges to comparability: Time, personnel and skills allocated to data control and data editing varies from country to country. Moreover, data collectors and data producers have different traditions of processing data. Thus, data deposited to the archive fall short of being homogeneous with respect to control and editing. This is one of the main challenges to the overarching goal of providing end-users with data that are as standardised and harmonised as possible, while reflecting the original reliability and quality of the data.

Ideally, then, in repeated cross-national surveys, processing activities should be integrated both vertically (between surveys) and horizontally (between countries within one survey). This presentation will focus on the horizontally integrated data processing of the European Social Survey, where the ESS Data Archive's standardised data processing procedures will be presented in detail, deploying the previous round of the survey, ESS round 5, as use case.


3. Do you know your data? Some observations of hidden stumbling stones in compartive analysis

Professor Peter Ph Mohler (Mannheim University & COMPASS)

Modern national and cross-national surveys are complex products that withstand shallow analyses. Here are some examples, past and recent. Analysing data from the ESS this author found a dubious result. The outcomes of separate analyses for men and women were somewhat inconclusive. That situation became clear after checking for data collection mode: respondents had a choice between self-completion or interviewer assisted. Women choose much more often self-completion than men. Thus the inconsistencies where due to a "hidden in the documentation" characteristc of these data.
Second example, some years ago researchers presented analyses about developments in East Germany right after the changes in 1989/90. They found a huge difference between the first and the second study (one right after and the other one year later the events). However, both studies were not comparable because the first one used a quota sample designe while the second a full probability design. The presenters were not aware of the fact because, again, these facts were documented but obiously not perceived by researchers.
This paper emphasises two issues: a. proper documentation of surveys according to modern documentation standards and b. increasing awareness and knowledge of users that they need to understand the complexity of surveys, manage the data for their analyses accordingly by, for instance applying appropriate weights, or adjusting for mode differences, or apply proper missing data definitions.