Tuesday 16th July
Wednesday 17th July
Thursday 18th July
Friday 19th July
Download the conference book
Data Quality Management in Cross-National Surveys 1 |
|
Convenor | Dr Jessica Fortin-rittberger (GESIS) |
Coordinator 1 | Dr Christina Eder (GESIS) |
Coordinator 2 | Dr Manuela Kulick (GESIS) |
Achieving data of the highest quality is the aspired goal of all survey programmes. While a growing literature offers guidelines on standards for cross-national research, most of the recommendations are directed at the design phase of surveys. Meanwhile, much less attention has been allocated to the post-data collection stage so far.
Information about data processing from data cleaning to quality checks is rarely available in details and thus not always clear to researchers and analysts. Since little is known about the steps taken in data processing, we have reasons to believe that practices differ between survey programmes. This represents a unique opportunity for collaboration. Contrasting different approaches to data cleaning and quality control will help shed light on each approach's particularities, and by the same token help harness the strengths of each method through comparison of individual positive features and drawbacks. For some survey programmes, this might provide the first steps towards a more systematic approach to data cleaning and quality control.
The proposed session aims to cover this important gap by exposing in details a series of cross-national models of data cleaning and quality control processes, clarifying which procedures are used, and highlighting commonalities and dissimilarities. By confronting different methodologies and experiences of various approaches to data management, it will be possible to engage in a discussion, and most important, a critical evaluation of the processes employed by each approach: highlighting strengths, finding ways to minimize weaknesses, and suggesting strategies for possible improvements. This discussion could contribute to establishing guidelines of best practice in this field.
While the session is primarily geared towards the process of post-collection data management in cross-national programmes we also welcome papers that focus on national or subnational surveys.
Principles, standardized routines and programs, are cornerstones in the production of accurate, harmonized and standardised files in cross national surveys. For repeated surveys, comparability across time is an additional concern.
The ESS is now in its 6th wave and has experience with several of the challenges processing of data and metadata over time poses on consistency over time. One major challenge in this respect is changing societies, - the evolution of educational programmes, occupational structures, political parties etc and how these changes affect comparability over time. Another challenge is changes in the research instruments,- question wording, answer scales etc.
As the official data archive for the ESS, we would like to share and discuss our experiences with these kinds of challenges with other survey programmes, users of survey data, and data managers alike.
The presentation will highlight the importance of consistent metadata, outline some of the issues relating to "variable genealogy",- the evolution of questions and variables, and illustrate how some of these challenges have been dealt with in the ESS. In relation to some of these issues the ESS Data Download Wizard will be used as an illustration of how metadata applied in data management also can be re-used for the benefit of the end users.
Established in the early 1990s, the Comparative Study of Electoral Systems (CSES) is a cross-national research program whose design, among other things, allows research on the impact of varying electoral systems on individual behavior, especially voting and turnout. Every five years, a Planning Committee selected from among the CSES collaborators develops a questionnaire to address a specific substantive theme. This questionnaire module is included in post-election surveys from over sixty countries around the world. The micro-level surveys are deposited with a central Secretariat and merged together, along with macro-level electoral system variables, into a single file for comparative analysis. However before the data are released, they undergo at least two rounds of processing by the CSES Secretariat which include a series of quality checks performed on each dataset, as well as on the merged cross-national dataset. In this presentation, the data processing and quality checks for the CSES will be described, and changes and improvements in the process over time will be identified.
In the last three years the set of ISSP background variables underwent a profound revision to improve the data quality. One of the implementations is the category "civil partnerships" in the variable measuring marital status. Civil partnerships in internationally varying forms became a common phenomenon across the world, however still ignored by some international surveys.
Now that the ISSP has collected the information on civil partnerships since 2010, it is possible to analyze its value for the data quality. Comparisons across international survey data and metadata from ISSP, ESS and EVS shed some light on questions such as: What phenomena can be measured in the individual countries due to national legal frameworks? Are the same phenomena measured the same way across surveys? If this is not the case, what could be the reasons and how can we change that? Do different wordings generate different data outcomes? And last but not least: do the pure case numbers justify the effort?
Comparative analyses of respondents' attitudes reveal that in certain respects it makes a great difference whether respondents are coded as "in a civil partnership" or as "single" - the category where these respondents have ended up before and still do in some surveys. For example, in terms of happiness or subjective general health they show more similarities with married people. By means of a few examples the presentation will show that an inadequate coding weakens correlations between items and distort analyses results.