Linking survey data and auxiliary data sources: statistical aspects and substantive applications 1 |
|
Convenor | Ms Chiara Peroni (STATEC ) |
Coordinator 1 | Mr Francesco Sarracino (STATEC, HSE-LCSR) |
Coordinator 2 | Mr Wladimir Raymond (STATEC) |
EU-SILC in Austria is a voluntary sample survey on income and living conditions. From EU-SILC 2012 onwards income registers have been used for collecting information on most components of the household income by linking administrative data to the households in the sample. The presence of more comprehensive administrative data in the EU-SILC sample also makes it possible to use this information in course of the weighting procedure. Changes to the weighting procedure due to administrative data - especially in terms of calibration - as well as a unit nonresponse analysis based on income register data will be presented.
Information from different survey sources can be used as auxiliary to each other to produce composite estimates of increased accuracy. Such auxiliary survey sources may be other surveys, subsamples of a single survey, past data from a repeated survey, or administrative data. Possibilities for linking survey data for improved estimation are on the increase in contemporary survey practice. For each case of linked survey micro-data, incorporation of auxiliary information into the sample weighting structure can be accomplished by a suitable weight calibration scheme, which is equivalent to a regression procedure based on the principle of best linear unbiased estimation.
This paper addresses how to link survey data with auxiliary sources. Currently, no data set incorporates individual, coethnic community, and group level factors. I construct coethnic community and group variables with different data sources and append the variables to individuals in nationally representative surveys for the US, Canada, and the UK. My approach for obtaining data is similar in all three countries: a.) individual data is retrieved from nationally representative surveys; b.) coethnic community data is created at small geographic areas using aggregated survey or Census data; and c.) group characteristics are coded from public sources (e.g., World Bank,
The Austrian Microcensus, which incorporates the Austrian Labour Force Survey, is one of the most important sources for labour market statistics. Although the non-response rate is only 5.7 % (in 2013), this non-response results in a small bias for the employment status. By including the employment status from administrative sources into the weighting scheme of the Austrian Microcensus, the bias was reduced.