ESRA logo

Tuesday 16th July       Wednesday 17th July       Thursday 18th July       Friday 19th July      

Download the conference book

Download the program





Wednesday 17th July 2013, 11:00 - 12:30, Room: No. 22

Weighting issues in panel surveys 1

Convenor Ms Nicole Watson (University of Melbourne)
Coordinator 1Dr Olena Kaminska (University of Essex)

Session Details

A range of issues arise when constructing weights for longitudinal surveys. In addition to the issues typically faced in cross-sectional surveys, we also need to consider how populations are defined over time, how to treat deaths and other out-of-scopes and how to best adjust for attrition. Household panel surveys have additional layers of complexity due to changes in the household structure, such as births, deaths, household splits, household mergers, and people moving into or out of the household. Further, when refreshment or top-up samples are added to household panel surveys, how should the additional sample be incorporated into the weights for researchers who want to use the combined sample?

This session seeks to bring together survey methodologists who are involved in constructing weights for panel surveys to explore the approaches taken for longitudinal weights and (where relevant) cross-sectional weights. Some discussion of this topic has occurred in the Panel Survey Methods Workshops in recent years, but we are looking to broaden the scope of these discussions.

Papers submitted to this session might include comparisons of alternative methods, analysis of the impact of particular components of the weights, or suggestions for new methods.


Paper Details

1. Generating stand-alone weights for a sub-population with time-varying characteristics: a SOEP case study

Dr Mathis Schroeder (DIW, Berlin)
Mr Rainer Siegers (DIW, Berlin)

The project "Familien in Deutschland" (FiD) is a longitudinal panel study related to the "Socio-economic Panel" (SOEP). FiD's main purpose is to provide researchers with data on specific sub-populations: low-income families, large families, single parents and families with young children. In total, there are three samples: two drawn in 2010, one in 2011. Our weighting approach allows analyzing FiD in combination with the SOEP or as a stand-alone dataset.
The construction of FiD-SOEP-integrated weights follows the procedure used when integrating new samples into the SOEP: FiD and SOEP are combined with a joint calibration step based on marginal distributions obtained from German population statistics. Using this procedure to create cross-sectional weights is feasible in 2010, but not for the integrated FiD-samples in 2011: because our sample is defined by characteristics in 2010, any marginal distribution of a following year needs to be restricted to a population defined by its characteristics in 2010.
We propose a three-step procedure: firstly, we integrate all FiD-cases into the SOEP. Secondly, within this combined sample, we define which cases belong to the "FiD-like" population to estimate the sub-population total in 2011. Thirdly, a logit model is used to estimate the probabilities of being a FiD case, which in combination with the integrated weights lead to the cross-sectional weights for the FiD sample.
The presentation will provide details on the procedures sketched above as well as possible alternatives to our proposed solution.


2. Handling attrition and non-response in the 1970 British Birth Cohort Study (BCS70)

Dr Tarek Mostafa (Institute of Education - University of london)
Dr Richard Wiggins (Institute of Education - University of london)

The 1970 British Birth Cohort Study (BCS70) is a continuing multi-purpose, multidisciplinary longitudinal study based on a sample of over 17,000 babies born in England, Wales and Scotland. The study has collected detailed information from cohort members on various aspects of their family circumstances at birth, their education, employment, housing and partnership histories over eight sweeps of data collection at ages 5, 10, 16, 26, 30, 34, 38 and most recently aged 42 years (2012). The paper consists of three successive analyses. First, we analyse attrition patterns in the BCS70 data in relation to key childhood characteristics and we trace the evolution of the sample over time. The main objective is to ascertain whether the sample is losing respondents of a particular type. Secondly, we model non-response for each wave using birth characteristics as explanatory variables and we generate non-response weights from logit response models. Thirdly, using a simulation study we illustrate the impact of different methods of tackling non-response on the efficiency of statistical inference. The methods we use are deletion, weights, and multiple imputations.



3. From PISA to LSAY: Weighting the Australian Longitudinal Surveys of Australian Youth

Mr Patrick Lim (National Centre for Vocational Education Research)

The Longitudinal Surveys of Australian Youth (LSAY) tracks 15 year olds as they move from school into further study, work and other destinations.

The 2003 and 2006 cohorts of the Longitudinal Surveys of Australian Youth (LSAY) are derived from the 2003 and 2006 Programme of International Student Assessment (PISA). The Programme of International Student Assessment uses a stratified sample scheme to sample individuals, with sample weights created in PISA to ensure that the resultant sample represents the underlying population of 15 year olds. The longitudinal nature of LSAY means that over time individuals drop out of the sample. In particular, different groups of individuals drop out at differential rates. This differential attrition can lead to bias in estimates and analysis undertaken using longitudinal data. One method to help overcome some of this bias is for data analysts to use weights when undertaking analysis or producing tables from LSAY.

The original sample weights must therefore be adjusted to account for differential attrition to ensure that the LSAY sample in each wave continues to represent the underlying population.

This presentation outlines the methodology used in deriving longitudinal and sample weights for the Longitudinal Surveys of Australian Youth.



4. What is the extent and impact of population dynamics we desperately try to adjust for?

Mr Tobias Gummer (GESIS - Leibniz-Institute for the Social Sciences)

Attrition due to unit nonresponse is a crucial problem in panel surveys. The construction of longitudinal weights is one strategy to tackle this and to adjust for the resulting bias. Driven by birth, death and migration, the composition of a population changes over time. As a result of population dynamics longitudinal weights have to take cross-sectional adjustment into account. In several panel surveys refreshment samples are used to control population dynamics design-based.
While there are different approaches to deal with population dynamics, e.g. weighting and refreshment samples, little is written about the extent and impact of these dynamics. Given the methodological discussion on taking dynamics into account by such strategies, the paper addresses this shortcoming using electoral populations in Germany as an example.
In the paper two different methods are applied. First, official population statistics are used to estimate the extent of population change. Second, cross-section datasets are pooled to estimate the impact of population change on specific variables, e.g. political interest. Therefore, decomposition methods are used to entangle within and between cohort change. In consequence, the impact of birth and death is determined and conclusions are drawn on the need to adjust and correct for population dynamics.


5. Evaluation of weighting methods to integrate a new top-up sample with an ongoing longitudinal sample

Ms Nicole Watson (University of Melbourne)

Long-running household-based panels sometimes top-up their samples over time. This may be for several reasons including increasing sample size, improving the coverage of the population, or to target specific groups in the population. In 2011, a general top-up sample was added to the Household, Income and Labour Dynamics in Australia (HILDA) Survey, a decade after the original sample was selected. The top-up sample added 2150 responding households to the main sample of 7400 responding households, representing nearly a 30 per cent increase in the overall sample size.

While the main HILDA sample and the top-up sample were selected via similar area-based designs, the main sample has evolved over time as household structures change due to births, deaths, splits and mergers. For the most part, the main sample can mimic the changes to the population but not with respect to immigration. As a result, the populations that these two samples now represent overlap to a large degree but not completely.

Drawing on the experience of other large household-based panels, this paper evaluates four options for integrating the HILDA samples. The evaluation considers the variability in the weights, the root mean square error of a range of key estimates, and a comparison to estimates from two large-scale national cross-sectional surveys.