Panel and Survival Techniques for Complex Survey Data |
|
Chair | Dr Arne Bethmann (German Youth Institute ) |
Coordinator 1 | Dr Ulrich Pötter (German Youth Institute) |
The effects of ignoring the sample selection process when fitting models to survey data can have severe effects on the inference. There exists some studies in panel data or longitudinal setup where repeated observations are collected from the individual selected based on a sampling design, but they are confined to the linear correlated model setup for continuous observations. In this talk, we consider dynamic models for repeated count and multinomial data in a finite population setup and develop sampling design weights based likelihood estimating equations for the estimation of the survey population parameters. Properties of the estimators are discussed.
Background
Mobility is a prerequisite to participation in civic life and an important component of quality of life. The future development of mobility limitations will largely depend on modifiable risk factors, including excess weight, smoking and physical inactivity, but also on structural changes in the population, such as ageing and rising levels of education. This study aimed to project the prevalence and number of people with severe mobility limitations up to 2044, based on scenarios for the development of risk factors.
Methods
We applied a multistate model on repeated measures in the Health 2000 and 2011 Surveys (BRIF8901), representing the Finnish population, to account for individual risk factors and their changes over time. Unit nonresponse and sampling variability in the Health 2000 Survey was handled using the weighted bootstrap using the poststratification weights. The item nonresponse in 2000 and in the Health 2011 Survey using multiple imputation (MI) based on the chained equations and regression trees. The projections of the both the outcome and the risk factor values in the future were generated using the same MI technique assuming same transition probabilities as between years 2000 and 2011.
Results
The number of people with severe mobility limitation was projected to double by the year 2044 in Finland, due to the rapid ageing of the population. Excess weight was the most important modifiable risk factor predicting severe mobility limitations. Eliminating half of the excess weight would reduce the number of persons with severe mobility by one fifth. Reductions in the prevalence of smoking and physical inactivity would only have a small impact on the prevalence of severe mobility limitations. Even if excess weight, smoking and physical inactivity were completely eliminated, the number of persons with severe mobility limitation is projected to increase.
Conclusions
Designing and implementing strategies to promote healthy weight and weight reduction are top priorities for public policy to slow down the rapid increase in mobility limitations due to population ageing. MI using chained equations seemed to be a plausible approach to handle the changes not only in the outcome over time but also in the risk factors, which are often categorical, and can pose interactions and nonlinearities, which generally are time consuming to model in parametric MI techniques.
In this paper we analyse the difference between variance estimation using bootstrap replicate weights versus linearization methods (TSL). We are particularly interested in the impact of calibration on the difference between the two variance estimation techniques. In order to gauge the size of the effects we start with a set of weights from an actual study conducted in 2010 by the Bundesbank (“Panel on Household Finance”). This study provides both sample design information as well as a set of 1,000 bootstrap replicate weights. To be able to asses the impact of calibration, we simulate a large set of study variables more or less related to the calibration variables and estimate their variance in Stata using both bootstrap and TSL. We find that the linearization methods yield systematically higher variances than the bootstrap methods and the gap between the two methods widens as the correlation between the variable under study and the indicators used in the calibration of the weights increases. Ar very high Levels of correlation, the bootstrap replicate methode underestimates the true variance, while the linarization method is always a conservative estimate.