Weighting issues in complex cross-sectional and longitudinal surveys 1 |
|
Convenor | Ms Nicole Watson (University of Melbourne ) |
Coordinator 1 | Dr Olena Kaminska (University of Essex) |
Regression analysis of survey data often incorporates the survey weights to allow for a complex sample design. Often the design weights are used for this purpose. Other possibilities include calibrated weights, the product of the design weights and a function of the covariates, smoothed weights given the values of the response and covariates, or unweighted approaches. The statistical and practical properties of these options are evaluated using the New Zealand Health Survey, which is an example of a complex, multi-stage, dual-frame health survey which oversamples ethnicities.
Recent integration of representative survey data and genomewide data provides new opportunities to generate population health estimates. Currently, sampling weights are not applied to genomewide data and analyses. Using the New Solider Study, the Pre and Postdeployment Study (Army STARRS) and the Health and Retirement Study, we examine the effects of using sampling and non-response weights on generalizability in genomewide data analyses such Genome Wide Association Studies and Genomic Relatedness Matrix Restricted Maximum Likelihood. As biological data are often collected from separate consents and are sometime subsamples of the larger study, these weights are unique and challenging to create.
Data for all countries for 6 rounds were weighted and analysed. An inventory of 23 variables that appear in all rounds was used to compute non-response biases i.e. the differences between the weighted and the unweighted estimate. Absolute and relative, standardized and non-standardized biases were computed for the selected variables and compared to response rates. The results vary across countries - the larger the nonresponse rate, the higher are the biases. In addition, a sensibility analysis was performed to estimate variations in the weighting procedure.
Longitudinal surveys follow people over time and some deaths will occur during the life of the panel. Through fieldwork efforts, some deaths will be known but others will go unobserved due to sample members no longer being issued to field or having inconclusive fieldwork outcomes. Using the Household, Income and Labour Dynamics in Australia (HILDA) Survey, three methods are used to examine the implications for non-response correction: i) matching to the national death registry; ii) employing life expectancy tables; and iii) modelling deaths as part of the attrition process.