Robust Methods in Survey Design and Analysis with Applications |
|
Convenor | Dr Marco Geraci (University of South Carolina ) |
Coordinator 1 | Dr James Hardin (University of South Carolina) |
Coordinator 2 | Dr Andrew Ortaglia (University of South Carolina) |
The National Survey of College Graduates (NSCG) is conducted by the US Census Bureau on behalf of the National Science Foundation. The NSCG's primary focus is on the science and engineering workforce. Frame information for sampling is obtained from the American Community Survey (ACS). A graduate may have an ACS stratum with relatively low importance, while belonging to an important NSCG domain. The high sampling weight of such a "stratum jumper" leads to unstable estimates and variance estimates. Options for addressing such stratum jumpers at both the design and estimation stage are considered, theoretically and via a detailed simulation.
A way to compensate for item nonresponse is using a multiple imputation routine that relies on the assumption of joint normality. Since research data follow the normal distribution only in the rarest of cases, one can approximate normality when transforming data before imputation. Therefore, we study the handling of skewed distributions by different transformation approaches that rely either on the method of moments or the maximum likelihood method. The aim of the approach is to obtain a correct inference for current survey data analyses. The paper also addresses the criticism in recent years.
When addressing complex survey data, the estimation of population parameters requires statistical modeling that accounts for design features. A substantial complication arises when data are affected by unit and item nonresponse. We address estimation issues for models which target the conditional quantile of a continuous outcome. Survey design variables are properly included in the analysis, and we propose a bootstrap variance estimator. The proposed imputation method preserves conditional skewness and kurtosis, and successfully handles bounded outcomes. Stata and R code implementations will be demonstrated and made available.
The National Diet and Nutrition Survey (NDNS) uses a representative sample to assess the diet of the UK population. Selecting individuals becomes challenging when sampling over a large geographic area therefore the NDNS uses a complex survey design. Participants provide clustered dietary data which can be examined using mixed-effects models, developed to estimate the conditional mean, although this will fail to provide a complete picture of relationships between dietary intake and explanatory variables. Quantile regression provides a comprehensive description of the intake distribution. A novel method of quantile regression of clustered data collected using a complex survey design is