ESRA logo
Tuesday 14th July      Wednesday 15th July      Thursday 16th July      Friday 17th July     




Wednesday 15th July, 09:00 - 10:30 Room: HT-103


Enhancing survey data with geocoded auxiliary data 1

Convenor Dr Sarah Butt (City University London )
Coordinator 1Mr Rory Fitzgerald (City University London)
Coordinator 2Ms Kaisa Lahtinen (City University London)

Session Details

Combining survey data with auxiliary data from other sources provides researchers with a wealth of potential opportunities to improve survey data collection and the quality of the inferences that can be drawn from survey data. One type of auxiliary data that is increasingly widely available is geocoded data i.e. data that can be linked to survey data based on the geographic location of sampled addresses. This includes census data, administrative data from government agencies and other public sector bodies, commercial databases and geospatial maps. Such data can be used to answer substantive research questions about the effect of location on attitudes and behaviour. By providing information about all sample units, geocoded data are also a potentially valuable tool to aid data collection and for overcoming non-response bias.

However, using auxiliary data from pre-existing sources presents a number of challenges.
Identifying suitable auxiliary variables that are correlated with the survey variables of interest (and, in the case of non-response analysis, response propensity) can be difficult. There are concerns over the coverage, accuracy and timeliness of external databases, the extent to which data which is often highly aggregated can characterise sampled households, and the increased likelihood of deductive disclosure as a result of combining different data sources.

This session invites studies that have combined survey data with geocoded auxiliary data to share their learning regarding the opportunities and challenges associated with this approach. We are interested in papers that provide insights into any of the following:
• The pros and cons of using different sources of geocoded auxiliary data
• Strategies for linking geocoded auxiliary data to survey data
• Modelling item or unit non-response using auxiliary data
• Combining auxiliary data and survey data cross-nationally

Paper Details

1. Using Geo-coded Data as Part of the Multi-level, Multi-Source Approach to Improve Surveys
Dr Tom W. Smitht@norc.uchicago.edu (NORC at the University of Chicago)
Dr Jibum Kim (Sungkyunkwan University)

The best way to improve social science research is not to replace surveys with big data, but to use auxiliary data to enhance surveys both substantively and methodologically. As proposed in tthe Multi-Level, Multi-Source approach, surveys can be enriched by adding geo-coded data at multiple levels of aggregation and from multiple sources to augment that data collected by surveys. The aggregate geo-coded data can help to detect and reduce non-response bias and also improve substantive model by including context variables measuring neighbor-, community, and other-level effects.



2. Using geocoded auxiliary data to predict nonresponse in address-based samples: Are household- level commercial data any better than aggregate-level census data?
Dr Sarah Butt (City University London)
Ms Kaisa Lahtinen (City University London)
Mr Rory Fitzgerald (City University London)

Geocoded auxiliary data are a potentially valuable resource for understanding the problem of nonresponse bias. However, identifying data sources and variables predictive of both response propensity and survey outcomes is not straightforward. This paper uses auxiliary data from multiple sources and different levels of aggregation to investigate patterns of nonresponse in the European Social Survey in the UK. It compares whether household-level measures of socio-demographic characteristics such as employment status and family type obtained from commercial sources are any more effective at predicting response behaviour compared with the same characteristics measured at aggregate level via the 2011 Census.


3. The use of geo-coded data to develop an interview performance management system
Mr Joel Williams (TNS BMRB)

In many surveys, it is typical for an interviewer's performance to be judged against a uniform response rate target, regardless of how challenging it is to meet this target in the allocated assignment area. This can lead to misjudgment of an interviewer's performance over time.

This paper describes the development of a more sensitive - and more timely - performance monitor for the Crime Survey of England & Wales, based on seven years of field data and more than half a million addresses. It led to reassessment of who the top performers were and who needed more coaching to meet expectations.