Quantitative Spatial Analysis of Micro and Macro Data: Methodological Challenges and Solutions 1 |
|
Session Organisers | Professor Henning Best (TU Kaiserslautern) Dr Tobias Rüttenauer (TU Kaiserslautern) |
Time | Thursday 18th July, 09:00 - 10:30 |
Room | D25 |
The session intends to bring together methodological experiences made when working with spatial data in quantitative empirical social research. On the one hand, spatial data offer the opportunity to investigate the relationship between regional characteristics on the macro level. On the other hand, spatial data can be used to enrich survey data with structural information on a certain regional level, either to control for context effects or to explicitly analyse these effects and their interplay with mechanisms on the individual level. By using GIS, addresses of survey participants can be linked with objective measures of their neighbourhood (e.g. pollution data) or proximity to institutions (e.g. of educational institutions or workplaces). Thus, these data allow investigating the relevance of infrastructure distances for social action as well as processes of spatial spillovers and diffusion.
In doing so, several methodological questions arise: What kind of regional level is adequate to what kind of question and how does the choice of administrative borders influence the derived conclusions (“MAUP”)? Can we enrich survey data by information on actual travelling times and means of transportation to account for the moving or action space of participants? What are the challenges and limitations of these approaches and how can it be done reliably?
Furthermore, innovative statistical methods are necessary to adequately analyse spatial data. Various regression models (e.g. SAR, SARAR, SLX, Durbin and others) address the spatial dependence in different ways and offer alternative approaches to identify different types of spatial spillovers or spatial interdependences, in cross-sectional and longitudinal data. Which types of models are adequate for which type of questions? Which models can be used to simultaneously analyse individual and aggregate data?
In sum, in this session we are especially interested in methodological and applied studies dealing with topics of:
1. Choice of adequate regional level and handling of borders when using administrative data
2. Connection of individual data and spatially aggregate as well as infrastructural data
3. Spatial analysis of time-series and cross-sectional data
4. Modelling spatial relationships (e.g. commuting flows, distances, traveling times, social interactions)
5. Modelling spatial interaction, spillover or diffusion processes
6. Further challenges and solutions when using georeferenced data
Keywords: spatial data, geodata, geo referencing, GIS
Professor Steffen Hillmert (University of Tübingen) - Presenting Author
Spatial context effects are an issue in many substantive applications. It is also a well-known phenomenon that context effects depend on the specific operationalization and demarcation of the spatial contexts, a phenomenon also known as modifiable areal unit problem (MAUP).
Besides this primarily empirical observation, there are at least some theoretical arguments that can often be put forward to decide about the (likely) spatial extension of relevant contexts. Probably most popular is the idea of ‘distance decay’, i.e. the assumption that objects or persons that are relatively close to someone are typically more import than objects or persons that are more distant to someone. When substantiated, an assumption like this allows, for example, for weighting these objects or contacts accordingly. However, such considerations are often not sufficient to derive distinct predictions for the adequate scaling of relevant contexts in (hierarchical) context analyses.
This conceptual paper examines various steps necessary for determining the extension, or the appropriate scale, of relevant spatial environments. A central strategy is to analytically disaggregate contexts into corresponding elements. Following this idea, adequate models of context scale need to specify not only the distance-dependent relevance of the context elements but also their spatial distribution as well as a rule of aggregation. Corresponding theoretical models result from substantive considerations but may also contain empirical components. The stepwise procedure makes it plausible, for example, why scale-related context effects have repeatedly shown non-monotonic patterns, in spite of the common assumption of monotonic distance decay. A distance-related maximum of (aggregate) context relevance may indicate an ‘optimal’ range for the measurement of context effects. The conceptual considerations also help to identify unrealistic assumptions when theorizing and interpreting spatial patterns of context effects.
Mr Robert Vief (Humboldt University of Berlin) - Presenting Author
Dr Henrik Schultze (Humboldt University of Berlin)
Ms Daniela Krüger (Humboldt University of Berlin)
Our research paper expands existing sociological network approaches in survey studies and adds a spatial perspective. We criticize that (1) existing survey instruments focus on specific forms of social support, most of them relying on ‘strong’ ties and interactions of spatial proximity. We support Mario Smalls (2017) approach to take the long neglected ‘weak’ ties more seriously in the research on social networks. To overcome methodological pre-assumptions of the existing literature on specific and pre-defined scenarios of social support, we thus let respondents define their own topics of relevant events, resources and support. Furthermore, (2) most name generators’ survey approaches underemphasize the spatial diffusion within network structures. We argue that existing network studies scarcely develop a spatial perspective at all. Where respondents talk and interact with other people remains unclear. Our approach fills this research gap and discusses important theoretical questions for urban and migration sociology: Where do neighborhoods create spaces for productive support? Which social categories (class, ethnicity, gender etc.) are more likely to use a greater spatial variability within their networks whereas others might rely on neighborhood institutions and local face-to-face interactions? Do specific neighborhoods create other pattern of the spatial use of social support? Are migrants more likely to use translocal pattern of interactions compared to native long-term residents?
Our project integrates existing survey software (LimeSurvey) and offline applications (Offline Surveys App) with Mapping software such as MapsMe in order to combine survey with GIS data sets. It opens up opportunities to calculate measurements of actual spatial distance and spatial clustering for ego-based social networks. Moreover, it creates possibilities to analyze overlapping spatial use of physical and digital interactions within neighborhoods.
In July, we will present first results from a representative survey collected in four Berlin neighborhoods, selected by a most different case design. We further discuss implications of technical integration
Dr Aleksey Oshchepkov (National Research University Higher School of Economics, Moscow, Centre for Labour Market Studies)
Dr Anna Shirokanova (National Research University Higher School of Economics, Saint-Petersburg, Laboratory for Comparative Social Research) - Presenting Author
Modelling contextual effects is a common task for many survey analysts. However, their choice of methods often depends on disciplinary conventions. Multilevel modelling offers the estimation of level-specific errors and generalisation of the model to the population. Its known drawbacks are a larger set of assumptions and little attention to causality estimation which prevents many economists from using it. This paper reviews the comparative advantages and disadvantages of the standard regression techniques vs. multilevel (hierarchical) regression modelling in estimating the effects of social context. We review both classical and recent literature on the models, simulation results, and typical errors among practitioners in applying both approaches. Our review shows, first, that the comparative performance of multilevel modelling in the case of omitted variable depends on the source of endogeneity. When it results from a variable omitted at the lowest level, multilevel modelling is inferior to the standard regression with instrumental variables or quasi-experimental techniques. However, when endogeneity results from a variable omitted at a higher level, multilevel modelling outperforms the standard regression techniques. Second, in modelling parameter heterogeneity across groups or time points, multilevel modelling is generally more efficient and convenient than the estimated dependent variable models or models with cross-level interactions within the standard regression framework. Our analysis demonstrates that multilevel modelling loses to the standard regression techniques only in causal inference and only when the endogeneity is caused by a variable omitted at the lowest level. However, recent papers have argued for the possibility of causal interpretation of multilevel models in the presence of sorting (Agrawal et al., 2018), or for yielding more efficient estimates than classical regression for longitudinal, panel, or time-series cross-sectional data (Feller and Gelman, 2015). In case of the variable omitted at the contextual level and in modelling parameter heterogeneity across groups, multilevel modelling is more efficient.
Mr Raman Mishra (International Institute for Population Sciences ) - Presenting Author
This study shows a distinct geographic variation in the use of sterilization method among women in India at the state and district level. Data from National Family Health Survey 4th round, 2016, for females aged 15-49 years is used. Integrated Nested Laplace Approximations (INLA) is employed, an approximate method for Bayesian inference based on the marginals of the parameters of the models. Bayesian Hierarchical is used in the context to estimates the spatial pattern of factors affecting sterilization over an extended geographical region. Besag York Mollie’s model is used within INLA to calculate the posterior structured spatial random effects. Further Posterior mean posterior random effects are mapped. Findings reveal that the females from North-east and Eastern region of India’s districts are inclined to adopt sterilization, a similar pattern was found for Southern and Western districts. The decision to adopt sterilization is driven by factors; last birth cesarean, no knowledge about other methods of contraception, the side effects of sterilization, lucrative incentive, insurance. Bivariate maps Local Indicators of Spatial Association (LISA) depicts that “last birth cesarean” is high in the southern region is one factor contributing significantly for a high prevalence of sterilization. Similarly, “no knowledge about other methods” and “lucrative incentive” was found to significantly correlated with sterilization in the north-eastern region. Sterilization is found to be high in rural areas and females belonging to Schedule Caste and Schedule Tribe. In contrary females belonging to higher income class and those literates prefer pitching upon sterilization. Significant spatial clustering, especially in southern and north-east regions, is evident, revealing India’s family planning program is still skewed towards female sterilization. A gender trans-formative health strategy is required. Thus, strengthening National Policy for Women 2016, which aims at recognizing women’s reproductive rights by shifting focus towards male sterilization or other reliable modern methods of contraception.