ESRA logo
Tuesday 18th July      Wednesday 19th July      Thursday 20th July      Friday 21th July     




Thursday 20th July, 11:00 - 12:30 Room: F2 108


Different methods, same results? Comparing the consequences of alternative methods of data collection and analysis 1

Chair Professor Elmar Schlueter (Justus-Liebig-University Giessen )
Coordinator 1Professor Jochen Mayerl (University of Kaiserslautern)

Session Details

No doubt about it – recent years have seen an ever increasing proliferation of methods for survey data collection and analysis. Think about the growing administration of surveys via the internet and mobile devices, the combination of large-scale surveys with experimental designs, the multiple approaches available to examine data from respondents nested in different levels of analysis or the wider application of Bayesian statistics. Such methodological innovations certainly help to open up important novel avenues for research. However, a central yet somewhat understudied question coupled with the plurality of methods is: To what extent do different strategies of survey data collection and analysis applied to the same research question lead to converging conclusions? Specifically, this session starts from the observation that for most research problems a single appropriate strategy of data collection or analysis does not exist. Instead, researchers typically face alternative defensible methods which may or may not converge in their results. Thus, the aim of this session is to stimulate the debate on the methodological as well as substantive issues that might arise when applying multiple methods of survey data collection or analysis. Does the application of alternative research designs or statistical methods lead to converging results? Are social science results with different methods replicable? We invite researchers to submit papers discussing the consequences of applying alternative methods of survey data collection or analysis in the following two scenarios:

A. Same research question, comparing at least two different methods of data collection
B. Same research question, comparing at least two methods of data analysis

Please send your paper proposals (no more than 500 words in length) to:

JProf. Dr. Jochen Mayerl, jochen.mayerl@sowi.uni-kl.de
Prof. Dr. Elmar Schlüter, elmar.schlueter@sowi.uni-giessen.de

Paper Details

1. Taking stock: Twenty years of research on conversational interviewing
Dr Frederick Conrad (University of Michigan)
Dr Michael Schober (New School for Social Research)

It has been observed for several decades that standardized interviewing, the prevailing approach to collecting survey data in the social sciences and government research, may not always lead to uniform interpretation (e.g., Suchman & Jordan, 1990). If some respondents interpret questions differently than intended by the question authors, the accuracy of individual responses and possibly of the resulting population estimates may be compromised. Standardized interviewers read the question as worded and then use only nondirective (i.e., largely content-free) probing to address comprehension problems. To address this potential weakness with standardized methods, an alternative approach to survey interviewing has been proposed that encourages interviewers to clarify the survey concepts, using whatever words they judge will be most effective, when they determine there is misalignment between the respondents’ interpretation and how the question was intended (e.g., Schober & Conrad, 1997; Conrad & Schober, 2000). The logic behind this approach is that successful everyday communication often involves back and forth between speaker and listener to assure they are on the same page – at least sufficiently to accomplish their current conversational task. This process of “conversational grounding” (e.g., Clark, 1996) is at the heart of the proposed alternative interviewing technique and has led to its being called “conversational interviewing.” In contrast, standardized interviewers cannot ground question meaning because this could involve substantive wording that might differ between respondents.

At least ten studies have been conducted to evaluate the pros and cons of standardized and conversational interviewing. This paper reviews and synthesizes several of these studies describing what has been learned and what is still unknown. Among the clear findings are that conversational interviewing can dramatically improve response accuracy for factual questions when there is ambiguity about the meaning of key concepts in the questions. But the improvement requires additional interviewing time as clarification necessarily involves additional words; this is the case not only in interviewer-administered interviews but also in automated interviews carried out by animated virtual interviewers. Across studies, the improvement in response accuracy is greater when the interviewer can provide clarification both when respondents ask for it and when the interviewer judges the respondent needs it, even without an explicit request. This is also true in web questionnaires with clickable definitions and that can clarify concepts when respondents are slow to answer. Respondents seem to be sensitive to whether interviewers are able to provide clarification in this way, using more disfluent speech and, in face-to-face interviews, averting the interviewers’ gaze more often than do respondents in standardized interviews. Conversational interviewers with greater interpersonal sensitivity are more efficient, whether clarifying concepts about factual or opinion questions. And these benefits accrue without increasing interviewer variance. One question currently being investigated is whether conversational interviewing can help improve quality in other ways such as reducing acquiescence and straightlining by better communicating the meaning of response scale values. More also remains to be learned about the practical tradeoffs involved in administering conversational interviews in production surveys, but the evidence suggests they are


2. Comparing 2016 Election Results from Traditional Phone Studies with Web-based Methodologies
Ms Stephanie Marken (Gallup)
Mr Zac Auter (Gallup)
Dr Jennifer Dineen (University of Connecticut)

Declining response rates and increased costs associated with traditional sampling and data collection approaches have led many researchers to more rigorously explore alternative methodologies, and web-based methodologies in particular. Many researchers turned again to these methods in the 2016 presidential race. In the final week preceding the 2016 Presidential vote, Gallup conducted an experiment comparing these new, web-based methodologies, with its traditional phone survey methodology. Gallup’s experiment was designed to compare data from several data collection approaches, including a phone survey utilizing a random-digit-dialing (RDD) dual frame sampling approach and web-based methodologies.

The comparisons allowed Gallup to explore attitudinal differences about the election and likely voter models between different methodologies and non-representative and representative sampling frames, allowing researchers to identify bias associated with different sampling frames. In this presentation, Gallup will provide details about the accuracy of these approaches when compared with the final election results. Gallup will also share data about the actual turnout of opt-in panel members, comparing their self-report and likely voter model prediction with their actual turnout based on state election board records.


3. Must We All Become Bayesians Now? Accurate Frequentist Inference for Multilevel Analyses Based on Small Cluster Samples
Professor Merlin Schaeffer (University of Cologne)
Professor Martin Elff (Zeppelin University)
Dr Jan Paul Heisig (WZB)
Professor Susumu Shikano (University of Konstanz)

In a highly influential paper published in the American Journal of Political Science, Stegmueller (2013) claims that mixed effects multilevel models estimated by frequentist methods provide biased parameter estimates and severely anti-conservative inference for context effects when the number of upper-level units is small. Stegmueller recommends using Bayesian estimation instead, which he finds to be more accurate. In this paper, we reassess and refute Stegmueller’s claims. First, we present analytical proof that frequentist mixed effects models provide unbiased estimates of context effects. We further illustrate that the apparent bias in Stegmueller's simulations is simply a ramification of Monte Carlo Error. Second, we show how the reported inferential problems of frequentist estimation arise from using Full Maximum Likelihood estimation and from relying on the standard normal (Gaussian) distribution to identify confidence limits and p-values. Using both Restricted Maximum Likelihood Estimation and the t-distribution with approximately correct degrees of freedom yields accurate inference, even for very small upper-level samples. Hence, concerns about the minimum number of clusters necessary for multilevel analyses, which have long haunted comparative social science, are unjustified. While there may be compelling reasons to favor Bayesian estimation, it is not necessary for achieving accurate inference for multilevel analyses based on small upper-level samples.


4. Female labor force participation in cross-country comparison. Applying multilevel regression analysis and Qualitative Comparative Analysis
Dr Thomas Laux (University of Bamberg)
Mrs Stefanie Heyne (University of Bamberg)

Methods of data analysis in the social sciences can either be classified as “effects-of-causes” or “causes-of-effects” approaches (Goertz and Mahoney 2012: 41). Quantitative methods follow the effects-of-causes approach, which aims for the causal effect of variables of interest. In reverse, the causes-of-effects approaches seek for the causes of a certain outcome. This procedure is, according to Goertz and Mahoney (2012: 42) a characteristic of the “qualitative culture” in social research. This fundamental distinction between quantitative and qualitative research serves as a starting point for our study. We are applying a multilevel regression analysis as well as a Qualitative Comparative Analysis (QCA) in order to study the similarities and the differences of the methods of data analysis. The aim is to open up different perspectives in cross-country comparison on the same object according to the applied method.
The object of the analysis is the rise in female labor force participation. Since World War II the increasing labor force participation of women is one of the most important social changes, which was mainly driven by a stronger labor market attachment of mothers. But despite an almost global trend in rising female labor force participation rates, we observe noticeable differences between countries in the strengths and the patterns of women’s labor market integration. Previous research that aims to explain those differences refers to country differences in modernization, institutions or culture and most studies use methods of quantitative cross-country comparison like multilevel analysis or time-series analysis. This quantitative approach of international comparison has been subject to several critiques during the last years (e.g. Ebbinghaus 2005). Most important, the restricted variation of independent variables across countries and resulting problems of multicollinearity as well as path dependency and geographical autocorrelation which violates a substantial assumption of regression analysis has questioned the applicability of this approach for cross-country comparison in general.
This raises the question whether the use of an alternative method like QCA is better suited for cross-country comparison. QCA is a diversity oriented method and captures the different cultural and institutional contexts (Ragin 2008: 109). The three fundamental principles—‘equifinality, conjunctural causation and asymmetry’ (Schneider and Wagemann 2012: 8)—define QCA’s specific approach on comparative research. It offers the possibility to distinguish between necessary and sufficient conditions and identifies equifinal explanations for a social phenomenon. In case of female labor force participation it is likely that conjunctural causation and the asymmetry of conditions are suited to capture the complexities of the patterns for explaining the differences between the countries.
This paper aims to compare those two approaches of international comparative research to test the explanatory power of structural and cultural factors in explaining country differences in female labor force participation. Using macro-level indicators for more than 80 countries we perform both cross-country regression analysis as well as QCA analysis to answer the question whether the application of different methods lead to the same results and new insights.