ESRA logo
Tuesday 18th July      Wednesday 19th July      Thursday 20th July      Friday 21th July     




Thursday 20th July, 11:00 - 12:30 Room: N AUD5


Deviations and Fraud in Surveys - the Impact of Motivation and Incentives 1

Chair Professor Katrin Auspurg (LMU Munich )
Coordinator 1Professor Thomas Hinz (University of Konstanz)
Coordinator 2Dr Natalja Menold (GESIS)
Coordinator 3Professor Peter Winker (University of Giessen)

Session Details

Credibility of social science was repeatedly jeopardized by recent and spectacular cases of deviant behavior in conducting surveys or fraud in presenting survey based research results. Several times researchers published path-breaking results that turned out to be ‘too good to be true.’ Because the incentive system in science commonly rewards originality higher than accurateness, most probably, the detected cases of making up data or trimming results are only the tip of the iceberg.

What makes the situation in survey research even more complex is the fact that several actors are involved who have manifold incentives to manipulate data. These include the researchers, survey institutes, survey supervisors, interviewers and respondents. Contributions to the session will discuss the motivation, prevalence and implications of misbehavior of actors in survey research. Of interest are theoretical approaches and empirical studies on the motivation, detection and prevention of data manipulations. Strategies to detect fraud deserve specific attention, but we also welcome empirical work on causal mechanisms: Which conditions most likely trigger fraud? Which interventions could accordingly work?

Some examples along the survey process highlight possible topics for the session:

(1) Respondents often share an interest with the interviewers to save time by taking inaccurate short cuts in the questionnaires. Additionally, they are prone to provide false answers, for instance, if questions are sensitive. Both kinds of behavior yield inaccurate measurements.

(2) Interviewers often may have a high discretion on many decisions in the process of conducting a survey (e.g. when selecting households in a random walk sample, by shortening the interview time through steering the respondents to filter options in questionnaires, or making up interviews (partly) from the scratch). The motivation for deviant behavior can be influenced by factors such as task difficulty, interviewers’ ability, and experience, but also by the quality of questionnaires and instructions and other administrative characteristics of the survey.

(3) Survey institutes often operate commercially under high cost and time pressure. In order to fulfill their contractual obligations to their clients they might have incentives to change, for instance, complex screening procedures without documenting, to manipulate statistics on non-response or even produce (near) duplicates to satisfy quota.

(4) Finally, researchers in survey research can engage in questionable practices as well when they select cases and statistical models just in purpose to get most sensational results.

Paper Details

1. Assuring the quality of survey data: Incentives, detection and documentation of deviant behavior
Professor Peter Winker (Justus-Liebig-University Giessen)

Research data are fragile and subject to classical measurement error as well as to the risk of manipulation. This also
applies to survey data which might be affected by deviant behavior at different stages of the data collection process. Assuring data quality requires focusing on the incentives to which all actors in the process are exposed. Relevant actors and some specific incentives are presented. The role of data based methods for detection of deviant behavior is highlighted as well as limitations when actors are aware of them. Conclusions are drawn on how settings can be improved to provide positive incentives. Furthermore, it is stressed that a proper documentation of data quality issues in survey data is required both in order to increase trust in the data eventually used for analysis and to provide input for the development of new methods for detection of deviant behavior.


2. Bias and efficiency loss in regression estimates due to duplicated observations: a Monte Carlo simulation
Dr Francesco Sarracino (STATEC and HSE)
Dr Malgorzata Mikucka (Universitè Catholique de Louvain and HSE)

Some recent studies documented that survey data contain duplicate records. We assess how duplicate records affect regression estimates, and we evaluate the effectiveness of solutions to deal with duplicate records. Results show that duplicate records affect regression estimates: the chances of obtaining unbiased estimates when data contain 40 doublets (about 5% of the sample) ranges between 3.5% and 11.5% depending on the distribution of duplicates. If 7 quintuplets are present in the data (2% of the sample), then the probability of obtaining biased estimates ranges between 11% and 20%. Weighting the duplicate records by the inverse of their multiplicity, or dropping superfluous
duplicates are the solutions that perform best in all considered circumstances. Our results illustrate the risk of using data in presence of duplicate records and call for further research on strategies to analyze affected data.


3. Exit polls in Georgia: achieving accurate measurements in a challenging environment
Mr Nicolas Becuwe (Kantar Public )
Mr Christopher Hanley (Kantar Public)

In emerging democracies in the Post-Soviet region, survey measurements related to elections suffer from a lack of accuracy and significant biases. This is largely due to the political tensions between the political powers, as well as external pressures faced by pollsters. The lack of reliable and valid polling data in turn impacts the trust in election results and the legitimacy of elected representatives. We examine ways in which the quality of survey data may be improved, using the example of a large-scale exit poll held in Georgia in October 2016.

In October 2016, the Republic of Georgia held parliamentary elections, which elected 150 Members of Parliament. Brussels based Kantar Public, together with GORBI, Georgian Opinion Research Business international, were commissioned by a consortium of television companies (led by Imedi TV – second TV station by popularity in Georgia) to conduct an exit poll for the first round of the elections.

One of the main difficulties we faced when conducting the exit poll was the lack of existing reliable data from previous exit polls conducted in Georgia and doubts surrounding the validity of previous election results. The exercise was further complicated by the tensions and allegations between political parties and media throughout the campaign. Our exit poll as well as our pre-election survey systematically underestimated the opposition party ratings and overestimated those of the governing party.

In discussing the organisation, we will focus on how national and international research companies worked together to develop the study. Setting the grounds for achieving accurate measurements will be central to the discussion. This paper will propose solutions to ensure better measurement in this context.

The paper will focus on the following aspects:

- The lack of valid data as a major problem when conducting political research in Georgia
- Sampling design in Georgia using past vote
- Measurement and validity of results of the exit poll (differences with actual results)
- Repeated underestimation of the opposition party and overestimation of the governing party


4. Time trends and risk-factors in publication bias: An analysis of the caliper test in the Quarterly Journal of Economics 1960 – 2015
Ms Julia Jerke (University of Zurich)
Professor Heiko Rauhut (University of Zurich)

Publication bias is present when the publication of a manuscript (directly or indirectly) depends on the therein reported results, assigning a higher chance to significant results. This bias can have two sources. First, publishers and reviewers may prefer manuscripts with novel and significant results, expecting to increase the number of citations. And second, in anticipation of journals’ potential preference for significant results, researchers may then either desist to submit manuscripts reporting negative or insignificant effects or manipulate their results to be publishable. Such selection bias the published scientific literature by overly significant and hypothesis-confirming results.
The caliper test is a wide-spread method for estimating the existence and extent of publication bias. This test focuses on reported test statistics, analysing their distribution at the critical thresholds of significance. It is based on the assumption that in absence of publication bias the frequency of estimates in narrow, equal-sized intervals just above and just below the critical threshold should follow a uniform distribution. Thus, a substantial overrepresentation of estimates in the interval just above the critical threshold is assumed to be an evidence for publication bias. Previous studies applying the caliper test have shown that publication bias is a substantial problem in American and German sociology, social psychology, political science and several other disciplines. However, there are few studies examining the mechanisms of promoting and reducing publication bias. Also, it is unknown whether this is a rather new phenomenon or evolved over the last decades. Most studies on publication bias were published in recent years, so the answer to this question remains unclear.
The objective of the study is twofold. First, we examine several factors potentially influencing the magnitude of publication bias such as author group size, number of citations as a proxy for originality, explicit vs. implicit hypotheses, experiment vs. field study and funding. Second, we investigate a time trend. We collected data from all volumes of the Quarterly Journal of Economics, one of the leading economics journals, published between 1960 and 2015. The sample consists of all quantitative articles reporting empirical studies. To test for publication bias we screen these articles thereby extracting z- or t-values, respectively and subsequently analyse their distribution at the common levels of significance. Spanning over a timeframe of more fifty years this study is unique in presenting the longest time trend of publication bias.