Deviations and Fraud in Surveys - the Impact of Motivation and Incentives 1 |
|
Chair | Professor Katrin Auspurg (LMU Munich ) |
Coordinator 1 | Professor Thomas Hinz (University of Konstanz) |
Coordinator 2 | Dr Natalja Menold (GESIS) |
Coordinator 3 | Professor Peter Winker (University of Giessen) |
Research data are fragile and subject to classical measurement error as well as to the risk of manipulation. This also
applies to survey data which might be affected by deviant behavior at different stages of the data collection process. Assuring data quality requires focusing on the incentives to which all actors in the process are exposed. Relevant actors and some specific incentives are presented. The role of data based methods for detection of deviant behavior is highlighted as well as limitations when actors are aware of them. Conclusions are drawn on how settings can be improved to provide positive incentives. Furthermore, it is stressed that a proper documentation of data quality issues in survey data is required both in order to increase trust in the data eventually used for analysis and to provide input for the development of new methods for detection of deviant behavior.
Some recent studies documented that survey data contain duplicate records. We assess how duplicate records affect regression estimates, and we evaluate the effectiveness of solutions to deal with duplicate records. Results show that duplicate records affect regression estimates: the chances of obtaining unbiased estimates when data contain 40 doublets (about 5% of the sample) ranges between 3.5% and 11.5% depending on the distribution of duplicates. If 7 quintuplets are present in the data (2% of the sample), then the probability of obtaining biased estimates ranges between 11% and 20%. Weighting the duplicate records by the inverse of their multiplicity, or dropping superfluous
duplicates are the solutions that perform best in all considered circumstances. Our results illustrate the risk of using data in presence of duplicate records and call for further research on strategies to analyze affected data.
In emerging democracies in the Post-Soviet region, survey measurements related to elections suffer from a lack of accuracy and significant biases. This is largely due to the political tensions between the political powers, as well as external pressures faced by pollsters. The lack of reliable and valid polling data in turn impacts the trust in election results and the legitimacy of elected representatives. We examine ways in which the quality of survey data may be improved, using the example of a large-scale exit poll held in Georgia in October 2016.
In October 2016, the Republic of Georgia held parliamentary elections, which elected 150 Members of Parliament. Brussels based Kantar Public, together with GORBI, Georgian Opinion Research Business international, were commissioned by a consortium of television companies (led by Imedi TV – second TV station by popularity in Georgia) to conduct an exit poll for the first round of the elections.
One of the main difficulties we faced when conducting the exit poll was the lack of existing reliable data from previous exit polls conducted in Georgia and doubts surrounding the validity of previous election results. The exercise was further complicated by the tensions and allegations between political parties and media throughout the campaign. Our exit poll as well as our pre-election survey systematically underestimated the opposition party ratings and overestimated those of the governing party.
In discussing the organisation, we will focus on how national and international research companies worked together to develop the study. Setting the grounds for achieving accurate measurements will be central to the discussion. This paper will propose solutions to ensure better measurement in this context.
The paper will focus on the following aspects:
- The lack of valid data as a major problem when conducting political research in Georgia
- Sampling design in Georgia using past vote
- Measurement and validity of results of the exit poll (differences with actual results)
- Repeated underestimation of the opposition party and overestimation of the governing party
Publication bias is present when the publication of a manuscript (directly or indirectly) depends on the therein reported results, assigning a higher chance to significant results. This bias can have two sources. First, publishers and reviewers may prefer manuscripts with novel and significant results, expecting to increase the number of citations. And second, in anticipation of journals’ potential preference for significant results, researchers may then either desist to submit manuscripts reporting negative or insignificant effects or manipulate their results to be publishable. Such selection bias the published scientific literature by overly significant and hypothesis-confirming results.
The caliper test is a wide-spread method for estimating the existence and extent of publication bias. This test focuses on reported test statistics, analysing their distribution at the critical thresholds of significance. It is based on the assumption that in absence of publication bias the frequency of estimates in narrow, equal-sized intervals just above and just below the critical threshold should follow a uniform distribution. Thus, a substantial overrepresentation of estimates in the interval just above the critical threshold is assumed to be an evidence for publication bias. Previous studies applying the caliper test have shown that publication bias is a substantial problem in American and German sociology, social psychology, political science and several other disciplines. However, there are few studies examining the mechanisms of promoting and reducing publication bias. Also, it is unknown whether this is a rather new phenomenon or evolved over the last decades. Most studies on publication bias were published in recent years, so the answer to this question remains unclear.
The objective of the study is twofold. First, we examine several factors potentially influencing the magnitude of publication bias such as author group size, number of citations as a proxy for originality, explicit vs. implicit hypotheses, experiment vs. field study and funding. Second, we investigate a time trend. We collected data from all volumes of the Quarterly Journal of Economics, one of the leading economics journals, published between 1960 and 2015. The sample consists of all quantitative articles reporting empirical studies. To test for publication bias we screen these articles thereby extracting z- or t-values, respectively and subsequently analyse their distribution at the common levels of significance. Spanning over a timeframe of more fifty years this study is unique in presenting the longest time trend of publication bias.