All time references are in CEST
Applications, Potentials, and Challenges when Using Google Trends in Combination or as Substitute for Surveys 2 |
|
Session Organisers |
Professor Florian Keusch (University of Mannheim) Ms Johanna Mehltretter (University of Mannheim) Dr Christoph Sajons (University of Mannheim) |
Time | Wednesday 19 July, 14:00 - 15:00 |
Room | U6-01c |
Aggregated Internet search data from Google Trends are increasingly used as a supplement or alternative to survey data. Proponents of Google Trends argue that anonymous search queries of Internet users are a good reflection of true interest, behaviors, and attitudes, particularly for sensitive topics, where surveys suffer from measurement error due to social desirability. In addition, Google Trends allows researchers to study changes in topic salience, attitudes, and behaviors across time and geographic areas at much finer granularity than possible in surveys. On the downside, using Google Trends data may include multiple problems. First, not everybody uses the Google search function, potentially leading to selection bias. Second, Google Trends only provide search volumes based on a sample of all search queries, thus questions of reliability arise. And third, it is often unclear how validly the selected search terms measure the constructs of interest.
In this session, we aim to bring together empirical evidence on the state-of-the-art use of Google Trends data in combination with or as an alternative to self-reports from surveys. Submissions can be methodological in orientation or can be substantive applications that demonstrate the usefulness and assess the quality of Google Trends data. Potential topics for submissions include, but are not limited to:
- Validation of Google Trends data
- Comparison of different approaches to select appropriate keywords
- Approaches to overcome reliability issues of Google Trends data
- Triangulation through joint use of Google Trends with surveys
- Analysis strategies for Google Trends data
- Best practices for transparent documentation when working with Google Trends
- Social science applications of the use of Google Trends data to measure specific attitudes, behavior, and topic salience
Ms Anne-Sophie Oehrlein (GESIS - Leibniz Institute for the Social Sciences) - Presenting Author
Dr Tobias Gummer (GESIS - Leibniz Institute for the Social Sciences)
Google Trends makes aggregated search engine data available, which enable researchers to investigate trends of search term usage on Google Search. The data obtained via Google Trends are relative search volumes for a selected search term across a predefined period of time and location. Depending on when a data retrieval query is issued and which period is specified, data are drawn from different samples: a real-time sample and a non-realtime sample. The real-time sample offers data of high granularity, whereas the non-realtime sample is a sample of search engine data starting from 2004. Previous research has questioned the reliability of non-realtime data and suggested that combining multiple samples (i.e., re-sampling) may help to mitigate these reliability issues. However, there appear to be differences between search terms on how well such procedures perform. It remains an open question on how to implement re-sampling in practice when aiming to reduce differences between non-realtime and real-time data. Specifically, it remains an open question how many samples to combine for specific search terms. With the present study, we address the issue of reliability of Google Trends data by investigating two research questions: (i) How does combining multiple samples reduce differences between non-realtime and real-time data? (ii) Does the performance of re-sampling differ between search terms?
To answer these research questions, we will collect real-time data for a one-week period. We will then collect daily re-samples of non-realtime data for the same search terms and periods for at least two months (i.e., 60 samples per search term). In our analyses, we will investigate how re-sampling reduces differences to the real time sample, conditional for search terms.
Miss Johanna Mehltretter (University of Mannheim) - Presenting Author
Professor Florian Keusch (University of Mannheim)
Dr Christoph Sajons (University of Mannheim)
Researchers increasingly use aggregated Internet search data, in particular from Google Trends, as a supplement or alternative to survey data. These data are assumed to be less prone to recall bias or social desirability bias for sensitive topics, can be accessed almost in real-time, and allow researchers to study changes in interests, attitudes, and behaviors across time and geographic areas at much finer granularity than in traditional surveys. Using this kind of data comes with important challenges with respect to construct validity, sample stability, and representativeness, however, that may severely restrict the meaningfulness of the obtained results. In this paper, we describe and assess the state-of-the-art of research with Google Trends data in the social sciences. We first identify and discuss the most important issues for valid and reliable measurement of topic salience, attitudes, and behaviors. Next, we conduct a systematic literature review of 365 studies using Google Trends data in the social sciences to (1) illustrate habits and trends over the past decade and (2) assess whether researchers take the identified challenges into account. The results show that the large majority of the literature fails to assess the validity of their Google Trends measure, does not consider whether the retrieved data is consistent across samples, and is not aware of the lack of representativeness of their data. We conclude by stating a set of guidelines that will help researchers reduce these problems and properly work with Google Trends data.
Ms Anna Meisser (FORS) - Presenting Author
Mr Max Felder (FORS)
Mr Nicolas Pekari (FORS)
The use of big data, such as Google trend data, as an alternative to traditional survey data has increased dramatically in the past decade. Despite the advantages of low cost and ease of collection, validation, replicability, and complexity remain challenging. Systematic reviews show that most studies have been carried out in monolingual, short-term settings and are often not validated with corresponding survey data. This study aims to test Google trend data’s ability to substitute specific cases or support traditional survey data. Different party affiliated words are used as a proxy to measure the strength of party preferences in the different Swiss cantons between 2019 and 2023. The collected Google data will be compared to survey data from the Swiss election study (Selects) as well as official election results from Swiss governmental elections. This serves to validate the party strength proxies measurements over time and regional spaces. While an intersection of internet users, voters, and survey respondents exist, not all internet users participate in elections and vice versa. Therefore, the accuracy and plausibility of Google trend data over time is tested with controls, such as party affiliation or left-right placements, from representative survey and election data. Lastly, the validated Google trend model will be compared to traditional survey data models in the accuracy of prediction of cantonal party strength for the October 2023 national parliamentary elections in Switzerland. Testing the validity and plausibility of Google trend data in a multilingual country serves to scope out the potential and possibilities of using said data to measure more complex proxies, such as issue attitudes, in the future.