Validity of Vignette-Designs |
|
Chair | Mrs Lena Verneuer (University of Bielefeld, Faculty of Sociology ) |
Coordinator 1 | Professor Stefanie Eifler (Catholic University Eichstätt, Faculty of History and Social Science) |
In recent decades factorial survey experiments (FSEs) have become an increasingly widespread and successful method for measuring and analyzing attitudes, judgments, beliefs, opinions, preferences and behavioral intentions. A FSE is a type of survey experiment consisting of – typical textual – scenarios (called vignettes) combining several treatments (called dimensions) with controlled varying doses (called levels). Beside the experimental design a further strength of FSEs is the possibility to collect several – typically 5 to 12 – vignette evaluations per respondent in a single survey. These repeated measurements enable differentiated analyses of experimental treatments’ and respondent specific factors’ influence on the vignette evaluations.
But such repeated measurements are also a possible threat to the internal validity of survey designs. For repeated evaluation tasks it has to be expected that respondents first, learn about the structure of the task leading to more consistent and differentiated evaluations (learning-effect); and thereon, after reaching a turning point, focus on the more salient aspects of the task leading to less consistent and differentiated evaluations (fading-effect). Salient aspects in the context of a FSE are dimensions with comparatively many levels or a comparatively noticeable positioning in the presentation of the vignettes. Possible mechanisms creating fading-effects are the physical strain (i.e., workload) for respondents induced by the task and related, satisficing (instead of maximizing) response behavior. Satisficing implies that respondents try to find a behavioral heuristic to reduce the workload of a task. Past research finds evidence of learning- and fading-effects for repeated evaluations in FSEs as well as in other types of survey experiments. These studies either use the effect sizes of experimental treatments or response time records as indicators. Response time-indicators allow no differentiation between learning- and fading-effects because for both effects a decline in response times over vignette evaluations is predicted. In contrast, for effect size-indicators an increase over the learning phase and a decrease during the fading phase of an evaluation task are expected. But effect size-indicators are rather detached from the physical behavioral correlates of the proposed satisficing mechanism.
An alternative addressing this problem, are indicators based on eye-tracking data consisting of gaze-fixation measurements. For fixation-indicators a decrease in clustering on salient aspects of the evaluation task during the learning phase and an increase in clustering on salient aspects over the fading phase are expected. Our study is the first to assess this hypothesis in the context of a FSE. For our test, we use data of a FSE on justice attitudes regarding earnings constructed using 8 dimensions. The data collection was organized using a computer laboratory and based on a sample of tertiary students. Each of the 79 participants evaluated 20 vignettes. The large number of vignettes per respondent was chosen to ensure having enough repeated measurements per participant for possible fading-effects to occur. The related eye-tracking data was recorded using a remote eye-tracker (EAS Binocular from LC Technologies) which enables separate gaze-fixation measurements for each eye.
In survey methodology, experimental designs, like vignette-studies or also called factorial surveys, have gained a broad attention in the social science. The advantages are obvious: Due to controlled settings, experimental designs allow causal interpretations and a higher internal validity. Also experiments can be easily conducted as an add-on of established surveys and the mode of data collection can be internet-based, what minimizes costs. Furthermore, because of the multidimensionality of the vignettes, factorial surveys are considered to be less prone to provoke socially desirable responses in cases of sensitive issues (like discrimination).
However, there are also critical voices. In vignette studies, respondents react on hypothetical descriptions of given scenarios (e.g. situations or persons). Therefore, it is arguable, whether this experimental design is valid in cases of external validity.
This paper compares responses to single-item questions and vignette-judgements concerning recruitment strategies of immigrants on the German Labour Market. Therefore, I use a merged dataset of a telephone survey (first stage) and a follow-up online survey including a factorial survey (second stage) where more than 5000 companies (Human Resource Managers) in Germany are included. Topics of the first and second stage survey have concentrated on recruitment strategies with a special focus on migrants and possible skill shortages/hiring problems in the companies.
The analysis focuses both determinants that influence real employments of foreign skilled workers and those who influence hypothetical recruitment decisions. Because of the two stage process asking the same companies, it is possible to compare their responses and to gain on this way information about the external validity of the answer within.
Previous methodological research among factorial survey designs has examined the consistency of vignette judgements within respondents (Sauer, Auspurg, Hinz & Liebig 2011; Teti, Gross, Knoll & Blüher 2015). Both approaches delivered helpful insights regarding how much complexity FS respondents can cope with depending on age and other sociodemographic determinants. However, the concept of consistency does not account for the number of vignette dimensions that have been evaluated. That is, respondents with low cognitive capacities may have very high consistency values, while at the same time only a few vignette dimensions have been evaluated. This contribution examines both, respondents’ consistency and the number of significant dimensions per respondent as proxy for the dimensions that have been accounted for. We focus on the question, if respondents who are familiar with the topic of the FS show more consistent answer patterns and if consistency decreases with increasing age.
The data basis is a factorial survey of a sample in Chemnitz (Germany) aged 18 to 65. The respondents have been asked to waitlist fictive organ recipients. The attributed of these fictive recipients have been experimentally varied regarding gender, age, family status, having children, being employed, urgency and probability of success of the transplantation and being a potential donor oneself (the fictive recipient has an organ donor card: yes or no). At the level of the respondents, information on gender, age, family status, SES, self-rated health and being a potential donor is collected as well. Familiarity with the topic is measured by the involvement of the respondent with the subject organ donation (did they collect information to organ procurement via internet search or talking to experts or friends etc.).
For the data collection a quota sample was used, so all combinations of being a potential organ donor, age and gender of the respondents are represented in the sample. The data collection is almost completed. 120 cases are realized at the moment. Consistency will be estimated (following Teti, Gross, Knoll and Blüher 2015) in terms of the unexplained variance (error term) in the vignette judgments of each respondent. In a first step, we will calculate the absolute value of residuals per respondent on the vignette level by estimating a regression for each respondent, using the rated rank on the fictive waiting list of organ recipients as the dependent variable and all vignette variables as independent variables. In a second step, we will use the respondent specific consistency values as dependent variable and respondent characteristics as covariates.