Question Pretesting and Evaluation: Challenges and Current Practices 2 |
|
Chair | Dr Cornelia Neuert (GESIS - Leibniz Institute for the Social Sciences ) |
Coordinator 1 | Dr Ellen Ebralidze (LIfBi – Leibniz Institute for Educational Trajectories) |
Coordinator 2 | Kerstin Hoenig (LIfBi – Leibniz Institute for Educational Trajectories) |
The UK Office for National Statistics (ONS) is moving its business and social surveys, and the Census, to electronic modes of collection. This has led to a departure from traditional cognitive interviewing methods, towards a combined approach of cognitive and usability testing. Using the UK Innovation Survey (UKIS) as an example, the paper will explore the challenges of redesigning a business survey from paper to an online collection mode.
The collaborative approach to questionnaire prototyping and testing between social researchers and user experience designers and developers will be outlined. Emphasis will be given to the complexities and challenges of developing an online questionnaire; ensuring methodological rigor, while maximising accessibility for respondents with additional needs (e.g., screen readers and zoom text software).
The ONS is developing a bespoke electronic questionnaire design system for all surveys and the Census. In addition to pre-testing UKIS online with non-incentivised business respondents; designs for generic and re-usable question ‘patterns’ are being tested simultaneously. The aim is to develop a library of question standardised patterns which are responsive to device-type and meet government accessibility requirements. Post-research and build, questionnaire designers and researchers will be able to use the tool to author questionnaires.
In the presentation, examples will be given to demonstrate how research questions were developed to meet the dual aims described above on UKIS. Explanation of the iterative design, testing, and reporting approach will be discussed; with detail given about how user experience professionals were involved in testing as observers.
Recent years have seen a shift away from interviewer administered to self-administered modes of survey interviews. In addition to the relative cost benefits this shift offers, it can also, in part, be attributed to the technological advancements of online instrument features and the opportunities offered to survey researchers by online methods. It is imperative that survey instruments move forwards with technology to offer respondents the easiest and most up to date ways of completing survey questionnaires to encourage response and provide a deterrent from dropping out.
Evaluating the validity of survey questionnaires has been an important concept since the Cognitive Aspects of Survey Methodology (CASM) movement in the early 1980s, which saw the emergence of ‘cognitive testing’, a technique now widely used within the field of survey research to test the validity of survey instruments. Derived from similar principles is a new technique termed ‘usability testing’ which has grown quickly in popularity over recent years. Usability testing is related to cognitive testing, alike in that it is inherently qualitative in nature, using the techniques of think aloud and probing and different in that, rather than focusing on participants’ interpretation of question wording it focuses primarily on the self-completion instrument and how ‘user-friendly’ it is.
This paper will begin by outlining the variety of methods and administration modes used in a number of recently conducted usability testing projects at Kantar Public in the UK. These range from testing a new paper questionnaire to two online instruments across different types of device, including smartphone. We also used specialist observation software to remotely watch respondents fill in an online questionnaire, which proved helpful in further shaping our findings. The usability testing of the online instruments focused on special features such as grids, sliders, and look up and drop down functions, these being especially challenging when designing and optimising surveys for smartphone users.
This paper will next detail some of the key findings from the usability testing. This includes a key finding that, somewhat ironically, at a time where opportunities to make survey instruments more ‘exciting’ to complete (for example through ‘gamification’) are on the rise, our findings and those in the supporting literature show that respondents are now looking for something more simple.
With these findings in mind, this paper will finally (a) put forward a set of recommendations for best practice when designing self-completion instruments (both paper and online) and (b) make suggestions for what we as survey researchers and methodologists might focus on next in the field of usability testing.
Survey methodologists have a broad set of methods at their disposal to evaluate survey questions. Thus, a key question with which any pretester is confronted is which methods are maximally productive in detecting potential problems with survey items. This study contributes to the understanding of this vital issue by presenting results of two experiments on the productivity of eye tracking in survey pretesting in conjunction with the standard method of cognitive interviewing.
Eye tracking enables the researcher to see where and for how long respondents look when reading and answering survey items. This feature can be used to detect questions that are difficult to understand or that are otherwise flawed.
The first study reports on a method comparison experiment that was designed to test whether a joint implementation of eye tracking and cognitive interviewing is more productive in pretesting self-administered questionnaires than standard cognitive interviews alone by comparing both the total number of problems detected and the number of questions identified as flawed. The results show that cognitive interviewing and eye tracking complement each other effectively. The hybrid method detected more problems and identified more questions as problematic than applying cognitive interviewing alone.
The second study builds upon the previous one by examining how eye tracking assists cognitive interviewing most effectively. To this end, two retrospective probing techniques are compared: Retrospective probing based on observed eye movements and gaze video cued retrospective probing. In the latter, a video of their own eye movements is shown to the respondents during the cognitive interview. The two conditions are compared with regard to the number and types of problems identified and the way they stimulate respondents when commenting on their behavior. The results show that both techniques did not differ in terms of the total number of problems identified. However, video cued retrospective probing identified fewer unique problems and fewer types of problems than pure retrospective probing.
Calendar methods have primarily been used in interviewer administered surveys to collect retrospective life history data. However, adding an interviewer administered life history component to an existing survey may not be the best option in relation to length and cost. While a web-based questionnaire may be a viable option for some populations (Glasner, van der Vaart, Dijkstra, 2015), a mail questionnaire is the better option when studying an older population, for which web access is fairly limited. However, little work has been done to establish how best to collect this kind of information in a self-administered format and whether calendar methods are effective for this survey mode. This study will examine the use of question pretesting to evaluate the cognitive burden and usability of a self-administered, retrospective life history questionnaire. In addition, we conduct a split-ballot experiment to test the effectiveness of including a life history calendar as a memory aid.
Ninety-five participants aged 50 to 90 were recruited through multiple methods along with a $25 incentive. Participants were mailed the questionnaire and asked to complete it before coming in for a semi-structured, in-person interview. We will examine the effects of the various recruitment strategies, the participant perception of the proposed questions, as well as the stated and latent effects of the life history calendar on data quality and recall. We also will discuss the challenges and difficulties our research team faced in completing the pretest along with the impact of the pretest on the development of the final questionnaire.