ESRA logo

ESRA 2025 Preliminary Program

              



All time references are in CEST

Assessing and improving survey data quality in low- and middle-income countries (LMICs)

Session Organisers Professor Timothy Johnson (University of Illinois at Chicago)
Mrs P. Linh Nguyen (University of Essex, University of Mannheim)
Dr Yfke Ongena (University of Groningen)
TimeTuesday 15 July, 13:30 - 15:00
Room Ruppert rood - 0.51

Researchers working in low- and middle-income countries from diverse disciplines, such as development economics, demography, and other social sciences, are increasingly engaged in investigating different aspects of the Total Survey Error to improve data quality. They are especially concerned by how survey data quality affects substantial results, as well as poverty and demographic rates.

This session aims to mainstream research on all aspects of survey data quality stemming from LMICs where the historical development, the conditions, and the implementation of survey methodology differs from other contexts. We see this session as unique opportunity to foster the network of like-minded researchers and practitioners, as well as to promote research results focusing on LMICs in preparation for the ESRA conference in 2027 organised jointly with the World Association for Public Opinion Research (WAPOR) and the first WAPOR conference in a Sub-Saharan African country, Kenya, in 2028.

Researchers may present their work on any issue(s) encountered along the full survey lifecycle from questionnaire development and testing, including scale development; translation, adaptation, and assessment of questionnaires into local languages; sampling innovations using unconventional sample frames; survey participation, data collection challenges and solutions through innovative uses of technology; minimizing measurement error; interviewer effects; survey data quality control; respondent comprehension and burden; etc.

There is no specific regional focus and papers may cover a variety of topics. Nevertheless, the studies to be considered should rely on data coming from LMICs. Cross-national comparisons in these contexts are also welcome.

Keywords: cross-cultural survey methods

Papers

Studying language switching in multilingual survey interviews – Evidence from a Zambian Face-to-Face Survey

Mrs P. Linh Nguyen (French Institute for Demographic Studies (INED), University of Essex, University of Mannheim) - Presenting Author

Most low- and middle-income countries are multilingual leading to the situation that both interviewer and respondent are speaking multiple local languages to a varying degree. As multilingual respondents, especially those with lower education, differ in their proficiency in the survey language, some will exhibit more cognitive processing problems evidenced by audible manifestations of problematic interactional behaviours during the interview (i.e., through seeking for clarification or repetition of the question).
Using the interactional analysis of the recordings of ten selected questions in a survey on financial behaviour and attitudes in Zambia on a sample of more than 800 interviews in two local languages (Bemba and Chewa), we analyze the relationship between six indicators of problematic interactional behaviours and interviewer effects (1. language switches by either interviewers or respondents; 2. exact reading of the question; 3.any pre-emptive and follow-up behaviours by interviewers to obtain a codable answer (such as providing explanations without being asked or feedback or probing); 4. seeking clarification; 5. indicators of uncertainty (including providing pauses, fillers, and repairs, as well as verbal expressions of uncertainty).
This study’s objective is to document this process of switching language and its implication to survey data quality. The Zambian survey we rely on estimates that about 2 to 7 percent of interviews exhibit some form of language switching based on the analysis of 10 recorded and analysed questions for more than 1,000 respondents from a probabilistic household survey in Zambia. The results show for both provinces that language switching does not occur as a single phenomenon but always in co-occurrence with other problematic interactional behaviour. Thus, we can categorise language switching as another problematic behaviour indicating the breakdown of the cognitive answer process and a disruption to the ideal question-answer sequence.


Evaluating the Impact of Mode Transition from CAPI to CATI on Data Quality: Evidence from TURKSTAT’s Life Satisfaction Survey

Dr Hilal Arslan (Hacettepe University Institute of Population Studies) - Presenting Author
Mrs Aslıhan Kabadayı (Hacettepe University Institute of Population Studies)

Like many countries, the COVID-19 pandemic has forced Turkish Statistical Institute (TURKSTAT) to make changes in its data collection and fieldwork strategy due to limited opportunities to conduct face-to-face interviews. Therefore, for majority of the surveys face-to-face CAPI sampled are substituted with CATI data collection mode suddenly. Based on Total Survey Error approach main types of measurement errors stemmed from data collection mode are due to nonresponse, social desirability bias, satisficing behavior, differences in handling of “don’t know” or refusal response options, contextual information, using visual or auditory cues, difficulty in attention, presence of others during interview, questionnaire design and interview length that may influence the respondent’s answers. Against this background, our study aimed to investigate the differences in the survey estimates of target variables i.e. life satisfaction, happiness and satisfaction with subdomains of life by interviewing by telephone instead of face-to-face on important target variables. Up to our knowledge this is the first study to check data quality for CAPI-CATI administrative mode transition for TURKSTAT surveys and we were particularly interested in whether these differences are due to mode-specific measurement errors including nonresponse rather than coverage errors by comparing subjective well-being indicators derived from data collected in person (CAPI) and those collected over the phone (CATI) by checking descriptive statistics and applying multivariate statistical models. For data analysis, in order to explore the potential impact of the mode difference on our survey estimates, we analyzed survey data for the years 2003-2023 by taking the complex sample design (stratification, clustering, and weighting) into account. Preliminary findings of the study show that there are statistically significant changes in the distribution of the response categories for the selected attitudinal questions and subjective measures.


Task Driven CAPI Methodology

Mr Glen Heller (ICF)
Mr Alexander Izmukhambetov (Data Experts Consulting International) - Presenting Author
Ms Lindsey Anna (United States Agency for International Development)
Ms Monica Kothari (ICF)

In the mid-2000s, with the decreasing cost and increasing power and availability of mobile devices, field surveys transitioned from paper-based data collection to direct capture on digital devices. This transition triggered a paradigm shift in the development of software systems to accommodate this new approach. Unlike the linear, centralized, and pipeline-like nature of paper data entry, CAPI systems need to distribute, decentralize, and scale functionality across multiple endpoints and locations. With evolving digital platforms and increasingly accessible mobile internet, it became feasible to envision, develop, and mass-adopt a robust CAPI methodology, ensuring adequate functionality, efficiency, quality, and safety.

The Surveys for Monitoring in Resilience and Food Security (SMRFS) has chosen a data management methodology that revolves around task-driven data structures. Tasks are simple data records containing information about specific activities required from the user. Each task has a well-defined lifecycle: creation, transmission, transformation, and termination. The SMRFS CAPI data management system encodes a template for generating, moving, and tracking tasks over the underlying device-based and cloud-based infrastructure. This centralizes the enforcement of workflow rules and requirements while distributing and decentralizing task execution and progression. The methodology works alongside a role-based user identity system, with each role having a well-defined set of tasks within the workflow and specified interactions with other users.

In practical terms, this methodology streamlines complex data collection operations, enforcing protocol rules and boundaries, as well as automating data management. Users on all levels are only required to review and complete tasks as they are encountered, with the system handling the rest. This reduces the time and effort needed for the technical part of interviewer training, eliminates manual workflow management, reduces human error in the field, and provides comprehensive data collection quality feedback to the survey headquarters.


Using data on the presence of others during interviews to improve survey estimates

Mr M Moinuddin Haider (International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b)) - Presenting Author
Mr Md Mahabubur Rahman (icddr,b)
Mr Md Tazvir Amin (icddr,b)
Dr Nurul Alam (icddrb.org)

Background: Household surveys often collect data on emotionally and socio-culturally normative and sensitive topics. Conducting interviews in the absence of third persons is recommended but ensuring privacy often becomes challenging, which may influence the responses, resulting in biased estimates. This study aims to estimate the presence of others during the interview (POODI) on survey estimates and correction factors to adjust the bias.
Method: We used the data from the Sexual and Reproductive Health Survey 2024 conducted among Forcibly Displaced Myanmar National (Rohingya Refugees) in Bangladesh. In this survey, additional data on the POODI was collected by the enumerators after each section of the questionnaire. The analytical sample includes 3205 married women aged 15-49. We examined the POODI for current contraceptive use, unintended pregnancy and birth, child death, and physical violence. We estimated the prevalence of POODI and identified its correlates using logistic regression. The correction factor was estimated using augmented inverse probability weighting that has the double-robust property. Age, education, and household size specify the logistic treatment model.
Results: The prevalence of POODI was 70%. Interestingly, the prevalence of POODI was 1.3 times higher among women who ever attended school than women without schooling. The survey underestimated contraceptive use by 4.8% and unintended pregnancy by 20% in the POODI compared to no POODI group. Reporting of unwanted birth, child death, and physical violence remained indifferent in the POODI. Post-stratification correction factors to minimize the indicator-specific underestimation will be elaborated in the main paper.
Conclusion: The high prevalence of POODI can underestimate some sensitive indicators in densely populated or refugee communities, which is also likely to happen in other conservative populations. Adding question(s) about the POODI and incorporating the correction factor may improve the estimates of socially sensitive indicators.