Q: Each year, Pew Research Center conducts polls in dozens of countries. How does the Center conduct surveys in so many different places?

To conduct its international polls, Pew Research Center works closely with experienced social and political research organizations to identify and manage local polling firms in each country included in our surveys. We carefully evaluate local vendors and select only those firms that meet our requirements for rigorous fieldwork procedures and thorough quality control.

Our top commitment is to data quality, a multifaceted concept that involves careful attention at every phase of the survey process, from drawing the sample, to conducting the interviews, to processing a dataset and beyond. This also means constantly staying abreast of new technologies and new best practices.

Q: Once you have selected a local polling partner, what role do you play in conducting the survey?

As a practical matter, Pew Research Center has to rely on local vendors to organize and manage field operations. But the Center is involved in every phase of the research. We develop the survey questionnaire, evaluate and approve the sampling plan and carefully evaluate the quality of the data produced. We have found that the protocols governing how interviewers conduct a survey are especially important for ensuring that a survey accurately represents the attitudes of the public and key subgroups, and so we pay a good deal of attention to designing rigorous protocols.

Q: What kind of information do you need to gather to monitor data quality from afar?

We remain in regular contact with our research partners — the professional research firms we hire to help identify, coordinate and oversee the extensive network of local vendors — as well as with the local vendors themselves. We also have a comprehensive process for confirming that a local firm understands and can execute the required sampling plan and fieldwork procedures.

The information we collect to do this includes:

  • a detailed account of the method for choosing respondents (the sample design) and the challenges faced during fieldwork; extensive documentation of the population statistics used to evaluate sample performance, or in less formal terms, an understanding of the population targets that the fieldwork team is trying to hit (e.g., the share of men and women) and how successful they were in that effort. In the U.S., for comparison, these population statistics would be based on the work of the U.S. Census Bureau;
  • records for each interview of the interviewer who conducted it, as well as the supervisor overseeing that interviewer and the person who entered the resulting data. This allows for retrospective investigation of unusual patterns in the data;
  • extensive logistical details for each interview, including time of day, location, duration and presence of others; and
  • when possible, data on all contacts made to complete the survey, including households that did not respond to the survey invitation and people who refused to participate.

Q: How involved is Pew Research Center in training the interviewers conducting the surveys?

Over time, we have become more involved in this phase of a project. Mainly, we want to be certain that interviewers are familiar and comfortable with fieldwork protocols, and that they understand why they must follow these guidelines. It is logistically challenging for us to directly observe interviewer training in multiple countries. But we are working to develop innovative solutions. For a multinational survey in Latin America, for example, we observed training sessions via video link. For a project in Eastern Europe, we arranged for a centralized gathering of field supervisors from different countries so that our staff could observe the training in person.

Q: How does Pew Research Center oversee a survey project once it is in the field?

During fieldwork, we require local vendors to use several quality control measures for monitoring interviews. First, a set percentage of the interviews must be supervised. Second, in face-to-face surveys, a random subset of each interviewer’s interviews must be “back-checked.” This requires a colleague in the survey firm to check with a respondent – in person or by phone – and verify that the interview took place. Several survey questions are asked again to confirm answers. Finally, we do not allow one interviewer to conduct more than 5% of all interviews. This helps minimize the impact any single interviewer may have on the quality of the final dataset.

One newer practice is to ask local polling firms to report, at predetermined times during the field period, on the total number of interviews completed, the regional distribution of interviews and the mix of respondents by age and gender. When possible, we ask for this information to be broken down by interviewer. Each data point provides information on how well interviewers are following the survey plan and the instructions for randomly selecting respondents.

Q: Have you ever encountered instances where interviewers did not follow the plan? If so, what did you do?

There have been cases where fieldwork updates suggest that one or more interviewers were not following proper protocols. In such cases, we pause that person’s fieldwork and request additional information about the interviewer’s performance, such as length of interviews and the time of and between interviews. We try to recreate an interviewer’s daily work log and determine if he or she is conducting interviews appropriately, given the survey’s length and the country’s geography.

In a few instances, the evidence suggests that an interviewer has fabricated responses or not randomly selected respondents as required. We require that interviewers suspected of misconduct be dismissed and the data they collected be discarded. In most instances, our investigations reveal well-intentioned interviewers who encountered challenges in the field. In one Latin American survey, for example, fieldwork updates alerted us that we had fewer interviews with young males than expected. This was due to female interviewers’ uneasiness about venturing into certain neighborhoods in the evening, the prime time to contact working-age men. In this case, we corrected the problem by extending interviews to daylight hours on weekends and re-emphasizing the importance of repeatedly contacting randomly selected respondents to gain their cooperation.

Q: After fieldwork is completed, what steps does Pew Research Center take to ensure the quality of survey data?

The dataset is first reviewed by our research partners, firms that have been hired specifically for their considerable experience in polling in relevant countries and for their established relationships with local firms. Our research partners alert us if they suspect, or if reviews suggest, that fieldwork procedures were not followed. In that case, we work with the local firm to address the issue. Our partners are our first line of review.

Once we receive a dataset, we then implement a range of quality control procedures. We pay special attention to the “paradata” for all completed interviews: We look specifically for suspicious patterns in interview length, location and time, as well as per-interviewer workload and success rate. We take a closer look at the responses recorded by a specific interviewer if we find anomalies that cannot be logically explained, and search for inconsistent responses, extreme values and duplicate records. This review process is iterative. Even if a first analysis of the paradata raises no concerns, we go back and delve more deeply into the paradata if we subsequently find curious patterns in the responses to survey questions.

Q: Do you look for duplicate records, and what does their presence indicate?

One of many ways survey data quality can be impacted is by the inclusion of duplicate records – where one set of responses is included more than once. This could be unintentional, an accident of programming or data entry. Alternately, it could be intentional fraud, a case of an employee – whether interviewer, supervisor or company executive – intentionally cutting corners to meet quotas, avoid hassles or lower costs.

We currently check for interviews in which there is a 100% match across all variables (including the demographic questions) and cases in which there is a 100% match across only the substantive variables. We do detect duplicate records in some cases, though they normally account for 1% or less of a sample. Our standard assumption is that these are an indication of an error, and thus we tend to discard them.

It’s difficult to determine if a duplicate record is due to intentional fraud, even after extensive investigation. The biggest problem with identifying falsified data of any kind is that no smoking gun in the dataset usually exists. Falsified data could take any form, not just a straight (or near) duplication of answers. Interviewers looking to cut corners could invent respondents altogether, or interview friends rather than the selected respondents. Because of this, when we suspect a problem with a dataset, we analyze all available data, including paradata, substantive data, data on interviewers and contact data, to evaluate what might have gone wrong. We then work with our research partners and local firms to investigate what happened and how best to address the issue.

Q: Can Pew Research Center ever be certain that its international surveys are free from false data?

Data falsification is a longstanding concern in the survey research community. It’s definitely a concern for us as we conduct surveys around the globe, often through face-to-face interviews that we cannot directly monitor. Our experience suggests that we can never be fully certain that our datasets are 100% free of false or subpar data. But we are confident that we have adopted sound measures to ensure that the data we release is of the highest quality.

Q: What is next in terms of improving data quality in international surveys?

We are constantly looking for ways to improve our quality control. We are excited about the potential of substituting paper questionnaires with computer tablets in face-to-face surveys to enable detailed tracking in our upcoming rounds of international polling.

One especially promising innovation is the measurement of time throughout the survey in face-to-face studies with tablets or other handheld devices. Such measurement can include the overall length of a survey, from start to finish, but also indicate the time it takes to complete sections of a questionnaire, or to answer a specific question. The timing of subsections of the questionnaire can be used to evaluate whether the respondent or interviewer may have had unusual difficulties with a particular section, or whether the interviewer may not have taken the appropriate amount of time to ask certain questions.

Another interesting avenue with handheld computer devices is the use of voice recordings at random points in the interview. This allows the researcher to confirm that the respondent was properly read questions and whether the same respondent is answering the questions throughout the survey.

As with any measure of data quality, these new approaches need to be evaluated along with a variety of other indicators to understand how they contribute to analyzing the quality of our survey data. The findings from this research should inform the design and structure of future surveys, lead to new approaches to training and motivating interviewers, and assist with the development of new, and hopefully better, quality control methods.