February 24, 2016

Data Quality Deserves an Honest Discussion

This week, a paper by social science researchers Noble Kuriakose and Michael Robbins was presented at a conference in Bethesda, Maryland. In it, they assert that as much as one-fifth of polling conducted outside the United States in recent years may have been subject to data falsification. They base their conclusion on the premise that such falsification is overwhelmingly likely to have occurred if there is a high level of “matched responses,” meaning that two people give the same response to more than 85% of the questions on the survey.

The following is Pew Research Center President Michael Dimock’s response to the research paper.

By Michael Dimock

Fraud is always a possibility in surveys, a possibility we’ve long known about, one we worry about, and one we work hard to avoid and detect. It is a serious issue for any polling organization.

Because Pew Research Center’s highest priority is to produce accurate data, we took seriously Kuriakose and Robbins’ suggestion that data falsification is more widespread than previously thought, particularly in international surveys conducted face to face.

We assigned a team of international-survey and methods experts to look into both their newly proposed fraud detection method and our own international survey data. Our assessment is that their particular statistical tool is not an accurate stand-alone indicator of data fraud. And so their claim that one-in-five international surveys contain falsified data is untrue.

Their method flags as potentially fraudulent any interviews in which two people have answered a high percentage of the questions in exactly the same way. The problem with this approach? There are a number of perfectly legitimate reasons why two people’s answers to a survey can wind up looking extremely similar.

One reason is purely statistical: When you ask a large number of people a small number of questions, and don’t give them many answer choices (i.e., a simple “yes” or “no”), it is quite common to find sets of responses that look much the same. Another reason has more to do with the nature of public opinion itself: When it comes to certain topics, certain groups of people tend to think alike. As social scientists would say, some populations are more homogenous in their views.

As an example, consider what we found when we looked at the views of Mormons in Pew Research Center’s 2014 U.S. Religious Landscape Survey. This was a very large poll involving upwards of 35,000 telephone interviews with questions focused primarily on religious beliefs and practices. If you suspect that American Mormons could look pretty similar in their answers to survey questions about religion, you would be right. When we applied the suggested tool, we found that about four-in-ten Mormons gave the same answers to more than 85% of the questions.

Does that high match rate among Mormons suggest massive amounts of interviewer falsification? Absolutely not. Each survey fieldhouse (or telephone bank) used in the project has live monitoring of interviewers and detailed records of how and when the interviews were conducted. With these controls, high-quality U.S. phone polling in general is not prone to fraud. Instead, this suggests the more obvious hypothesis: When it comes to religion, at least, Mormons hold remarkably consistent views.

The examples could go on: Pick a population with something in common and a topic on which they tend to share views, and you will likely find a high share of near identical responses to surveys. We found this was true in two U.S. surveys fielded before the last congressional election: Ask Republicans about topics related to the election and a high percentage of their answers match. The same goes for Democrats.

These surveys aren’t fraudulent, they are doing what they are tasked to do; they measure the diversity and uniformity of public opinion.

This logic applies to the international surveys Kuriakose and Robbins deem “fabricated” as well. If you conduct a survey on a topic such as foreign affairs, religion, or economic concerns and opportunities in many countries, you will often find great areas of consensus across multiple items. These similarities will often cluster around specific population subgroups, or even in specific geographic areas. This is not evidence of fraud, this is evidence that people often hold similar views.

Kuriakose and Robbins developed their data-falsification test under specific conditions. But using this test to evaluate real-world survey data, with all its complexity, produces far too many false positives to be a reliable indicator of fraud. In fact, if wrongly applied to “fix” data, it could introduce errors by removing areas of agreement that actually exist among the surveyed population.

Pew Research Center conducts international survey work to explain the views, beliefs and behaviors of people worldwide. As a nonprofit organization that informs decision-making with facts, it is central to our mission to continually improve the quality of that data. To that end, we would welcome a future iteration of this test that takes into account the real-world survey factors that we have found impact its function. Such a test would circumvent the risk of confusing real similarity of opinion with data fraud. But in the meantime, the data-assessment tool Kuriakose and Robbins developed should not be implemented as a threshold test of survey fabrication.

For those interested in a thorough statistical and conceptual breakdown of the flaws in Kuriakose and Robbins’ model, please see our detailed analysis here.