Methodology

This study was designed to examine changes in national public opinion polling in the United States from 2000 to 2022. The study focuses on two key features: the sample source(s) (where the respondents came from) and the mode(s) (how they were interviewed).

The study’s unit of analysis is the organization sponsoring the polling. In total, 78 such organizations are included. Pew Research Center staff coded the sample sources and interview modes used in national public polls for each organization during each even-numbered year from 2000 to 2022. Odd-numbered years were excluded on purely practical grounds. Hundreds if not thousands of national public polls are released each year, and processing them for this report required substantial labor. Focusing on even-numbered years cut the manual labor roughly in half. As shown in the report, the even-numbered-year approach was able to successfully track major changes in the industry.

The initial coding of sample source and mode of interview was based on information available from a variety of sources including pollster websites, news articles, and the Roper iPoll opinion poll data archive. After this data was compiled, Center staff attempted to contact each organization and ask them to confirm whether the information for that organization was accurate. Most organizations that responded confirmed that the information gathered was accurate. When organizations provided corrections or additions, the study database was updated with that information. Each of these steps is described in greater detail below.

Inclusion criteria

The study aimed to examine change over time in national public polls. To be included, an organization needed to sponsor at least one national public poll in two or more of the years studied (i.e., the even-numbered years from 2000 to 2022). Organizations that sponsored a national public poll in only one of these years are not included. This criterion helped to reduce the influence of organizations that were not consistently involved in polling.

The national polls examined in this study are those based on the general public (e.g., U.S. adults ages 18 and older), registered voters or likely voters. Polls that exclusively surveyed a special population (e.g., teachers) were not included, as they often require unique designs. Additionally, polls described as experimental in their methods or research goals are not included.

Who is a ‘pollster’ in this study

One important choice in designing this type of project is deciding whether to study the organization sponsoring a poll or the organization collecting the data (known as the “vendor”). For example, many Pew Research Center polls are fielded by Ipsos. Pew Research Center is the sponsor, and Ipsos is the vendor. This study focuses on sponsors and refers to them as the “pollster.” There are several reasons why sponsors are the focus rather than vendors, including:

The sponsor is typically the organization commissioning the poll and attaching their institutional name to its public reporting. Sponsors decide whether to make poll results public or keep them private. Generally speaking, sponsors are the entity most responsible for any public reporting from the poll.
Sponsors dictate, in broad terms, the budget available for a poll, and the vendor works within that constraint to find a design that fits within that budget. While the vendor will often decide the exact final price tag, they usually are reacting to information from the sponsor as to whether the budget available is, say, $10,000 or closer to $100,000. In other words, whether a poll uses a very expensive method or very inexpensive method is generally dictated by the sponsor.

One complication is that sometimes the sponsor and vendor are one and the same. Continuing with the Ipsos example, in addition to the polls they conduct on behalf of clients, Ipsos also conducts national polling and releases the results by themselves. Accordingly, Ipsos is among the 78 pollsters in the analysis based on polling that they (co-)sponsor. Other vendors only conduct work on behalf of clients and never put themselves in the role of sponsor. This explains why a few major companies that conduct polling do not necessarily appear in the study dataset. Their work is represented in the rows associated with their clients.

Admittedly, focusing on the sponsor rather than the vendor feels more appropriate in some cases than others. For example, some sponsors are very engaged in decisions about how their polls are conducted, down to the smallest detail. Referring to such sponsors as the “pollster” feels accurate. Other sponsors are much more hands-off, letting the vendor make most or all decisions about how the poll is conducted. In these cases, it is tempting to think of the vendor as the pollster.

Ultimately, to execute a study like this, researchers must rely on information in the public domain. Nuanced records of who made which decisions are simply not available, nor is it possible to gather such information about polls fielded as early as 2000. Of the options available, focusing on the sponsor was the best fit for this study, but we acknowledge that in some cases the sponsors were likely not deeply engaged with decisions about sampling frames and interview mode.

Another complication is that some polls have multiple sponsors. For example, ABC News and The Washington Post have a long-standing partnership for live phone polling. In addition, they sponsor polls either solo or with other partners. In this study, each pollster is credited with using the methods employed for national public polls that they either sponsored or co-sponsored. For example, this study’s records for both ABC News and The Washington Post reflect the use of live phone with random-digit-dial sample for the years in which they jointly sponsored such polls.

A few well-known, recurring national surveys were considered but ultimately excluded from this analysis because they are too different from a typical public opinion poll. The General Social Survey and American National Election Survey both measure public opinion, but their budgets and timelines are an order of magnitude different from a typical public opinion poll. On that basis, we decided it was unhelpful to include them. Similarly, the National Election Pool (NEP) poll and VoteCast are not included because they are designed specifically to cover election results not just nationally but at the state level. Their methodological and budget considerations are quite different from those of ordinary opinion polls. These studies are very valuable to opinion research in the U.S., but they are not comparable to the polling studied in this project.

Which polling organizations were included

The analysis is based on 78 organizations. Each organization sponsored and publicly released national poll results in at least two of the years studied (i.e., the even-numbered years from 2000 to 2022). There is no authoritative list of such organizations, and experts might disagree on whether certain edge cases should be included. Several gray areas required resolution.

Inclusion based on content

In the broadest sense, public opinion exists for many topics – from politics and the economy to pop culture and brand preferences. That said, the national dialogue around “polls” and “polling” is generally understood to be focused more narrowly on public affairs, politics and elections. For instance, a market research survey on public preferences in ice cream flavors is probably not what comes to mind when someone is asked about public opinion polling. At the other extreme, surveys measuring support for candidates running for public office (known as “the horserace”) perhaps represent the prototypical conception of a poll. Many polls fall in between – measuring attitudes about important national issues but not measuring the horserace.

Each of the 78 organizations included in this study has a track record of measuring public attitudes about public affairs, politics and elections. Not all organizations included here specifically measure the horserace, but all of them have asked the public about factors influencing how they might vote in an election.¹

Academic research versus public opinion polling

The decision to consider a sponsoring organization as the pollster raised two practical questions when considering academic organizations. Colleges and universities may have multiple individuals or entities independently conducting polls within them, and so one question is whether these separate polling efforts should be considered as different pollsters or not. The second question is whether (or which) national polls conducted by faculty primarily for academic purposes should be included.

Although some academic institutions have more than one branded entity conducting public polls on politics and policy, publicly released surveys typically carried the institutional name. Moreover, there was no practical way to ascertain the degree of independence of the entities within a university. As a result, the decision was made to code all polling that met the study’s content criteria for inclusion as polling by that university.

However, this study does not contain all the surveys conducted by every college and university. Indeed, it would be nearly impossible to locate all such polling, since most of it is made public only through academic papers or conference presentations. We acknowledge that as a limitation of the study. This study does not purport to represent every national survey whose results can be found somewhere in the public domain. The ones that are included tend to have a news media partnership increasing their visibility; maintain a public website updated regularly with the latest polling results; and/or are archived in the Roper iPoll public opinion poll repository. It is worth underscoring that the goal of this study is to describe the nature and degree of changes in national public polling from 2000 to 2022. Research designed primarily for an academic audience (e.g., peer-reviewed journals) is not the type of polling people have in mind when they question whether polling still works after the 2016 and 2020 elections.

Among the news media organizations included and contacted in this study, only one raised the concern that multiple units within the organization might be conducting polls. Most of their polling was coordinated through their political unit, but a few polls over the years were not. Mapping out the decision structures within each organization was outside the scope of this project. In the interest of applying the same standard to each organization, this study includes the methods for any public polls we found during the years studied. In some cases, this yields a track record that does not reflect centralized decision-making. In all cases, though, this approach reflects what the public sees – either a poll was or was not sponsored by “Organization X.”

Data collection

Creating a list of national public pollsters

There is no authoritative list of organizations sponsoring and releasing results from national public opinion polls, so researchers constructed one using a variety of sources. First, researchers compiled the names of organizations releasing national estimates for U.S. presidential approval or horserace estimates for U.S. presidential elections going back to 2000. For this task, researchers used polling databases and summaries from The Polling Report, FiveThirtyEight.com, RealClearPolitics.com and Wikipedia. Researchers then expanded the list with the names of prominent national polling organizations (e.g., Kaiser Family Foundation, the American Enterprise Institute’s Survey Center on American Life, PRRI) that do not necessarily appear in those sources. Finally, additional sponsors were identified through polling partnerships. For example, if researchers saw that Pollster A co-sponsored a few polls with Pollster B, then the researchers investigated all the polling associated with Pollster B. If Pollster B qualified for inclusion, they were then added to the study.

Determining which methods each pollster used in national public polling for each year

For each pollster in the study, researchers set out to document which sampling frames and interview modes the pollster used for national public polls in each even-numbered year from 2000 to 2022. Unfortunately, this kind of information cannot be found in any one location. Indeed, one of the main motivations for this study was that existing databases are insufficient for understanding key distinctions in modern polling. Existing resources might indicate whether a poll was done by “phone” or “online,” but there is often no information about the sample source. Consequently, Center researchers scoured the internet for more detailed documentation. Researchers executed this work in several steps.

1. Internet search for pollster and year. Researchers conducted a Google search for “[POLLSTER NAME] poll [YEAR]” starting with 2022 and working backwards, doing even-numbered years only. For each year, they limited the search time frame to 01/01/[YEAR] to 12/31/[YEAR]. They then investigated each of the hits on the first page of results. The poll in question could be disregarded if it was not sponsored by the pollster of interest and/or if the poll was fielded in the previous year (an odd-numbered year). Next, researchers conducted a Google search for “[POLLSTER NAME] survey [YEAR]” in the same manner. This was done because some organizations use the term “survey” instead of “poll.” This searching often yielded poll reports, press releases, methodology statements and other useful documentation.

2. Search the pollster’s website for documentation of polls and methodology. Some pollsters had a public webpage where poll results or documentation were posted. For some pollster websites this was productive, but for others it was not. In some cases, a poll was listed with a broken link. In those cases, researchers added the additional information from the pollster website to the Google search methods listed above.

3. Search the Roper iPoll Archive.Researchers entered the pollster and year in the search fields and looked for documentation of polls and methodology.

4. Search the FiveThirtyEight.com pollster database. This resource was helpful to find instances of polls and methodologies that had not been identified through the prior step.

5. Additional internet searches for missing information. In some cases, additional, more specific internet searches were needed to look for missing information. Researchers conducted Google searches for “[POLLSTER NAME] poll [YEAR] [MODE]” to confirm the year that use of a particular method began or ended. For example, “CBS News poll 2010 online” was used to check that the pollster did not do online polling in 2010.

Two researchers performed the steps above independently for each pollster. The team also created a codebook assigning a number to each combination of methods observed from a pollster in a given year. For example, code 1 denotes that the pollster used only live phone with random-digit-dial sample for at least one national public poll that year, while code 25 denotes that the pollster did at least one poll using online opt-in and at least one poll using a probability-based panel. The team conducted a reliability analysis on the two independently gathered sets of data. The Cohen’s kappa was 0.7, which is typically considered an acceptable level of agreement in social science content coding. A senior researcher then resolved any conflicts and produced a single dataset reflecting the best information available from the searching phase.

To record the data, researchers created a spreadsheet in which each column was a year (2000, 2002, … 2022) and each row was an instance of a pollster using a particular method during that year. Researchers archived the URL documenting each instance of a pollster using a given method in a particular year. If the pollster did multiple polls using the same method, only one instance was archived per year.

Pollster outreach to verify the information gathered

As a final quality check, a senior researcher emailed each pollster to verify the information. These emails explained the study goals and presented the methods information observed from 2000 to 2022 specifically for the pollster. The email asked for any additions or edits. Most organizations (47 of the 78) responded. Pollsters were very generous with their time. Among those responding, most (81%) confirmed that the information recorded looked accurate. Others offered corrections, which were then applied to the study dataset. In some cases, a pollster correction could be corroborated by a URL. In a small number of instances, staff could not find a corroborating URL, but the information provided by the pollster was taken as authoritative. Such instances are recorded in the documentation dataset as “pollster email” instead of a URL.

Assumptions made when information was incomplete

Sometimes the best available documentation of a poll did not clearly describe the sample source and/or interview mode. However, it was often the case that circumstantial evidence supported a reasonable educated guess. The team applied several guidelines in such situations. Each of these guidelines proved well-founded based on the input received during pollster outreach.

If mode was not specified, but the questionnaire contained interviewer instructions, such as to read or not read certain response options, the poll was coded as live phone.
If mode was specified as live phone, the sample source was not specified, but the poll was described as a sample “of Americans,” the poll was coded as live phone with random-digit-dial sample.
If the poll was described as having been conducted “online” but there was no other information, the poll was coded as online opt-in. (Our experience was that any time a poll used a probability-based panel, the pollster always disclosed the name of the panel.)
If the poll documentation reported a “credibility interval” or “modeled margin of error” but did not disclose the sample source, the poll was coded as online opt-in.
If the methodology was not disclosed but the pollster had a clear track record of consistently using a certain methodology, the poll was coded consistent with the pollster’s known methods.
Documentation of live phone polls before roughly 2012 often did not disclose the sample source, such as whether it was random-digit dial or registered voter records. The information that is available suggests that most national public polls conducted from 2000 to 2012 were probably using random-digit-dial sample. If a poll was live phone but the sample source was not specified, the poll was coded as live phone with random-digit-dial sample (unless the pollster had a record of using registration-based sampling).

Strengths and weaknesses of the research design

Like any study, this one has its strengths and limitations. The strengths include:

Offering more insight than common industry characterizations. Existing databases that code polling methods tend to use terms like “online” or “telephone” that gloss over major distinctions, such as whether an online sample was recruited using convenience approaches or random sampling from a high-coverage frame. This study distinguishes between online surveys fielded with opt-in sample and those fielded on probability-based panels. This study also distinguishes between live phone surveys using registered voter records or random-digit-dial procedures.
Documenting growth in the number of methods individual pollsters use. This study captures not just changes in methods but the increasing use of multiple methods within and across polls by the same pollster.
Offering a timeline long enough to see how trends unfolded. While other reports have provided a snapshot of the methods used in a particular year or election, they generally do not shed light on the trajectory of those methods. This study distinguishes between changes emerging in recent years from those with longer arcs.
Studying new approaches. This study coded methods information at a level granular enough to see the emergence of new techniques such as using text as a supplemental mode or matching an online opt-in sample to registered voter records.

The study limitations include:

Only national polling was considered. Unfortunately, attempting to find, document and code polling from all 50 states and the District of Columbia would have exceeded the time and staff resources available.
Not all methods details were considered. This study focuses on two key poll features: where the respondents came from (the sample source or sources) and how they were interviewed (the mode or modes). While important, they are not exhaustive of the important decisions in designing a poll. The study did not attempt to track other details, such as weighting. Because the study only measured two out of all possible poll features, estimates from this study likely represent a lower bound of the total amount of change in the polling industry.
Odd-numbered years were excluded. The study reflects data for even-numbered years only, which was a purely practical decision. Hundreds if not thousands of national public polls are released each year, and processing them for this report required substantial labor. Focusing on even-numbered years cut the manual labor roughly in half.
Given the size and fluidity of the industry, the dataset may be missing some pollsters. While the research team went to great lengths to include all the pollsters that were eligible under the study criteria, it is very possible that some were missed. The online public record is only so complete, and the further back in time one goes the more broken links one encounters. Moreover, technology has destroyed the barriers to public polling that once existed, and so the number of potential pollsters feels almost without limit. This study includes many of the most prominent polling organizations, but probably misses some of the less prominent pollsters with smaller digital footprints.
The study does not measure the volume of polls conducted with each sample source and mode of interview. The low cost of online opt-in polling has led to a dramatic increase in the number of such polls during the period of study. In practical terms, this means that the growth in the number of organizations conducting online opt-in polls could understate the growth in the share of all polls conducted with this methodology.

Supplemental materials

The dataset accompanying this report lists the 78 organizations included. The Documentation URLs file provides the webpages used to code the methods. The Codebook details what each code means.

Two organizations in the initial list were judged to focus on topics too far afield of politics, elections and policy and were excluded. The Hollywood Reporter sponsors national public polls focused on the entertainment industry. The University of Michigan Surveys of Consumers measures attitudes and perceptions about prices and economic conditions.↩