Numbers, Facts and Trends Shaping Your World

Key facts about the quality of the 2020 census

The coronavirus pandemic broke out in the United States just as the 2020 census got underway, causing an unprecedented disruption of operations and raising questions about the extent of undercounts or overcounts – with implications for political representation and the allocation of federal and state resources. Politics, wildfires and extreme weather also may have affected the once-in-a decade count, adding to the doubts among officials, experts and the general public.

The Census Bureau’s own research on data quality has concluded that the national total in the 2020 census was largely accurate, but has estimated miscounts for some states and demographic groups. The latest research, released in May 2022, found that the 2020 census overcounted household populations in eight states while undercounting household populations in six others. The states with overcounts were Delaware, Hawaii, Massachusetts, Minnesota, New York, Ohio, Rhode Island and Utah. Those with undercounts were Arkansas, Florida, Illinois, Mississippi, Tennessee and Texas. By contrast, in the 2010 census, the Census Bureau estimated that no states had overcounts or undercounts.

A map showing that six states had undercounts in the 2020 census, while eight had overcounts

The Census Bureau publishes two main indicators of data quality for decennial censuses. Its Demographic Analysis (DA) uses birth and death records and other federal government data to develop a range of three estimates (low, middle and high) of the total population size. Its Post-Enumeration Survey (PES) is a sample survey of households whose responses are matched with responses from the 2020 census. The PES sample does not include group quarters such as prisons or college dorms, while the DA does include them in its estimate of the total population. For DA, the Census Bureau makes estimates of the population from public records and federal data, which the agency then compares with the census count for the total population and for demographic groups, including age, sex, racial and Hispanic origin groups. The result is a measure of net undercount or overcount, or how close the decennial census count is to the estimated population based on other data sources. The PES not only provides similar measures of net undercount or overcount but also provides the components of that miscount. The components include whether people were missed (thus, undercounted), as well as whether they were counted more than once or included in the count when they should not have been (thus, overcounted).

Below are key facts about data quality in the 2020 U.S. census count.

How we did this

This post analyzes results from the two basic means the Census Bureau has used to estimate census coverage for the last seven censuses – the bureau’s Demographic Analysis and its Post-Enumeration Survey. Demographic Analysis (DA) constructs a national estimate of the U.S. population by age, sex, race and Hispanic origin using historical data on births and deaths, federal data on international migration and Medicare records. DA uses the basic demographic accounting equation that the total population is equal to births minus deaths, plus immigration, minus emigration. Birth and death records for 1945-2020 are used to estimate the U.S.-born population under age 75 for Census Day, April 1, 2020. International migration estimates employed a number of data sources, but mainly the American Community Survey for the foreign-born population younger than 75. Finally, administrative data from the Medicare program, adjusted for under-enrollment, was used to estimate the population ages 75 and older on Census Day since these cohorts were born before 1945 when vital records were less complete.

Although the main data for DA is administrative records, not sample surveys, there are a number of sources of potential uncertainty, including, for example, registration completeness, classification errors and differences in reporting between the 2020 census and the external data sources. To account for this uncertainty, the Census Bureau produced three alternative estimates – low, middle and high – reflecting alternative assumptions about births, international migration and Medicare enrollment.

The Post-Enumeration Survey (PES) in 2020 is a sample survey of about 10,000 census blocks that ultimately included almost 400,000 people. About 160,000 households were interviewed to determine where they lived on Census Day and their basic demographic characteristics. The people in these households were then matched to census records to determine who was counted correctly, missed or counted in error. The PES sample includes only households; it excludes people living in group quarters (such as college dorms, prisons and nursing homes) and the small number of people living in Remote Alaska areas. Consequently, the PES coverage estimates apply to a household population count of 323,200,000 and not the full census count of 331,400,000. As a sample survey, the PES is subject to sampling error, so the Census Bureau reports the PES results with a margin of error.

The estimates of the amount of net undercount shown in the final chart were developed by the Pew Research Center using two main data sources: the Census Bureau’s PES estimates of the percentage undercount for racial and Hispanic groups in 2010 and 2020 applied to the P.L. 94-171 census counts for these groups.

A table showing that census errors were generally larger in 2020 than in 2010

On the surface, the total count in the 2020 census was a success, but data for subgroups and states is flawed by undercounts, overcounts and incorrect counts. The overall population count matched well with the middle DA estimate, showing an undercount of 0.35%, or about 1.1 million people, out of a total count of 331.5 million. The PES estimated that the household population count (323.2 million) was 0.24%, or about 780,000 people, short.

However, this relatively small net undercount happened because the total included undercounts and overcounts of different demographic groups that canceled each other out. The PES estimates that 18.8 million people were left out of the 2020 census. They were counterbalanced by 10.85 million people who didn’t file a complete response to the census and were added to the count by a statistical technique called imputation, and by 7.17 million people who were counted more than once or were incorrectly included because they had died before the census date, were not yet born, were a temporary foreign visitor or were otherwise not a U.S. resident on Census Day, April 1, 2020.

There was a record undercount of Hispanics. The census count of more than 62 million Hispanics still missed one-in-twenty of them. That is more than 3 million Hispanics, or about four times the number missed in 2010. The historic pattern of high undercount rates also continued in 2020 for the Black population, American Indians and Alaska Natives on reservations, and people who identified as “Some other race,” the vast majority of whom are Hispanic. These groups are less likely to fill out census forms and respond to census workers who come to the door for reasons that could include lack of trust and feelings of disconnect when it comes to the government.

The Asian population was overcounted by about 600,000 – out of more than 24 million people after having neither an overcount nor undercount in 2010. One clue as to why this occurred was turned up by researchers at the City University of New York: Plurality-Asian neighborhoods had high rates of census self-response, meaning that people filled in their forms on their own – considered the best indicator of a thorough, high-quality census. Some experts want the Census Bureau to release more data on Asian national-origin subgroups because they believe there is wide variation in count quality within the Asian population.

The non-Hispanic White population also was overcounted in 2020, as it was in 2010. In 2020, the overcount of White Americans was about 3.1 million, out of a total population of about 192 million.

A bar chart showing that census overcounts and undercounts for racial and Hispanic groups generally grew in 2020

Americans younger than 50, especially young children, were undercounted. Americans ages 50 and older – a group that includes a disproportionate number of White adults – were overcounted. Overall, groups that are undercounted are on average less well-off, more geographically mobile and less likely to be familiar with the census than groups that are overcounted. Overcounts can happen when people have more than one place where they could be counted – for example, college students living away from their parent’s home or people with second homes.

In most age categories, men were more likely to be undercounted than women. As in past censuses, homeowners were overcounted and renters undercounted.

There still is a lot we do not know about the quality of the 2020 census. Later this year, the Census Bureau plans to release experimental DA estimates with more detail about racial and ethnic groups, as well as state and county estimates for young children. Unlike in 2010, the bureau is not planning to release PES breakdowns for demographic groups below the national level, or overall population coverage estimates below the state level. The Census Bureau’s own National Advisory Committee, as well as outside expert panels of both the American Statistical Association and the Committee on National Statistics, have asked the bureau to release more local-level metrics. They say a complete evaluation is not possible without these metrics.

More than two dozen localities have filed administrative challenges to the census numbers as of early June 2022, and more are expected to file objections or even lawsuits. Austin, Texas, and Detroit, Michigan, are, so far, the most prominent local governments to challenge their census counts, and more could join before the June 2023 deadline. Following the 2010 census, 237 localities in the 50 states, plus the District of Columbia, challenged their counts. Because of concerns about the quality of 2020 data for college dorms and other group quarters, the Census Bureau added an option for governments to challenge those counts, too. However, any data corrections to the final counts are likely to be limited due to Census Bureau rules that do not permit broader challenges. The agency has already indicated, for example, that it will not change what it has already delivered for congressional reapportionment, though it does incorporate count corrections into post-census population estimates. Some researchers also have included warnings about data quality in publications using 2020 census data.

The 2020 census was worse on many quality measures than the previous two censuses. The estimated undercount for the Hispanic population in 2020 was larger than in 2010 or 2000. Those for the Black population and for American Indians and Alaska Natives on reservations were larger than in 2000. The overcounts for the White non-Hispanic and Asian populations in 2020 increased since 2010.

While the quality of the total count in 2020 matched that of 2010 and 2000, more detailed measures of census errors also were higher in 2020. The number of people left out of the census increased from 16 million in 2010 to 18.8 million in 2020. This increase was principally due to a large increase in whole person imputations (10.85 million in 2020 vs. 5.99 million in 2010), which represent people added to the count using statistical techniques when information collected was not sufficient to identify the individual.

What will be done about these flaws? The Census Bureau does not plan to alter the official numbers that are used to apportion seats in Congress or redistrict political boundaries within states. One major use of census results is the starting point for the population estimates for the rest of the decade. A Census Bureau task force is studying whether its research findings can improve these annual state and county population estimates that are widely used to direct an estimated $1.5 trillion in federal funding for health, education, revenue-sharing and other programs every year. The Census Bureau did use its DA findings as inputs to the 2021 population estimates, especially the results for young children. Longer term, the agency is studying how to improve the 2030 census by building trust with undercounted groups and relying more on government records rather than requiring people to fill out forms.