Pew Research Center
Fact Tank Blog |

The challenges of translating the U.S. census questionnaire into Arabic

In 2020, census questionnaires may for the first time be offered in Arabic, now the fastest-growing language in the U.S. However, the Census Bureau faces a challenge not only in translating the language but also in adjusting the appearance of the questionnaire for those accustomed to reading and writing Arabic script.

The Census Bureau has already conducted some research on what it would take to implement the new questionnaire and has made some recommendations. A final decision on these changes – or even whether the questionnaire will definitely be translated into Arabic – hasn’t been made. A new study presented at the American Association for Public Opinion Research annual conference in May detailed the bureau’s cognitive testing and focus groups of Arabic speakers not proficient in English to identify the translation and visual display issues that are unique to Arabic and anticipate the measurement problems that might result. The bureau will use this research to help determine whether a translation of the census form can accurately “translate” symbolic and layout meanings from English to Arabic.

Arabic is the fastest growing language in the U.S.

The number of people ages 5 and older who speak Arabic at home has grown by 29% between 2010 and 2014 to 1.1 million, making it the seventh most commonly spoken non-English language in the U.S. Meanwhile, the number who speak Spanish at home has grown only 6% over the same time period.

The growth in Arabic language use is tied to continued immigration from the Middle East and North Africa and the growing U.S. Muslim population. The increasing presence of this group is one reason the Census Bureau may add a Middle East/North Africa category to the 2020 census form as part of major changes being considered to questions about race and ethnicity. In 2010, the Census Bureau offered an Arabic language assistance guide to help Arabic speakers fill out an English-language questionnaire.

The bureau identified about 1.9 million people with Arab ancestry living in the United States in 2014, but advocacy groups have suggested that the number may be much higher. Among those who speak Arabic at home, 38% were not proficient in English – that is, they report speaking English less than “very well.” This is comparable to the rate of English proficiency among the 39.3 million U.S. residents who speak Spanish at home. Some 42% of this group does not speak English very well, according to census data.

The challenges of translating surveys across cultures

Translating survey questionnaires is a tricky endeavor because it can be difficult to express the same meaning across two languages and cultures. But Arabic presents unique challenges because it is read from right to left on the page (the opposite of English and many other languages), the letters are connected like cursive writing in English, and, because it uses a different alphabet, words such as names can’t always be directly transliterated into English. Even if the questions are translated accurately, the visual elements of the survey may not necessarily transmit the same meaning as in English. For example, symbols such as an “X” to mark a response carry different connotations in different cultures. The census is usually a self-administered survey (that is, respondents complete the questionnaire on their own, on paper or online) and research shows that visual display can have a large effect on survey responses.

Careful formatting encourages respondents to give accurate and legible responses

In addition to translating the text of the questionnaire, the new study indicated the need for thoughtful formatting of the questions and answer write-in fields. Since Arabic is written right-to-left, the study recommends that most of the census form’s questions should be aligned on the right side of the page. Also, while the English questionnaire uses capitalization or italicization to emphasize or de-emphasize text elements, there are no distinct capital letters or italicization in Arabic, so the study recommends that other methods be used to achieve a similar effect, such as bolded or underlined words.

In addition, the current design of the census paper questionnaire provides individual blocks for printed letters in write-in responses. For the Arabic translation, the study recommends eliminating these inside borders so that the response can be written in normal connected Arabic script. This challenge may be mitigated by the Census Bureau’s initiative to gather most responses online because these boxes only appear in the printed forms.

Another issue is that the census’s English instructions indicate that respondents should use an “X” to mark a checkbox, but in Arabic, an “X” holds connotations of a response being incorrect or not applicable, while a check mark is more culturally appropriate. The bureau is looking into allowing the use of check marks in future surveys and censuses.

Certain census questions may require responses in English

One key challenge is to determine when to require a response in English and when to allow an Arabic response. For example, the study recommends that the address fields require the respondent to use English, because an American address might not be accurately translated into Arabic. This requires, in addition to an instruction to use English, the use of inside borders in these text boxes and aligning the response options to the left side of the page for these items as a further cue to use English instead of Arabic letters.

Names raise another complex issue. Arabic names can be transliterated into English in many different ways because the letters of the Arabic alphabet don’t necessarily have direct English equivalents. For example, the Arabic name “حسين” can be transliterated into English at least six different ways: Hussein, Hussain, Husein, Husain, Houssain and Houssein.

Because the Census Bureau is looking into supplementing its data collection with other government records, collecting a respondent’s “official” name (i.e., what is included on their government identification, tax forms, etc.) is important. This suggests asking for the name written in the English alphabet, but some of the respondents interviewed in the Census Bureau study interpreted “name in English” to mean their “Americanized” nickname, such as John or Lisa. The study recommended further research on collecting names from Arabic-speaking households.

The new Arabic translation comes along with many other proposed changes to the 2020 census. One change that is certain: The Census Bureau hopes to count most American households online for the first time, rather than using only paper questionnaires. The bureau is testing a number of other changes to the questionnaire, such as using administrative data from other government agencies to fill in missing data on people who don’t fill out their census forms and revising the race and ethnicity questions.