Sampling and Weighting
This report is based on the findings of a survey on Americans’ use of the internet. The results in this report are based on data from telephone interviews conducted by Princeton Survey Research Associates International from October 20 to November 28, 2010, among a sample of 2,255 adults, age 18 and older. Interviews were conducted in English. For results based on the total sample, one can say with 95% confidence that the error attributable to sampling is plus or minus 2.5 percentage points. For results based on internet users (n=1,787), the margin of sampling error is plus or minus 2.8 percentage points. In addition to sampling error, question wording and practical difficulties in conducting telephone surveys may introduce some error or bias into the findings of opinion polls.
A combination of landline and cellular random digit dial (RDD) samples was used to represent all adults in the continental United States who have access to either a landline or cellular telephone. Both samples were provided by Survey Sampling International, LLC (SSI) according to PSRAI specifications. Numbers for the landline sample were selected with probabilities in proportion to their share of listed telephone households from active blocks (area code + exchange + two-digit block number) that contained three or more residential directory listings. The cellular sample was not list-assisted, but was drawn through a systematic sampling from dedicated wireless 100-blocks and shared service 100-blocks with no directory-listed landline numbers. The final data also included callback interviews with respondents who had previously been interviewed for 2008 Personal Networks and Community survey. In total, 610 callback interviews were conducted – 499 from landline sample and 111 from cell sample.
A new sample was released daily and was kept in the field for at least five days. The sample was released in replicates, which are representative subsamples of the larger population. This ensures that complete call procedures were followed for the entire sample. At least 7 attempts were made to complete an interview at a sampled telephone number. The calls were staggered over times of day and days of the week to maximize the chances of making contact with a potential respondent. Each number received at least one daytime call in an attempt to find someone available. The introduction and screening procedures differed depending on the sample segment. For the landline RDD sample, half of the time interviewers first asked to speak with the youngest adult male currently at home. If no male was at home at the time of the call, interviewers asked to speak with the youngest adult female. For the other half of the contacts interviewers first asked to speak with the youngest adult female currently at home. If no female was available, interviewers asked to speak with the youngest adult male at home. For the cellular RDD sample, interviews were conducted with the person who answered the phone. Interviewers verified that the person was an adult and in a safe place before administering the survey. For landline or cell callback sample, interviewers started by asking to talk with the person in the household who had previously completed a telephone interview in the 2008 survey. The person was identified by age and gender. Cellular sample respondents were offered a post-paid cash incentive for their participation. All interviews completed on any given day were considered to be the final sample for that day.
Weighting is generally used in survey analysis to compensate for sample designs and patterns of non-response that might bias results. A two-stage weighting procedure was used to weight this dual-frame sample. The first-stage weight is the product of two adjustments made to the data – a Probability of Selection Adjustment (PSA) and a Phone Use Adjustment (PUA). The PSA corrects for the fact that respondents in the landline sample have different probabilities of being sampled depending on how many adults live in the household. The PUA corrects for the overlapping landline and cellular sample frames.
The second stage of weighting balances sample demographics to population parameters. The sample is balanced by form to match national population parameters for sex, age, education, race, Hispanic origin, region (U.S. Census definitions), population density, and telephone usage. The White, non-Hispanic subgroup is also balanced on age, education and region. The basic weighting parameters came from a special analysis of the Census Bureau’s 2009 Annual Social and Economic Supplement (ASEC) that included all households in the continental United States. The population density parameter was derived from Census 2000 data. The cell phone usage parameter came from an analysis of the July-December 2009 National Health Interview Survey.6
The disposition reports all of the sampled telephone numbers ever dialed from the original telephone number samples. The response rate estimates the fraction of all eligible respondents in the sample that were ultimately interviewed. At PSRAI it is calculated by taking the product of three component rates:
- Contact rate – the proportion of working numbers where a request for interview was made
- Cooperation rate – the proportion of contacted numbers where a consent for interview was at least initially obtained, versus those refused
- Completion rate – the proportion of initially cooperating and eligible interviews that were completed
Thus the response rate for the landline sample was 17.3 percent. The response rate for the cellular sample was 19.9 percent.
The full disposition of all sampled telephone numbers is available here in the PDF version of the appendix.
In this report, we are trying to understand how technology and other factors are related to the size, diversity and character of people’s social networks. But we face a challenge. If we were simply to compare the social networks of people who are heavy users of technology with those who do not use technology, we would have no way of knowing whether any differences we observe were associated with demographic or other differences between these groups, rather than with their differing patterns of technology use. That’s because some demographic traits, such as more years of education, are associated with larger and more diverse social networks. And those with more formal education are also more likely to use technology.
To deal with this challenge, we use a statistical technique called regression analysis, which allows us to examine the relationship between technology use and network size while holding constant other factors such as education, age or gender. Thus, many of the results reported here are not shown as simple comparisons of the behavior of groups on our key measures, which is the typical approach of Pew Internet reports. Rather, the findings compare the social networks of people who use certain technologies with demographically similar people who do not use the technologies. For example, we use regression analysis to compare the average size of the social network of a demographically typical American who uses the internet and has a cell phone with an American who shares the same demographic characteristics but does not use the internet or a cell phone.
Another common type of analysis in the report estimates how much more likely a certain outcome is (such as having at least one person of a different race or ethnic group in a social network) for people who use certain technology compared with people who do not, all other things being equal. For example, holding demographic characteristics constant, the regression analysis finds that a person who blogs is nearly twice as likely as a demographically similar person (e.g., the same sex, age, education and marital status) who does not blog to have someone of a different race in their core discussion network.
As with all studies that use data collected at only one point in time, none of the results we report should be interpreted as explanations of cause and effect. We cannot say from these findings that internet and mobile-phone use cause people to have bigger, more diverse networks. We can and do say that technology use is often strongly associated with larger and more diverse social networks.