Methodology

Reading Habits Survey Prepared by Princeton Survey Research Associates International for the Pew Research Center’s Internet & American Life Project

December 2011

Summary

The Reading Habits Survey, conducted by the Pew Research Center’s Internet & American Life Project, obtained telephone interviews with a nationally representative sample of 2,986 people ages 16 and older living in the United States. Interviews were conducted via landline (n_LL=1,526) and cell phone (n_C=1,460, including 677 without a landline phone). The survey was conducted by Princeton Survey Research Associates International. The interviews were administered in English and Spanish by Princeton Data Source from November 16 to December 21, 2011. Statistical results are weighted to correct known demographic discrepancies. The margin of sampling error for results based on the complete set of weighted data is ±2.2 percentage points. Results based on the 2,571 internet users have a margin of sampling error of ±2.3 percentage points.

Details on the design, execution and analysis of the survey are discussed below.

Design and Data Collection Procedures

Sample Design A combination of landline and cellular random digit dial (RDD) samples was used to represent all adults in the United States who have access to either a landline or cellular telephone. Both samples were provided by Survey Sampling International, LLC (SSI) according to PSRAI specifications.

Numbers for the landline sample were drawn with equal probabilities from active blocks (area code + exchange + two-digit block number) that contained three or more residential directory listings. The cellular sample was not list-assisted, but was drawn through a systematic sampling from dedicated wireless 100-blocks and shared service 100-blocks with no directory-listed landline numbers.

Contact Procedures Interviews were conducted from November 16 to December 21, 2011. As many as seven attempts were made to contact every sampled telephone number. Sample was released for interviewing in replicates, which are representative subsamples of the larger sample. Using replicates to control the release of sample ensures that complete call procedures are followed for the entire sample. Calls were staggered over times of day and days of the week to maximize the chance of making contact with potential respondents. Interviewing was spread as evenly as possible across the days in field. Each telephone number was called at least one time during the day in an attempt to complete an interview.

For the landline sample, interviewers asked to speak with the youngest adult male or female currently at home based on a random rotation. If no male/female was available, interviewers asked to speak with the youngest adult of the other gender. This systematic respondent selection technique has been shown to produce samples that closely mirror the population in terms of age and gender when combined with cell interviewing.

For the cellular sample, interviews were conducted with the person who answered the phone. Interviewers verified that the person was an adult and in a safe place before administering the survey. Cellular respondents were offered a post-paid cash reimbursement for their participation.Calls were made to the landline and cell samples until 1,125 interviews were completed in each. Once those targets were hit, screening for e-book and tablet owners was implemented. During the screening, anyone who did not respond with having an e-book or tablet device was screened-out as ineligible. All others continued the survey until approximately 700 e-reader/tablet owners were interviewed overall.

Weighting and analysis

The first stage of weighting corrected for the oversampling of tablet and e-reader users via screening from the landline and cell sample frames. The second stage of weighting corrected for different probabilities of selection associated with the number of adults in each household and each respondent’s telephone usage patterns.¹⁸ This weighting also adjusts for the overlapping landline and cell sample frames and the relative sizes of each frame and each sample.

The equations can be simplified by plugging in the values for S_LL = 1,526 and S_CP = 1,460. Additionally, we will estimate of the ratio of the size of landline sample frame to the cell phone sample frame R = 1.03.

The final stage of weighting balances sample demographics to population parameters. The sample is balanced to match national population parameters for sex, age, education, race, Hispanic origin, region (U.S. Census definitions), population density, and telephone usage. The Hispanic origin was split out based on nativity; U.S born and non-U.S. born. The White, non-Hispanic subgroup is also balanced on age, education and region. The basic weighting parameters came from a special analysis of the Census Bureau’s 2010 Annual Social and Economic Supplement (ASEC) that included all households in the United States. The population density parameter was derived from Census 2000 data. The cell phone usage parameter came from an analysis of the July-December 2010 National Health Interview Survey.¹⁹²⁰

Weighting was accomplished using Sample Balancing, a special iterative sample weighting program that simultaneously balances the distributions of all variables using a statistical technique called the Deming Algorithm. Weights were trimmed to prevent individual interviews from having too much influence on the final results. The use of these weights in statistical analysis ensures that the demographic characteristics of the sample closely approximate the demographic characteristics of the national population. Table 1 compares weighted and unweighted sample distributions to population parameters.

Effects of Sample Design on Statistical Inference

Post-data collection statistical adjustments require analysis procedures that reflect departures from simple random sampling. PSRAI calculates the effects of these design features so that an appropriate adjustment can be incorporated into tests of statistical significance when using these data. The so-called “design effect” or deff represents the loss in statistical efficiency that results from systematic non-response. The total sample design effect for this survey is 1.46.

PSRAI calculates the composite design effect for a sample of size n, with each case having a weight, w_i as:

In a wide range of situations, the adjusted standard error of a statistic should be calculated by multiplying the usual formula by the square root of the design effect (√deff ). Thus, the formula for computing the 95% confidence interval around a percentage is:

where p is the sample estimate and n is the unweighted number of sample cases in the group being considered.

The survey’s margin of error is the largest 95% confidence interval for any estimated proportion based on the total sample—the one around 50%. For example, the margin of error for the entire sample is ±2.2 percentage points. This means that in 95 out every 100 samples drawn using the same methodology, estimated proportions based on the entire sample will be no more than 2.2 percentage points away from their true values in the population. It is important to remember that sampling fluctuations are only one possible source of error in a survey estimate. Other sources, such as respondent selection bias, questionnaire wording and reporting inaccuracy, may contribute additional error of greater or lesser magnitude.

Response Rate Table 2 reports the disposition of all sampled telephone numbers ever dialed from the original telephone number samples. The response rate estimates the fraction of all eligible respondents in the sample that were ultimately interviewed. At PSRAI it is calculated by taking the product of three component rates:²¹

Contact rate—the proportion of working numbers where a request for interview was made²²
Cooperation rate—the proportion of contacted numbers where a consent for interview was at least initially obtained, versus those refused
Completion rate—the proportion of initially cooperating and eligible interviews that were completed

Thus the response rate for the landline sample was 14 percent. The response rate for the cellular sample was 11 percent.

Qualitative material

The qualitative material in this report, including the extended quotes from individuals regarding e-books and library use, comes from two sets of online interviews that were conducted in May 2012. The first group of interviews was of library patrons who have borrowed an e-book from the library. Some 6,573 people answered at least some of the questions on the patron canvassing, and 4,396 completed the questionnaire. The second group of interviews was of librarians themselves. Some 2,256 library staff members answered at least some of the questions on the canvassing of librarians, and 1,180 completed the questionnaire. Both sets of online interviews were opt-in canvassings meant to draw out comments from patrons and librarians, and they are not representative of the general population or even library users. As a result, no statistics or specific data points from either online questionnaire are cited in this report.

i.e., whether respondents have only a landline telephone, only a cell phone, or both kinds of telephone.↩

Blumberg SJ, Luke JV. Wireless substitution: Early release of estimates from the National Health Interview Survey, July-December, 2010. National Center for Health Statistics. June 2011.↩

The phone use parameter used for this 16+ sample is the same as the parameter we use for all 18+ surveys. In other words, no adjustment was made to account for the fact that the target population for this survey is slightly different than a standard 18+ general population survey.↩

PSRAI’s disposition codes and reporting are consistent with the American Association for Public Opinion Research standards.↩

PSRAI assumes that 75 percent of cases that result in a constant disposition of “No answer” or “Busy” are actually not working numbers.↩