May 2, 2016

Evaluating Online Nonprobability Surveys

5. Variation in online nonprobability survey design

One possible explanation for the variability in estimates across the different nonprobability samples is the range of methods that online sample vendors employ. They differ on recruitment, weighting and everything in between. Even if the impact of each of these differences is small, the cumulative effect could be much larger.

Panel recruitment and survey sampling

Methods to recruit respondents to modern nonprobability surveys differ across vendors. Some use ads in banners, on search engines or on social media. Some allow panelists to sign up directly at the panel vendor’s website, while others recruit at the end of other web surveys. Some panel vendors directly recruit customers of partner companies, and some even turn to phone and mail recruitment for hard-to-reach demographic groups.

Vendors also differ on how they draw samples for a particular survey from among their own panelists or other sources available to them. Some draw a sample from their panel in much the same way a probability-based sample is drawn from a sampling frame, with subgroups selected at different rates depending on their propensity to respond and their desired proportion in the final sample. Most online sample providers do not draw samples for individual surveys, but rather invite panelists to an unspecified survey and route them to one of many surveys fielding simultaneously. The outcome of the routing is determined by the respondent’s characteristics and algorithms that determine where each respondent is needed most. These routing algorithms make for much more efficient use of sample, but they do imply that a respondent’s inclusion in any particular survey depends to some extent on what other surveys are fielding at the same time. These routing algorithms vary from provider to provider and their effects have received very little study by survey methodologists.

As previously discussed, sample providers typically apply some form of quota sampling during data collection to achieve a pre-specified distribution on some set of variables. Most panel vendors set quotas on some combination of age, gender and Census region. However, they differ on which of those variables they use as well as the categories into which responses are grouped. For example, one panel vendor might quota on male vs. female and separately on age groups of 18-29, 30-49, 50-64, and 65 and older. Another might have quotas set on the fully crossed age-by-gender categories of male 18-34, female 18-34, male 35-54, female 35-54, male 55 and older, and female 55 and older.

Some online sample vendors are offering more statistically sophisticated sampling techniques that go beyond setting basic quotas. One such approach, propensity score matching, involves assigning each panelist a score based on their likelihood of being in a probability reference sample (e.g., the Current Population Survey) – rather than in a nonprobability sample – given their demographic profile. Quotas are then based on quantiles of this propensity score rather than on specific respondent characteristics. A related technique uses statistical matching to achieve a desired sample composition. Under this approach, the vendor draws a subsample from a large probability sample, such as the CPS, and then looks for members of its own panel who closely resemble each case in the probability subsample on a number of variables. The survey is complete when a suitably close “match” has been identified for every case in the subsample. Both of these methods allow vendors to flexibly incorporate a larger number of respondent characteristics into the selection process than is possible with standard quotas, in theory improving their ability to correct for sources of selection bias.

The sample used for a survey might also not be limited to members of a particular vendor’s panel. Sometimes panelists from multiple vendors’ panels are sampled for a survey, especially if a low incidence or hard-to-reach group is being targeted. Additionally, some panel vendors offer the option of including “river sample” cases along with panel sample to make up the final survey sample. “River” sample is a term used when internet users are invited to take a survey through an advertisement or webpage without being required to join a panel. In some cases, answering survey questions allows them to access content that they would otherwise have to pay for, in an arrangement known as a “survey wall.”

Weighting

Once the survey is out of the field there are also differences across vendors in how the data are weighted in order to be representative of the population of interest. A number of nonprobability sample vendors do not weight their data by default. Their view is that if the sample is properly balanced because of the quotas employed in sampling, weighting is unnecessary. When weights are provided, the technique used varies by vendor (e.g., from iterative proportional fitting or “raking” as a default practice to more sophisticated, generalized regression-based approaches for some custom surveys), as do the variables on which the data are weighted.

Strategies to increase quality: Incentives, monitoring, verification

Vendors also differ on the incentives they offer individuals to join their panel and/or respond to surveys. None of the vendors we tested offer direct monetary incentives for completing surveys. The more common approach is to incentivize panel members with points, which can be redeemed for consumer goods like gift cards and airline miles, as well as for cash. Other vendors offer drawings or donations to charity. Some offer incentives only if a panelist qualifies for and completes a survey, while others offer incentives even to panelists who are sampled for a particular survey and complete screening questions but ultimately do not qualify for the survey.

Finally, each vendor has its own set of quality control measures, which can range from the simple to the complex. These measures may be implemented at the survey level, at the respondent level or at a combination of the two. They may include monitoring for speeding (when respondents answer questions rapidly and without actually considering the question) and straightlining (when respondents simply select the same answer choice to every question), as well as trap questions that check to make sure respondents are reading the questions carefully. They may also regulate the frequency with which panelists can be invited to take surveys or the frequency with which they can respond to surveys at their own initiative. Most panels are double opt-in, meaning potential panelists first enter an email address and then respond to an email sent from the panel provider in order to confirm the email account. Depending on the vendor, other quality control features include IP address validation and digital fingerprinting, which guards against a single person having multiple accounts in a given panel.