Machine vision tools like facial recognition are increasingly being used for law enforcement, advertising, and other purposes. Pew Research Center itself recently used a machine vision system to measure the prevalence of men and women in online image search results. This kind of system develops its own rules for identifying men and women after seeing thousands of example images, but these rules can be hard for to humans to discern. To better understand how this works, we showed images of the Center’s staff members to a trained machine vision system similar to the one we used to classify image searches. We then systematically obscured sections of each image to see which parts of the face caused the system to change its decision about the gender of the person pictured. Some of the results seemed intuitive, others baffling. In this interactive challenge, see if you can guess what makes the system change its decision.

Here’s how it works:

What these results can tell us about machine vision

Here are some of the images from the interactive challenge. As a reminder, if any yellow or purple area is covered, the model changes its initial decision of whether the image is showing a man or a woman.

What do you see when you look at these images? Perhaps you noticed that sometimes the parts of a person’s face that cause the model to change its mind about the gender of the person are not what you might have expected. Sometimes covering the middle of a person’s face causes the system to change its decision – but sometimes the same thing happens if we cover up a part of their face that seems trivial to a human in determining gender, like their ear or a part of their forehead.

You might also notice that there aren’t obvious or consistent patterns to the parts of people’s faces that cause the model to change its mind. Covering up a certain part of one person’s face might change its decision for that image, but covering the same part of another person’s face might not cause that change.

Machine learning tools can bring substantial efficiency gains to analyzing large quantities of data, which is why we used this type of system to examine thousands of image search results in our own studies. But unlike traditional computer programs – which follow a highly prescribed set of steps to reach their conclusions – these systems make their decisions in ways that are largely hidden from public view, and highly dependent on the data used to train them. As such, they can be prone to systematic biases and can fail in ways that are difficult to understand and hard to predict in advance.

The technique used in this interactive is one way to help understand the features these algorithms rely on to make their decisions. Similarly, our data essay on machine vision takes an in-depth look at how the data used to train these systems can introduce hidden biases and unexpected sources of error into their results.

Our machine learning tool is relatively simple and designed for the very specific purpose of identifying gender in digital images, so more advanced systems or those trained to do different tasks might achieve different results. But this exercise can provide important insights into the ways these systems make their decisions. More importantly, it can provide insight into how these processes are not always clear or intuitive. As algorithmic decisions are increasingly brought to bear on decisions with profound real-world impacts for human beings, it is important to understand their limitations – as well as the ways in which their results may be biased or simply inaccurate.

For more about this analysis, read our methodology. To learn more about machine vision, read our data essay about the impact of training data on the accuracy of machine vision models. To read how Pew Research Center has used machine vision systems for research, see “Men Appear Twice as Often as Women in News Photos on Facebook” and “Gender and Jobs in Online Image Searches.”

Photographs by Bill Webster