Face detection without a deep neural network

I was surprised when I watched this video about how most face detection works. Granted, this is not face recognition (identifying the specific person). Face detection looks at an image or video and can almost instantly point out all the human faces. In a consumer camera, this is part of the code that puts a rectangle around each person’s face while you’re framing your shot.

What’s wonderful in the video is how the Viola–Jones object detection framework is illustrated and explained so that even we non-math types can understand it.

Like the game cases I wrote about yesterday, this is a case where tried-and-true algorithms are used, but deep neural networks are not.

As is typical with AI, there is a model. How does the code identify a human face? It “knows” some things about the shape and proportions of human faces. But it knows these attributes (features) not as noses and eyes and mouths — as we humans do. Instead, it knows them as rectangular shapes that map very well to the pixels in a digital image.

Above: Graphic from Viola and Jones (2001) — PDF

Make sure you stay with the video until 3:30, when Mike Pound begins to draw on paper. (This drawing-by-hand is a large part of why I love the videos from Computerphile!) At 8:30 he begins drawing a face to show how the algorithm analyzes that segment of an image.

The one part that might not be clear (depending on how much time you spend thinking about pixels in images) is that the numbers in the grid he draws represent values of lightness or darkness in the image. In all cases, computers require knowledge to be represented as numbers. When dealing with images, numbers represent differences. To compare sections of an image with other sections, the numeric values for one section are added up and compared with the sum of numeric values from another section.

The animations in the final three minutes of the video provide an awesomely clear explanation of how the regions of the image are assessed and quickly discarded as “not a face” or retained for further examination.

Computers are lightning-fast at these kinds of calculations. This method is so efficient, it runs rapidly even on simple hardware — which is why this method of face detection has been in use since 2002.

Creative Commons License
AI in Media and Society by Mindy McAdams is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Include the author’s name (Mindy McAdams) and a link to the original post in any reuse of this content.

.

Racial and gender bias in AI

Different AI systems do different things when they attempt to identify humans. Everyone has heard about face recognition (a k a facial recognition), which you might expect would return a name and other personal data about a person whose face is “seen” with a camera.

No, not always.

A system that analyzes human faces might simply try to return information about the person that you or I would tag in our minds when we see a stranger. The person’s gender, for example. That’s relatively easy to do most of the time for most humans — but it turns out to be tricky for machines.

Machines often get it wrong when trying to identify the gender of a trans person. But machines also misidentify the gender of people of color. In particular, they have a big problem recognizing Black women as women.

A short and good article about this ran in Time magazine in 2019, and the accompanying video is well worth watching. It shows various face recognition software systems at work.

Another serious problem concerns differentiating among people of Asian descent. When apartment buildings and other housing developments have installed face recognition as a security system — to open for residents and stay locked for others — the Asian residents can find themselves locked out of their own home. The doors can also open for Asian people who don’t live there.

You can find a lot of articles about this widespread and very serious problem with AI technology, including the deservedly famous mug shots test by the American Civil Liberties Union.

“While it is usually incorrect to make statements across algorithms, we found empirical evidence for the existence of demographic differentials in the majority of the face recognition algorithms we studied.”

—Patrick Grother, NIST computer scientist

So how does this happen? How do companies with almost infinite resources deploy products that are so seriously — and even dangerously — flawed?

Yesterday I wrote a little about training data for object-detection AI. To identify any image, or any part of an image, an AI system is usually trained on an immense set of images. If you want to identify human faces, you feed the system hundreds of thousands, or even millions, of pictures of human faces. If you’re using supervised learning to train the system, the images are labeled: Man, woman. Black, white. Old, young. Convicted criminal. Sex offender. Psychopath.

Who is in the images? How are those images labeled?

This is part of how the whole thing goes sideways. There’s more to it, though. Before a system is marketed, or released to the public, its developers are going to test it. They’re going to test the hell out of it. This can be compared with when an AI is developed that plays a particular game, like Go, or chess. After the system has been trained, you test it. To test the system, you’re going to have it play, and see if it can win — consistently. So when developers create a face recognition system, and they’ve tested it extensively, and they say, great, now it’s ready for the public, it’s ready for commercial use — ask yourself how they missed these glaring flaws.

Ask yourself how they missed the fact that the system can’t differentiate between various Asian faces.

Ask yourself how they missed the fact that the system identifies Black women as men.

Fortunately, in just the past year these flaws have received so much attention that a number of large firms (Amazon, IBM, Microsoft) have pulled back on commercial deployments of face recognition technologies. Whether they will be able to build more trustworthy systems remains to be seen.

More about bias in face recognition systems:

Creative Commons License
AI in Media and Society by Mindy McAdams is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Include the author’s name (Mindy McAdams) and a link to the original post in any reuse of this content.

.