The Puzzle of AI Facial Recognition

Like many other forms of AI, facial-recognition software received a shot in the arm from the maturation of artificial neural networks such as the large language models powering systems like ChatGPT.

The process used by most current methods contains several steps.

First, the software translates a photo or video of your face into a set of measurements—the distance between your eyes, between your nose and your lips, and so on. Then this “faceprint” is fed into an artificial neural network. This system uses statistical methods to match your faceprint to others in its database. Programmers “train” the system by rewarding it for correct matches and penalizing it for misses. The model generates results in the same way that LLMs produce language—by processing a large database in order to obtain statistical relationships among entities.

Eventually, and with a sufficiently large database, the system can reliably match one image of a face with another. Given adequate lighting and a relatively clear picture, software can match your passport photo with your appearance in a party photo on a friend’s Facebook page, for example, 99 percent of the time.

No one knows exactly how the system obtains its matches. There is an entire field in AI known as “mechanistic interpretability” that is attempting to understand how LLMs move from a given input (your passport photo) to a given output (identifying your face in the Facebook post).

The existence of this field—and its limited success thus far, despite the significant resources devoted to it—is one index of the alien quality of the “thinking” that goes on in artificial “minds.”

Continue reading at Harpers.org