CaraComp
Log inTry Free
CaraComp
Forensic-Grade AI Face Recognition for:
Start Free Trial
Podcast

A 95% Match Score Sounds Solid. These 3 Reality Checks Show When It Isn’t.

A 95% Match Score Sounds Solid. These 3 Reality Checks Show When It Isn’t.

A 95% Match Score Sounds Solid. These 3 Reality Checks Show When It Isn’t.

0:00-0:00

This episode is based on our article:

Read the full article →

A 95% Match Score Sounds Solid. These 3 Reality Checks Show When It Isn’t.

Full Episode Transcript


A cybersecurity researcher walked onto a stage at R.S.A.C. twenty-twenty-six and fooled a live facial recognition system with a deepfake. The system didn't flag it. It didn't even hesitate. It said the fake face was real — and gave it a high confidence score.


Trusted by Investigators Worldwide
Run Forensic-Grade Comparisons in Seconds
2 free forensic comparisons with full reports. Results in seconds.
Run My First Search →

If that makes your stomach drop, good

If that makes your stomach drop, good. It should. Because that same technology is used to unlock your phone, verify your identity at your bank, and in some cases, decide whether someone goes to jail. For anyone who's ever trusted a confidence score — whether you're reviewing evidence or just logging into an app — what happened on that stage matters. The system returned a number that looked solid. But nobody asked the right follow-up questions. Today, I want to walk you through three reality checks that separate a meaningful match from a dangerous one. Because the real question isn't whether the algorithm found a match. It's whether that match survives scrutiny.

First reality check — the old liveness tests are dead. For years, identity verification systems asked you to blink, or turn your head, to prove you were a real person sitting in front of a camera. That made sense when the biggest threat was someone holding up a printed photo. But modern deepfakes don't just overlay a face on a screen. They inject synthetic video directly into the camera feed itself. The system never sees a fake image being held up. It sees what it believes is a live person — blinking, turning, smiling on command. That's what the R.S.A.C. demonstration proved. The deepfake passed every basic liveness check the system threw at it. And the system reported "real" with full confidence. It failed silently. No warning, no flag, no asterisk. For an investigator, that means your tool just told you a synthetic face is authentic. For you and me, it means someone could impersonate us on a video call and our bank might not catch it.

Second reality check — pose and lighting can collapse a match score without telling you. According to researchers at Carnegie Mellon's CyLab Biometrics Center, once a person's head turns past about thirty degrees from center, confidence scores can plummet by thirty to forty percent. That's not a small wobble. That's a cliff. And the number the algorithm hands back doesn't come with a footnote explaining that. So why do people trust a ninety-five percent match so readily? Because vendors publish accuracy benchmarks from controlled lab conditions — frontal pose, even lighting, high-resolution images. According to N.I.S.T.'s own documentation, those benchmarks don't account for motion blur, bad angles, low resolution, or aging. Most buyers never think to ask what happens after you leave the lab. A ninety-five percent match captured head-on in perfect light is a completely different animal from a ninety-five percent match pulled from grainy surveillance footage at an angle. The number looks identical. The reliability isn't even close. It's like airport security — a system that catches ninety-nine percent of volunteers walking straight toward a camera will catch far fewer people trying to avoid detection. The conditions changed. The score didn't update to reflect that.

Third reality check — and this one's newer. Some of the most promising detection methods don't look at what a face looks like at all. They look at how it moves. According to research published on ArXiv, measuring the distribution of biometric facial similarity across video frames — basically tracking whether the tiny distances between facial landmarks shift naturally over time — catches deepfakes that pixel-level analysis misses. A real face has micro-movements that are hard to fake consistently across hundreds of frames. A synthetic face might look perfect in any single frame but behave unnaturally when you watch the pattern of movement over seconds. What makes this approach powerful is that it holds up even when video resolution is low or compression is heavy. Pixel-based detection struggles in those conditions because compression destroys the very artifacts it's looking for. Movement-based detection sidesteps that problem entirely. And there's a striking finding from a comparative evaluation of deepfake detection tools. According to that ArXiv study, experienced human reviewers correctly identified deepfakes that the best-performing A.I. classifier missed. The humans spotted anatomical oddities, lighting that didn't make physical sense, objects that couldn't exist. The algorithm saw pixels and said "real." The human saw a face and said "something's wrong." That gap is why detection can't be fully automated — not yet.


The Bottom Line

How seriously is the industry taking this? Seriously enough to spend billions. The deepfake detection market is projected to grow from about one and a half billion dollars in twenty-twenty-five to nearly five billion by twenty-twenty-seven. The first wave of biometric injection attack detection assessments is already underway in European testing labs, and those protocols are becoming the foundation for an I.S.O. standard. When organizations start building international standards around a threat, the threat is no longer theoretical.

The match score was never the answer. It was always the first question. The expertise isn't in finding a match — it's in stress-testing that match against the conditions that could break it.

So — three things to carry with you. One — a confidence score only means something if you know the conditions it was captured under. Two — deepfakes now pass the basic checks that used to catch them, so those checks alone aren't enough. Three — the best detection combines what algorithms measure with what trained humans notice, because neither one catches everything on its own. Whether you're building a case or just trusting a video call, the number on the screen isn't proof. It's an invitation to look deeper. The full story's in the description if you want the deep dive.

Ready for forensic-grade facial comparison?

2 free comparisons with full forensic reports. Results in seconds.

Run My First Search