CaraComp
Log inTry Free
CaraComp
Forensic-Grade AI Face Recognition for:
Start Free Trial
Podcast

Radiologists Miss 59% of Fake X-Rays on First Look — What That Proves About Your Case Photos

Radiologists Miss 59% of Fake X-Rays on First Look — What That Proves About Your Case Photos

Radiologists Miss 59% of Fake X-Rays on First Look — What That Proves About Your Case Photos

0:00-0:00

This episode is based on our article:

Read the full article →

Radiologists Miss 59% of Fake X-Rays on First Look — What That Proves About Your Case Photos

Full Episode Transcript


A trained radiologist looks at an A.I.-generated chest X-ray and calls it real. Not because they're careless. Not because they're new. Because when nobody warned them fakes were in the mix, only forty-one percent caught them.


Trusted by Investigators Worldwide
Run Forensic-Grade Comparisons in Seconds
2 free forensic comparisons with full reports. Results in seconds.
Run My First Search →

That number comes from a peer-reviewed study

That number comes from a peer-reviewed study published in the journal Radiology by lead researcher Dr. Mickael Tordjman. And it should matter to anyone who works with photographic evidence. If the most highly trained medical image specialists on the planet miss synthetic forgeries nearly six times out of ten, what happens when an investigator glances at a case photo and decides it looks legit? Today you're going to learn exactly why expertise alone can't protect you from modern visual forgeries, what mathematical signatures give fakes away, and why the tool that created the fake can't even reliably spot its own work. So what separates the people who catch forgeries from the people who don't?

The study worked like this. Radiologists were shown a set of X-rays and asked to rank image quality. Nobody told them some images were generated by ChatGPT-4o. After they finished ranking, researchers asked a simple question — did you notice anything unusual? Only forty-one percent flagged something as synthetic. But once the same radiologists were told that A.I.-generated images were hidden in the batch, accuracy jumped to seventy-five percent overall. Individual scores ranged from fifty-eight percent all the way up to ninety-two. That gap — from forty-one to seventy-five — isn't about skill. It's about attention mode. When you're just looking, your brain takes shortcuts. When you're deliberately searching, you engage a completely different cognitive process.

Now, you might assume the most experienced radiologists performed best. They didn't. According to the study's findings, years of experience had no measurable effect on a radiologist's ability to spot the fakes. A doctor with five years and a doctor with twenty-five years were equally likely to be fooled. That inverts something most of us take for granted — that time on the job builds an instinct for spotting what's wrong. For anyone evaluating photographs in an investigation, the lesson is blunt. Tenure doesn't substitute for method.

So what do the fakes actually look like up close? According to Dr. Tordjman, deepfake medical images often look too perfect. Spines that are unnaturally straight. Blood vessel patterns that repeat with eerie uniformity. Real human anatomy is messy — asymmetrical, irregular, full of tiny variations. A forged image overcorrects for that messiness and produces geometric consistency that doesn't exist in nature. The same principle applies to facial photographs. A real face has micro-asymmetries. The spacing between landmarks shifts slightly under different lighting or camera angles. A manipulated image may smooth those out, creating a mathematical regularity that structured analysis can flag — even when your eye sees nothing wrong.


The Bottom Line

And what about using A.I. to catch A.I.? Four large language models were tested as detectors, including the very model that generated the fakes. GPT-4o scored highest at about seventy-five percent. The others — including models from Google and Meta — ranged from fifty-seven to eighty-five percent. Even the best performer missed roughly fifteen fakes out of every hundred. The creator couldn't reliably identify its own output. That asymmetry is critical. Generating a convincing fake is now far easier than detecting one. No single tool catches everything, which means cross-validation across multiple methods — measuring distances between facial landmarks, checking landmark consistency, analyzing file metadata — isn't optional anymore. It's the baseline.

The real dividing line isn't between experts and amateurs. It's between people who trust a glance and people who trust a measurement.

So here's what to take away. Trained specialists miss A.I.-generated images most of the time when they aren't actively looking for them. Experience alone doesn't improve detection — structured, systematic methods do. And the A.I. that built the forgery can't reliably catch it either, so no single tool is enough. Every critical photograph that crosses your desk deserves the same scrutiny you'd give a questioned document — not "does this look right," but "can I verify this with numbers." The written version goes deeper — link's below.

Ready for forensic-grade facial comparison?

2 free comparisons with full forensic reports. Results in seconds.

Run My First Search