CaraComp
Log inTry Free
CaraComp
Forensic-Grade AI Face Recognition for:
Start Free Trial
biometrics

Discord Leaked 70,000 IDs Answering One Simple Question: Are You 18?

Discord Leaked 70,000 IDs Answering One Simple Question: Are You 18?

In 2026, roughly 70,000 people had their government-issued photo IDs exposed after Discord faced an age-appeal process gone wrong. Think about that for a second. Discord needed to answer one question: is this person old enough? What they ended up with was a centralized repository of scanned passports and driver's licenses—a concentrated target that, when breached, handed attackers a gift-wrapped identity dataset. Not because someone made a security mistake. Because the system was designed to collect far more than it needed to answer a very simple question.

TL;DR

The biggest mistake in age verification isn't asking too little — it's collecting a full identity profile when all you needed was a yes or no on whether someone clears a single age threshold.

This is the age verification trap, and almost everyone building compliance workflows falls into it. The logic seems reasonable enough: more verification must mean better security. So platforms reach for the most comprehensive data source available — government ID — because it feels airtight. Regulators can't complain about thoroughness, right? What could possibly go wrong?

Everything, as it turns out. The data you over-collect doesn't disappear after the check. It sits in a database, often managed by a third-party vendor, waiting.


The Binary That's Breaking Everything

Here's the mental model most compliance teams are working from: you either ask users to click a checkbox saying "I'm 18+" — which everyone agrees is useless — or you collect a full identity document. Self-attestation versus document verification. Pick your poison.

People get stuck here for a predictable reason. Regulators loudly and repeatedly reject self-attestation as legally insufficient. Nobody wants to be the platform that got sued because a 14-year-old clicked through a checkbox. So the instinct is to sprint to the opposite extreme: collect the most authoritative identity proof available. Document everything. Build an audit trail.

What this reasoning misses is the mismatch between the question being asked and the answer being collected. A platform needs to verify one binary fact — over or under 18. A driver's license answers that question, yes. But it also answers seventeen other questions nobody asked: full legal name, home address, hair color, license number, organ donor status, and an exact birthdate that can be used to reconstruct a person's full identity profile. The information disclosed is wildly disproportionate to the need.

Imagine if every bar in town photographed your driver's license, stored it in a shared database, and kept it for seven years — when the bouncer at the door only needed to glance at your face to know you weren't sixteen. That's precisely the system most age-verification platforms have built, just with better software around it. This article is part of a series — start with The Face Matched The Voice Matched The Person Never Existed.

$10.4B
projected global age assurance market by 2029, up from $5.7B in 2025
Source: Market research cited in Digital Information World coverage

That number should make you pause. Every dollar of growth in that market is a dollar locked into systems that concentrate sensitive identity data across a small cluster of commercial vendors. A single breach at one mid-tier identity verification provider can expose tens of thousands of users simultaneously — which is exactly what the Discord incident demonstrated. The concentration is the danger, not an unfortunate side effect of it.


What the Science Actually Supports

Here's where the technical picture gets genuinely interesting — and where the misconception starts to crack.

Facial age estimation, the kind used in threshold-based biometric checks, achieves a Mean Absolute Error of 1.3 years for users aged 13–17, according to Yoti's technical research on facial age estimation accuracy. That's not "good enough for a rough guess." That's genuinely precise — precise enough to reliably distinguish a 15-year-old from someone who clears an 18-year threshold, which is all a compliance system actually needs to do.

Now, a fair objection: doesn't accuracy degrade as people age? Yes, it does. A 40-year-old with minimal sun exposure might scan younger than a 35-year-old who spent summers outdoors. But here's the thing — that doesn't matter for threshold verification. The system doesn't need to know you're exactly 34. It needs to confirm you're not 16. Those are different problems with different accuracy requirements, and most platforms are solving the harder problem when the easier one would do fine.

"Regulators are increasingly open to facial age estimation that does not uniquely identify the individual, but broad biometric collection such as facial recognition tied to identity is discouraged or outright prohibited in many jurisdictions." — Center for Democracy and Technology, CDT Privacy-Preserving Age Verification Analysis

That distinction is doing enormous work and most compliance teams walk right past it. Facial age estimation that never links to an identity record is legally and technically different from facial recognition tied to an ID document. One answers the threshold question. The other builds a dossier. The former is increasingly acceptable to regulators; the latter is drawing legal challenge after legal challenge.

This is where work in facial analysis — the kind at the foundation of what we do at CaraComp — becomes directly relevant to compliance architecture. Understanding the difference between biometric estimation and biometric identification isn't semantic hair-splitting. It's the difference between a system that answers one question cleanly and a system that creates permanent records it can't legally justify keeping.


Trusted by Investigators Worldwide
Run Forensic-Grade Comparisons in Seconds
2 free forensic comparisons with full reports. Results in seconds.
Run My First Search →

The Smarter Architecture Already Exists

The middle path between "click yes" and "upload your passport" isn't theoretical. It's already in production. Previously in this series: Call To Confirm Is Dead Carrier Level Voice Cloning Killed I.

Google released a Zero-Knowledge Proof solution that verifies a user is over 18 without transmitting birthdates or identity documents. According to Ondato's breakdown of privacy-first verification methods, the process works like this: a trusted provider — one that already knows your age, like your phone's operating system or a government app — issues a cryptographic token. The token says exactly one thing: "this user is 18+." The website receives that token, accepts it, and learns nothing else. The verification happens on your device. Only an attestation leaves your phone. No name, no address, no birthdate, no photograph transmitted anywhere.

Mathematically, the website receives a proof — not data. Think of it like a sealed envelope that only lights up green or red. You never open the envelope. You just read the light.

This approach aligns with the ISO/IEC 18013-7 standard for verifiable digital credentials, which provides a framework for sharing specific claims from identity documents without sharing the documents themselves. The standard was designed precisely for this scenario: proving one fact without handing over everything.

And yet, most platforms haven't adopted it. Why? Because it requires intentional design. It means resisting the instinct to collect everything you could collect, in favor of collecting only what you need. That's a harder organizational problem than a technical one. It requires someone in the room to ask: wait, why do we actually need the full document?

What You Just Learned

  • 🧠 Threshold vs. identity — "Is this person 18?" and "Who is this person?" are different questions that don't require the same data to answer
  • 🔬 Facial age estimation is threshold-accurate — A 1.3 MAE for ages 13–17 is precise enough for compliance checks without identity linkage
  • 🔐 Zero-knowledge proofs solve this today — Cryptographic attestations can confirm age without transmitting any personal data to the verifying platform
  • ⚠️ Over-collection creates liability, not protection — 70,000 exposed IDs on Discord showed that thorough data collection concentrates risk, it doesn't eliminate it

The Electronic Frontier Foundation has documented how age verification mandates — however well-intentioned — create access barriers for marginalized groups and expose identity theft risks that compound over time as vendor databases grow. Meanwhile, 438 security and privacy researchers from 32 countries signed an open letter warning that age verification mandates are technically impossible to get perfectly right and structurally likely to cause more harm than they prevent — not because verification is inherently bad, but because the infrastructure built to enforce it becomes a permanent data liability.

That's the escalation pattern nobody talks about in compliance meetings. A platform starts with facial age estimation. Confidence scores dip on edge cases, regulators ask for documentation of the process, and suddenly the platform is layering in ID checks as a fallback. Then the fallback becomes the default. What began as a lightweight threshold check becomes a full identity pipeline — and users who never consented to document collection are now in a vendor database they've never heard of.


Key Takeaway

The question "Is this person 18?" requires exactly one bit of information: yes or no. Every piece of identity data collected beyond that single answer is a liability you've voluntarily taken on — and the breach risk scales with every unnecessary byte you store. Up next: Age Verification Bypass Threat Model Facial Recognition.

The Investigator's Version of This Problem

Here's the reframe that makes this land beyond compliance teams and platform architects. If you work in case investigation or identity verification workflows — not just age-gating — you face the same structural temptation every time you open a new case.

The instinct is to pull everything available: full background, full document scan, every address on file, every alias. More data feels like more certainty. But every piece of identity information you collect that wasn't strictly necessary to answer the case question is now part of your evidentiary record. If that data is later challenged, mishandled, or subpoenaed, you're defending collection decisions you can't easily justify.

The principle is identical to the age verification problem. A good investigator — like a good platform architect — asks not "what can I collect?" but "what's the absolute minimum I need to answer the specific question in front of me?" For age verification, that question is narrow: over or under 18? For a case workflow, the question changes, but the discipline doesn't.

Minimum necessary. Answer the question asked. Stop there.

That discipline is harder than it sounds, because the data is usually right there, one click away. But the 70,000 people whose IDs were exposed on Discord weren't victims of a complicated attack. They were victims of a system that collected more than it needed, stored what it shouldn't have kept, and handed a breach an address book nobody asked to build.

The bouncer didn't need the filing cabinet. He just needed to look at your face.

Ready for forensic-grade facial comparison?

2 free comparisons with full forensic reports. Results in seconds.

Run My First Search