CaraComp
Log inStart Free Trial
CaraComp
Forensic-Grade AI Face Recognition for:
Start Free Trial
privacy

Facial Comparison vs. Face Harvesting Under GDPR

Facial Comparison vs. Face Harvesting: Why GDPR Treats Them Differently

Here's something that surprises almost everyone the first time they hear it: GDPR does not ban facial comparison. It doesn't ban algorithms that analyze faces. It doesn't even ban processing biometric data in investigations — full stop. What it regulates, with considerable precision, is the architecture around that processing. The collection method. The retention logic. The access controls. The purpose at the moment of capture.

That distinction — between the tool and the system the tool lives inside — is one of the most misunderstood ideas in investigative technology today. And it's costing practitioners real operational confidence they don't need to sacrifice.

TL;DR

Under EU law, the legal risk in facial comparison isn't the algorithm — it's the collection architecture around it. Tightly scoped, case-bound comparison of photos you lawfully hold occupies fundamentally different legal territory than mass biometric harvesting.

The myth goes something like this: "Faces are biometric data. Biometric data is sensitive data. Sensitive data is restricted under GDPR. Therefore, any AI that processes faces is legally dangerous." Follow that chain of reasoning and you'd conclude that uploading two passport photos from a case file is roughly equivalent to scraping 30 million faces from social media. Which is, to put it gently, nonsense.


What the Fireflies.ai Lawsuit Actually Teaches Us

Earlier this year, a lawsuit against Fireflies.ai — an AI meeting assistant that records, transcribes, and analyzes calls — started generating serious attention from privacy lawyers. The core concern wasn't that it used AI. It was what the tool actually did with biometric data: captured voice and potentially facial data from multiple parties simultaneously, often without granular individual consent from everyone in the room, retained that data persistently, and processed it across unrelated sessions and users.

That's the architecture that draws legal fire. As Epstein Becker Green analyzed in their breakdown of the case, the problems compound when biometric data is captured broadly, stored indefinitely, and where third parties have no meaningful ability to opt out of collection. The issue isn't the comparison. It's the continuous, uninvited, multi-party harvest. This article is part of a series — start with Stress Test Facial Comparison Method Against Deepf.

Now contrast that with an investigator who holds two photographs from a lawfully obtained case file and runs a facial comparison to determine whether the same person appears in both images. No new database is created. No third-party faces are swept up. The processing is closed at the end of the task. That's not a smaller version of what Fireflies.ai was doing. It's a categorically different operation — the same way a doctor comparing two X-rays from the same patient's folder is categorically different from photographing everyone entering a hospital to build a predictive health-risk profile. Same imaging principle. Completely different legal and ethical universe.

"The key question is not whether biometric data is processed, but whether the processing is proportionate to the purpose, limited to what is necessary, and accompanied by appropriate safeguards — including access controls and retention limits." — Analysis framework, Skadden, Arps, Slate, Meagher & Flom LLP, EU GDPR Decisions Commentary

Trusted by Investigators Worldwide
Run Forensic-Grade Comparisons in Seconds
Full platform access for 7 days. Run real searches — no credit card, no commitment.
Run My First Search →

The GDPR Mechanics Investigators Actually Need to Understand

Article 9 of GDPR restricts "biometric data processed for the purpose of uniquely identifying a natural person." That clause is doing a lot of work. The phrase "for the purpose of uniquely identifying" is not decorative — it signals that the intent and scope of the processing determines its legal classification, not simply whether a face appears in a dataset.

Here's where it gets genuinely interesting. The European Data Protection Board has issued guidance clarifying the treatment of pseudonymised biometric data — situations where re-identification risk is controlled, access is restricted to authorized personnel, and the processing is bounded by a defined purpose. In those conditions, the data occupies different legal territory than freely accessible, broadly identifiable personal data. The risk calculus changes. The compliance posture changes.

Article 9
GDPR restricts biometric data processed for the purpose of uniquely identifying a person — the operative phrase is about intent and scope, not about whether faces appear in data at all
Source: General Data Protection Regulation, EU 2016/679

For investigators, this translates into four concrete variables that determine defensibility:

1. Lawful basis for holding the original photos. If you have a subject's image because they submitted it during an employment application, because it's part of a court-ordered disclosure, or because you obtained it through proper legal channels — you already hold it lawfully. That matters from the moment the image enters your possession, not just at the moment of comparison.

2. Purpose limitation. The comparison must serve a defined investigative purpose. "We might need this later" is not a purpose. "We are comparing these two images to establish whether Individual A in Document X is the same person as Individual B in Document Y, as part of Case Reference 2024-0471" is a purpose. The specificity is the compliance. Previously in this series: Biometric Privacy Crackdowns Coming For Investigat.

3. Data minimisation at the algorithm level. This is the part most people miss, and it's where tool selection actually matters. A facial comparison run on two images that produces a similarity score and then discards the intermediate biometric template is architecturally different from a system that retains a facial embedding in a persistent database. If your tool isn't creating a new, lasting biometric profile — just answering the specific comparison question — you're operating at the minimum necessary data footprint. That's not a legal technicality; it's the core of what defensible facial comparison in investigations is designed to do.

4. Retention and access controls. Who can see the results? How long are they stored? Is access logged? The EDPB's guidance on pseudonymised data makes clear that strong access restriction and defined deletion timelines are among the clearest signals that processing is proportionate. Document these. Not for a regulator's benefit — for your own operational clarity.

The Four Pillars of Defensible Facial Comparison

  • Lawful origin of images — You must hold the photos legitimately before comparison begins; that lawfulness carries through to processing
  • 📌 Defined, specific purpose — Comparison tied to a named case and a stated question, not a speculative or open-ended scan
  • 🔒 No persistent biometric database — Processing that answers a comparison question without creating a lasting profile satisfies data minimisation at its core
  • 📋 Access logs and retention policy — Documented controls on who sees results and when data is deleted are among the clearest markers of proportionate processing

When the Law Shifted — and Most Practitioners Didn't Notice

A landmark EU court ruling clarified something that had been ambiguous for years: pseudonymised data — where the re-identification risk is genuinely controlled and not merely claimed — can fall outside the strictest tier of GDPR personal data protections. Skadden's analysis of the ruling notes that the court examined not just whether data could theoretically re-identify someone, but whether re-identification was reasonably likely given the actual access controls and context in place.

That's a meaningful shift. It moves the legal question from a binary "is this personal data?" to a contextual "does this processing, in this environment, with these controls, create a real identification risk?" For investigators with well-documented workflows — specific case files, restricted access, defined deletion — the answer is increasingly defensible.

Meanwhile, White & Case's review of the EU Digital Omnibus signals that upcoming adjustments to the GDPR framework are likely to reinforce, not loosen, this context-sensitive approach. Purpose, proportionality, and provenance of data are becoming more central to enforcement thinking — not less. Up next: Biometric Privacy Crackdowns Small Investigators.

Look, nobody's saying this is simple. Biometric data in investigations carries real obligations, and those obligations deserve serious professional attention. But "serious attention" and "blanket avoidance" are not the same thing. Serious attention means understanding what your tools actually do to data after the comparison runs. It means documenting your lawful basis before you click compare, not after a complaint arrives.

Key Takeaway

GDPR evaluates why you process, what you retain, who can access it, and how long you keep it — not whether a face appears in your data. A tightly scoped comparison on two lawfully held images, with no persistent database created and documented controls in place, is architecturally and legally distinct from mass biometric harvesting. The myth that all face AI is the same under EU law isn't just wrong — it's operationally expensive for investigators who abandon defensible tools out of misplaced caution.

At CaraComp, the design principle behind facial comparison has always been that the tool should answer the investigative question — and then stop. No persistent templates. No shadow profiles. No aggregation across cases. That's not a marketing position. It's an architectural choice with direct compliance consequences, and it's exactly the kind of distinction regulators are now asking organizations to demonstrate, not just assert.

So here's the question worth sitting with: when a regulatory challenge to facial comparison technology eventually lands on an organization's desk — and given current scrutiny, it will — the decisive factor won't be which algorithm they used. It will be whether they can show exactly what happened to the data the moment the comparison finished. Can you answer that question about your current workflow right now, without looking anything up?

If the answer is no, that's not a technology problem. That's a documentation problem. And documentation problems are the easiest kind to fix.

Ready to try AI-powered facial recognition?

Match faces in seconds with CaraComp. Free 7-day trial.

Start Free Trial