Big Tech Stole Their Voices to Train AI — Now Illinois Law Could Cost Billions

Full Episode Transcript

Nine journalists and voice actors just sued nine of the biggest tech companies on the planet. Their claim is simple. These companies took their voices — without asking — fed them into A.I. models, and built products that now compete directly with the people whose voices were taken.

Your voice is biometric data

Your voice is biometric data. It's shaped by your physiology — your pitch, your timbre, the way your vocal cords resonate. Like a fingerprint, it's unique to you. And unlike a stolen password or a compromised credit card number, you can't change it. That matters whether you narrate audiobooks for a living or you've ever left a voicemail, talked to a smart speaker, or joined a video call. These lawsuits were filed in Illinois under a law called B.I.P.A. — the Biometric Information Privacy Act. Passed back in 2008, it's the strongest biometric privacy law in the country. The plaintiffs say major tech firms scraped hours of their recorded speech, converted it into mathematical voice models, and used those models to train text-to-speech systems. Systems that now generate audiobooks and even podcasts — replacing the very people whose voices built them. So what happens when the law catches up to the technology?

A voiceprint isn't just a recording. It's a mathematical representation of everything that makes your voice yours. Pitch patterns, resonance, the physical shape of your throat and mouth. That's what makes it a biometric identifier under Illinois law — the same category as fingerprints and facial geometry. And the reason that distinction matters is permanence. If someone steals your Social Security number, you can get a new one. If someone captures your voiceprint and trains an A.I. on it, there's no reset button. Courts are starting to treat that as a form of permanent harm.

The plaintiffs in these cases aren't anonymous data points. They're working professionals — narrators and broadcast journalists whose livelihoods depend on their voices. According to the lawsuits, the defendants used those voices to build text-to-speech products that now directly compete with human narration. Google's text-to-speech technology, for example, is already being used by audiobook publishers as a substitute for hiring a human narrator. Google's NotebookLM tool generates full podcast-style audio overviews — synthetic conversations that sound like real hosts discussing real topics. That's not a hypothetical future threat. That's a product on the market right now, built on training data that allegedly includes voices taken without consent.

Trusted by Investigators Worldwide

Run Forensic-Grade Comparisons in Seconds

Court-ready facial comparison reports. Results in seconds.

Get Started

7-day refund guarantee**

🎆 July 4th Sale: 50% OFF your first month — use code JULY426 at checkout · ends July 11

The financial exposure is enormous

And the financial exposure is enormous. Just last November, Google agreed to pay one point three seven five billion dollars to settle a case with the state of Texas over unauthorized collection of voiceprints and facial data through Google Photos and Google Assistant. One point three seven five billion. That wasn't a voiceprint case from a class of professional narrators — it was a state enforcement action over consumer data. The Illinois lawsuits could push the numbers even higher because B.I.P.A. includes something most privacy laws don't — a private right of action. That means individuals can sue directly. They don't have to wait for a state attorney general to take up their cause. That single provision has generated more than fourteen hundred class-action lawsuits under B.I.P.A. so far.

What makes these voice cases different from earlier B.I.P.A. fights over fingerprint scanners or facial recognition is the damage calculation. When a warehouse scanned employees' fingerprints without consent, the harm was a privacy violation. Real, but abstract. When a tech company encodes a narrator's voice into an A.I. model and then sells that model to the narrator's own clients, the harm is economic displacement on top of the privacy violation. Judges notice that. Juries notice that. And for anyone who's ever recorded a podcast, narrated a training video, or even left extended voice messages — your voice data exists in systems you may never have agreed to.

One detail from the complaints stands out. The plaintiffs allege these companies knew exactly how to build consent systems that comply with B.I.P.A. They had the technical infrastructure. They had the legal teams. They chose not to. That framing — intentional noncompliance, not honest mistake — changes how courts evaluate damages. It also shifts the business calculus. Building a consent-compliant training pipeline costs money upfront. Skipping consent and paying settlements later costs billions. The math is finally starting to favor doing it right the first time.

The Bottom Line

Now, the defense will push back. A.I. companies are likely to argue that aggregated voice features used in training don't qualify as biometric identifiers under B.I.P.A. If a model learns general speech patterns from thousands of voices blended together, does any single voiceprint get captured in a way that uniquely identifies one person? If courts define voiceprints narrowly — as blended statistical patterns rather than speaker-specific signatures — the plaintiffs' entire legal theory weakens. And Texas just passed an A.I. exemption that could shield training data from biometric restrictions in that state. Other states might follow, which would split the legal landscape in ways that make national compliance a nightmare.

The instinct is to frame this as a privacy scandal. It's actually a market correction. Companies built a billion-dollar synthetic voice industry on training data they didn't pay for, because they assumed no law would make them. Illinois is making them.

So — nine lawsuits, nine tech giants, and a sixteen-year-old Illinois law that treats your voice the same way it treats your fingerprint. The core question is whether A.I. companies can encode someone's voice into a product without that person's consent. If courts say no, the cost of training synthetic speech models goes up dramatically — and the incentive to ask permission first finally becomes real. Whether you make a living with your voice or you just use one every day, what's being decided in these courtrooms is who owns the sound that comes out of your mouth. The full story's in the description if you want the deep dive.

Big Tech Stole Their Voices to Train AI — Now Illinois Law Could Cost Billions