I've studied deepfakes for more than 25 years. Here's why AI is making it nearly impossible for you to know what's real.
Hany Farid, a digital forensics expert, said ordinary people can no longer reliably discern between real content and AI-generated content.
By Lloyd Lee
This as-told-to essay is based on a conversation with Hany Farid, a digital forensics expert and former professor at the University of California, Berkeley.
Farid also co-founded GetReal, a digital forensics and cybersecurity startup.
It has been edited for length and clarity.
The average person on the internet today cannot tell whether an image, video, or audio recording on their feed is real or not.
We've done perceptual studies of this.
The human visual and auditory systems are simply not good enough to do this task reliably.
That doesn't mean we can't tell.
We have computational and mathematical tools.
You give us a piece of content and a little bit of time? Yeah, we'll figure it out.
But there is a big difference between what we do at GetReal, a digital forensics startup I co-founded, and what the average person doomscrolling on social media is capable of.
I started my academic career at Dartmouth College in 1999.
It's hard to remember '99.
We lived in a largely analog world.
We were still taking photographs on film.
Digital cameras were emerging.
The internet was emerging, but it was barely anything.
Social media didn't really exist.
Nobody knew where this was going.
I started thinking about digital evidence that is inherently malleable in the courts of law.
At the time, nobody thought this was a problem, and they were right.
I thought it might become one because the digital revolution was unlikely to stop.
So we started this very bespoke, niche, tiny, weird field called digital forensics — just me and a bunch of great grad students at Dartmouth writing papers.
Everybody was like, "This is cool, but what does this have to do with anything?" Then digital took off.
Citizen journalists emerged.
We started seeing The Associated Press and Reuters say, "Hey, how do we know that this photo that somebody submitted to us is real?" Over the years, the problem expanded from hearing from media outlets and courts of law once a month and national security once a year to every day.
Suddenly, our whole world is upside down.
At the beginning of my field, I was mostly thinking about photographs.
Video was very hard to manipulate.
It's 24 to 30 frames a second and has an audio track.
Images were easier to manipulate using tools like Photoshop.
The good news was that manipulation still required skill.
So you'd find mistakes.
You'd find artifacts.
Shadows that were misaligned, geometry that was wrong, and sizes were wrong.
Sometimes you had metadata that said a photo was edited in Photoshop.
Today, you don't need skill.
You don't need time.
You don't need anything.
You just need a keyboard and an internet connection.
You can type, "Do this to this image or audio or video," and AI takes over and can do remarkable things — things that were unimaginable five to 10 years ago.
With any technology, you don't look at where the puck is.
You look at where the puck's going.
We knew we'd reach a point where the content would be visually indistinguishable — not necessarily computationally indistinguishable, but visually.
Images were probably the first to pass through the uncanny valley.
Voice was next, with the inflection, the laughing, and the pauses.
Video is moving through the uncanny valley now.
If someone gives me a 30-minute HD video, it's probably not AI.
But if it's 15 to 30 seconds — the typical video you see online — it's hard to tell from visual cues alone.
For now.
AI-generated videos used to be about four seconds.
Now, there are some getting to 30 or 40 seconds by stitching them together.
The content will get better.
It's going to get cheaper and easier to use.
And it's going to become ubiquitous.
Generative AI doesn't know anything about the 3D world.
It doesn't know about physics and shadows.
I say "know" with air quotes.
AI can generate things that, to the human brain, are quite good.
But the physics of it are subtly wrong.
As long as you do something that is physically implausible, we have a signal that we can detect.
Sometimes, finding a fake can be really fast and relatively easy.
Once you find something wrong, you're sort of done.
The flip side — authenticating something — is much harder.
You run test after test, and you don't find anything wrong.
Does that mean it's real? Not really.
It means you didn't find anything.
On average, the work can take about an hour.
But an hour is a long time on the internet.
That's essentially an eternity.
Usually, we'll get a call about something, and there's already a million views on it.
We'll work on it, talk to a reporter, and they'll do the report.
Now there are 10 million views.
We're a little bit of a postmortem in that regard.
The fact checks come after the fact.
The stakes and consequences of being wrong are getting higher.
You're putting people in jail.
You're making geopolitical decisions.
You're reporting what is happening in the world to try to inform eight billion people.
You've got to get it right.
What scares me most is that we, as a society, are losing our shared sense of reality.
We're not arguing about what the tax rate should be, what the role of government is, what the role of foreign policy is, or other things we can and should disagree about.
We are arguing about whether two plus two is four.
I say two plus two is four, and the other person says, "No, it's not.
It's applesauce." That's the tenor of the conversation.
I'm not sure how you can have a stable democracy without a shared sense of reality.
We can disagree.
That's OK.
Disagreement is good.
We can't say, "This happened," and have the other say, "No, it didn't." That can't be how we have a society.
