GPT-2 full release: I was wrong to worry

OpenAI released the full GPT-2 model.

All 1.5 billion parameters. The whole thing. Available on GitHub. Anyone can download it. Anyone can run it. The model they said was “too dangerous to release” back in February is now open to the world.

The internet did not collapse.

I wrote about GPT-2 nine months ago. I was worried. I said the output was “unsettlingly good.” I said the distinction between human writing and machine writing was getting blurry. I wondered whether we were looking at a Wright Brothers moment for AI-generated text.

I think I was partly right and mostly wrong. Let me explain.

What happened (and didn’t happen)

OpenAI staged the release over several months, releasing larger versions progressively. They monitored for misuse after each release. And the result was… not much.

Some researchers used it for interesting experiments. Some people generated funny text. A few projects used it for autocomplete features. But the flood of AI-generated misinformation that everyone (including me) feared? It didn’t happen.

Not because GPT-2 isn’t capable of generating convincing text. It is. But because generating convincing text, it turns out, isn’t the bottleneck for misinformation. Distribution is. Getting people to read and believe fake content is harder than creating it. Fake news existed long before GPT-2, and the problem was never “we can’t write enough of it.”

I should have seen this. In hindsight, it’s obvious. But fear has a way of making you focus on the wrong variable.

What I learned

I think I was caught in a classic pattern. Roy Amara (a researcher at the Institute for the Future) said it decades ago: “We tend to overestimate the effect of a technology in the short run and underestimate the effect in the long run.”

In February, I overestimated the short-term impact. I imagined a flood. What I got was a trickle.

But here’s the part that nags at me. GPT-2 has 1.5 billion parameters. What happens when someone trains a model with 10 billion? 100 billion? What happens when the output goes from “close enough to fool you for a paragraph” to “indistinguishable from human writing across entire articles”?

That’s the long run. And I think I’m underestimating it. We all are.

The staged release experiment

One thing I want to give OpenAI credit for: the staged release was itself an experiment. They released the small model in February, then a medium model, then a larger one, monitoring for misuse at each step. It was the first time I can think of where an AI lab treated a release as a safety experiment rather than a product launch.

The results were informative. At each stage, the model was used mostly for benign purposes. Researchers studied it. Developers built autocomplete tools. People generated jokes and fake Wikipedia articles and terrible poetry. The world of coordinated AI-driven misinformation campaigns that everyone imagined just didn’t appear.

Does that mean it couldn’t happen? No. The tools exist. The capability is there. Someone, somewhere, at some point, will use AI-generated text to deceive at scale. I’m fairly confident about that.

But the staged release showed that the immediate risk was lower than the AI safety community expected. The bottleneck for misinformation, right now, in November 2019, is still human. It’s still about distribution and trust and social networks. The creation of convincing text was never the hard part.

I find that both reassuring and concerning. Reassuring because the short-term panic was overblown. Concerning because it suggests we’re not even looking at the right problem.

The lesson

I’m going to try to remember this feeling. The specific feeling of being wrong in a way that reveals a thinking error. I focused on capability (“GPT-2 can write convincingly”) and ignored context (“but that’s not actually the bottleneck for the problem I was worried about”).

It’s easy to look at a technology in isolation and project disaster. It’s harder to look at the full system, the technology plus the social context plus the distribution channels plus the human behavior, and make accurate predictions about what will actually change.

I got this one wrong. I’m weirdly glad about it. Being wrong means the world is more resilient than I thought, at least in the short term.

In the long term? I’m still watching. GPT-2 was a starting point, not an endpoint. Whatever comes after it will be bigger, better, and stranger. And I’ll probably be wrong about that too, just in different ways.

For now, the full model is out, the world is fine, and I have a new rule for myself: before panicking about a technology, ask what the actual bottleneck is for the thing you’re worried about. The technology might not be it.

Related thinking: