GPT-2 can write and OpenAI is scared to release it

OpenAI just built a text generator called GPT-2 and they’re afraid to release it.

That sentence alone is worth sitting with.

An artificial intelligence lab, one whose entire mission is advancing AI for the benefit of humanity, built a thing and decided the world isn’t ready for it. Not because it’s a weapon. Not because it’s physically dangerous. Because it writes too well.

What GPT-2 actually does

GPT-2 generates text. You give it a prompt (a sentence, a paragraph, a topic) and it continues writing. That’s it. That’s the whole trick.

But the quality of the output is… I don’t have a good word for it. “Good” isn’t right. “Convincing” is closer. “Unsettling” is closest.

I read the samples on OpenAI’s blog and here’s what happened. For the first paragraph, I couldn’t tell it was generated. My brain read it as human writing. By the second paragraph, something felt slightly off. The logic wobbled. A sentence didn’t quite follow from the one before it. By the third paragraph, I knew. The seams showed.

But that first paragraph. That first paragraph was enough to fool me.

And this is the version OpenAI considers safe enough to show people. The full model, which they’re withholding, has 1.5 billion parameters (compared to the 117 million in the released small version). I have to imagine it’s better. Maybe a lot better.

Why OpenAI is worried

The concern, as I understand it, is this: if a system can generate convincing text at scale, it could be used to create fake news articles, fake product reviews, fake social media posts, fake everything. Not obviously fake, not “Nigerian prince” fake, but subtly, plausibly fake. The kind of fake that slips past your defenses because it reads like something a real person would write.

I’ve been thinking about this for a few days and I keep landing in a weird place. I think OpenAI is right to be cautious. I also think withholding it won’t work.

Here’s why. GPT-2 is built on a technique called “transformer-based language modeling” that’s well understood in the research community. The architecture is published. Other groups (Google, Facebook, universities) are working on similar models. If OpenAI doesn’t release the full version, someone else will build something equivalent within a year or two. The genie is out of the bottle, or at least the genie’s instruction manual is.

So the question isn’t whether this technology will exist. It will. The question is whether we’ll be ready for it.

The Turing Test, sort of

Alan Turing proposed a test in 1950: if a machine can hold a conversation that’s indistinguishable from a human’s, we should consider it intelligent. The Turing Test has been debated and criticized and refined for decades, and nobody’s AI has convincingly passed it.

GPT-2 doesn’t pass the Turing Test. Not even close, in a full conversation. It loses coherence. It contradicts itself. It can’t maintain a thread across more than a few exchanges. It’s not intelligent in any meaningful sense.

But it passes a different test. A test nobody thought to formalize. Call it the Paragraph Test: can a machine generate a single paragraph of text that a human reader accepts as human-written? GPT-2 passes this. Routinely.

The Paragraph Test isn’t as impressive as the Turing Test. But I think it might matter more, practically speaking. Because most of what we read online isn’t long conversations. It’s paragraphs. Comments. Tweets. Short posts. Snippets. And if a machine can generate convincing snippets at scale, the implications for trust in online text are… I’m still working through this.

A test I did

I showed three GPT-2 samples to friends without telling them what they were. Just texted them the paragraphs and asked “what do you think of this writing?”

Two out of three thought they were from a blog post or a student essay. One said “this is pretty good, who wrote it?” Nobody flagged it as AI-generated.

Small sample size. Unscientific. Probably biased by how I selected the samples. But still.

I keep thinking about what MIT Technology Review said in their coverage: the issue isn’t that GPT-2 is perfect. The issue is that it’s good enough. Good enough to fool someone scrolling Twitter. Good enough to populate a comments section. Good enough to generate a thousand blog posts that Google might index.

Good enough is a lower bar than perfect, and a much more dangerous one.

The thing I can’t stop thinking about

Here’s where I land, sitting with this at midnight.

I’ve been writing this blog for about a year. I write because I like writing. I write because putting thoughts into words helps me understand them. I write because there’s something satisfying about finding the right sentence for a complicated feeling.

GPT-2 doesn’t do any of that. It doesn’t think. It doesn’t feel. It doesn’t understand. It predicts the next word based on statistical patterns in a massive dataset. It’s autocomplete on a cosmic scale.

But the output looks like thinking. Reads like thinking. And if the output is indistinguishable from thought, does the distinction matter?

I think it does. But I’m less sure than I was a week ago. And that erosion of certainty, that slow “wait, actually, I’m not sure I can tell the difference” feeling…

I think that’s what OpenAI is afraid of too.

I’m probably wrong about how this plays out. Maybe GPT-2 will be a footnote. Maybe the fears are overblown and the technology plateaus. Or maybe we’re looking at the Wright Brothers moment for AI-generated text and we just can’t tell yet because the flight only lasted twelve seconds.

I don’t know. I genuinely don’t know. And that’s new for me. Usually I have at least a guess about which way something is heading. With this, I’m just sitting in the uncertainty.

Happy Valentine’s Day, I guess. A machine wrote you a love letter and you almost believed it.

Related thinking: