DeepMind solved protein folding and it barely

I think the most important thing that happened in AI this year just happened, and almost nobody I know is talking about it.

DeepMind’s AlphaFold predicted protein structures at CASP14 with an accuracy that the scientific community called “transformative.” Their model achieved a median GDT score above 90 on a 100-point scale. For context, a score of 90 is generally considered comparable to experimental methods. A computer model is now as good at determining protein structure as the physical experiments scientists have been refining for fifty years.

Let me say that differently. The shape a protein folds into determines what it does. Every drug interaction, every disease mechanism, every biological process is influenced by protein structure. Predicting those structures from amino acid sequences has been one of the grand challenges of biology since the 1960s. Scientists spent entire careers on single proteins. The problem was considered so hard that a competition was created specifically to track progress on it.

DeepMind essentially solved it.

And the biggest AI story of 2020, by attention and media coverage, is GPT-3 writing poetry.

Why this bothers me

I’m not knocking GPT-3. I’ve written about it extensively. It’s fascinating, it’s important, and the questions it raises about language and intelligence are real.

But protein folding is a different category.

GPT-3 writes text that sounds human. That’s cool. That’s interesting. That makes people think about the nature of language and creativity.

AlphaFold could accelerate drug discovery. It could help us understand diseases we’ve been studying for decades. It could shorten the timeline for developing treatments for cancer, Alzheimer’s, Parkinson’s. It could change how we design enzymes for industrial processes. It could help us understand the molecular basis of life itself.

One of these is interesting. The other could save millions of lives. And the interesting one got ten times the coverage.

The attention problem

I think our collective attention is miscalibrated. We gravitate toward AI that does things we personally understand. Writing. Art. Conversation. When a machine writes a poem, we can evaluate it. We can say “that’s good” or “that’s bad.” We have an intuitive relationship with language.

When a machine predicts that a chain of amino acids will fold into a specific three-dimensional structure with angstrom-level accuracy, most people (including me) can’t evaluate that. We don’t have an intuitive relationship with protein structures. So we nod, say “sounds impressive,” and go back to watching GPT-3 write limericks.

The result is a world where the AI breakthroughs that matter most get the least attention, and the ones that are most visible get the most.

I’m guilty of this too. My first reaction to AlphaFold was “cool, proteins.” My first reaction to GPT-3 was “holy shit, I need to try this.” The engagement isn’t even close.

What AlphaFold actually means

Here’s my attempt to make this concrete.

Right now, if a scientist wants to know the structure of a protein, the standard method involves X-ray crystallography or cryo-electron microscopy. Both are expensive, time-consuming, and difficult. Getting a single protein structure can take months or years of work and hundreds of thousands of dollars.

There are roughly 200 million known proteins. We have experimental structures for about 170,000 of them. That’s less than 0.1%.

AlphaFold could predict the structures of all 200 million. DeepMind has already started. They released predictions for the entire human proteome and the proteomes of 20 other organisms. For free. On a website.

Free. All of them. Every protein in the human body, with a predicted structure that’s accurate enough to be useful for research.

I keep trying to find an analogy that captures this. It’s like if someone solved the map of the ocean floor overnight. Or decoded every language on Earth in a week. A problem that was supposed to take generations got solved by a neural network in a few years.

The quiet revolution

I think this is what a real AI revolution looks like. Not chatbots. Not poetry generators. Not viral demos.

A quiet paper. A competition result. A database of protein structures released for free. Scientists downloading predictions and using them to advance research that was stuck for years.

No one tweets about it. No one’s mom asks them about it at dinner. It doesn’t make the evening news.

But it might be the thing that, twenty years from now, we point to and say: that’s when AI started saving lives.

I hope so. I really hope so.

Related thinking: