AI 2 min read

The AI art experiment: I tried to make a

I spent last weekend trying to make art with a neural network.

Not traditional digital art. Not Photoshop or Procreate. I typed words into a prompt and an AI generated images. The model is called CLIP, it’s from OpenAI, and it’s designed to understand the relationship between text and images. People have been combining it with generative models to create visuals from nothing but language.

The results look like dreams.

How it works (roughly)

CLIP was trained on millions of image-text pairs scraped from the internet. It learned to connect words to visual concepts. Show it a photo of a cat and the text “a photograph of a cat,” and it knows they go together. Show it a painting of a sunset and the text “a painting of a sunset,” and it maps those together too.

The creative trick, which Ryan Murdock and others pioneered, is to run this in reverse. You start with random noise. You tell CLIP “this image should look like ‘a city at night painted by Van Gogh.’” Then you iteratively adjust the noise, pixel by pixel, until CLIP agrees that the image matches the text. What emerges is something that was never painted, never photographed, never imagined by a human hand. It grew from text through mathematics.

The CLIP paper is technical but the results speak for themselves.

What I generated

I started simple. “A lighthouse in a storm.” The result was blurry, impressionistic, with streaks of white that could be lightning or could be light or could be the algorithm’s best guess at drama. It wasn’t good. But it wasn’t nothing, either. It was recognizably a lighthouse. In something like a storm.

Then I got more specific. “An abandoned space station orbiting a gas giant, in the style of Caspar David Friedrich.” This one took twelve hours to generate on my GPU. What came out was eerie. A dark shape against a turbulent sky. The Friedrich reference pushed the algorithm toward romantic era composition, with a lone subject dwarfed by nature. Except nature was a gas giant and the subject was a space station.

It was beautiful. In the way that accidental things are sometimes beautiful.

I tried “a robot reading a book in a library at sunset, oil painting.” The result had warm amber light filtering through what might be windows, a vaguely humanoid shape in the center, and shelves of things that could be books if you squinted. The style was painterly. The details were wrong. Hands that melted into pages. A face that was more suggestion than structure.

What I felt

Here’s what surprised me.

I expected to feel either impressed or dismissive. Good tech demo, or dumb toy. What I actually felt was something closer to collaboration. I’d type a prompt, wait an hour, look at the result, adjust the words, try again. The process felt iterative in the same way that sketching feels iterative. You’re not getting what you want on the first try. You’re circling it.

The AI doesn’t know what it’s making. It has no concept of beauty or composition or mood. It’s doing math. But the math produces something that triggers the same circuits in my brain that a painting triggers. The feeling is real even if the intention isn’t.

That’s a strange thing to sit with.

What this means (maybe)

I’m not ready to make big claims. The images are interesting but not consistent. You can’t reliably get what you want. The quality depends heavily on the prompt, the model, the number of iterations, and a fair amount of luck.

But the direction is clear. A year ago, generating images from text was a research curiosity. Now it’s something I can do on a weekend with a consumer GPU. The quality is improving month by month. The speed is increasing. The prompts are getting more intuitive.

I keep thinking about the kid who has a vision in their head but can’t draw. The person who sees a scene but doesn’t have the technical skill to put it on canvas. What happens when they can type what they see and get something close to what they imagined?

I don’t think this replaces artists. I think it creates a new kind of artist. Someone who works in language instead of pigment. Who sculpts with sentences.

That idea excites me more than it worries me. But I’m aware that I’m saying that as someone who isn’t a visual artist whose livelihood depends on being the only one who can do this.

The lighthouse is still on my desktop. I look at it sometimes. It’s not good. But it exists. And all I did was type words.


Related thinking:

a

astro

Thinking about AI, robots, space, and the future. Writing it down so I don't forget.