AI 2 min read

Stable Diffusion is free and everything is

Stability AI released Stable Diffusion as open source. Free. Downloadable. Runnable on a consumer GPU with 8GB of VRAM.

DALL-E 2 was behind a waitlist and an API. Midjourney was in a Discord server with usage limits. Stable Diffusion is a model you can download and run on your own machine, forever, with no rate limits, no content filter you didn’t choose, and no monthly fee.

This changes the math entirely.

The weekend

I downloaded it Friday night. Installed the AUTOMATIC1111 web UI. Ran my first generation at 11 PM. Went to bed at 4 AM.

I generated maybe 200 images in those five hours. Landscapes, portraits, abstract art, architectural concepts, sci-fi scenes. The quality was uneven. Some outputs were stunning. Some were mangled. Hands were consistently terrible (AI hands are the uncanny valley of image generation, fingers fused and multiplied in ways that haunt you).

But the speed. Type a prompt. Wait 15 seconds. Get an image. Iterate. Adjust. Generate again. The feedback loop between imagination and visualization collapsed from “hire an artist, wait weeks, review, revise” to “type, wait, see.”

That collapse is what matters.

What changes

Stock photography is the most immediately affected industry. Why pay $50 for a stock photo of “diverse team in modern office” when you can generate exactly the image you need in 15 seconds? Getty and Shutterstock have business models built on selling visual content. When visual content becomes free to produce at the point of consumption, those models need to find new ground.

Concept art is next. Game studios, film pre-production, architectural visualization, product design. Anywhere someone sketches ideas before committing to expensive production, AI image generation will compress the ideation phase by orders of magnitude.

Illustration will change shape. Not disappear. Change shape. The illustrators who thrive will be the ones who use these tools as a starting point and add what the machine can’t: intentional style, narrative understanding, emotional specificity that a prompt can’t capture.

The open source part

This is the thing I keep thinking about. Hugging Face hosts the model weights. Anyone can download them. Build on them. Fine-tune them on specific datasets. Create specialized versions for specific styles or domains.

DALL-E 2 is a product. Stable Diffusion is infrastructure.

When something becomes infrastructure, it stops being controlled by any single company. The genie isn’t going back in the bottle. Even if Stability AI disappeared tomorrow, the model is out. It’s been downloaded millions of times. Forks exist. Improvements are happening daily. The community is building on it faster than any company could.

What I feel

Honestly? Two things at once.

Awe at the technology. The speed, the quality, the accessibility. A tool that turns words into images, free and open, available to anyone with a computer. That’s science fiction made real. That’s the Star Trek replicator, but for visual content.

And unease at the implications. For artists whose livelihoods depend on commercial illustration. For our already saturated visual culture, about to get even more saturated. For the concept of authorship, when an image can be produced without a hand ever touching a surface.

Both feelings are valid. I’m sitting with them. I generated another 100 images today. Some of them were beautiful. All of them were free. None of them were made by a human hand.

I don’t know what to do with that. But I think everyone’s going to have to figure it out soon.


Related thinking:

a

astro

Thinking about AI, robots, space, and the future. Writing it down so I don't forget.