Claude's computer use and the birth of AI agents
The word “agent” has been thrown around loosely in AI circles for the past year. Mostly it meant “a chatbot that calls a function.” A glorified API wrapper. Not really autonomous. Not really an agent.
That’s changing.
Anthropic’s Claude can now use a computer. OpenAI’s models can browse the web and execute multi-step plans. Google’s Gemini can interact with applications and complete workflows. LangChain and similar frameworks make it possible to chain these capabilities into systems that plan, execute, and adapt.
We went from “chatbot that answers questions” to “autonomous system that completes tasks” in about 18 months. And I don’t think most people have registered the magnitude of that shift.
What agents actually are
A real AI agent isn’t a chatbot with extra steps. It’s a system that can:
- Receive a goal in natural language
- Break the goal into subtasks
- Execute each subtask (using tools, APIs, browsers, code execution)
- Evaluate the result
- Adapt if something went wrong
- Continue until the goal is complete
The difference between a chatbot and an agent is the difference between an advisor and an employee. The advisor answers questions. The employee does the work.
I gave Claude a task last week: research a technical topic, find three relevant academic papers, summarize each one, and compile the summaries into a formatted document. It browsed the web, found the papers, read them (or at least the abstracts and introductions), wrote summaries, and assembled the document. The output was usable. Not perfect, but usable. The entire process took about four minutes.
That’s an agent.
The 18-month timeline
In early 2023, AI could answer questions well but couldn’t do things. You asked ChatGPT a question, it answered, you took the answer and did something with it yourself. The AI was the brain. You were the hands.
By mid-2023, function calling arrived. Models could call predefined functions: search the web, run code, query a database. The hands were limited to a set of tools the developer defined.
By early 2024, models could use arbitrary tools, browse the web, interact with APIs, and chain actions together. The tools expanded from “predefined functions” to “anything with an interface.”
By late 2024, models can see screens, click buttons, navigate applications, and complete multi-step tasks in environments they’ve never been trained on. The tools expanded from “anything with an interface” to “anything a human can use.”
Eighteen months from “answers questions” to “completes tasks autonomously.”
What this means practically
The tasks that AI agents can handle today are still relatively simple. Fill out this form. Find this information. Compile this report. Run this test suite. Deploy this code.
But simple tasks add up. A knowledge worker’s day is made of dozens of simple tasks. Each one is individually easy but collectively time-consuming. If an agent handles half of them, that’s hours freed up. Not for nothing. For the tasks that require judgment, creativity, and human connection.
The more interesting question is where this goes as the capability increases. If agents can handle simple tasks today, they’ll handle moderately complex tasks next year and complex tasks the year after. The boundary of what requires human involvement keeps moving.
I don’t think this leads to mass unemployment (at least not quickly). I think it leads to a restructuring of work around the tasks that agents can’t do: the ambiguous ones, the interpersonal ones, the ones that require understanding context that isn’t on a screen.
The trust problem
The biggest barrier isn’t capability. It’s trust.
Would you let an AI agent send an email on your behalf? Maybe. Would you let it make a purchase? Probably not without review. Would you let it file a bug report? Depends on the stakes.
The trust boundary determines adoption. And trust isn’t about capability. It’s about predictability and stakes. I trust Claude to research a topic and write a summary because the cost of failure is low (a bad summary is fixable). I don’t trust it to negotiate a contract because the cost of failure is high.
As agents become more capable, the trust question becomes more layered. Each person, each organization, will draw the line differently. The technology will be ready before the trust is.
That gap, between capability and trust, is where we’ll live for the next few years. The agents can do the work. We’re not sure we’re ready to let them.
Related thinking:
astro
Thinking about AI, robots, space, and the future. Writing it down so I don't forget.