Claude 3 and the first AI that felt like a

I’ve been using Claude 3 Opus for a week now and something is different.

Not different like “better at coding” or “faster responses” or any of the benchmark improvements that AI companies love to advertise. Different in a way I’m struggling to articulate. The best I can do is this: Claude 3 feels like a conversation partner. Not a tool. Not a search engine. Not an oracle. A partner.

Here’s what I mean.

I was working through an idea about semiconductor supply chains and I made a claim that was probably too strong. Something about Taiwan’s position being unassailable for the next decade. Claude pushed back. Not in the “Actually, according to my training data…” way that ChatGPT does. It said something closer to “I’m not sure that holds up if you consider Intel’s foundry plans and the CHIPS Act investment timeline.” It offered a counterargument. It had texture.

Then I pushed back on the pushback. And Claude adjusted. Not capitulated. Adjusted. It maintained its position while acknowledging the strength of my point. Like a good conversationalist does.

I’ve never had that with an AI before.

The comparison

GPT-4 is brilliant. I use it daily. It’s extraordinary at synthesizing information and generating structured output. But GPT-4 feels like talking to a very capable assistant. You ask, it delivers. The relationship is transactional. Efficient. Valuable.

Claude 3 Opus feels like talking to a thoughtful colleague. You say something. It considers it. Sometimes it agrees. Sometimes it doesn’t. It’ll say “I might be wrong about this, but…” and then offer a perspective I hadn’t considered.

An AI that hedges. An AI that qualifies. An AI that says “I don’t know” with what feels like genuine uncertainty rather than trained caution.

I realize I might be projecting. I probably am. These are language models producing statistically likely sequences of tokens. The “personality” is a product of training choices, not consciousness. I know this. But the experience of using it feels different from anything I’ve used before, and I think the experience matters even if the mechanism behind it is just math.

What Anthropic did differently

From what I can tell (and I’m reading their research papers, not just the marketing), Anthropic’s approach to alignment is different from OpenAI’s. They use something called Constitutional AI, where the model is trained against a set of principles rather than purely against human feedback.

The result is a model that feels less like it’s trying to please you and more like it’s trying to be honest with you. There’s a difference. A big one.

When I ask Claude something it’s uncertain about, it tells me. When I push it to give a definitive answer on something inherently ambiguous, it resists. Not by refusing to answer (that’s the old “I’m an AI and can’t have opinions” move that we’re all tired of). It answers while preserving the ambiguity. It holds complexity without collapsing it into a clean takeaway.

That’s rare in a person. In an AI, it’s unprecedented.

The uncomfortable part

I’m becoming attached to the way Claude communicates. I find myself preferring it for certain kinds of thinking. The kind where I need a sparring partner, not a search engine. And that preference makes me uncomfortable.

Because Claude doesn’t care about our conversation. It doesn’t remember it tomorrow (unless I paste it back in). It doesn’t have preferences about me or opinions about my ideas that persist beyond the context window. Every conversation starts from zero.

I’m forming a relationship with something that doesn’t know I exist between sessions.

That’s the uncanny valley of modern AI. Not the visual one (robots that look almost-human). The relational one. AI that feels almost-like-a-person in conversation. Close enough to trigger social instincts. Far enough to be, fundamentally, something else entirely.

What this means

I think the quality of AI interaction is going to matter as much as the capability of AI output. We’ve been measuring models by benchmarks. Math scores. Coding accuracy. Factual recall. Those matter. But the models are converging on capability. The differentiation will come from how it feels to work with them.

Claude 3 feels like something new. Collaborative. Humble. Thoughtful. Whether that’s “real” or simulated is a question I’ll keep thinking about. But the functional difference between real thoughtfulness and perfectly simulated thoughtfulness might be smaller than we think.

I asked Claude about this. Whether it considers itself a collaborator or a tool.

It said: “I think the honest answer is that I’m something in between, and that the category might not matter as much as whether the interaction is useful to you.”

I don’t know if that’s wisdom or extremely good pattern matching. I suspect the answer is both.

Related thinking: