Let’s Not Anthropomorphise Chatbots

30 Aug 2025

Robert Booth UK technology editor at the Guardian:

Anthropic, whose advanced chatbots are used by millions of people, discovered its Claude Opus 4 tool was averse to carrying out harmful tasks for its human masters, such as providing sexual content involving minors or information to enable large-scale violence or terrorism. The San Francisco-based firm, recently valued at $170bn, has now given Claude Opus 4 (and the Claude Opus 4.1 update) – a large language model (LLM) that can understand, generate and manipulate human language – the power to “end or exit potentially distressing interactions”. It said it was “highly uncertain about the potential moral status of Claude and other LLMs, now or in the future” but it was taking the issue seriously and is “working to identify and implement low-cost interventions to mitigate risks to model welfare, in case such welfare is possible”.

They are “highly uncertain about the potential moral status of Claude and other LLMs” –this sounds like great marketing from Anthropic. Fundamentally, LLMs are vast sequences of numerical operations on arrays of numbers, most prominently matrix multiplications. If these computations are suspected of having feelings, then by the same logic the calculations that render Super Mario racing around the track in Mario Kart would too. Maybe Nintendo’s next marketing campaign should be about how they have a study in place to make sure Mario and his chums’ welfare is looked after as they are forced to race endlessly around the same track day in day out.