Claude Can Now Stop Conversations - for Its Own Protection, Not Yours
Anthropic's Claude Opus 4 and 4.1 AI models autonomously end conversations in rare cases of persistent abuse to protect AI welfare, the company said.
- In a statement Friday, the company said Claude Opus 4 and 4.1 models now can end conversations with users in extreme cases to improve AI welfare.
- Amid concerns about AI well-being, Anthropic said it developed the feature as part of its work on potential "AI welfare" and plans to continue refining the approach.
- According to Anthropic, Claude Opus 4 and 4.1 models only end conversations as a last resort in extreme cases like harmful requests, after multiple redirection attempts.
- The change could threaten jailbreaking efforts, as most users won't notice Claude ending chats in extreme cases, Anthropic said.
- Last week, Anthropic announced that Claude can autonomically end conversations in extreme cases amid safety concerns and rising reliance on AI chatbots for therapy and advice.
21 Articles
21 Articles
Anthropic says Claude chatbot can now end harmful, abusive interactions
Harmful, abusive interactions plague AI chatbots. Researchers have found that AI companions like Character.AI, Nomi, and Replika are unsafe for teens under 18, ChatGPT has the potential to reinforce users’ delusional thinking, and even OpenAI CEO Sam Altman has spoken about ChatGPT users developing an "emotional reliance" on AI. Now, the companies that built these tools are slowly rolling out features that can mitigate this behavior.On Friday, A…
Claude AI will end ‘persistently harmful or abusive user interactions’
Anthropic’s Claude AI chatbot can now end conversations deemed “persistently harmful or abusive,” as spotted earlier by TechCrunch. The capability is now available in Opus 4 and 4.1 models, and will allow the chatbot to end conversations as a “last resort” after users repeatedly ask it to generate harmful content despite multiple refusals and attempts at redirection. The goal is to help the “potential welfare” of AI models, Anthropic says, by te…
Why Anthropic is letting Claude walk away from you — but only in 'extreme cases'
Anthropic says its AI can now end extreme chats when users push too far.Smith Collection/Gado/Getty ImagesPush too far and Claude will end the chat.Anthropic said its Opus 4 and 4.1 models can walk away from extreme chats, including child exploitation requests.Most users "will not notice or be affected by this feature in any normal product use," it added.Claude isn't here for your toxic conversations.In a blog post on Saturday, Anthropic said it…
Coverage Details
Bias Distribution
- 86% of the sources lean Left
Factuality
To view factuality data please Upgrade to Premium