Published 1 day ago • loading... • Updated 4 hours ago

Claude Can Now Stop Conversations - for Its Own Protection, Not Yours

In a statement Friday, the company said Claude Opus 4 and 4.1 models now can end conversations with users in extreme cases to improve AI welfare.
Amid concerns about AI well-being, Anthropic said it developed the feature as part of its work on potential "AI welfare" and plans to continue refining the approach.
According to Anthropic, Claude Opus 4 and 4.1 models only end conversations as a last resort in extreme cases like harmful requests, after multiple redirection attempts.
The change could threaten jailbreaking efforts, as most users won't notice Claude ending chats in extreme cases, Anthropic said.
Last week, Anthropic announced that Claude can autonomically end conversations in extreme cases amid safety concerns and rising reliance on AI chatbots for therapy and advice.

Insights by Ground AI

Does this summary seem wrong?

21 Articles

ZDNet

Reposted by

IT Security News - cybersecurity, infosecurity news

Center

Claude can now stop conversations - for its own protection, not yours

The company has given its AI chatbot the ability to end toxic conversations as part of its broader 'model welfare' initiative.

4 hours ago·United States

Read Full Article

PC Mag

Lean Left

Be Nice: Claude Will End Chats If You're Persistently Harmful or Abusive

'This ability is intended for use in rare, extreme cases,' Anthropic says.

4 hours ago·United States

Read Full Article

Mashable

Left

Anthropic says Claude chatbot can now end harmful, abusive interactions

Harmful, abusive interactions plague AI chatbots. Researchers have found that AI companions like Character.AI, Nomi, and Replika are unsafe for teens under 18, ChatGPT has the potential to reinforce users’ delusional thinking, and even OpenAI CEO Sam Altman has spoken about ChatGPT users developing an "emotional reliance" on AI. Now, the companies that built these tools are slowly rolling out features that can mitigate this behavior.On Friday, A…

6 hours ago·United States

Read Full Article

The Verge

Lean Left

Claude AI will end ‘persistently harmful or abusive user interactions’

Anthropic’s Claude AI chatbot can now end conversations deemed “persistently harmful or abusive,” as spotted earlier by TechCrunch. The capability is now available in Opus 4 and 4.1 models, and will allow the chatbot to end conversations as a “last resort” after users repeatedly ask it to generate harmful content despite multiple refusals and attempts at redirection. The goal is to help the “potential welfare” of AI models, Anthropic says, by te…

8 hours ago·United States

Read Full Article

Business Insider

Lean Left

Why Anthropic is letting Claude walk away from you — but only in 'extreme cases'

Anthropic says its AI can now end extreme chats when users push too far.Smith Collection/Gado/Getty ImagesPush too far and Claude will end the chat.Anthropic said its Opus 4 and 4.1 models can walk away from extreme chats, including child exploitation requests.Most users "will not notice or be affected by this feature in any normal product use," it added.Claude isn't here for your toxic conversations.In a blog post on Saturday, Anthropic said it…

13 hours ago·United States

Read Full Article