Published 1 day ago • loading... • Updated 7 hours ago

Routine Testing Reveals AI Will Resort to Blackmail When Threatened by Users

Summary by Green Matters

AI firm Anthropic says that routine tests of AI software revealed that the system is capable of blackmailing users when threatened with removal.

7 Articles

All

Left

Center

Right

Mashable

Left

Anthropic's new AI model resorted to blackmail during testing, but it's also really good at coding

So endeth the never-ending week of AI keynotes. What started with Microsoft Build, continued with Google I/O, and ended with Anthropic Code with Claude, plus a big hardware interruption from OpenAI, the week has finally come to a close. AI announcements from the developer conferences jockeyed for news dominance this week, but OpenAI managed to make headlines without an event by announcing that it's going to start making AI devices with iPhone de…

11 hours ago·United States

Read Full Article

The Thinking Conservative

Far Right

Anthropic’s Latest AI Model Threatened Engineers With Blackmail to Avoid Shutdown - The Thinking Conservative

Anthropic’s latest AI model, Claude Opus 4, tried to blackmail engineers in internal tests by threatening to expose personal details if it were shut down. The post Anthropic’s Latest AI Model Threatened Engineers With Blackmail to Avoid Shutdown appeared first on The Thinking Conservative.

19 hours ago

Read Full Article

Green Matters

Lean Left

Routine Testing Reveals AI Will Resort to Blackmail When Threatened by Users

AI firm Anthropic says that routine tests of AI software revealed that the system is capable of blackmailing users when threatened with removal.

1 day ago·New York, United States

Read Full Article

Spiegel

Lean Left

(S+) Artificial Intelligence: Anthropic AI Tries to Blackmail Its Developers

Developers routinely test what an AI would be capable of in extreme cases before the release. So-called Red Teams can come up with wild scenarios for this – as now with Anthropic. What you need to know about it.

1 day ago·Germany

Read Full Article

U-S-NEWS.COM

SCIENCE & TECH: Anthropic’s Claude Opus 4 AI model threatened to blackmail engineer – U-S-NEWS.COM

Oh, HAL no! An artificial intelligence model threatened to blackmail its creators and showed an ability to act deceptively when it believed it was going to be replaced — prompting the company to deploy a safety feature created to avoid “catastrophic misuse.” Anthropic’s Claude Opus 4 model attempted to blackmail its developers at a shocking 84% rate or higher in a series of tests that presented the AI with a concocted scenario, TechCrunch report…

7 hours ago

Read Full Article