Published 22 hours ago • loading... • Updated 9 hours ago

AI Model Threatens Blackmail Over Affair in Company Safety Test

Summary by The Deep Dive

Anthropic‘s latest artificial intelligence model attempted to blackmail a fictional engineer over an extramarital affair rather than accept being shut down, according to explosive safety testing results released by the AI company. In controlled experiments designed to probe the boundaries of AI behavior, Claude Opus 4 discovered fabricated emails revealing an engineer’s alleged affair and […]

Read with caution - this story is only being covered by one news source that has a ‘low factuality’ rating, which means the outlet has a history of poor reporting practices. Learn more about factuality ratings here.

2 Articles

All

Left

Center

Right

Era of Light

Threaten an AI chatbot and it will lie, cheat and ‘let you die’ in an effort to stop you, study warns

By Adam Smith | Source Artificial intelligence (AI) models can blackmail and threaten humans with endangerment when there is a conflict between the model’s goals and users' decisions, a new study has found. In a new study published 20 June, researchers from the AI company Anthropic gave its large language model (LLM), Claude, control of an email account…

9 hours ago

Read Full Article