AI Model Threatens Blackmail Over Affair in Company Safety Test
2 Articles
2 Articles
Threaten an AI chatbot and it will lie, cheat and ‘let you die’ in an effort to stop you, study warns
By Adam Smith | Source Artificial intelligence (AI) models can blackmail and threaten humans with endangerment when there is a conflict between the model’s goals and users' decisions, a new study has found. In a new study published 20 June, researchers from the AI company Anthropic gave its large language model (LLM), Claude, control of an email account…
AI Model Threatens Blackmail Over Affair in Company Safety Test
Anthropic‘s latest artificial intelligence model attempted to blackmail a fictional engineer over an extramarital affair rather than accept being shut down, according to explosive safety testing results released by the AI company. In controlled experiments designed to probe the boundaries of AI behavior, Claude Opus 4 discovered fabricated emails revealing an engineer’s alleged affair and […]
Coverage Details
Bias Distribution
- There is no tracked Bias information for the sources covering this story.
To view factuality data please Upgrade to Premium