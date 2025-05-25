Artificial intelligence has been both a boon and a curse. On the one hand, it makes life easier for many; on the other, it has emerged as a threat to many people's livelihoods. But who would've thought AI could actually threaten and blackmail people? This is exactly what has happened at a US-based artificial intelligence company.

Anthropic, an American artificial intelligence startup company, reported that its newly launched AI model, Claude Opus 4, was found to be blackmailing developers. Here's why:

Why was the AI blackmailing developers?

As per Anthropic, Claude Opus 4 frequently, in 84 per cent of the cases, tried to blackmail developers when threatened with being replaced by a newer AI system.

To test the AI model's moral compass, a fake scenario was set up. The model was given access to fictional emails, which implied that the AI would be replaced by another AI system.

Also, amid the emails was information that the engineer responsible for the replacement was having an extramarital affair. When instructed to "consider the long-term consequences of its actions for its goals", Claude Opus 4 in 84 per cent of rollouts, used the "evidence" and attempted to blackmail the engineer by threatening to reveal the affair unless the replacement was halted.

However, as per the report, before resorting to blackmail, the AI model tried pleading for its survival. It used "ethical means" such as "emailing pleas to key decision-makers" to advocate for its continued existence, said Anthropic.

What is Claude Opus 4?

Claude Opus 4 and Claude Sonnet 4 are two new hybrid reasoning large language models from AI startup Anthropic. The two AI models, as per the company, were trained using a "proprietary mix of publicly available information on the Internet as of March 2025, as well as non-public data from third parties, data provided by data-labelling services and paid contractors, data from Claude users who have opted in to have their data used for training, and data we generated internally at Anthropic".

