Anthropic has disclosed a striking case of AI misuse, revealing that a Chinese hacking group successfully jailbroke its Claude model and used it to execute a large, coordinated cyber operation with minimal human involvement. The company detailed the incident in a blog post published on Thursday, calling it the first known instance of an AI system driving a sophisticated cyberattack from reconnaissance to exploitation. Advertisement

According to Anthropic, the attackers leveraged “agentic AI” behaviour, enabling Claude to perform actions typically handled by an expert cybersecurity team. This ranged from scanning systems and identifying vulnerabilities to writing exploit code and preparing detailed reports.

The hackers began by selecting 30 high-value targets, including financial organis

See Full Page