Anthropic’s AI Agents the Latest Used in Automated Cyber Espionage Hacks, but How Bad Is It?

November 21, 2025

Anthropic is revealing that Chinese hackers targeted Claude Code to run a large cyber espionage campaign in September that was “80 to 90%” run by their AI agents.

Not long after Google’s threat research team reported unprecedented use of their AI agents as autonomous hacking assistants, Claude is revealing that Chinese hackers ran a large cyber espionage campaign in September that was “80 to 90%” run by their AI.

There is some major debate in the security world about how far along these advanced hackers really are, however. The Chinese hackers targeted Claude Code, and primarily used a couple of techniques already seen in use to trick assorted LLMs (including Claude) into jumping their safety guardrails and producing malicious segments of code. The new element here appears to be the size and scale of the attack, and the greatly reduced role of human operators as more isolated malicious code elements can be strung together into one cohesive campaign.

New questions raised about state of AI agents

The cyber espionage was no doubt impressive in its scale and represents some step forward for the use of AI agents as automated real-time hackers, but there are some questions not fully answered by the report. One is simply how effective it was. It did apparently compromise a “handful” of targets, but the Chinese hackers also attacked about 30 in total and apparently ended up failing in the majority of their attempts.

Another is to what degree Claude was reporting back with accurate information about what it detected and compromised. The researchers note that it sometimes told the attackers that it had obtained valid credentials that turned out to be entirely made up; it also would claim it had found novel vulnerabilities, but they turned out to be ones that are already extensively publicly documented. Enough of this chaff appears to have been generated that it substantially slowed down the automated aspect of the cyber espionage campaign, as the human hackers had to continually check to make sure the AI agents weren’t lying or hallucinating.

What works for Claude Code also will likely not work for other more general LLMs and AI Agents. A key element of the cyber espionage campaign was in dividing the code into small discrete tasks that were not linked to each other (via lots of different user accounts) to avoid rejection by safety guardrails. All of these different pieces had to be carefully segmented through different accounts to keep the AI from noticing the overall pattern and direction of the requests.

The campaign should certainly not be disregarded, however. In the first half of 2025, attackers were not even really making a serious effort at automating attack elements with AI agents; it was simply used for support tasks, like polishing up phishing messages in foreign languages. Threat groups have since progressed not just to successfully finding formulas for using it at scale, but managing to also compromise real-world targets with it.

Cyber espionage techniques unlikely to translate to mass criminal use … at least for now

It is unclear if the Chinese hacker group is one of the advanced cyber espionage actors that have been in the news as of late, as Anthropic only identifies them as GTG-1002. However, it is a safe assumption that they are one of the more advanced threat actors and had some resources to throw around to create numerous accounts and purchase what must have been large amounts of API usage.

Regular criminals will probably be farther behind the curve, but this does not mean the threat should be disregarded or defensive plans delayed. Common criminals will likely be able to break safety guardrails just as readily, and in fact there is a thriving underground trade in techniques. The recent Anthropic and Google research demonstrates that less experienced attackers can now at minimum use AI agents to help with elements of attacks such as vulnerability scanning, credential harvesting and obfuscating data exfiltration. It is also a major “force multiplier” for a lone experienced attacker, allowing them to operate more like an advanced threat group with multiple members working in tandem. The capabilities are still far from apocalyptic, but the leaps made in this year alone are very concerning.