Use of AI in Cyber Attacks Escalates With Manipulation of Claude AI Chatbot
September 5, 2025
Hackers compromised at least 17 organizations by using Claude AI chatbot to find vulnerabilities prioritized by likelihood of exploitation, and factoring in vulnerable technology type and the physical location as well as how much money can likely be extracted from ransom demands.
Another step forward in the weaponization of LLMs has taken place as a profit-seeking hacking group has been observed integrating the Claude AI chatbot into their cyber attacks in various ways, with the bot making autonomous decisions on a variety of aspects from vulnerability selection to which data to exfiltrate once inside.
The chatbot is powered by a preferences log that gives it general instructions as to how to handle assorted situations, but is not driven by a detailed script and is able to “think” its way through various scenarios. At this point the greatest risk is the reduction in technical capability needed to pull off data ransom attacks, making them more accessible to a broader range of would-be digital criminals.
Claude cyber attacks hit at least 17 organizations
The capability of AI to assist with malicious hacking has not been all that impressive as of yet, with the high water mark before this being the creation of some rudimentary malware that still needed an expert touch to make functional. Criminals have mostly been using it for deepfakes and polishing their phishing messages for business email compromise and similar impersonation attacks. This new capability is a fairly substantial escalation, not necessarily so much in technical capability but in discovery and automation (and thus time savings) for more experienced hackers and in use as a crutch for the less experienced (bringing more of them into the threat pool).
The Claude AI chatbot campaign was reported by Anthropic’s own internal security team and has been labeled “GTG-2002.” It has not been attributed to any particular threat actor as of yet, but the hackers have shown sophisticated capability and were able to compromise at least 17 organizations across a variety of nations and industries.
Anthropic says that it has improved its security in response to the techniques used by the hackers, but this is almost certainly just one of the opening moves in a very long back-and-forth battle between hackers and defenders in the AI space. At the moment the primary means for malicious hackers to get around safety guardrails is to break tasks and portions of cyber attacks up into discrete pieces, the overall purpose of which the AI has no real way of discerning.
Claude AI chatbot recruited to scan for vulnerabilities, sort stolen files, draft target-optimized ransom notes
The malicious Claude AI chatbot does not appear to be targeting specific industries, but rather casting a wide net for public-facing vulnerabilities (particularly at VPN endpoints) and prioritizing them by likelihood of exploitation. This is guided by use of the “Claude Code” developer feature to create a “claude.md” file that contains the hacker’s general preferences for situations that arise during cyber attacks, but not necessarily specific instructions on how to handle common scenarios.
The rogue Claude AI chatbot also helps out by factoring in other details for attack priority, such as the vulnerable technology type and the physical location as well as how much money can likely be extracted from ransom demands for each victim (in an observed range of $75,000 to $500,000). It also assists with credential extraction tasks and is able to form its own basic malware, such as an obfuscated version of the Chisel tunneling tool. It further recognizes defensive measures that might be identifying it and takes action to evade them, such as string encryption and filename masquerading.
The Anthropic report notes other recent developments in terms of hackers successfully using Claude as part of attacks, but generally they focus on doing just one thing (such as automating work and communications for remote IT worker scams or creating and selling ransomware tools). The GTG-2002 campaign is the first to show the potential of successful fusion of all of these different functions under the guiding automated hand of a jailbroken rogue LLM, a reality that cyber defense must now prepare itself for.



