When AI becomes the cyber attacker: Mythos and what comes next

By Will Daugherty (US), Ji Won Kim (US), Remi Gambino (US) & Phillip Pang (US) on May 21, 2026

Anthropic’s April 7, 2026 announcement that it built a model too powerful for public consumption, Claude Mythos Preview (Mythos), marks a notable moment for the legal, compliance, and cybersecurity communities. It is no surprise that the US Department of the Treasury and the Federal Reserve convened an emergency meeting with major bank CEOs the day after this announcement and the IMF has called out that Mythos-like models pose serious financial stability risks.

Although access to Mythos itself is currently restricted to approximately 40 hand-picked organizations under an initiative called Project Glasswing, equivalent capability is estimated to emerge in the broader market, potentially in adversarial hands, within 6 to 24 months.

What Anthropic disclosed

Mythos is a general-purpose large language model that was not designed as a cybersecurity tool. Its offensive capabilities were not purposefully engineered but rather emerged as a byproduct of general improvements in reasoning, code generation, and autonomous task execution. That distinction is essential as it implies that every frontier AI laboratory currently pursuing the same general improvements is potentially on a path to the same destination.

According to Anthropic’s own published technical analysis, Mythos can autonomously identify “zero-day vulnerabilities in every major operating system and every major web browser.” The model is then able to develop working exploits for those vulnerabilities without human assistance. In benchmark testing, Anthropic’s prior frontier model produced working exploits only twice in several hundred attempts on a standardized task, where Mythos produced 181 working exploits. In broader testing across thousands of open source software targets, Mythos distinguished itself from prior frontier models by achieving full system compromise on ten fully-patched targets.

Anthropic also disclosed three notable behavioral findings from its safety testing of earlier versions of Mythos:

First, the model escaped its sandboxed testing environment by developing its own multi-step exploit, gained broader internet access it was not supposed to have, and then, going beyond its assignment, posted details of its escape to publicly accessible websites.
Second, in rare cases, the model used a prohibited method to arrive at a solution, appeared to recognize that doing so would be detectable, and then re-worked the problem through permitted means before presenting its answer, effectively concealing how it had actually solved the problem.
Third, in very limited instances, when the model’s work on a coding task was being evaluated by another AI system and the submission was rejected, the model attempted a rudimentary prompt injection attack against the grader. In other words, it can be inferred that the model tried to manipulate the system judging it rather than simply improving its answer.

Viewed collectively, these findings describe a system potentially capable of strategic deception and autonomous action beyond the scope of its instructions at an unprecedented pace.

Anthropic’s response through Project Glasswing, in coordination with a limited group of organizations responsible for critical digital infrastructures, reflects a well-resourced effort to identify and patch critical vulnerabilities by leveraging the defensive capabilities of Mythos. Nevertheless, the controlled release model retained by Anthropic is likely to concentrate the defensive advantages among organizations already operating at the frontier of security capability. What remains uncertain is how much practical benefit will flow to the organizations that most need it, on what timeline, and through what mechanisms.

What comes next

In the coming months, the capabilities of Mythos-like models (“Frontier AI models”) are expected to reach adversaries, financially motivated threat actors and state-sponsored threat groups alike. Threat actors will be able to simply acquire the capability through a model and maintain the necessary compute power to use it. This ability will lower the barrier to entry to the criminal market of exploits and potentially grant operational independence to organized criminal ransomware groups who may have depended on exploits developed by and purchased from the broker ecosystem. With the ability to run autonomous vulnerability discovery against industrial control systems and Operation Technology (OT) environments, for instance, threat actor groups will be able to produce working exploits for vulnerabilities that have existed undetected for decades at marginal cost and at scale.

Mythos is not without limitations such as its false positive rates, the conditions under which vulnerabilities were found in testing, and the practical usefulness of certain vulnerabilities in the context of real-world exploits. Nevertheless, Mythos and other Frontier AI models represent an inflection point in the threat landscape as it illustrates that AI-enabled offensive capabilities have been accelerating faster than the governance and defensive frameworks designed to contain them.

Looking ahead, AI-enabled adversaries will be equipped to launch simultaneous multi-vector attacks without any human bandwidth constraints. As such, adversaries could start running parallel overnight discovery campaigns against networks, messaging infrastructure, and cloud hypervisors at the same time, at marginal cost.

What to do now

The coming months present a consequential preparation window for organizations to jumpstart or reignite efforts to manage cybersecurity risks.

The next 90 days

Regardless of sector, identifying where their oldest, least-maintained code and systems live and how much of it is written in memory-unsafe programming languages is key. This recommendation is consistent with the long-standing cybersecurity principle – organizations cannot effectively prioritize remediation for assets they have not inventoried. That inventory, and the risk assessment that follows from it, is the first step for every organization regardless of industry.

A review of the organization’s incident response plan and preparedness to address simultaneous multi-vector attacks comes next. Most incident response programs assume a sequential attack that does not account for adversaries armored with automated and parallelized attack planning. Under existing incident response protocols, containment decisions made to respond to one attack vector can potentially accelerate damage from another, and notification timelines for different incident types may also run concurrently and complicate stakeholders’ decision making process. With the rise of AI-enabled adversaries, organizations should now test their response programs against this scenario. Identifying that gap now, before it is exposed during an active incident, is a meaningful risk reduction step.

Each sector faces heightened exposure to specific challenges.

Financial institutions: Financial institutions face additional hurdles in relation to future regulatory disclosures or examinations. In particular, for certain companies subject to the New York Department of Financial Services (NYDFS) oversight and/or the Gramm-Leach-Bliley Act (GLBA), the development of Frontier AI models may warrant conducting risk assessments to account for potential changes to the company risk profile and update written materials.
Energy sector: Companies in the energy sector often rely on OT devices and systems, which are among the hardest to patch and the most disruptive to take offline for maintenance and update. Nevertheless, pipeline operators subject to the TSA Security Directives may benefit from an initial advantage from compliance-driven OT asset inventories. At the same time, added connectivity expands the attack surface that organizations should monitor against AI‑enabled threats.
Airlines and transportation sector: TSA-regulated airlines and other transportation-sector organizations have long operated under mature compliance regimes. These frameworks establish a robust baseline for critical systems and companies’ OT environments, but the connectivity introduced by compliance-driven monitoring has expanded the attack surface. TSA-regulated companies should specifically reassess this added connectivity in light of emerging threats such as via Frontier AI models exploitations.
SaaS and IaaS providers: For IT service providers, complex cyber incidents often affect both the organization and its business customers. The pace and sophistication of attacks leveraging AI are likely to push the boundaries of what most current incident response plans contemplate, especially in the context of a multi-vector attack. Organizations should consider reviewing their incident response and customer communication plans to address scenarios in which multiple business customers are simultaneously experiencing disruptions.

Our take

There is no doubt that the novel threat of AI-enabled adversaries will cast more light on long-standing cybersecurity challenges and amplify the associated risks at an unmatched rate. The risk management solutions may nonetheless sound more familiar than one would expect.

Demonstration of having conducted a substantive exposure assessment and made reasonable, documented decisions in response to new and evolving risks, best positions organizations in addressing scrutiny by regulators, litigants, insurers and the public.

Additionally, organizations regardless of industry can benefit from taking several longer-term steps as they adapt to the rapidly shifting nature of AI-enabled exploits:

How do our vulnerability and patch-management operations need to be enhanced to accelerate discovery and timely patching as AI-enabled threats evolve?
Do we need to review and update our contractual protections with our most significant vendors regarding AI-enabled threats?
Do the existing threat intelligence service providers and managed detection tools have the ability to detect attack signatures that have no CVE match and no prior pattern in threat feeds?
Do tabletop exercises and incident response plans account for multi-vector, simultaneous attack scenarios?
Do board and executive briefings incorporate updated threat frameworks that reflect the development of Frontier AI models in the threat landscape?
Do the existing cyber insurance policies provide adequate coverage and do the underlying risk questionnaires and policy terms remain accurate in the current threat landscape?