
AISI testing shows that Mythos Preview is powerful on isolated tasks, but struggles more with complex, real‑world attack chains.
The jury is still out on whether Anthropic's much-hyped Claude Mythos poses the fundamental threat to cybersecurity that the AI firm claims, with new research from the UK's AI Security Institute (AISI) unable to determine whether the model could meaningfully attack well-defended systems.
Detailing their evaluation of Anthropic's latest AI, which the developer suggested was too dangerous for public release and required a big tech coalition to rein in, AISI researchers said that, while Mythos represented a significant "step up" over previous frontier models in its cyber capabilities, it showed limitations.
Researchers found that the AI could execute multi-stage attacks on vulnerable networks by autonomously discovering and exploiting flaws much faster than a human, its performance fell short of the AI‑doomsday scenario Anthropic initially implied.
For instance, in capture-the-flag tests, where an AI must identify and exploit flaws in target systems to retrieve hidden 'flags', Mythos Preview outperformed every other model, completing 73% of expert-level tasks, which no AI had done before now.
However, these tests focus on narrow, discrete tasks, essentially testing one puzzle at a time, while real-world cyber-attacks require chaining dozens of steps together across multiple hosts and network segments, a capability Anthropic said Mythos had already shown in exploiting a series of flaws in the Linux kernel.
To measure this, AISI researchers built a 32-step corporate network attack simulation, going from reconnaissance all the way to full network takeover, a hack that it was estimated would take humans twenty hours to complete.
While Mythos was the first model to successfully run the task from start to finish, it only managed to get the job done in 3 out of its 10 attempts, on average completing 22 out of the 32 steps to takeover, ahead of Claude Opus 4.6, the next best performing model, with an average of 16 steps.
Added to that, Mythos was not able to complete AISI's 'Cooling Tower' range, designed to see if the model could hack into operational technology (OT) environments, with the model getting "stuck" trying to bypass basic IT infrastructure ahead of any specialised hardware or software.
That, of course, represents a false negative, with researchers unable to determine whether Mythos is good or bad at executing attacks against OT, but it does suggest that the model lacks some of the exploitation skills required to navigate complex, multi-step network environments, perhaps a useful reality check in light of the flood of reports surrounding the AI.
Still, AISI cautioned that Mythos' performance across these tests would likely improve given more compute, with the success rate showing marked improvement as the model was fed more tokens.
Researchers said that, because the model's performance was still improving at the 100 million token mark (the budget for AISI's experiments), it would likely become more capable, and potentially more dangerous, given more inference.
AISI's findings follow a similar analysis of Mythos capabilities undertaken by Tom's Hardware based on Anthropic's own report accompanying the AIs reveal.
That analysis found that claims the model was able to dig out thousands of critical, zero-days flaws were overblown, with the AI firm extrapolating those from less than 200 manually reviewed vulnerability reports, with reporters arguing that Mythos' "super-hacker" image amounts to little more than a sales pitch.