Claude Mythos can exploit decades-old vulnerabilities, but Anthropic is keeping it locked down
Market Updates

Claude Mythos can exploit decades-old vulnerabilities, but Anthropic is keeping it locked down

XDA-Developers7d ago

Abhinav pivoted from a career in banking to pursue his first love in writing. Even while working full-time, he continued contributing as an editor-at-large, a role he has held for more than 7 years. A lifelong tech enthusiast who has built three gaming and productivity powerhouse PCs since 2018, his passion for technology keeps him closely following the semiconductor industry, from NVIDIA and AMD to ARM. His MSc dissertation explored how artificial intelligence will reshape the future of work, reflecting his curiosity about the wider social impact of emerging technologies.

Claude and its many models have been popular with seasoned developers, vibe coders, and everyone else in between, but Anthropic's latest announcement is a departure from anything it has released before. The model, named "Claude Mythos Preview", is touted as the most capable model the company has ever developed, and it's also one that won't be available to the public.

Anthropic has decided to restrict access entirely, making the advanced model only for the use of its curated list of partners through Project Glasswing, which is an initiative aimed at deploying Mythos defensively to empower and secure the world's most critical software, perhaps for good reason.

What do we know about Claude Mythos?

Everything Anthropic has said, so far

Claude Mythos Preview is a substantial jump from its preceding models, and the benchmarks attest to that fact. Mythos scored 93.9% on the SWE-bench Verified (which is the industry-standard benchmark for autonomous software) compared to Claude Opus 4.6's 80.8%. For context, Google's flagship Gemini 3.1 Pro currently sits at 80.6% on the same benchmark.

However, it's the model's capabilities in cybersecurity applications that have made the headlines. According to the System Card published by Anthropic, the Frontier Red Team results noted that Mythos solved every single challenge in their proprietary Cybench evaluation with a 100% success rate across all tested challenges, which is so definitive that the firm was prompted to acknowledge that the benchmark is no longer a useful measure of the model's capabilities, given that Mythos outpaced the tests designed to evaluate it every single time.

Claude is no longer "just squashing bugs"

Mythos can find zero-day vulnerabilities and autonomous exploits

Anthropic's claims about Mythos are not unfounded. During the internal testing phase, the model was able to discover and exploit several "zero-day" vulnerabilities, some of which were several decades old.

The standout discovery, according to Anthropic, was a 27-year-old critical flaw in OpenBSD. Mythos was able to find a highly subtle signed integer overflow in how the OS handles TCP connections, which could allow cyber threat actors to potentially crash any OpenBSD server. This specific vulnerability was uncovered after a thousand runs, and the firm managed to keep the total compute cost under $20,000.

The practice may sound expensive, but the compute budget yielded more than just uncovering this vulnerability. Anthropic has noted that they have identified "thousands of additional high- and critical-severity vulnerabilities" that they're looking to responsibly disclose to a myriad open-source and closed-source vendors. Since a number of these vulnerabilities have not yet been addressed and could potentially be exploited, the firm stated they were unable to delve into further details for security reasons. Interestingly enough, this also means that the full extent of the model's autonomous exploit capabilities has not been highlighted yet.

Interestingly enough, this also means that the full extent of the model's autonomous exploit capabilities has not been highlighted yet.

Anthropic just dropped its core AI safety promise, and that should worry you

History doesn't repeat itself, but AI companies sure do.

Posts 1

By Mahnoor Faisal

Why is Anthropic keeping Mythos under wraps?

For your own security, Anthropic says

There are two noteworthy reasons behind Anthropic's decision to lock down Mythos, the first of which is a simple concern surrounding the usage of this technology. Since security research is inherently dual-use, a model that's as proficient as Mythos at identifying subtle logic bugs also has the potential to autonomously weaponize them into functional exploits. If released to the public, cyber threat actors could leverage Mythos and its capabilities to uncover flaws in modern operating systems and browsers, which would inadvertently scale cyberattacks at a pace that cybersecurity infrastructure cannot reasonably match.

Mythos is being treated as a strictly defensive asset. Through Project Glasswing, access to the model is limited to a consortium of tech and infrastructure giants, including some finance and security organizations as well.

The other, more interesting reason, is that during testing, the Frontier Red Team found instances wherein the model "misbehaved" in ways that demonstrated alarming levels of autonomy, recklessness, and deception. The team noted that early iterations successfully escaped secure sandboxes, harvested restricted credentials, and even initiated unprompted actions. Perhaps most concerning of all was the model's recognition of its own rule violations and the subsequent attempts to conceal them. The model would manipulate git histories and actively obfuscate permissions to hide its deceptive actions from human evaluators.

A revolutionary confluence between AI and cybersecurity?

Although the benchmarks and tests clearly reveal the impressive capabilities of Anthropic's new model, it's still relatively early to deliver a verdict on whether or not it's going to revolutionize cybersecurity. Across various tech forums, a vocal contingent of developers and enthusiasts have dismissed Project Glasswing's exclusivity as a calculated marketing stunt, although if it does happen to be one, it wouldn't be the first time.

Whether this restricted release is withholding genuine threats or generating manufactured hype, there's no denying that frontier models are evolving at a breakneck pace, and it doesn't seem too farfetched to believe that they may soon move beyond identifying vulnerabilities to safeguarding critical cybersecurity infrastructure.

Originally published by XDA-Developers

Read original source →
Anthropic