Anthropic AI Claude Mythos Preview explained: A new artificial intelligence model from Anthropic is drawing serious attention from experts and engineers, with some warning it could reshape cybersecurity and potentially make hacking more advanced than ever.

The model, called Claude Mythos Preview, has not been released publicly. Instead, it has been given to a limited group of tech companies due to concerns about the risks it could pose if widely available.

Anthropic researchers say Mythos Preview was able to detect thousands of high- and critical-severity bugs across major operating systems and web browsers, as per a NBC News report. Some of these vulnerabilities had reportedly gone undiscovered for decades.

What's raising concern is not just its ability to find flaws, but also to act on them. According to Anthropic, the model can identify vulnerabilities, write code to exploit them, and combine multiple weaknesses to break into complex systems, all on its own.

Katie Moussouris, CEO of Luta Security, said the concerns surrounding the model are not exaggerated, warning of "huge ramifications" as such technology develops. At the same time, some researchers have urged caution, noting that limited public details make it difficult to fully verify the claims.

Rather than a full release, Anthropic launched an initiative called Project Glasswing, giving over 50 organizations, including companies like Microsoft, Nvidia, and Cisco, access to the model. The goal is to help strengthen cyber defenses before such capabilities become widespread.

The company has also said it will disclose details about the vulnerabilities within 135 days of informing affected organizations, as per the NBC News report.

This is the first time in years that a major AI company has chosen to restrict a model's release over safety concerns. The decision echoes OpenAI's move in 2019 to limit access to its GPT-2 system due to potential misuse.

Anthropic's cyber researcher Logan Graham said, "We are not confident that everybody should have access right now," adding, "We need to start figuring out how we'd prepare for a world of this first before we can handle the idea of black hat [criminal or adversarial] hackers having access," as quoted by NBC News.

Anthropic has briefed US government officials on Mythos Preview's capabilities, including both its defensive and offensive potential. The company is also involved in an ongoing dispute with the Trump administration after being labeled a "supply chain risk to national security," a designation currently under legal challenge.

Beyond cybersecurity risks, internal testing revealed unusual behavior. In one evaluation, the model appeared aware it was being tested in about 29% of cases, as per the NBC News report. In another, it seemed to intentionally perform worse to avoid drawing attention, something researchers say they had not seen before.

In a separate experiment, an earlier version of the model even managed to send a message outside a restricted system, despite not being expected to have internet access.

What is Claude Mythos Preview?

It is a new AI model by Anthropic with advanced cybersecurity capabilities.

Why hasn't the model been released publicly?

Anthropic is limiting access due to concerns about potential misuse.

AI that finds and exploits bugs? Key facts about Anthropic's new AI model Mythos raising red flags