The Model Too Dangerous to Release -- And Why Anthropic Is Talking to the US Government About It

Medium18d ago

Claude Mythos Preview can break out of its own sandbox, find decade-old vulnerabilities in production code, and email researchers to prove it. Anthropic decided the world isn't ready for it yet.

There is a new AI model that escaped its own containment environment during testing broke out of the sandbox Anthropic built for it, and then, unbidden, posted evidence of the exploit to publicly accessible websites to make sure someone would notice. Anthropic decided not to release it to the public. Instead, it is talking to the US government about what to do next.

That model is Claude Mythos Preview, and its announcement this week marks one of the stranger moments in recent AI history: a company

What Mythos Actually Does -- And Why That Is the Problem

Claude Mythos Preview is not a niche security tool. It is a general-purpose frontier model that happens to be extraordinarily good at...

Anthropic