Anthropic Project Glasswing Targets Safer AI Agents Before They Go Rogue
Market Updates

Anthropic Project Glasswing Targets Safer AI Agents Before They Go Rogue

HotHardware15d ago

Following an announcement that the company had

inked a massive SpaceX compute deal, Anthropic has unveiled Project Glasswing, a collaborative cybersecurity initiative that deploys its most capable AI to harden critical software before bad actors can weaponize the same technology against it.

Powered by the still-unreleased to the public Claude Mythos Preview model, the project has already surfaced more than 10,000 high, and critical-severity zero-day vulnerabilities in just its first month, a number that would presumably take traditional security teams years to uncover.

The initiative involves approximately 50 major tech industry partners, including Microsoft, Google, and Cloudflare. The company emphasizes the stakes are quite real, adding AI coding capabilities have now evolved to a point where they can match or outperform all but the most elite human penetration testers at hunting for software flaws. Project Glasswing is Anthropic's play to ensure those capabilities land in defenders' hands first, and not potential threat actors.

AI-accelerated and enhanced automated security sweeps have flipped the traditional cybersecurity bottleneck. Finding vulnerabilities used to be the hard part. Now it is triaging and deploying fixes fast enough.

Real-world results from Glasswing's partners make that shift feel more real. Cloudflare turned Mythos Preview loose on its critical systems and uncovered roughly 2,000 bugs, 400 of them high or critical severity, with a false-positive rate that beat human penetration testers. Mozilla used the model to audit Firefox 150 and patched 271 vulnerabilities, more than ten times what a comparable scan of Firefox 148 found using Claude Opus 4. As we covered recently, Mozilla has publicly argued that AI could eventually end the era of zero-day vulnerabilities, and results like these seem to backup that claim.

In the open-source arena, Anthropic directed Mythos Preview to scan over 1,000 widely used projects and flagged 6,202 high, or critical-severity vulnerabilities. Independent security firms vetted a large subset and confirmed a 90.6% true-positive rate. Among the finds was a critical flaw in the wolfSSL cryptography library (CVE-2026-5194), which is used by billions of devices worldwide, that would have let attackers forge security certificates and host convincing fake banking or email sites invisible to end users. It has since been patched.

This is not the first time AI has proven its worth as a vulnerability hunter. In late 2024, Google's Big Sleep agent found its first real-world zero-day back, and more recently Google confirmed the first AI-developed zero-day exploit used by actual threat actors in the wild.

Anthropic is explicitly withholding Claude Mythos Preview because the same capabilities that make it a defensive powerhouse also make it a dangerous offensive tool. During closed evaluations, the UK's AI Security Institute confirmed it became the first model to fully solve both of their complex, multi-step cyberattack simulations end to end.

In the wrong hands, that level of autonomous capability could dramatically disrupt banking networks, healthcare systems, or power grids. With six actively exploited zero-days patched in a single Microsoft Patch Tuesday earlier this year, the threat environment speaks for itself.

In the meantime, Anthropic has launched Claude Security, a public beta for Enterprise customers using Claude Opus 4.7 to scan codebases and generate proposed patches, with 2,100 vulnerabilities patched in its first three weeks. The company has also partnered with the Open Source Security Foundation's Alpha-Omega project to help overwhelmed maintainers handle the surge in AI-generated bug disclosures.

The patch bottleneck could be the real story here. Some open-source developers have reportedly asked Anthropic to slow its disclosure rate because they simply cannot keep up. Ideally, defenders need to patch vulnerabilities faster than they can be exploited, which won't always be possible.

Find the full disclosure from Anthropic on its website.

Originally published by HotHardware

Read original source →
SpaceXAnthropic