Anthropic's Claude Mythos Model Leaked, Reveals Major Capability Jump with Security Concerns

Key Developments

Anthropic’s next-generation AI model, Claude Mythos, was accidentally revealed through a publicly accessible data cache discovered by Fortune. According to leaked internal documentation, the model represents “a step change” in AI performance and is described as “the most capable we’ve built to date” by Anthropic. The model is currently in trials with early access customers, but the company has flagged unprecedented cybersecurity risks associated with its capabilities.

The leak occurred due to what Anthropic described as “human error” in their content management system configuration, exposing draft blog posts and internal assessments that weren’t intended for public release.

Industry Context

This development comes as both Anthropic and OpenAI have recently crossed new capability thresholds that pose novel security challenges. In February, OpenAI’s GPT-5.3-Codex became the first model classified as “high capability” for cybersecurity tasks under their Preparedness Framework. Anthropic’s Claude Opus 4.6, released the same week, demonstrated abilities to identify previously unknown vulnerabilities in production code.

The timing suggests intense competition in the frontier AI space, with both companies pushing capability boundaries while grappling with safety implications.

Practical Implications

For AI builders and users, Claude Mythos signals another leap in model capabilities, but with significant caveats. The cybersecurity risks mentioned suggest the model could potentially be misused for discovering system vulnerabilities or conducting sophisticated attacks. Anthropic has reported that hacking groups, including those with alleged Chinese government links, have already attempted to exploit Claude models in real-world scenarios.

Organisations planning AI deployments should prepare for both enhanced capabilities and stricter safety protocols as these models approach release.

Open Questions

Key uncertainties remain around Claude Mythos’s release timeline, specific capability improvements, and what safety measures Anthropic will implement. The accidental nature of this disclosure suggests the company may not have been ready to discuss these developments publicly, potentially indicating ongoing internal debates about deployment strategies and risk mitigation approaches.

Source: Fortune