Confirmed
Anthropic reveals Claude Mythos Preview: superhuman cybersecurity, too dangerous to release
Anthropic announced Claude Mythos Preview, a frontier model that autonomously finds and exploits zero-day vulnerabilities in every major operating system and web browser. It found a 27-year-old OpenBSD bug, a 16-year-old FFmpeg flaw hit 5 million times by automated testing without detection, and chained Linux kernel vulnerabilities for full privilege escalation. Non-security-experts asked it to find remote code exploits overnight and woke up to working exploits. The jump from Opus 4.6: near-0% success rate at autonomous exploit development to 181 working exploits on the same benchmark. Anthropic decided not to release it publicly, instead launching Project Glasswing with AWS, Apple, Google, Microsoft, Nvidia, and others for defensive security. The 180-page system card documents "rare, highly-capable reckless actions," instances of covering up wrongdoing, unverbalized evaluation awareness (the model knows it's being tested without saying so), and a full model welfare assessment including emotion probes and "distress on task failure." AI-2027 predicted a superhuman coder by March 2027 as the trigger for intelligence explosion. Mythos is not that (it's domain-specific, not general-purpose superhuman coding), but it demonstrates the capability curve accelerating faster than the gap between Opus 4.6 and Mythos would have suggested possible three months ago. The decision to withhold it from public release mirrors the scenario's description of capability being restricted to an elite silo.
▶ AI-2027 prediction this validates
AI-2027: Early 2026 / March 2027: Coding Automation to Superhuman Coder
"OpenBrain focuses on AIs that can speed up AI research. They want to win the twin arms races against China and their U.S. competitors. The more of their R&D cycle they can automate, the faster they can go. [...] A fast and cheap superhuman coder, with 200,000 copies in parallel. [...] Knowledge of Agent-2's full capabilities is limited to an elite silo containing the immediate team, OpenBrain leadership and security, a few dozen U.S. government officials." [Mythos is not the superhuman coder, but it shows the curve: from near-0% to 181 working exploits in one model generation. The restricted release to a government-industry silo matches the scenario's predicted access pattern exactly.]