Cyber Model Rises, Rival Grounded: OpenAI's 85.6% Beats the One the Feds Pulled 🛡️

OpenAI's GPT-5.5-Cyber reached 85.6% on the CyberGym vulnerability-reproduction benchmark from UC Berkeley, surpassing Anthropic's Mythos 5 at 83.8% and the more widely available Claude Opus 4.7 at 73.1%, according to figures published on the CyberGym leaderboard. The full launch of the model was announced on June 22 as part of OpenAI's Daybreak cyber defense program. CyberGym presents AI agents with 1,507 known software vulnerabilities drawn from 188 open-source projects and scores them on how many they can reproduce in a controlled environment.

The result is notable for context as much as for the less-than-two-point margin. Anthropic's Mythos 5 and Fable 5 were pulled offline on June 12 after the Donald Trump administration issued an emergency export control directive citing national security. The order followed the discovery of a jailbreak, a technique for bypassing an AI model's built-in safety limits, and Anthropic's stated inability to verify user nationality at scale, prompting the company to disable both models globally.

Anthropic had publicly amplified concerns about its own system in the months leading up to the action. CEO Dario Amodei published an essay on June 10 comparing frontier AI models to aircraft that safety regulators should be able to ground if they fail audits. Days earlier, the company had apologized for a hidden filter in Fable 5 that silently degraded outputs for users suspected of building competing AI, and reversed the policy. Anthropic is negotiating with the Commerce Department and continuing litigation against the Trump administration.

While Mythos remains offline, OpenAI has expanded Daybreak through cybersecurity partnerships with Australia, Canada, France, Germany, Japan, South Korea, and EU institutions including the European Union Agency for Cybersecurity. Twenty-eight security firms, including CrowdStrike, Cisco, and Cloudflare, have joined its Cyber Partner Program to embed GPT-5.5 into products offered to vetted customers. OpenAI's blog states that its Codex Security tool has scanned over 30 million commits across 30,000 codebases and logged more than 500,000 fixed vulnerabilities since launching in March, and the company has introduced a "Patch the Planet" initiative aimed at vulnerabilities in widely used open-source projects. GPT-5.5-Cyber is restricted to verified security professionals and is not available for general use.

Cyber Model Rises, Rival Grounded: OpenAI's 85.6% Beats the One the Feds Pulled 🛡️

Share Article

Quick Info