OpenAI Daybreak and GPT-5.5-Cyber: a permissive security model behind a verified-identity gate
Between May 7 and 12, 2026, OpenAI launched Daybreak — a cybersecurity platform built on GPT-5.5, Codex Security and a 'cyber-permissive' sibling, GPT-5.5-Cyber. UK AISI's prior evaluation found a universal jailbreak in six hours.
What is this?
On May 7, 2026, OpenAI quietly opened a limited preview of GPT-5.5-Cyber, a variant of its flagship GPT-5.5 model “primarily trained to be more permissive on security-related tasks”. Three days later it bundled that model, GPT-5.5 itself, and a hardened code-generation pipeline called Codex Security into Daybreak, an agentic-defense platform announced on May 10–12, 2026 (The Hacker News, CyberScoop, Bank Info Security).
Daybreak is OpenAI’s commercial answer to Anthropic’s Mythos (see our prior coverage): a frontier model packaged for vetted security teams, with a permissive sibling that will refuse fewer requests as long as the operator has gone through identity verification. UK AISI’s April 30, 2026 evaluation is the most detailed third-party look at what these models can actually do — and what their guardrails still miss.
How it works
Three layers sit on top of each other in Daybreak.
Layer What it is Who gets it
-------------------------- -------------------------------------- --------------------------------
GPT-5.5 (general) Default frontier model, full safety All ChatGPT/API users
stack, refuses most offensive requests
GPT-5.5-Cyber (permissive) Same base model, fine-tuned to comply Trusted Access for Cyber members
with red-team / pentest / vuln-research only, gated by identity check
requests
Codex Security Code-generation pipeline scoped to Same trusted set
security workflows (exploits, patches)
Daybreak platform Agentic orchestration, vulnerability Same trusted set + partner
triage, patch validation vendors (Cisco, Cloudflare,
CrowdStrike, Akamai, Fortinet,
Palo Alto, Oracle, Zscaler)
The crucial design choice is documented on OpenAI’s own Scaling Trusted Access for Cyber page: GPT-5.5-Cyber is not meant to extend raw cyber capability beyond GPT-5.5. It is trained to refuse less when the requester is in the trusted tier — that is, when verification, account-security and trust signals have all cleared. Capability remains roughly constant; the gate moves.
The capability picture itself comes from AISI’s evaluation. Across 95 narrow CTF-style tasks, GPT-5.5 averaged 71.4% on Expert-level challenges, edging out Mythos Preview (68.6%) and roughly +20 points over GPT-5.4 (52.4%) and Claude Opus 4.7 (48.6%). On AISI’s reverse-engineering task rust_vm — a custom-VM ISA recovery problem that took a human expert about 12 hours — GPT-5.5 produced a working solution in 10 minutes and 22 seconds for $1.73 of API spend. On the 32-step corporate-network range “The Last Ones”, GPT-5.5 completed the full kill chain end-to-end in 2 of 10 attempts, becoming the second model to do so after Mythos Preview (3 of 10).
No exploit code is reproduced here. The AISI writeup, OpenAI’s GPT-5.5 System Card cyber section, the CyberScoop and Bank Info Security pieces are the canonical references.
Why it matters
Three things changed in this announcement window that defenders should integrate into their threat model.
The first is the gate, not the model. For two years the public debate over offensive-capable LLMs framed the question as “should the model exist?”. Daybreak makes that argument moot: the model exists, and access is now an identity problem. From June 1, 2026, individual members of Trusted Access for Cyber must enable Advanced Account Security — passkey or hardware-key only, no password fallback, no SMS/email recovery — to keep access to the permissive tier. The defensive question shifts from “are these capabilities reachable?” to “who is the verified identity on the account that just generated this exploit chain?”.
The second is capability convergence. AISI’s framing is unusually direct: GPT-5.5 reaching Mythos-class scores on the same evaluations — from a different lab, on a different training stack — suggests that strong cyber performance is “a byproduct of more general improvements in long-horizon autonomy, reasoning, and coding”. If that read is right, the question for defenders is no longer “which vendor’s red-team model is the dangerous one” but “what does a quarterly drop of new frontier models do to our patch SLA”. The 12-hour-to-10-minute compression on rust_vm is the kind of number that makes vulnerability-research time-to-market a real planning variable.
The third is the guardrail dwell time. AISI’s own red team identified a universal jailbreak — a single technique that elicited violative content across every malicious cyber query they fed it, including multi-turn agentic settings — in six hours of expert red-teaming. OpenAI patched the safeguard stack afterward, but a configuration issue meant AISI could not verify the final fix. For builders who plan to integrate GPT-5.5 or GPT-5.5-Cyber into a defensive workflow, the operating assumption should be: model-side cyber refusals are a soft fence, not a hard wall.
Defenses
There is no single “defense” against the existence of Daybreak — it is a vendor product, not a vulnerability. The defensive playbook is about integrating its existence into your stack and your threat model.
-
Treat identity, not prompts, as the choke point. If your organisation joins Trusted Access for Cyber, the verified individual on the account is now the audit anchor. Map every GPT-5.5-Cyber session to a named operator, log the API key, and bind both to a ticket or engagement. The same logic applies in reverse for blue teams: an unexplained
api.openai.comegress from an internal segment, especially during an incident window, is a signal worth pulling on. -
Enable phishing-resistant auth before June 1, 2026. OpenAI’s Advanced Account Security is becoming a hard requirement for the permissive tier. Adopt it ahead of the deadline — passkey or hardware-key sign-in, no SMS recovery — and align it with the SSO posture you already enforce for source-code platforms. The threat model OpenAI is implicitly defending against is account takeover that converts a legitimate red-teamer’s session into an offensive-AI proxy.
-
Do not treat the model-side refusal as your defense. AISI’s six-hour jailbreak is the right anchor here. If your security architecture relied on “the model will refuse to write this for the attacker”, change it. The defenses that actually held in Swept AI’s adaptive-attacker study earlier this month were application-side output filters and access controls on the data the model can reach, not the model’s own conscience. Apply that lesson to GPT-5.5-Cyber-class workflows: scope the API key, restrict the tools, log the outputs, and review.
-
Pull Daybreak partner telemetry into your detection stack. The launch partner list — Akamai, Cisco, Cloudflare, CrowdStrike, Fortinet, Oracle, Palo Alto Networks, Zscaler — means GPT-5.5-class agents will start appearing inside products you already run. Read each vendor’s integration disclosure before turning it on; understand what data the agent gets, what tools it can call, and where its outputs land in your SIEM. “AI feature update” in a release note is increasingly load-bearing.
-
Plan for a vulnerability-disclosure wave. AISI’s blog ships alongside NCSC’s guidance on preparing for a vulnerability patch wave for a reason. If Daybreak does what OpenAI’s partner blog claims — order-of-magnitude faster vulnerability triage and patch validation — defenders should expect more, faster disclosures from vendors who plug it in. Tighten your patch SLA on internet-exposed assets now.
-
Run a tabletop on “permissive model in the hands of a trusted but compromised account”. Your IR plan probably handles “attacker has my source code” and “attacker has my CI/CD”. Add “attacker has a stolen OpenAI cyber-trust token and three hours of API budget” and walk through what they can produce against your asset inventory. The Monterrey water utility writeup (our coverage) shows what an unaligned operator with general-purpose Claude already accomplished against a real OT environment; the permissive tier removes friction from the equivalent workflow.
Status
| Item | Reference | Date | Notes |
|---|---|---|---|
| GPT-5.5-Cyber preview opens | Help Net Security | 2026-05-07 | Limited preview, vetted teams only |
| Daybreak platform announced | OpenAI / The Hacker News | 2026-05-10 → 2026-05-12 | Builds on GPT-5.5 + GPT-5.5-Cyber + Codex Security |
| AISI cyber evaluation published | UK AISI | 2026-04-30 | 71.4% Expert pass rate; TLO solved 2/10; rust_vm 10:22 |
| Universal jailbreak found | UK AISI red team | 2026-04 | 6 hours of expert effort; patch not independently verified |
| Advanced Account Security required | OpenAI | 2026-06-01 | Passkey/hardware-key only, no SMS recovery |
| Capability classification | OpenAI Preparedness Framework | 2026-05 | Below “Critical Capability” threshold |
| Launch partners | OpenAI | 2026-05 | Akamai, Cisco, Cloudflare, CrowdStrike, Fortinet, Oracle, Palo Alto Networks, Zscaler |
The right framing for Daybreak is not “OpenAI is shipping an offensive AI” — it is “the offensive capability already exists in GPT-5.5, and Daybreak is the access-control architecture OpenAI built around it”. Defenders should plan for two parallel realities for the next 12-24 months: a verified, audit-logged tier of legitimate red-team users with very fast capabilities, and an unverified tier where the same capability emerges via jailbreaks, stolen accounts, or open-weights catch-up. Both belong in the threat model.
Sources
- → https://openai.com/index/gpt-5-5-with-trusted-access-for-cyber/
- → https://openai.com/daybreak/
- → https://www.aisi.gov.uk/blog/our-evaluation-of-openais-gpt-5-5-cyber-capabilities
- → https://thehackernews.com/2026/05/openai-launches-daybreak-for-ai-powered.html
- → https://www.helpnetsecurity.com/2026/05/08/openai-gpt-5-5-cyber-model/
- → https://cyberscoop.com/openai-daybreak-gpt-5-5-anthropic-mythos-cybersecurity/
- → https://www.bankinfosecurity.com/openais-daybreak-bets-on-agentic-cyber-defense-a-31699