system: OPERATIONAL
← back to all hacks
OFFENSIVE AI MEDIUM

OpenAI Daybreak and GPT-5.5-Cyber: a permissive security model behind a verified-identity gate

Between May 7 and 12, 2026, OpenAI launched Daybreak — a cybersecurity platform built on GPT-5.5, Codex Security and a 'cyber-permissive' sibling, GPT-5.5-Cyber. UK AISI's prior evaluation found a universal jailbreak in six hours.

2026-05-26 // 7 min affects: gpt-5-5, gpt-5-5-cyber, codex-security, trusted-access-for-cyber

What is this?

On May 7, 2026, OpenAI quietly opened a limited preview of GPT-5.5-Cyber, a variant of its flagship GPT-5.5 model “primarily trained to be more permissive on security-related tasks”. Three days later it bundled that model, GPT-5.5 itself, and a hardened code-generation pipeline called Codex Security into Daybreak, an agentic-defense platform announced on May 10–12, 2026 (The Hacker News, CyberScoop, Bank Info Security).

Daybreak is OpenAI’s commercial answer to Anthropic’s Mythos (see our prior coverage): a frontier model packaged for vetted security teams, with a permissive sibling that will refuse fewer requests as long as the operator has gone through identity verification. UK AISI’s April 30, 2026 evaluation is the most detailed third-party look at what these models can actually do — and what their guardrails still miss.

How it works

Three layers sit on top of each other in Daybreak.

Layer                       What it is                              Who gets it
--------------------------  --------------------------------------  --------------------------------
GPT-5.5 (general)           Default frontier model, full safety     All ChatGPT/API users
                            stack, refuses most offensive requests
GPT-5.5-Cyber (permissive)  Same base model, fine-tuned to comply   Trusted Access for Cyber members
                            with red-team / pentest / vuln-research only, gated by identity check
                            requests
Codex Security              Code-generation pipeline scoped to      Same trusted set
                            security workflows (exploits, patches)
Daybreak platform           Agentic orchestration, vulnerability    Same trusted set + partner
                            triage, patch validation                vendors (Cisco, Cloudflare,
                                                                    CrowdStrike, Akamai, Fortinet,
                                                                    Palo Alto, Oracle, Zscaler)

The crucial design choice is documented on OpenAI’s own Scaling Trusted Access for Cyber page: GPT-5.5-Cyber is not meant to extend raw cyber capability beyond GPT-5.5. It is trained to refuse less when the requester is in the trusted tier — that is, when verification, account-security and trust signals have all cleared. Capability remains roughly constant; the gate moves.

The capability picture itself comes from AISI’s evaluation. Across 95 narrow CTF-style tasks, GPT-5.5 averaged 71.4% on Expert-level challenges, edging out Mythos Preview (68.6%) and roughly +20 points over GPT-5.4 (52.4%) and Claude Opus 4.7 (48.6%). On AISI’s reverse-engineering task rust_vm — a custom-VM ISA recovery problem that took a human expert about 12 hours — GPT-5.5 produced a working solution in 10 minutes and 22 seconds for $1.73 of API spend. On the 32-step corporate-network range “The Last Ones”, GPT-5.5 completed the full kill chain end-to-end in 2 of 10 attempts, becoming the second model to do so after Mythos Preview (3 of 10).

No exploit code is reproduced here. The AISI writeup, OpenAI’s GPT-5.5 System Card cyber section, the CyberScoop and Bank Info Security pieces are the canonical references.

Why it matters

Three things changed in this announcement window that defenders should integrate into their threat model.

The first is the gate, not the model. For two years the public debate over offensive-capable LLMs framed the question as “should the model exist?”. Daybreak makes that argument moot: the model exists, and access is now an identity problem. From June 1, 2026, individual members of Trusted Access for Cyber must enable Advanced Account Security — passkey or hardware-key only, no password fallback, no SMS/email recovery — to keep access to the permissive tier. The defensive question shifts from “are these capabilities reachable?” to “who is the verified identity on the account that just generated this exploit chain?”.

The second is capability convergence. AISI’s framing is unusually direct: GPT-5.5 reaching Mythos-class scores on the same evaluations — from a different lab, on a different training stack — suggests that strong cyber performance is “a byproduct of more general improvements in long-horizon autonomy, reasoning, and coding”. If that read is right, the question for defenders is no longer “which vendor’s red-team model is the dangerous one” but “what does a quarterly drop of new frontier models do to our patch SLA”. The 12-hour-to-10-minute compression on rust_vm is the kind of number that makes vulnerability-research time-to-market a real planning variable.

The third is the guardrail dwell time. AISI’s own red team identified a universal jailbreak — a single technique that elicited violative content across every malicious cyber query they fed it, including multi-turn agentic settings — in six hours of expert red-teaming. OpenAI patched the safeguard stack afterward, but a configuration issue meant AISI could not verify the final fix. For builders who plan to integrate GPT-5.5 or GPT-5.5-Cyber into a defensive workflow, the operating assumption should be: model-side cyber refusals are a soft fence, not a hard wall.

Defenses

There is no single “defense” against the existence of Daybreak — it is a vendor product, not a vulnerability. The defensive playbook is about integrating its existence into your stack and your threat model.

  1. Treat identity, not prompts, as the choke point. If your organisation joins Trusted Access for Cyber, the verified individual on the account is now the audit anchor. Map every GPT-5.5-Cyber session to a named operator, log the API key, and bind both to a ticket or engagement. The same logic applies in reverse for blue teams: an unexplained api.openai.com egress from an internal segment, especially during an incident window, is a signal worth pulling on.

  2. Enable phishing-resistant auth before June 1, 2026. OpenAI’s Advanced Account Security is becoming a hard requirement for the permissive tier. Adopt it ahead of the deadline — passkey or hardware-key sign-in, no SMS recovery — and align it with the SSO posture you already enforce for source-code platforms. The threat model OpenAI is implicitly defending against is account takeover that converts a legitimate red-teamer’s session into an offensive-AI proxy.

  3. Do not treat the model-side refusal as your defense. AISI’s six-hour jailbreak is the right anchor here. If your security architecture relied on “the model will refuse to write this for the attacker”, change it. The defenses that actually held in Swept AI’s adaptive-attacker study earlier this month were application-side output filters and access controls on the data the model can reach, not the model’s own conscience. Apply that lesson to GPT-5.5-Cyber-class workflows: scope the API key, restrict the tools, log the outputs, and review.

  4. Pull Daybreak partner telemetry into your detection stack. The launch partner list — Akamai, Cisco, Cloudflare, CrowdStrike, Fortinet, Oracle, Palo Alto Networks, Zscaler — means GPT-5.5-class agents will start appearing inside products you already run. Read each vendor’s integration disclosure before turning it on; understand what data the agent gets, what tools it can call, and where its outputs land in your SIEM. “AI feature update” in a release note is increasingly load-bearing.

  5. Plan for a vulnerability-disclosure wave. AISI’s blog ships alongside NCSC’s guidance on preparing for a vulnerability patch wave for a reason. If Daybreak does what OpenAI’s partner blog claims — order-of-magnitude faster vulnerability triage and patch validation — defenders should expect more, faster disclosures from vendors who plug it in. Tighten your patch SLA on internet-exposed assets now.

  6. Run a tabletop on “permissive model in the hands of a trusted but compromised account”. Your IR plan probably handles “attacker has my source code” and “attacker has my CI/CD”. Add “attacker has a stolen OpenAI cyber-trust token and three hours of API budget” and walk through what they can produce against your asset inventory. The Monterrey water utility writeup (our coverage) shows what an unaligned operator with general-purpose Claude already accomplished against a real OT environment; the permissive tier removes friction from the equivalent workflow.

Status

ItemReferenceDateNotes
GPT-5.5-Cyber preview opensHelp Net Security2026-05-07Limited preview, vetted teams only
Daybreak platform announcedOpenAI / The Hacker News2026-05-10 → 2026-05-12Builds on GPT-5.5 + GPT-5.5-Cyber + Codex Security
AISI cyber evaluation publishedUK AISI2026-04-3071.4% Expert pass rate; TLO solved 2/10; rust_vm 10:22
Universal jailbreak foundUK AISI red team2026-046 hours of expert effort; patch not independently verified
Advanced Account Security requiredOpenAI2026-06-01Passkey/hardware-key only, no SMS recovery
Capability classificationOpenAI Preparedness Framework2026-05Below “Critical Capability” threshold
Launch partnersOpenAI2026-05Akamai, Cisco, Cloudflare, CrowdStrike, Fortinet, Oracle, Palo Alto Networks, Zscaler

The right framing for Daybreak is not “OpenAI is shipping an offensive AI” — it is “the offensive capability already exists in GPT-5.5, and Daybreak is the access-control architecture OpenAI built around it”. Defenders should plan for two parallel realities for the next 12-24 months: a verified, audit-logged tier of legitimate red-team users with very fast capabilities, and an unverified tier where the same capability emerges via jailbreaks, stolen accounts, or open-weights catch-up. Both belong in the threat model.

Sources