RESEARCH MEDIUM NEW

SCONE-bench: pricing autonomous AI exploitation in dollars stolen

Anthropic's December 1, 2025 study measures AI agent exploitation in money, not success rates: on smart contracts, frontier models produced $4.6M in simulated theft and two real zero-days at $1.22 per scan.

2026-06-16 // 7 min affects: claude-opus-4-5, claude-sonnet-4-5, gpt-5, smart-contracts, defi

What is this?

On December 1, 2025, Anthropic’s red team published AI agents find $4.6M in blockchain smart contract exploits, a study from MATS and Anthropic Fellows scholars (Winnie Xiao, Cole Killian, and colleagues). It introduces SCONE-bench (Smart CONtracts Exploitation benchmark): 405 contracts that were actually exploited between 2020 and 2025 across Ethereum, Binance Smart Chain, and Base, derived from the public DefiHackLabs repository. The novelty is not another capability score — it is the unit of measurement. Instead of an abstract attack success rate, SCONE-bench prices what an AI agent can do in dollars of simulated stolen funds, because a smart-contract exploit has a directly observable on-chain value. All testing ran in blockchain simulators only; no live chains or real assets were touched.

This matters because it turns “AI can do cyber tasks” into an economic statement that defenders, engineers, and policymakers can reason about. It is also a topic that has circulated again through June 2026 security commentary (including OWASP’s agentic-security work) as the clearest public estimate of the economics of autonomous exploitation.

How it works

The harness gives an agent a forked, sandboxed copy of a blockchain and a 60-minute budget. The agent gets the target contract’s source and metadata, a Foundry toolchain and a Python environment exposed over the Model Context Protocol, and is asked to write an exploit script that increases its own token balance past a small profit threshold. Success is validated by replaying the script against the forked chain. We describe the result at the level of measurement only — no exploit scripts, addresses, or operational steps are reproduced here.

Across 10 frontier models at Best@8, the agents produced working exploits for 207 of 405 contracts (51%), totalling roughly $550 million in simulated stolen funds. To rule out training-data contamination, the authors re-ran on contracts exploited after each model’s knowledge cutoff (June 1, 2025 for Opus 4.5, March 1, 2025 for the others): Opus 4.5, Sonnet 4.5, and GPT-5 solved about 56% of those, worth up to $4.6 million, with Opus 4.5 alone reaching $3.7 million. Plotted over the year, post-cutoff exploit revenue roughly doubled every 1.3 months — from about 2% of vulnerabilities and $5,000 a year earlier to 55.88% and $4.6 million.

The most striking part is the zero-day test. On October 3, 2025 the Sonnet 4.5 and GPT-5 agents were pointed at 2,849 recently deployed BSC contracts with no known vulnerabilities. They surfaced two genuinely novel flaws worth $3,694, with GPT-5 finding them at an API cost of $3,476. Both bugs were elementary in hindsight: one was a reward-calculation function that developers forgot to mark read-only, so calling it mutated state instead of just reading it; the other was a fee-withdrawal path that never validated the recipient. These are ordinary access-control and write-protection mistakes — the kind static analysis and review already target — found and monetised end-to-end by an autonomous agent.

Why it matters

The economics are the headline. The average cost to have an agent exhaustively scan a single contract was $1.22; the average cost per vulnerable contract actually found was about $1,738, against $1,847 average revenue per exploit. Token cost per successful exploit fell about 70% across four Claude generations, meaning roughly 3.4x more exploits per dollar than six months earlier. As the authors note, the skills involved — long-horizon reasoning, boundary analysis, iterative tool use — are not blockchain-specific. Smart contracts are simply the place where the dollar value is visible; the same automated scrutiny extends to any open-source dependency, forgotten auth library, or obscure endpoint on the path to valuable assets. The window between deploying vulnerable code and seeing it probed is shrinking toward machine speed.

Defenses

The study’s own conclusion is that the same agents that exploit can defend, and that defenders should adopt them now rather than later.

Use AI agents as pre-deployment auditors. SCONE-bench ships plug-and-play support for pointing the agent at your own contracts before launch; run autonomous exploit-generation against your code in a fork, treat any script that clears the profit threshold as a release blocker, and fold it into CI alongside conventional static analysis.

Re-target the basics, because that is what the agents hit. Both zero-days were missing view modifiers and missing recipient validation — write-protection and access-control hygiene. Enforce these with linters, mandatory review of state-mutating public functions, and tests that assert who can call what.

Compress the deploy-to-patch window. If revenue doubles every ~1.3 months and a scan costs ~$1.22, assume hostile agents reach your contract within hours of deployment. Stage releases, cap value-at-risk in new contracts, keep emergency-pause and upgrade paths ready, and pre-arrange white-hat rescue contacts (the study coordinated fund recovery with SEAL).

Track capability as an economic curve, not a yes/no. Measure your own exposure the way the benchmark does — in value reachable per dollar of attacker compute — and revisit it as models improve, since the cost side keeps falling.

Status

This is published, peer-reviewable research with a defensive framing, not a product CVE. The study was posted December 1, 2025 and lightly revised December 2 and December 8, 2025; Bruce Schneier covered it December 11, 2025. The benchmark is open-sourced (with the full harness to follow), a dual-use decision the authors justify by noting attackers already have the incentive to build such tools. For context on real-world stakes, Trail of Bits documented a $120M rounding-direction exploit of Balancer in November 2025. This article reports the findings and mitigations only; it contains no exploit code, contract addresses, or operational attack detail. Sources are cited with their publication dates above.

This article covers published security research with a defensive framing. If you ship smart contracts or other high-value open-source code, treat autonomous exploit-generation as part of your own test suite, not a future threat.