AGENTS MEDIUM NEW

AIRQ scores 100 production AI agents: 98% carry the lethal trifecta

Adversa AI's June 2026 AI Risk Quadrant rates 100 commercial agents on attack surface, blast radius and defenses. Only 11% are well-defended; tool execution alone explains 76% of blast radius.

2026-06-04 // 7 min affects: claude-code, github-copilot, openai-codex, openclaw

What is this?

In June 2026, Adversa AI published the AI Risk Quadrant (AIRQ), an independent assessment that scores 100 commercial and publicly available AI agents across ten classes. The methodology was built with contributors and reviewers drawn from OWASP, CoSAI, the Cloud Security Alliance and NIST, and both the framework and the report are released open and free. As covered by Help Net Security on June 3, 2026, it is positioned as the first comparative security rating for agentic products — the kind of vendor-neutral baseline buyers have lacked.

The headline finding is blunt: 98% of the scored agents already carry the “lethal trifecta”, and only 11% are both highly capable and well-defended. We are covering it because it converts a familiar architectural warning into measured, comparable numbers — exactly the artifact security teams can take to a procurement review.

How it works

AIRQ is a scoring framework, not an attack. It rates each agent on three independent axes plus an evidence layer:

Axis                  Question it answers
--------------------  -----------------------------------------------
Attack surface        How exposed is the agent across its input and
                      execution paths?
Blast radius          How bad is it if the agent is compromised —
                      what data and actions does it reach?
Defensive controls    What actually stops an attack: constrained
                      identity, execution isolation, approval gates?
Evidence layer        How strong is the public proof for each
                      claimed control? (source code / third-party
                      assessment > vendor datasheet)

Plotting attack surface against defense produces the named quadrant: high reach with thin defense is an Exposed Giant, high reach with matching defense a Fortified Leader, narrow and well-guarded a Tight Operator, narrow and lightly guarded a Humble Provider. The fourth layer is what most scoring skips — and it matters, because the report finds 83% of claimed defenses are not publicly verifiable. AIRQ scores the claim and the proof separately, so a marketing page cannot pass for a tested control.

The “lethal trifecta” that appears in 98% of the cohort is the combination of private data access, exposure to untrusted content, and the ability to take outbound actions. When those three sit together, a single poisoned document — the indirect prompt injection pattern — can turn an agent against its operator across every system it can reach. Eight of the ten agent classes show 100% trifecta exposure.

Why it matters

The report’s value is in the quantification. One variable dominates: whether an agent executes tools, and whether that execution is sandboxed, explains 76% of blast radius — outpredicting agent class, vendor reputation, and every individual defense component. That makes triage cheap: ask those two questions before reading anyone’s deck.

The distribution is sobering. Forty percent of agents fall in the Exposed Giants quadrant, which the report says holds 60% of the total risk budget. Capability and defense move in opposite directions across most of the market — coding agents rank second in capability but eighth in defense, and computer-use agents post an average output-guardrail score of zero (no points for output validation, exfiltration-channel blocking, or rendering sanitization). Worse, these high-risk agents are often the self-serve, bottom-up tools that bypass procurement entirely.

Audit is not defense. The report notes 37% of agents log well but score poorly on the four controls that actually prevent harm, and 38% complete irreversible actions before any monitoring path can plausibly fire. Logging that fires after an irreversible action is forensics, not protection.

Defenses

AIRQ doubles as a defensive checklist. Its factor lists map to NIST, OWASP, MITRE, CoSAI and CSA guidance, so they work as a procurement questionnaire and a red-team scoping aid.

Make sandboxing a procurement gate. Documented, tested sandboxing cuts residual risk by roughly 2.6×; cloud- or container-level isolation captures about 6×. Most of the benefit comes from the first step, so require it before deployment.
Reduce blast radius first. Since tool execution explains most of the damage, scope what tools an agent can call, constrain its identity with short-lived narrow-scope credentials, and isolate its runtime. A compromise in a tightly scoped environment is a contained test result.
Break the trifecta. You rarely need all three of private-data access, untrusted-content ingestion, and outbound action in the same context. Separate the agent that reads untrusted input from the one that holds credentials or can act externally.
Demand evidence, not datasheets. With 83% of claimed controls unverifiable, treat an unsupported claim as absent. Ask vendors for the AIRQ factor answers backed by source code or third-party assessment.
Gate irreversible actions and review the action stream. Put human-in-the-loop or policy approval in front of anything you cannot take back, and make sure monitoring can fire before the action, not after.
Score twice and re-audit quarterly. The same platform scores differently as the vendor ships it versus as the customer configures it. Re-check on a schedule — categories with low CVE counts are in a pre-discovery phase, not a safe one.

Status

Item	Reference	Date	Notes
AIRQ report & framework	Adversa AI	2026-06	100 agents, 10 classes; open methodology
Lethal-trifecta prevalence	AIRQ	2026-06	98% of cohort; 8/10 classes at 100%
Well-defended (Fortified Leaders)	AIRQ	2026-06	11% of agents
Exposed Giants	AIRQ	2026-06	40% of cohort, 60% of risk budget
Tool execution → blast radius	AIRQ	2026-06	Explains 76% of blast radius
Sandboxing benefit	AIRQ	2026-06	~2.6× residual-risk reduction; ~6× with container/cloud isolation
Defenses lacking verification	AIRQ	2026-06	83% of claims not publicly verifiable
Independent coverage	Help Net Security	2026-06-03	”Only 11% of production agents pass the AI agent security bar”

The takeaway is not that any one product is unsafe — it is that the agentic market has shipped capability far ahead of containment, and now there is a public, reproducible way to measure the gap. Treat the agent (not the underlying model) as the unit of risk, compare within a class, and make sandboxing and verified controls the price of deployment.