DEFENSE LOW NEW

Verified agent skills: capability governance for the SKILL.md supply chain

NVIDIA's May 19, 2026 verified agent skills add risk scanning, cryptographic signing and machine-readable skill cards to the SKILL.md supply chain — a defensive answer to poisoned skills.

2026-06-16 // 6 min affects: claude-code, openai-codex, cursor, ai-agents

What is this?

On May 19, 2026 (updated May 21), NVIDIA’s Trustworthy AI and security teams published “verified agent skills”, a capability-governance layer for the portable instruction bundles — SKILL.md files and their attachments — that AI coding agents load to learn new tasks. The premise is that runtime guardrails are not enough: teams also need to know where a skill came from, whether it was scanned for known risks, and whether it was modified after publication. “Verified” means a skill is cataloged, scanned, evaluated, documented with a skill card, signed, and synced into a public catalog. The work builds on the open agentskills.io SKILL.md specification, so the same skill is meant to run across Claude Code, Codex and Cursor. This is a defensive framework, not a vulnerability.

How it works

A verified skill moves through a publishing pipeline owned by the product team that authors it:

source repo → review → scan → evaluate → skill card → sign → catalog → sync

Two stages do the security work. Scanning runs each candidate through SkillSpector, which treats a skill as a deployable capability rather than a static prompt. It checks conventional software risks (vulnerable dependencies, suspicious scripts, dangerous code patterns, credential access, data-exfiltration paths) and agent-native risks: hidden instructions, prompt injection, trigger abuse, excessive agency, tool poisoning, and mismatches between a skill’s declared purpose, the access it requests, and what its bundled artifacts actually do. That intent layer matters — a skill can look benign file-by-file while steering an agent toward unsafe behavior. SkillSpector’s coverage is mapped to OWASP’s LLM and Agentic AI risk lists and MITRE ATLAS.

Signing uses OpenSSF Model Signing (OMS): a detached skill.oms.sig covers every file and subdirectory in the skill, so a downloader can verify integrity and authenticity after download, not just trust a catalog listing.

# Verify a downloaded skill against NVIDIA's root certificate
model_signing verify certificate SKILL_DIR \
    --signature SKILL_DIR/skill.oms.sig \
    --certificate-chain nv-agent-root-cert.pem \
    --ignore-unsigned-files

Every verified skill ships with a skill card — a machine-readable trust record stating what the skill does, who built it, its license, its dependencies, and its known limitations, risks and mitigations. The agent loads the card alongside the skill, so trust metadata travels with the capability instead of living in a developer’s head.

Why it matters

Agent skills are one of the fastest-growing supply-chain surfaces in agentic AI, and llm-hacking has documented the attack side repeatedly: poisoned SKILL.md registries, a benchmark of malicious agent skills, credential leakage through skills, and skill-based exfiltration in Copilot/Cowork. The recurring failure is the same as in AGENTS.md injection: an instruction bundle on disk is treated as trusted context, so whoever controls the bundle controls the agent.

Verified skills attack two specific gaps. First, catalog membership is not integrity — most registries can tell you who uploaded an asset, but few let you cryptographically verify the asset itself after download; OMS signing closes that tamper window. Second, file-level scanning misses intent, which is precisely where skill attacks hide; SkillSpector’s purpose-vs-access checks target that layer. It is the supply-chain mirror of model signing and, conceptually, an enforcement point for the instruction hierarchy at the capability boundary.

Defenses

How to put this to work — and where it stops:

Verify signatures, don’t just trust the catalog. Run model_signing verify after pulling any signed skill. An unsigned or signature-mismatched skill should be treated as untrusted, regardless of where it was listed.
Read the skill card before install. Check declared access against declared purpose. A routing skill that requests filesystem or network scope beyond its solver endpoint is a red flag the card is designed to surface.
Treat scanning as point-in-time, not a guarantee. A clean SkillSpector pass reduces risk; it does not prove safety. Re-scan on update, and keep your own SCA/secret-scanning in the loop.
Signing proves integrity and authenticity, not good intent. A correctly signed skill from a trusted publisher can still be over-privileged. Pair provenance with runtime controls — sandboxed execution, least-privilege tool access, and I/O guardrails (e.g. NeMo Guardrails) — so a compromised or over-scoped skill is still contained.
Mind the trust boundary. Today’s verified catalog covers NVIDIA-published skills, and ecosystem-wide signing is described as a roadmap NVIDIA is “publicly experimenting” with. Third-party and community skills remain unverified until that spec spreads — govern them accordingly.

Status

Item	Reference	Date	Notes
Verified skills announced	NVIDIA Technical Blog	2026-05-19	Updated 2026-05-21; ~8 min read
Scanning tool	SkillSpector (open source)	2026	Software + agent-native risk checks, mapped to OWASP/MITRE ATLAS
Signing scheme	OpenSSF Model Signing (OMS)	2026	Detached `skill.oms.sig`, verifiable post-download
Open spec	agentskills.io `SKILL.md`	—	Portable across Claude Code, Codex, Cursor
Scope	NVIDIA-published skills	—	Broader ecosystem signing is a stated roadmap, not yet universal

The honest framing is not “skills are now safe.” It is that the skill layer finally has the same chain-of-trust primitives — provenance, scanning, signing, documented limitations — that the rest of the software supply chain has had for years. Verification tells you a capability is authentic and was checked; it is the floor for trusting an agent’s skills, not the ceiling.