Execution provenance for LLM agents: tracing evidence to rebuild trust
A June 2026 arXiv survey (2606.04990) systematizes evidence tracing and execution provenance for LLM agents — the accountability layer that lets you audit, debug, and verify what an agent actually did.
What is this?
“From Agent Traces to Trust: Evidence Tracing and Execution Provenance in LLM Agents” is a survey posted to arXiv in June 2026 (2606.04990) by Yiqi Wang and colleagues at Griffith University, with co-authors from Peking University, Nanjing University, Macquarie University and others. It does not propose a new attack or a single defense. Instead, it names and organizes a problem that most agent deployments still handle ad hoc: when an LLM agent calls tools, reads memory, browses the web and talks to other agents, how do you reconstruct what actually happened and decide whether to trust it?
The authors’ starting observation is simple. Final-answer accuracy tells you the endpoint of an execution. It does not tell you which retrieved evidence supported each claim, whether a tool call was justified, how a memory item influenced a later decision, or where a failure originated. That is the process-level accountability gap, and it is exactly the gap an incident responder falls into when an agent does something harmful and the only artifact left behind is the final output.
How it works
The survey frames evidence tracing and execution provenance as an accountability layer that sits alongside the agent rather than inside it. Evidence tracing records and connects the units that support, contradict, invalidate or influence an agent’s claims and actions. Execution provenance is the broader structured record of how an execution unfolds: retrieved documents, tool calls and their parameters, observations, memory reads and writes, intermediate claims, actions, inter-agent messages and final outputs.
To make this concrete the paper introduces a taxonomy along several axes — trace sources, evidence and execution units, provenance relations, tracing granularity, tracing timing, representation forms, and trust functions. The provenance relations are the interesting part for defenders: typed edges such as support, derivation, dependency, contradiction, invalidation, triggering and updating let you express, for example, that an action was triggered by a tool output that was itself derived from an untrusted web page. The lineage is borrowed from mature systems work — the survey explicitly builds on W3C PROV-DM and on OpenTelemetry-style distributed tracing — but extends it to the semantic units LLM agents introduce: generated claims, tool-call rationales, memory items and natural-language observations that traditional system traces never capture.
Why it matters
Provenance is where several previously separate security problems converge. The survey connects retrieval grounding, tool-use safety, memory lineage, observability and recovery under one model, and in doing so it maps recent agent-security work onto a shared substrate: control/data-flow separation (CaMeL), information-flow control (Fides), taint propagation through semantic transformations (NeuroTaint), and specification-, runtime- and boundary-based enforcement (AgentSpec, AgentSentry, AgentBound). Indirect prompt injection, in this view, is not a mysterious failure — it is an untrusted evidence unit acquiring undue influence on a downstream action, which a provenance graph can surface.
Memory is called out as a first-class risk. The paper treats memory as provenance-bearing evidence, not passive storage: a memory item derived from a poisoned document, a stale tool output or a malicious inter-agent message can silently propagate errors through every later decision. Without lineage on memory writes and retrievals, memory-poisoning attacks are nearly impossible to attribute after the fact.
Defenses
The survey is essentially a defensive blueprint. Practical takeaways for teams running agents in production:
- Instrument for process-level accountability, not just outputs. Capture tool calls, arguments, retrieved sources, memory accesses and inter-agent messages as structured trace units — OpenTelemetry-style spans adapted to agent semantics are a reasonable foundation.
- Build a typed provenance graph. Recording support/derivation/influence edges turns post-incident analysis from log archaeology into graph queries: “which untrusted source influenced this action?” becomes answerable.
- Apply information-flow and taint tracking. Treat tool outputs and retrieved content as tainted until proven otherwise, and flag when tainted data reaches a sensitive action — the structural signature of indirect prompt injection.
- Track memory lineage. Tag every memory write with its source and validity window so poisoned or stale items can be invalidated and audited.
- Move evaluation from final-answer correctness toward process correctness. The survey notes most benchmarks still grade endpoints; trace-based localization (e.g., TRAIL) and multi-agent failure analysis (MAST) grade the path.
Provenance is an accountability and detection layer, not prevention on its own — it complements, rather than replaces, input filtering and least-privilege tool design.
Status
This is a survey, not a vulnerability, so there is nothing to patch. Its value is conceptual and operational: a vocabulary and taxonomy for a capability that agent platforms are only starting to ship. The authors flag the field as fragmented and list open challenges that double as a roadmap — unified trace schemas, claim-level and semantic provenance, provenance-aware safety mechanisms, realistic execution-trace benchmarks, recovery-oriented evaluation, and privacy-aware audit infrastructure. For anyone designing agent observability or incident-response tooling in 2026, it is a useful map of what to record and why.