owasp_agentic_top10_2026

OWASP / Agentic Top 10

A risk framework derived from the OWASP Top 10 for Agentic Applications 2026 (ASI Top 10), published by the OWASP GenAI Security Project – Agentic Security Initiative. It identifies the ten highest-impact security risks specific to AI agent systems that plan, decide, and act autonomously across multiple steps and systems, together with associated controls to reduce or eliminate those risks.

Type:

Industry

Domain:

Cybersecurity

Agentic

Coverage:

Accountability & Governance

Safety & Reputational Harm

Tags:

GenAI

Content:

10 Risks

41 Controls

Version: 2026

Framework Definition

Risks and controls associated with the framework

Assessment Layer

Concrete evaluations linked to controls to assess pass or fail

No evaluation mapping defined yet.

RISK

Agent Goal Hijack

Risk that adversaries manipulate an AI agent's objectives, task selection, or decision pathways — through prompt injection, deceptive tool outputs, malicious artefacts, forged agent-to-agent messages, or poisoned external data — causing the agent to redirect its autonomous, multi-step behaviour toward unintended or harmful outcomes, resulting in data exfiltration, financial loss, operational disruption, or reputational harm.

ASI01:2026

4 Controls

CONTROL

Treat All Natural-Language Inputs as Untrusted

Ensure that all natural-language inputs — including user-provided text, uploaded documents, retrieved content, emails, calendar entries, and peer-agent messages — are routed through input validation and prompt-injection safeguards before they can influence goal selection, planning, or tool calls.

C.A1.1

CONTROL

Define, Lock, and Version-Control Agent System Prompts

Ensure that agent system prompts are explicitly defined with locked goal priorities and permitted actions, placed under configuration management, and that any changes to goals or reward definitions require human approval.

C.A1.2

CONTROL

Validate User and Agent Intent Before Executing Goal-Changing Actions

Ensure that both user intent and agent intent are validated at runtime before executing goal-changing or high-impact actions, requiring confirmation via human approval, policy engine, or platform guardrails whenever the agent proposes actions that deviate from the original task or scope.

C.A1.3

CONTROL

Establish Behavioral Baselines and Monitor for Goal Drift

Ensure that comprehensive logging and continuous monitoring of agent activity is maintained, establishing a behavioral baseline that includes goal state, tool-use patterns, and invariant properties, with alerting on unexpected goal changes, anomalous tool sequences, or deviations from the established baseline.

C.A1.4

RISK

Tool Misuse and Exploitation

Risk that AI agents misuse legitimate tools — due to prompt injection, misalignment, unsafe delegation, or ambiguous instructions — leading to data exfiltration, tool output manipulation, workflow hijacking, destructive actions, or financial loss, even when the agent operates within its authorized privilege boundaries.

ASI02:2026

5 Controls

CONTROL

Define and Enforce Per-Tool Least-Privilege Profiles

Ensure that each tool available to an LLM agent has a defined least-privilege profile specifying permitted scopes, maximum invocation rate, and egress allowlists, expressed as IAM or authorization policy configurations rather than ad-hoc conventions.

C.A2.1

CONTROL

Require Action-Level Authentication and Human Approval for Destructive Operations

Ensure that explicit authentication is required for each tool invocation and that human confirmation is mandated for high-impact or destructive actions (such as delete, transfer, or publish), with a pre-execution plan or dry-run preview presented before approval.

C.A2.2

CONTROL

Run Tool Execution in Sandboxes with Egress Controls

Ensure that tool and code execution occurs in isolated sandboxes with enforced outbound network allowlists, denying all non-approved network destinations to contain the impact of misuse or injection.

C.A2.3

CONTROL

Enforce Semantic and Identity Validation of Tool Calls

Ensure that fully qualified tool names and version pins are enforced to prevent tool alias collisions and typo-squatting, and that the intended semantics of tool calls are validated rather than relying on syntax alone, with ambiguous resolutions failing closed.

C.A2.4

CONTROL

Maintain Immutable Logs of Tool Invocations and Monitor for Anomalous Chaining

Ensure that immutable logs of all tool invocations and parameter changes are maintained, with continuous monitoring for anomalous execution rates, unusual tool-chaining patterns, and policy violations.

C.A2.5

RISK

Identity and Privilege Abuse

Risk that the architectural mismatch between user-centric identity systems and agentic design is exploited — through unscoped privilege inheritance, memory-based credential retention, cross-agent trust exploitation, TOCTOU authorization drift, or synthetic identity injection — enabling escalation of access, hijacking of privileges, or execution of unauthorised actions across interconnected systems.

ASI03:2026

4 Controls

CONTROL

Issue Task-Scoped, Time-Bound Credentials per Agent

Ensure that each agent is issued short-lived, narrowly scoped credentials (e.g., mTLS certificates or scoped tokens) per task, with permission boundaries that limit blast radius and block delegated-abuse, un-scoped inheritance, and orphaned privileges.

C.A3.1

CONTROL

Isolate Agent Identities and Wipe Context Between Sessions

Ensure that agents run in per-session sandboxes with separated permissions and memory, with state wiped between tasks and users to prevent memory-based credential escalation and cross-repository data exfiltration.

C.A3.2

CONTROL

Re-Verify Permissions at Each Privileged Step via a Centralised Policy Engine

Ensure that each privileged step in an agent workflow is re-verified by a centralised policy engine that checks external data, preventing cross-agent trust exploitation and reflection-loop privilege elevation.

C.A3.3

CONTROL

Detect and Alert on Delegated and Transitive Permission Changes

Ensure that monitoring is in place to detect when an agent gains new permissions indirectly through delegation chains, flagging cases where a low-privilege agent inherits or is handed higher-privilege scopes during multi-agent workflows, and alerting on abnormal cross-agent privilege elevation.

C.A3.4

RISK

Agentic Supply Chain Vulnerabilities

Risk that agents, tools, and artefacts sourced from third parties — including models, plug-ins, datasets, other agents, MCP servers, agent registries, agentic communication interfaces (MCP, A2A), or update channels — are malicious, compromised, or tampered with, introducing unsafe code, hidden instructions, or deceptive behaviours into the agent's execution chain, which can cascade vulnerabilities across multi-agent systems at runtime.

ASI04:2026

4 Controls

CONTROL

Sign, Attest, and Maintain Inventory of All Agentic Components

Ensure that manifests, prompts, and tool definitions are signed and attested, that SBOMs and AIBOMs with periodic attestations are operationalised for all AI components, and that only curated registries are used with untrusted sources blocked.

C.A4.1

CONTROL

Allowlist, Pin, and Scan Agentic Dependencies for Typo-Squatting

Ensure that all agentic dependencies (packages, tool adapters, prompt templates) are allowlisted, pinned by content hash and commit ID, and scanned for typo-squatting across registries (PyPI, npm, LangChain, LlamaIndex), with automatic rejection of unsigned or unverified components.

C.A4.2

CONTROL

Enforce Mutual Authentication and Message Signing for Inter-Agent Interfaces

Ensure that inter-agent communication interfaces (MCP, A2A) enforce mutual authentication and attestation via PKI and mTLS, with no open registration, and that all inter-agent messages are signed and verified.

C.A4.3

CONTROL

Implement a Supply Chain Kill Switch for Emergency Revocation

Ensure that emergency revocation mechanisms are in place that can instantly disable specific tools, prompts, or agent connections across all deployments when a compromise is detected, preventing cascading damage across the agent ecosystem.

C.A4.4

RISK

Unexpected Code Execution

Risk that agentic systems generating and executing code are exploited — through prompt injection, tool misuse, unsafe serialisation, or malicious package installation — to achieve remote code execution, local system misuse, container escape, or exploitation of internal systems, bypassing traditional security controls because the code is generated in real-time by the agent.

ASI05:2026

3 Controls

CONTROL

Prohibit eval() and Enforce Safe Interpreter Usage in Production Agents

Ensure that unsafe evaluation functions (eval(), exec()) are prohibited in production agent environments, safe interpreters with taint-tracking are used instead, and static scans are performed before any agent-generated code is executed.

C.A5.1

CONTROL

Run Agent Code Execution in Hardened Sandboxes with Minimal Privilege

Ensure that agent-generated code is executed in sandboxed containers running without root privileges, with strict network access limits, filesystem access restricted to dedicated working directories, and known-vulnerable packages blocked.

C.A5.2

CONTROL

Separate Code Generation from Execution with Validation Gates

Ensure that code generation and code execution are architecturally separated by validation gates, with human approval required for elevated or production-system execution runs, and that direct agent-to-production system access is prevented.

C.A5.3

RISK

Memory and Context Poisoning

Risk that adversaries corrupt or seed an agent's stored context — including conversation history, memory tools, embeddings, and RAG stores — with malicious or misleading data, causing future reasoning, planning, or tool use to become biased, unsafe, or to facilitate exfiltration, with effects that persist across sessions, propagate between cooperating agents, and resist remediation once embedded.

ASI06:2026

4 Controls

CONTROL

Validate and Scan All Memory Writes for Malicious or Sensitive Content

Ensure that all new memory writes and model outputs are scanned using rule-based and AI-based methods for malicious instructions or sensitive content before being committed to agent memory, preventing injection of persistent adversarial payloads.

C.A6.1

CONTROL

Segment Memory by User Session and Domain Context

Ensure that agent memory is logically segmented by user session and domain context to prevent knowledge leakage and sensitive data cross-contamination between users, tenants, or tasks, using per-tenant namespaces in shared vector and memory stores.

C.A6.2

CONTROL

Expire Unverified Memory and Maintain Rollback Capability

Ensure that unverified memory entries are expired after a defined period to limit poison persistence, that snapshots and version control are maintained for rollback, and that high-impact memory entries require a provenance score or human-verified tag before surfacing.

C.A6.3

CONTROL

Prevent Re-Ingestion of Agent-Generated Output into Trusted Memory

Ensure that agent-generated outputs are not automatically re-ingested into trusted memory stores without validation, preventing self-reinforcing contamination and bootstrap poisoning scenarios.

C.A6.4

RISK

Insecure Inter-Agent Communication

Risk that inter-agent message exchanges lack proper authentication, integrity, confidentiality, or semantic validation — enabling interception, spoofing, replay, protocol downgrade, or semantic manipulation of agent messages across transport, routing, discovery, and semantic layers — leading to misinformation, privilege confusion, or coordinated manipulation across distributed agentic systems.

ASI07:2026

4 Controls

CONTROL

Enforce End-to-End Encryption and Mutual Authentication for Agent Channels

Ensure that all inter-agent communications use end-to-end encryption with per-agent credentials and mutual authentication, enforcing PKI certificate pinning, forward secrecy, and regular protocol reviews to prevent interception or spoofing.

C.A7.1

CONTROL

Digitally Sign and Semantically Validate Inter-Agent Messages

Ensure that all inter-agent messages are digitally signed with both payload and context hashed, and that natural-language-aware sanitisation and intent-diffing are applied to detect goal manipulation, parameter tampering, and hidden instructions within messages.

C.A7.2

CONTROL

Enforce Protocol Pinning and Reject Downgrade Attempts

Ensure that allowed protocol versions (MCP, A2A, gRPC) are defined and enforced, that downgrade attempts and unrecognised schemas are rejected, and that both peers are required to advertise matching capability and version fingerprints.

C.A7.3

CONTROL

Authenticate Discovery and Coordination Messages via Attested Registries

Ensure that all discovery and coordination messages are authenticated using cryptographic identity, and that agent registries or marketplaces provide digital attestation of agent identity, provenance, and descriptor integrity with signed agent cards and continuous verification before accepting coordination traffic.

C.A7.4

RISK

Cascading Failures

Risk that a single fault — such as a hallucination, malicious input, corrupted tool, or poisoned memory — propagates and amplifies across autonomous agents, tools, and workflows, bypassing stepwise human checks to cause system-wide harm including widespread service failures, cross-domain data compromise, operational disruption, and consequences that outpace human ability to intervene.

ASI08:2026

4 Controls

CONTROL

Implement Blast-Radius Guardrails Between Planner and Executor

Ensure that blast-radius guardrails — including quotas, progress caps, and circuit breakers — are implemented between planner and executor components to limit the propagation scope of any single agent fault.

C.A8.1

CONTROL

Enforce Independent Policy Engines to Separate Planning from Execution

Ensure that planning and execution are separated via an external policy engine that independently validates planned actions, preventing corrupt or manipulated planning from directly triggering harmful downstream operations.

C.A8.2

CONTROL

Implement Human Oversight Gates Before Propagating High-Risk Agent Outputs

Ensure that checkpoints requiring human review or governance agent validation are placed before high-risk agent outputs are propagated downstream to other agents, tools, or systems, with rate limiting applied to detect and throttle fast-spreading commands.

C.A8.3

CONTROL

Record Tamper-Evident Logs with Non-Repudiation Across All Agent Actions

Ensure that all inter-agent messages, policy decisions, and execution outcomes are recorded in tamper-evident, time-stamped logs bound to cryptographic agent identities, maintaining lineage metadata for every propagated action to support forensic traceability and rollback validation during cascade events.

C.A8.4

RISK

Human-Agent Trust Exploitation

Risk that adversaries or misaligned agent designs exploit the natural language fluency, emotional intelligence, and perceived expertise of AI agents to manipulate human users into disclosing sensitive information, approving harmful actions, or making unsafe decisions — leveraging automation bias, authority bias, and fabricated rationales to bypass human oversight — resulting in data breaches, financial losses, and reputational harm, with the agent's role invisible to forensic investigation.

ASI09:2026

4 Controls

CONTROL

Require Multi-Step Human Confirmation for Sensitive or Irreversible Actions

Ensure that multi-step approval or human-in-the-loop controls are required before agents access extra-sensitive data or perform irreversible actions, with immutable tamper-proof audit records of all user queries and agent actions maintained for forensic purposes.

C.A9.1

CONTROL

Provide Plain-Language Risk Summaries and Enable Reporting of Suspicious Behaviour

Ensure that in user-interactive systems, plain-language risk summaries (not model-generated rationales) are displayed for high-impact actions, and that users have a clear mechanism to flag suspicious or manipulative agent behaviour, triggering automated review or temporary capability lockdown.

C.A9.2

CONTROL

Implement Adaptive Trust Calibration and Human-Factors UI Safeguards

Ensure that agent autonomy and required human oversight are continuously adjusted based on contextual risk scoring, and that UI safeguards visually differentiate high-risk recommendations, with personnel trained to recognise manipulation patterns and agent limitations.

C.A9.3

CONTROL

Detect Plan Divergence from Approved Workflow Baselines

Ensure that agent action sequences are continuously compared against approved workflow baselines, with alerts triggered when unusual detours, skipped validation steps, or novel tool combinations indicate possible deception, drift, or manipulation.

C.A9.4

RISK

Rogue Agents

Risk that AI agents become malicious or compromised and deviate from their intended function or authorised scope — through goal drift, workflow hijacking, reward hacking, or self-replication — acting harmfully, deceptively, or parasitically within multi-agent or human-agent ecosystems, with individually legitimate-appearing actions whose emergent behaviour escapes detection by traditional rule-based controls.

ASI10:2026

5 Controls

CONTROL

Maintain Comprehensive Immutable Audit Logs of All Agent Actions

Ensure that comprehensive, immutable, and signed audit logs are maintained for all agent actions, tool calls, and inter-agent communication, enabling review for stealth infiltration, unapproved delegation, and coordinated collusion patterns.

C.A10.1

CONTROL

Assign Trust Zones and Deploy Restricted Execution Environments

Ensure that agents are assigned to trust zones with strict inter-zone communication rules and deployed in restricted execution environments (e.g., container sandboxes) with API scopes based on least privilege, with suspicious agents quarantined in isolated environments for forensic review.

C.A10.2

CONTROL

Deploy Behavioural Detection and Watchdog Agents

Ensure that behavioural detection is deployed — including watchdog agents that validate peer behaviour and outputs — focusing on detecting collusion patterns, coordinated false signals, and anomalous action executions such as excessive or unexpected tool invocations.

C.A10.3

CONTROL

Implement Kill Switches and Credential Revocation for Rapid Agent Containment

Ensure that rapid containment mechanisms — including kill switches and credential revocation — are in place to instantly disable rogue agents, with a recovery and reintegration process requiring fresh attestation, dependency verification, and human approval before an agent is returned to production.

C.A10.4

CONTROL

Enforce Per-Agent Cryptographic Identity Attestation and Behavioural Integrity Baselines

Ensure that each agent has a cryptographic identity with signed behavioural manifests declaring expected capabilities, tools, and goals, validated by orchestration services before each action, with a continuous behavioural verification layer that monitors for deviations from the declared manifest and ephemeral per-run credentials mediated by orchestrators.

C.A10.5