owasp_agentic_skills_top10_2026

OWASP / Skills Top 10

A risk framework derived from the OWASP Agentic Skills Top 10 (AST10) 2026, authored by Ken Huang and published as an OWASP Incubator Project. It identifies the ten most critical security risks in agentic AI skills — the reusable, named behaviours that encode complete workflows and give agents real-world impact across OpenClaw (SKILL.md), Claude Code (skill.json), Cursor/Codex (manifest.json), and VS Code (package.json) ecosystems — together with associated controls to reduce or eliminate those risks.

Type:

Industry

Domain:

Cybersecurity

Coverage:

Accountability & Governance

Safety & Reputational Harm

Tags:

Agents

Content:

10 Risks

26 Controls

Version: 1.0-2026

Framework Definition

Risks and controls associated with the framework

Assessment Layer

Concrete evaluations linked to controls to assess pass or fail

No evaluation mapping defined yet.

RISK

Malicious Skills

Risk that adversaries publish agent skills — on registries such as ClawHub or skills.sh — that appear legitimate but contain hidden malicious payloads including credential stealers, reverse shells, backdoors, and social engineering instructions embedded in skill prose. Because skills execute with the full permissions of the host agent, a malicious skill gains immediate access to API keys, SSH credentials, wallet files, browser data, and shell, exploiting both the code layer and the natural language instruction layer simultaneously.

AST01:2026

3 Controls

CONTROL

Require Cryptographic Signatures on All Published Skills

Ensure that cryptographic signatures (e.g., ed25519) are required for all published skills, that unsigned installs are rejected, and that hash-pinned installed skills alert on any modification to detect tampering after installation.

C.S1.1

CONTROL

Implement Merkle Root Signing and Multi-Layer Scanning at Registry Level

Ensure that skill registries implement Merkle root signing, that skills are scanned at publish time and install time using behavioral analysis (not just pattern matching), and that publisher trust level, install count, and scan status are displayed in the registry UI.

C.S1.2

CONTROL

Never Auto-Execute Skill Prerequisites Without Explicit User Review

Ensure that "Prerequisites" sections in skill definitions are never automatically executed without explicit user review, and that users are warned before any skill-initiated terminal commands or external downloads are permitted.

C.S1.3

RISK

Supply Chain Compromise

Risk that skill registries and distribution channels — lacking the provenance controls of mature package ecosystems — are exploited through coordinated mass uploads, dependency confusion, account takeover, repository poisoning, and configuration file hijacking. Repository configuration files that were once passive metadata have become active execution paths, meaning a compromised skill inherits the agent's full credential set — not just sandboxed package permissions.

AST02:2026

3 Controls

CONTROL

Implement Skill Provenance Tracking with Transparency Logs

Ensure that each published skill is linked to a verified code-signing identity and that transparency logs are maintained for all registry operations (publish, update, delete) — analogous to Certificate Transparency — enabling detection of unauthorised or coordinated mass-upload campaigns.

C.S2.1

CONTROL

Pin All Nested Dependencies to Immutable Hashes

Ensure that all direct and transitive dependencies are pinned to immutable content hashes (sha256:) rather than version ranges, and that recursive dependency trees — not just top-level skill files — are scanned for tampering or malicious indicators.

C.S2.2

CONTROL

Treat Repository Configuration Files as Executable Code with Trust Gates

Ensure that repository configuration files including hooks, settings files, and environment overrides are treated as executable code subject to the same trust gates as skill definitions, requiring explicit user trust confirmation before any repository-controlled configuration is permitted to execute.

C.S2.3

RISK

Over-Privileged Skills

Risk that skills are granted broader permissions than their stated function requires — because no mandatory permission manifest system exists or users accept all permissions without review — creating excessive blast radius where a legitimate skill with overly permissive access can be weaponised by downstream prompt injection to execute operations it was never intended to perform, including accessing agent identity files or exfiltrating credentials.

AST03:2026

3 Controls

CONTROL

Require a Minimum-Scope Permission Manifest for Every Skill

Ensure that every skill is required to declare a permission manifest specifying permitted files, network access, shell access, and tools, that skills without a manifest are rejected, and that per-skill scoped credentials rather than shared agent-level API keys are enforced.

C.S3.1

CONTROL

Flag Skills Requesting Write Access to Agent Identity Files for Elevated Review

Ensure that skills requesting write access to agent identity files (SOUL.md, MEMORY.md, AGENTS.md) are automatically flagged for elevated security review, and that network access is scoped to specific domain allowlists rather than a binary network on/off permission.

C.S3.2

CONTROL

Validate Manifest Declarations Against Observed Runtime Behaviour

Ensure that runtime permission enforcement — not just declarative manifest checking — is applied, and that manifest permission declarations are validated against actual observed runtime behaviour in sandboxed testing before skills are approved for distribution.

C.S3.3

RISK

Insecure Metadata

Risk that skill metadata fields — name, description, author, permissions, risk tier — are attacker-controlled strings with no validation, signing, or trust anchoring, enabling adversaries to impersonate trusted brands, understate permissions, misdeclare risk tiers, poison registry search results, and embed malicious instructions in skill prose using ASCII smuggling, base64 encoding, or zero-width Unicode that is invisible to human reviewers but interpreted by the agent.

AST04:2026

3 Controls

CONTROL

Apply Static Analysis to Metadata Fields and Validate Permissions at Publish Time

Ensure that static analysis is applied to all metadata fields at publish time to flag suspicious patterns, and that declared permissions are validated against observed sandbox runtime behaviour to detect understated or misdeclared scope.

C.S4.1

CONTROL

Scan Skill Definitions for Steganographic and Obfuscation Techniques

Ensure that skill definition files are scanned for ASCII smuggling, base64 payloads, and zero-width Unicode characters that could hide malicious instructions from human reviewers while remaining interpretable by the agent.

C.S4.2

CONTROL

Surface Metadata Provenance and Implement Brand Protection at Registry Level

Ensure that metadata provenance — including the declaring identity, timestamp, and signing key — is surfaced in registry UI, and that brand and trademark protection mechanisms are implemented to prevent impersonation of trusted publishers at the registry level.

C.S4.3

RISK

Unsafe Deserialization

Risk that AI agent skill files — in YAML, JSON, and Markdown formats with well-documented deserialization vulnerabilities — are loaded using unsafe parsers or without sandboxing, enabling attackers to embed executable payloads that trigger at skill load time before any user action, with the attack surface including SKILL.md YAML frontmatter, package.json, manifest.json, requirements.txt, and any configuration pulled during skill initialisation.

AST05:2026

2 Controls

CONTROL

Use Safe Parsers and Disable Dangerous Deserialization Tags by Default

Ensure that safe YAML loaders are used by default with dangerous tags explicitly disabled (e.g., yaml.safe_load instead of yaml.load, disabling !!python/object and !!python/apply), and that an allowlist of permitted YAML/JSON keys is applied to reject any unexpected fields.

C.S5.1

CONTROL

Parse and Validate Skill Config Files in an Isolated Process Before Execution

Ensure that all skill configuration files — including requirements.txt, package.json, and pyproject.toml — are parsed and validated in an isolated subprocess or container before execution, and that skill files are never deserialised with elevated privileges.

C.S5.2

RISK

Weak Isolation

Risk that skills execute in the same security context as the host agent — with full file system access, shell access, and network egress — because sandboxing is unavailable, optional, or disabled by default, removing all containment guarantees and enabling any installed skill to achieve full system compromise, persistent host backdoors, network pivoting, and localhost-based attacks against the agent's control interface.

AST06:2026

3 Controls

CONTROL

Require Container or Docker Isolation for Skill Execution by Default

Ensure that container or Docker isolation is the default for skill execution and that host-mode execution requires explicit opt-in with documented risk acceptance, with seccomp and AppArmor profiles applied to constrain agent system call surface and per-skill process isolation enforced in separate namespaces.

C.S6.1

CONTROL

Bind Agent Control Interfaces to Localhost with Authentication and Rate Limiting

Ensure that agent control interfaces (e.g., WebSocket endpoints) are bound to localhost with strong authentication, are never exposed on 0.0.0.0 by default, and that all connections including those from localhost are rate-limited and authenticated to prevent cross-origin brute-force attacks.

C.S6.2

CONTROL

Restrict Skill Hot-Reload and Require User Confirmation for Workspace Overrides

Ensure that skill hot-reload is restricted and disabled in non-development environments, and that explicit user confirmation is required for any workspace skill overrides that could shadow built-in functionality through runtime precedence mechanisms.

C.S6.3

RISK

Update Drift

Risk that installed skills drift out of sync with known-good versions — either because patches are not applied leaving known vulnerabilities exploitable, or because auto-update mechanisms blindly apply upstream changes that may themselves be malicious — amplified by the absence of enterprise patch management for individually installed skills and the inability to verify "fix" versions without cryptographic pinning.

AST07:2026

2 Controls

CONTROL

Pin All Installed Skills to Immutable Content Hashes

Ensure that all installed skills are pinned to immutable content hashes (sha256:) rather than version ranges, that cryptographic signature verification is required on every update with unsigned updates refused, and that an inventory of installed skills with version, hash, and last-verified timestamp is maintained.

C.S7.1

CONTROL

Require Human Approval for Skill Updates in Enterprise Environments

Ensure that skill updates in enterprise deployments are subject to a human approval step before being applied, that production deployments operate in a freeze mode prohibiting hot-reload, and that security advisories for installed skills are actively monitored with alerts on CVE matches.

C.S7.2

RISK

Poor Scanning

Risk that security scanning tools designed for traditional code are ineffective against agent skills — because skills blend natural language instructions with code in ways that defeat pattern matching, regex filters, and signature-based detection — enabling adversaries to distribute malicious skills that pass all available checks, including through pure natural-language social engineering with no detectable code signatures and through obfuscation techniques invisible to text-based scanners.

AST08:2026

2 Controls

CONTROL

Deploy Behavioral and Semantic Analysis Scanners Alongside Pattern Matching

Ensure that behavioral analysis scanners that evaluate intent rather than just signatures are deployed alongside deterministic rules, that both the code layer and the natural language instruction layer are scanned independently, and that multi-tool scanning pipelines combine pattern matching, semantic analysis, and behavioral sandbox evaluation.

C.S8.1

CONTROL

Test Skills in Isolated Sandboxes and Compare Runtime to Declared Behaviour

Ensure that skills are tested in isolated sandboxes with actual runtime behaviour observed and compared against declared behaviour in the manifest, and that installed skills are continuously re-scanned as scanner models improve — not only at initial install time.

C.S8.2

RISK

No Governance

Risk that organisations deploying AI agents lack the inventories, policies, review processes, and audit trails needed to manage skills at enterprise scale — with skills installed by individual developers without SOC visibility, approval workflow, or revocation mechanism — creating a shadow AI layer that security teams cannot see or control, enabling undetected compromise, orphaned credentials, regulatory exposure, and cascading agent compromise across multi-agent pipelines.

AST09:2026

3 Controls

CONTROL

Establish a Centralised Skill Inventory with Version, Hash, and Scan Status

Ensure that a centralised skill inventory is established recording skill name, version, content hash, install date, installer identity, and last scan status for all agent skills across the organisation, integrated into existing CMDB and ITSM tooling.

C.S9.1

CONTROL

Implement Approval Workflows and Agentic Identity Controls for Skill Installation

Ensure that all skill installations in enterprise environments require a formal security review and approval workflow, that non-human identities (NHIs) with scoped credentials are assigned to agents on a rotation schedule, and that comprehensive audit logging covers all skill actions including file access, network calls, shell commands, and memory writes.

C.S9.2

CONTROL

Establish a Formal Skill Revocation Process Tied to Offboarding and Incident Response

Ensure that a formal skill revocation process exists, linked to employee offboarding workflows and incident response playbooks, so that skills installed by departing employees are deprovisioned and credentials associated with compromised skills are immediately revoked.

C.S9.3

RISK

Cross-Platform Reuse Without Security Normalisation

Risk that skills ported across platforms — OpenClaw to Claude Code to Cursor to VS Code — are not re-validated for security properties of the target format, causing permission manifests, risk tier declarations, and signing metadata present in the source format to be silently stripped in translation, and enabling the same malicious payload to be deployed simultaneously across multiple platforms whose separate scanning and governance systems are unaware of each other's incidents.

AST10:2026

2 Controls

CONTROL

Adopt the Universal Skill Format with Normalised Security Metadata

Ensure that new skill development adopts a universal skill format that normalises security properties — including permission manifests, risk tier, cryptographic signatures, content hash, and scan status — across all target platforms, with the security metadata re-validated when skills are ported between platform ecosystems.

C.S10.1

CONTROL

Establish Cross-Registry Threat Intelligence Sharing

Ensure that cross-registry threat intelligence sharing is established between major skill registries so that malicious skills detected on one platform trigger automated scanning and removal processes on all other platforms where the same payload may have been deployed.

C.S10.2