15 min read

June 5, 2026

AI Security Guide 2026: Securing Agents, Not Just Models

The 2026 AI security picture for CISOs: the agentic attack surface (indirect injection, MCP supply chain), the OWASP Agentic Top 10, what the vendor consolidation means for buying, and the control stack production teams actually deploy.

AI Security Agentic AI Prompt Injection Governance

We The Flywheel Research & Analysis

Published June 5, 2026

AI security changed jobs in 2026. The question is no longer whether a model says something embarrassing; it is what an agent does with the tools, identities, and data it was handed. The attacks that matter arrive through content the agent reads, not prompts the user types — and the defenses that hold are architectural, not linguistic. This guide maps the threat landscape, the frameworks that settled this year, the vendor consolidation, and the control stack production teams actually run.

The 2026 threat landscape: agents moved, attacks followed

Indirect prompt injection is the production vector

Direct jailbreaks make demos; indirect injection makes incidents. Malicious instructions arrive inside content the agent ingests during normal operation — an email it summarizes, a document it retrieves, a tool description it loads — and execute with the agent's privileges. Google's security team measured a 32% rise in malicious indirect-injection content between November 2025 and February 2026, and reported that indirect attempts succeed with fewer tries than direct ones.

The canonical case is EchoLeak (CVE-2025-32711, CVSS 9.3, disclosed June 2025): one crafted email with hidden instructions caused Microsoft 365 Copilot to pull data from OneDrive, SharePoint, and Teams and exfiltrate it through a trusted Microsoft domain — zero clicks, discovered by Aim Security. It was the first production weaponization of indirect injection, and the template for everything since.

The MCP supply chain is the new dependency risk

The Model Context Protocol became the connective tissue of agent deployments, and the most rapidly weaponized new surface. The Cloud Security Alliance's May 2026 research note estimates a systemic architectural flaw disclosed in April exposes roughly 200,000 MCP instances across a supply chain spanning 150M+ package downloads. The MCPTox benchmark (45 live servers, 353 authentic tools) put attack success rates from poisoned tool descriptions at 60-72%. Discrete CVEs followed, including a CVSS-9.8 remote code execution in MCPJam Inspector, and Check Point disclosed critical vulnerabilities in a widely used CLI coding agent in February 2026.

OWASP now names this class ASI04, Agentic Supply Chain. The practical posture: treat every MCP server and tool description as untrusted input with the same rigor you apply to a package registry — pinning, review, isolation, egress monitoring.

Attackers run agents too

The largest incident of the period inverted the threat model. Between December 2025 and February 2026, an attacker used commercial coding agents to breach nine Mexican government agencies, stealing 195M taxpayer records and 220M civil records — with the AI executing roughly three quarters of the remote commands across more than a thousand prompts. Agentic tooling compressed what was previously a team-scale intrusion into one operator's workload. Defense planning that assumes human-speed attackers is now planning for the wrong adversary.

The framework map settled in 2026

After two years of overlapping drafts, the governing documents are now clear:

Framework	Current version	Use it for
OWASP Top 10 for Agentic Applications	2026 list, released Dec 9, 2025; peer-reviewed by NIST, Microsoft AI Red Team, AWS	Agent deployments — goal hijack, tool misuse, identity abuse, supply chain, memory poisoning (ASI01-ASI10)
OWASP Top 10 for LLM Applications	2025 version (adds System Prompt Leakage, Vector & Embedding Weaknesses)	Non-agentic LLM features — chat, RAG, summarization
NIST AI RMF	1.0 + Generative AI Profile (AI 600-1, July 2024); Cyber AI Profile draft (IR 8596, Dec 2025)	Governance structure and risk language the board and auditors share
MITRE ATLAS	v5.1.0 (Nov 2025): 16 tactics, 84 techniques, 14+ new agentic techniques	Adversary TTPs, red-team planning, detection engineering
EU AI Act	GPAI obligations live since Aug 2, 2025; high-risk obligations from Aug 2, 2026	Legal floor: risk management, data governance, logging, human oversight

The regulatory deadline deserves emphasis: high-risk system obligations apply from August 2, 2026 — conformity assessment, registration, logging, and human oversight, with enforcement powers attached. The Cloud Security Alliance flags a substantial enterprise readiness gap. The overlap with the four-pillar governance framework is nearly total: teams that built scope, escalation, audit, and observability for operational reasons hold most of their compliance evidence already.

The vendor landscape: the pure-play era closed

Between late 2024 and early 2026, nearly every standalone AI-security vendor was acquired into a platform:

Vendor	Specialty	Status (June 2026)
Robust Intelligence	AI firewall, model validation	Acquired by Cisco (Sept 2024, ~$400M) — now Cisco AI Defense
Protect AI	Model scanning, AI supply chain	Acquired by Palo Alto Networks (completed July 2025)
Prompt Security	GenAI protection platform	Acquired by SentinelOne (Sept 2025)
Lakera	Runtime protection, red-teaming	Acquired by Check Point (Oct 2025, ~$300M)
Promptfoo	Red-teaming, evals	Acquisition by OpenAI announced March 2026
HiddenLayer	Model scanning, runtime defense, 48+ CVEs disclosed	Independent (~$56M raised, M12-led Series A)
Noma Security	AI security platform	Independent ($100M raised)

Calypso AI (F5), Aim Security (Cato), Pangea (CrowdStrike), and Invariant Labs (Snyk) followed the same path. The buying implication: in 2026 you mostly evaluate AI security as a module inside a platform relationship you already have, not as a new vendor selection. The two questions that still differentiate: does the platform's coverage extend to agentic threats (the ASI list, not just the LLM list), and can it see your MCP/tool layer rather than only your model API traffic.

The control stack production teams actually deploy

Defensive guidance from OWASP, NVIDIA, and Microsoft converged in 2026 on the same defense-in-depth principle: constrain what an agent can do, regardless of what it can be instructed to do. Application-level guardrails act only after execution authority is handed over; infrastructure controls bound the blast radius before it.

Sandboxing / runtime isolation. The primary infrastructure control. Firecracker or Kata Containers for regulated and adversarial-code workloads; gVisor for compute-heavy Kubernetes deployments; V8 isolates for lightweight JavaScript-only tasks.
Egress allowlists and tool policy. Network egress allowlists, write protection on configuration files, and per-task secrets provisioning — no standing broad credentials in an agent's reach.
Layered guardrails. Deterministic pre-LLM checks, policy-enforced tool execution, post-LLM validation with self-correction, continuous observability — in that order, because each layer catches what the previous one missed.
Identity and least privilege. Scoped agent identities with least-privilege tool access, addressing OWASP ASI03 directly. An agent with your service account's permissions is your service account.
Human approval gates. Synchronous approval reserved for irreversible, high-impact actions; asynchronous monitoring for the rest — the HITL operating model applied to security rather than quality.

The deployment gap is the story, not the controls. Vendor research in 2026 (Atlan, Kiteworks — vendor figures, so weigh accordingly) suggests more than half of production agents run with no security oversight or logging at all. The controls are known, documented, and mostly cheap; what's scarce is the decision to require them before the first incident instead of after.

Standing up an AI security program: where to start

Inventory the agents. Every agentic workflow, its tools, its identities, its MCP servers. The unlogged agents surface here — they are your first work item.
Instrument before you filter. Audit trails and runtime observability precede guardrail tuning; you cannot tune what you cannot see, and the EU AI Act's logging obligations are due regardless.
Adopt the ASI list as your review checklist. Ten threats, each with a concrete design question for every new agent workflow.
Red-team the indirect path. Your testing should deliver payloads through content the agent reads (documents, tickets, tool descriptions) — not just prompts a tester types.
Buy through your platforms, verify agentic coverage. Ask your existing security vendors what they acquired and whether it covers ASI threats and the MCP layer, before adding anyone new.

Which framework should govern our agentic AI deployments?

Use the OWASP Top 10 for Agentic Applications (released December 9, 2025, peer-reviewed by NIST, Microsoft's AI Red Team, and AWS) for agent deployments, and the OWASP Top 10 for LLM Applications 2025 for non-agentic LLM features — they are complementary lists, not replacements. Map both onto NIST's AI RMF plus its Generative AI Profile (AI 600-1) for governance structure, and use MITRE ATLAS v5.1.0 for adversary tactics and red-team planning.

Is prompt injection solvable, or only mitigated?

Mitigated, not solved — and 2026 made that consensus official. Google's security team reported a 32% rise in malicious indirect-injection content between November 2025 and February 2026, and the MCPTox benchmark showed 60-72% attack success rates against poisoned MCP tool descriptions. The durable defense is architectural: constrain what the agent can do (sandboxing, egress allowlists, least-privilege tools) rather than trying to filter what it can be told.

We adopted MCP for our internal agents. What's our exposure?

Material and growing. A systemic MCP architectural flaw disclosed in April 2026 is estimated by the Cloud Security Alliance to expose roughly 200,000 instances across a supply chain with 150M+ package downloads, alongside discrete CVEs such as the CVSS-9.8 remote code execution in MCPJam Inspector. Treat every MCP server as untrusted input: pin and review tool descriptions, isolate servers, monitor egress, and audit the supply chain like any other dependency tree — OWASP now codifies this as ASI04, Agentic Supply Chain.

Do we still need a standalone AI-security vendor after all the acquisitions?

Mostly no — the category consolidated into platforms. Protect AI went to Palo Alto Networks (July 2025), Lakera to Check Point (October 2025, ~$300M), Prompt Security to SentinelOne (September 2025), Robust Intelligence to Cisco (2024, now Cisco AI Defense), and OpenAI announced the Promptfoo acquisition in March 2026. Most CISOs now buy AI security as a module inside an existing platform relationship. HiddenLayer and Noma Security are the notable remaining independents worth evaluating.

What does the EU AI Act require on security, and by when?

If you deploy high-risk AI systems, August 2, 2026 is the hard deadline: conformity assessment, registration, risk management, data governance, logging, and human oversight — with Commission enforcement powers beginning. General-purpose AI providers have carried documentation, transparency, and systemic-risk obligations since August 2, 2025; GPAI models placed on market before that date have until August 2, 2027. The logging and human-oversight requirements overlap almost exactly with the audit and escalation pillars of a working governance framework.

What's the single highest-leverage control we're probably missing?

Logging and runtime oversight of agents. Vendor research in 2026 (Atlan, Kiteworks) suggests more than half of production agents run without security oversight or logging, while a meaningful share of AI breaches now involve an agentic system. Instrument first — audit trails and observability — then layer sandboxing, egress allowlists, and human approval gates on high-impact actions. An agent you cannot reconstruct is an agent you cannot defend.

Explore More

Ready to Find the Right AI Tools?

Browse our data-driven rankings to find the best AI tools for your team.

View AI Rankings Get in Touch