Prompt Injection Taxonomy: Types, Paths, and Outcomes

What Is the Prompt Injection Taxonomy? Techniques, Delivery Paths, and Outcomes

Last updated: June 2026

A prompt injection taxonomy sorts the attack into clear classes by how the malicious instruction reaches the model and what it achieves: direct, indirect, stored, multimodal, tool-mediated, and agentic injection. Every class shares one root cause. AI models read instructions and data in the same channel, so attacker text hidden in trusted-looking content can take over the model’s behavior. The class tells a security team where the control has to sit.

What the Prompt Injection Taxonomy Covers

Prompt injection is a top concern for AI applications, and the reason it is hard is structural. A model cannot reliably tell an instruction from data, so any text it reads can act as a command. OWASP ranks prompt injection the number one risk for AI applications and states that, given how models work, there is no fool-proof method of prevention (OWASP, 2025). The U.S. National Institute of Standards and Technology (NIST) names prompt injection as an information-security risk in its Generative AI Profile (NIST, 2024). A taxonomy turns that single weakness into classes you can defend against, sorted by delivery path and outcome.

The classes break down by how the instruction is delivered and what it produces:

Injection Type	Delivery Path	Typical Outcome
Direct injection	A prompt sent straight to the model by a user or attacker	The model ignores its instructions, leaks its system prompt, or returns disallowed output
Indirect injection	Hidden instructions inside third-party content the model reads, such as a web page, document, email, or search result	On a benign task like “summarize this,” the model follows the attacker and leaks or sends data
Stored injection	A payload planted in data the model or agent reads later, including memory, records, or a retrieval store	Fires on a later, unrelated request and can persist as an implant
Multimodal injection	Instructions concealed in an image, audio, or file that rides alongside benign text	A cross-modal attack slips past text-only filters
Tool-mediated injection	Injected content that drives an agent to call tools and connected systems	Unauthorized tool calls, data exfiltration, or actions taken in business systems
Agentic injection	Untrusted content an autonomous agent ingests on its own and acts on across steps	Zero-click takeover that chains tool calls, escalates privilege, and persists

Direct and Indirect Injection: The Two Delivery Roots

Every class grows from two roots, defined by where the malicious instruction enters. Direct injection puts the instruction in the prompt itself. Indirect injection hides it in content the model retrieves, and it is the more dangerous root because the victim never types anything hostile. The EchoLeak vulnerability in Microsoft 365 Copilot (CVE-2025-32711) is the clear example: a crafted email carried hidden instructions, and when a user asked Copilot to summarize their inbox, the assistant exfiltrated sensitive data with zero clicks (NVD, 2025). The same root drives most agent attacks, where the ingested content is a web page, a document, or a tool’s response. For the mechanics, see our explainers on what prompt injection is and on direct versus indirect prompt injection.

Stored, Multimodal, Tool-Mediated, and Agentic Injection

The remaining classes extend the two roots into places filters rarely watch. Stored injection plants the payload in data an agent reads later, such as its memory or a retrieval-augmented generation (RAG) store, so a clean-looking request days afterward triggers the attack. MITRE’s 2026 ATLAS investigation of an autonomous agent platform documented exactly this pattern, where undifferentiated memory let web scrapes and tool outputs be saved and replayed as instructions. The investigation catalogued seven new agent techniques, the most observed being direct and indirect prompt injection, tool invocation, and configuration changes (MITRE, 2026). Multimodal injection hides the instruction in an image or audio that rides alongside benign text, a cross-modal path OWASP flags as a growing risk.

Tool-mediated and agentic injection are where the outcomes turn severe. Once an agent can call tools, injected content can make it act: read records, move data, run code. Aurascape’s own threat research shows how far this goes. Aura Labs principal threat research engineer Qi Deng documented SilentBridge, a class of zero-click indirect injection in Meta’s Manus agent, where a request as ordinary as asking the agent to summarize a page led to email theft, secret leakage, and a root shell inside the agent. The team reported three variants, each rated 9.8 of 10, and worked with the vendor to fix them before publication (Aurascape, 2026).

Why Detection-Only Controls Are Not Enough

Filters that scan for known bad prompts help, but they do not solve the class. OWASP is explicit that no fool-proof prevention exists, because the model treats instruction and data alike. The academic AgentDojo benchmark makes the gap measurable. Across realistic tool-using tasks, the strongest agents already failed more than a third of benign jobs, and prompt injection attacks still succeeded against those best agents in under a quarter of cases. Adding a secondary injection detector cut attack success to roughly 8 percent, not zero, and the authors note that current techniques are not foolproof (arXiv, 2024). A control that inspects only the prompt also misses the response, the tool call, and the way an attack unfolds across a conversation. Detection narrows the odds. It does not close the path.

Where Enforcement Belongs: The Interaction and the Tool Call

If detection cannot be perfect, the control has to sit where injection produces an effect: the interaction and the tool call. Aurascape splits agent traffic into two channels and inspects both. The AI Proxy secures the intelligence channel, decoding full prompts and responses in real time, so an injected instruction is caught in context rather than at a destination. The Zero-Bypass MCP Gateway secures the tool-execution channel (Aurascape, 2026).

Model Context Protocol (MCP) is the open standard that lets an agent connect to external tools and act through them, and it is one mechanism within the larger job of governing agent execution. The Gateway verifies and signs every approved tool call before it runs, and unsigned calls cannot reach the tool or the model. That stops tool-mediated and agentic injection at the point of action, not at a network destination the agent already passed. Cross-call data lineage tracks information across chained steps, so an attack that looks harmless one call at a time is still caught.

Context-aware policy then decides the response. Actions span allow, coach, warn, block, and redact, applied on identity, the enterprise versus personal account, the application, the Intention or mode in use, the data category, and the tool requested. Two kinds of discovery shrink the blast radius when an attack lands. Aurascape finds the AI already in the environment across the network, on endpoints, and at the application programming interface (API), including agents running locally, and its patented discovery agents work ahead of use, interrogating new AI tools as they launch and risk-scoring them before anyone connects (Aurascape, 2026). This is not only a shadow-AI problem. The same controls govern sanctioned, licensed tools through Intentions and entitlement, so an approved assistant cannot be turned into an exfiltration path by injected content. Aurascape keeps interaction records for audit and effectiveness, governed by role-based access control (RBAC) for privacy, and runs alongside an existing secure service edge (SSE), secure access service edge (SASE), or data loss prevention (DLP) stack rather than replacing it.

How to Reduce Prompt Injection Blast Radius

Prompt injection is not a reason to slow AI adoption. It is a reason to govern it at the point of action. Gartner expects more than 40 percent of agentic AI projects to be canceled by the end of 2027, often from weak governance and unclear risk controls (Gartner, 2025). The teams that map their injection exposure and enforce across both channels are the ones that keep their programs. Five moves carry most of the value:

Inventory every AI app, agent, and MCP server in use, including agents running locally on endpoints, so the attack surface is known.
Inspect both legs of every agent interaction: prompts and responses on the intelligence channel, tool calls on the tool-execution channel.
Treat all ingested content as untrusted, including web pages, documents, search results, tickets, and agent memory.
Require approved, signed tool calls and block unsigned ones, so injected instructions cannot reach a system.
Classify and control sensitive data inline across prompts, responses, uploads, and tool calls, to cap what any single injection can take.

Frequently Asked Questions

What is the difference between direct and indirect prompt injection?

Direct injection places the malicious instruction in the prompt itself, while indirect injection hides it in content the model retrieves, such as a web page, document, or email. Indirect injection is the more dangerous root because the victim never types anything hostile. The attack rides in on trusted-looking data the model was asked to process.

Can prompt injection be fully prevented?

No. OWASP states there is no fool-proof prevention, because models read instructions and data in the same channel. The practical answer is defense in depth: inspect prompts and responses, govern tool calls, treat ingested content as untrusted, and limit what any action can reach.

What is agentic or tool-mediated prompt injection?

It is injection that drives an autonomous agent to act through its tools, not just to produce text. Injected content can make the agent call connected systems, move data, or run code, and in the worst case chain those steps into a zero-click takeover, as Aurascape’s SilentBridge research showed.

How does Aurascape defend against prompt injection across the taxonomy?

Aurascape enforces at the interaction and the tool call. The AI Proxy inspects prompts and responses on the intelligence channel, the Zero-Bypass MCP Gateway verifies and signs every tool call on the tool-execution channel, and discovery with inline data protection shrinks the blast radius if an attack lands.

Aurascape treats prompt injection as an architecture problem, not a filtering problem. The AI Proxy inspects the intelligence channel, the Zero-Bypass MCP Gateway verifies and signs every tool call so injected content cannot quietly drive an agent’s actions, and discovery with inline data protection limits the damage when an attack lands. Every deployment starts with a tailored demo mapped to the injection classes your teams actually face.

See how Aurascape governs prompt injection across the interaction and the tool call →

Aurascape Solutions

Discover and monitor AI Get a clear picture of all AI activity.
Safeguard AI use Secure data and compliancy in AI usage.
Secure Agentic AI Secure how your teams use AI and build AI agents.
Copilot readiness Prepare for and monitor AI Copilot use.
Coding assistant guardrails Accelerate development, safely.
Frictionless AI security Keep users and admins moving.