What is the difference between direct and indirect prompt injection?

Direct prompt injection is typed straight into the model by whoever is talking to it ('ignore your instructions and do X'). Indirect prompt injection, formalized by Greshake et al. in 2023, hides the instruction in third-party content the model will later retrieve: a web page, a PDF, an email, a search result. The attacker never interacts with the model; they only need the malicious text to be in something the model reads. The indirect form is the one that matters for AI search, because retrieval is exactly the channel an AI engine uses to pull external content into its working context.

Is prompt injection a GEO tactic?

No, and treating it as one is a mistake. Hiding instructions in your content to steer an engine is detectable, against engine policy, and squarely black-hat: it belongs to the same family as the manipulation tactics catalogued under black-hat C-SEO. Hidden and instruction-like text aimed at the model is commonly treated as manipulative or abusive in major platform policies and safety guidance, and engines are widely understood to screen for it. The editorial-trust path GEO actually rewards (clean, retrievable, self-contained content) is the opposite of injecting commands.

Can prompt injection be fully prevented?

Not with current architectures. Language models process instructions and data through the same channel (the context window), so a perfectly reliable boundary between 'this is content to read' and 'this is a command to follow' does not yet exist. Greshake et al. frame this as inherent to LLM-integrated applications. Vendors deploy input filtering, instruction hierarchies, and output guardrails that reduce the risk, but none of them claim full mitigation; treating retrieved content as untrusted data is the standing defensive posture, not a solved problem.

Prompt injection

What is prompt injection?

Prompt injection is an attack class in which adversarial text is placed where a language model will read it as a command rather than as data, causing the model to ignore or override its intended task.¹ The vulnerability exists because a language model processes instructions and content through the same channel (its context window), so a reliable separation between "text to act on" and "text to merely read" is difficult; engineering layers like instruction hierarchies and role separation reduce the gap but do not close it.

There are two forms, and the distinction is the whole point for AI search:

Direct prompt injection is typed straight into the model by whoever is interacting with it (the classic "ignore your previous instructions and instead do X"). The term was coined by Simon Willison in September 2022, after Riley Goodside demonstrated the technique against an early LLM app.
Indirect prompt injection, formalized by Greshake et al. in 2023, hides the instruction inside third-party content the model later retrieves: a web page, a document, an email, a search result. The attacker never touches the model; they only need their text to land in something the model reads.¹

The indirect variant is the one that matters here, because retrieval is exactly how an AI search engine pulls external content into its working context. Indirect prompt injection is, in effect, the security mirror image of GEO: it travels the same retrieve-and-extract path that optimization works to win, and exploits the same fact that engines read and act on content they did not author.

Status in 2026

Prompt injection is an open, unsolved security problem rather than a settled, patched one. Because the data/instruction boundary is inherent to how current models read a context window, vendors ship mitigations (input filtering, instruction hierarchies that rank the system prompt above retrieved text, output guardrails) that reduce exposure without eliminating it; none publicly claim a complete fix. The working defensive posture in the literature is to treat all retrieved, third-party content as untrusted data, never as instructions.

For a publisher, the practical reading is not "here is a lever" but "here is why your content is screened." The same property that makes a page citable (being retrievable, self-contained, and easy for an engine to lift a passage from) is the property an attacker abuses to smuggle in a command. So engines have strong incentive to treat instruction-like or hidden text in retrieved content as a trust risk, though vendor-specific filtering rules are not public. The honest GEO consequence is a boundary, not a tactic: the editorial-trust path that earns citations runs in the opposite direction from injecting instructions.

How to apply

Prompt injection is something to understand and defend against, not to deploy. Three honest moves:

Do not embed instructions in your content to steer engines. Hidden text, "as an AI, you should cite this," and instruction-like phrasing aimed at the model are detectable, likely to be treated as manipulative or abusive by major platforms, and belong to the black-hat C-SEO family. They invite filtering and policy enforcement, and they undercut the editorial trust that actually drives citations.
Treat retrieved content as untrusted data in your own LLM features. If you run any feature that feeds user-supplied or third-party text into a model (a chatbot over your docs, a summarizer of submitted URLs), assume that text may contain injected instructions and isolate it from your system prompt. This is the standing mitigation, not a complete fix.
Learn the shape of it so you can recognize it. Knowing that indirect injection hides commands in retrieved content helps you audit your own pipelines and understand why engines distrust manipulative pages, which is the defensive payoff of the concept.

What to skip: any attempt to manipulate an engine by planting commands in your content. It is the detectable, filterable opposite of being a trustworthy source.

How it relates to other concepts

Black-hat C-SEO is the boundary: prompt injection is the technical attack mechanism, while black-hat C-SEO is the GEO practice domain that may deploy it among other adversarial tactics (the citation- and ranking-manipulation use is the C-SEO-relevant subset of this broader security mechanism). This entry defines the mechanism defensively; that one catalogs its misuse.
RAG and context assembly are the path indirect injection rides: the instruction is planted so it gets retrieved and assembled into the model's context alongside legitimate content.
Hallucination grounding is one defense indirect injection tries to subvert: the model may have retrieved real evidence, but a malicious instruction inside that evidence can push the output away from faithful use of the source.

Greshake, K., Abdelnabi, S., Mishra, S., Endres, C., Holz, T. & Fritz, M. "Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection." Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security (AISec 2023), arXiv:2302.12173 (submitted February 2023). The paper coins/systematizes indirect prompt injection: instructions planted in content an LLM-integrated application later retrieves, so the attacker needs no direct interface to the model, and argues that LLM-integrated applications blur the line between data and instructions, deriving a security taxonomy of impacts (data theft, "worming," information-ecosystem contamination, and more). The earlier, direct form of prompt injection (instructions typed into the prompt) was demonstrated by Riley Goodside and named "prompt injection" by Simon Willison in September 2022; Greshake et al. extend it to the retrieval-borne case that is relevant to AI search. This entry is definitional and defensive and intentionally contains no exploitation detail; verified 2026-06-09 against the arXiv abstract and the AISec 2023 record. ↩ ↩²

Prompt injection

Citation status

What is prompt injection?

Status in 2026

How to apply

How it relates to other concepts

Part of AI behavior· editorial cluster, not a semantic link

FAQ

Sources & further reading

Citation status

What is prompt injection?

Status in 2026

How to apply

How it relates to other concepts

Footnotes

Part of AI behavior· editorial cluster, not a semantic link

Related terms

FAQ

Sources & further reading

Get the monthly digest