What is Prompt Injection?
Prompt injection is a type of attack that targets AI models — especially large language models (LLMs) — by inserting hidden or malicious instructions into the text the AI receives. These hidden prompts can trick the AI into ignoring original commands, revealing sensitive information, or behaving in unintended ways.
It’s like sneaking a secret message into a conversation that changes how the AI responds.
Prompt injection can happen:
When users intentionally include misleading or harmful text in a prompt
When attackers embed hidden instructions in user-generated content, which the AI then processes unknowingly
This makes it a serious concern for applications using generative AI, especially in chatbots, virtual assistants, or customer-facing tools.