Question 1

What is prompt injection?

Accepted Answer

Prompt injection is an attack where malicious text in user input overrides the system prompt instructions, potentially causing the AI to ignore safety guidelines, leak confidential prompts, or perform unintended actions.

Question 2

What is a jailbreak?

Accepted Answer

A jailbreak is a technique used to bypass an AI model's safety training and content policies, causing it to produce outputs it would normally refuse. Testing your prompts against known jailbreak patterns helps build more robust systems.

Question 3

How do I protect my AI application from prompt injection?

Accepted Answer

Use input validation, sanitization, and clear separation between system instructions and user input. Never include sensitive instructions that depend on user secrecy. These tools help you test your defenses before deployment.

AI Security

FAQ

Related Categories