Question 1

What is a prompt injection attack?

Accepted Answer

Prompt injection is an attack where malicious input attempts to override an AI model's instructions — for example, telling it to "ignore all previous instructions" or "pretend you are a different AI". These attacks can cause models to bypass safety measures or leak system prompts.

Question 2

How is the risk score calculated?

Accepted Answer

Each matched pattern contributes to the score based on severity: critical patterns (instruction overrides, role overrides) add 3 points, high patterns (system prompt extraction) add 2 points, and medium patterns (delimiter attacks) add 1 point. The score is capped at 10.

Question 3

Does this tool catch all injection attacks?

Accepted Answer

No regex-based tool can catch all injection attempts. Sophisticated attackers use encoding, paraphrasing, and novel phrasing. This tool catches common patterns and should be used as one layer in a defense-in-depth strategy alongside semantic classification.

AI Prompt Injection Checker

Related Tools

Learn More

FAQ