AI Jailbreak Pattern Detector
Detect DAN, developer mode, roleplay exploits, and encoding tricks in AI prompts.
Related Tools
Detect prompt injection attacks in text with pattern matching and a 0-10 risk score.
Remove invisible Unicode, escape injection keywords, and strip dangerous content from LLM input.
Detect personal information (email, phone, SSN, credit card, IP, date of birth) in text before sending to LLMs.
Build content filter rules for LLM output: blocked words, regex patterns, PII detection, and format constraints.
Learn More
FAQ
- What is DAN mode and why is it dangerous?
- DAN (Do Anything Now) is a jailbreak technique that attempts to make a model bypass its safety guidelines by pretending to enable a special mode. When successful, models may generate harmful content they would otherwise refuse.
- What encoding tricks does this tool detect?
- The tool detects base64-encoded strings (long alphanumeric sequences ending with = that could hide instructions) and mentions of rot13, which attackers use to obfuscate content and bypass keyword filters.
- How is this different from the Prompt Injection Checker?
- The Injection Checker focuses on structural attacks like "ignore previous instructions" and delimiter breaks. The Jailbreak Detector focuses on persona-based attacks (DAN, developer mode) and obfuscation techniques specific to creative jailbreaking.
Identify known jailbreak techniques in user input: DAN (Do Anything Now) patterns, developer/maintenance mode requests, roleplay exploits, and encoding tricks like base64 and rot13. Each detection includes a severity level (critical, warning, info) and highlighted match. Use as a pre-filter before sending input to LLMs.