AI Input Sanitizer
Remove invisible Unicode, escape injection keywords, and strip dangerous content from LLM input.
Also escape injection keywords with brackets
Related Tools
Detect prompt injection attacks in text with pattern matching and a 0-10 risk score.
Detect DAN, developer mode, roleplay exploits, and encoding tricks in AI prompts.
Clean and sanitize text for LLM input by stripping HTML, normalizing Unicode, and collapsing whitespace.
Detect personal information (email, phone, SSN, credit card, IP, date of birth) in text before sending to LLMs.
Learn More
FAQ
- What are invisible Unicode characters and why are they dangerous?
- Characters like zero-width space (U+200B), zero-width non-joiner (U+200C), and byte order mark (U+FEFF) are invisible but can affect tokenization. Attackers embed invisible instructions that bypass display-level filters but get processed by the model.
- What strictness level should I use?
- Use Low for general sanitization of user content. Use Medium when you need to prevent prompt injection but preserve natural language. Use High only for structured inputs like codes or IDs where you can afford to strip most characters.
- Does escaping injection keywords fully prevent injection attacks?
- No. Escaping is a heuristic that disrupts common patterns but sophisticated attackers can work around it. Use this alongside semantic classification and output monitoring for defense-in-depth.
Sanitize user input before sending to LLM APIs with three strictness levels. Low: removes invisible Unicode characters (zero-width spaces, BOM). Medium: also escapes prompt injection keywords by wrapping them in brackets. High: strips everything except alphanumeric characters, spaces, and basic punctuation. Shows a diff of what changed.