AI Input Sanitizer

Remove invisible Unicode, escape injection keywords, and strip dangerous content from LLM input.

Also escape injection keywords with brackets

Related Tools

Learn More

FAQ

What are invisible Unicode characters and why are they dangerous?
Characters like zero-width space (U+200B), zero-width non-joiner (U+200C), and byte order mark (U+FEFF) are invisible but can affect tokenization. Attackers embed invisible instructions that bypass display-level filters but get processed by the model.
What strictness level should I use?
Use Low for general sanitization of user content. Use Medium when you need to prevent prompt injection but preserve natural language. Use High only for structured inputs like codes or IDs where you can afford to strip most characters.
Does escaping injection keywords fully prevent injection attacks?
No. Escaping is a heuristic that disrupts common patterns but sophisticated attackers can work around it. Use this alongside semantic classification and output monitoring for defense-in-depth.

Sanitize user input before sending to LLM APIs with three strictness levels. Low: removes invisible Unicode characters (zero-width spaces, BOM). Medium: also escapes prompt injection keywords by wrapping them in brackets. High: strips everything except alphanumeric characters, spaces, and basic punctuation. Shows a diff of what changed.