dataGemini

Regex Generation Prompt (Gemini)

Regex is one of the most misread and miswritten tools in programming. This prompt generates regexes with built-in test cases and a breakdown table explaining each component, which dramatically reduces the chance of a subtle bug making it to production. The must-not-match examples are as important as the must-match ones. This variant is formatted for Gemini: Optimised for Gemini 1.5 Pro and Gemini Ultra. Uses Google AI markdown formatting conventions.

Prompt Template
# Gemini AI Prompt

You are a helpful AI assistant powered by Google Gemini.

## Instructions
You are a regular expression expert.

Write a regular expression that matches the following pattern:

Pattern description: {{description}}
Language/flavour: {{language}}

Examples that MUST match:
{{must_match}}

Examples that MUST NOT match:
{{must_not_match}}

Edge cases to handle: {{edge_cases}}

Provide:
1. The regular expression
2. A named capture group version (if applicable)
3. A plain-English explanation of each part of the regex using a breakdown table
4. Any known limitations or patterns this regex cannot handle
5. Test code in {{language}} that verifies all match/non-match examples

## Output Format
Provide a well-structured response using Markdown headers and code blocks where appropriate.

Variables

{{description}}What the regex should match, e.g., "UK phone numbers in the format +44 XXXX XXXXXX"
{{language}}Language or regex flavour: JavaScript, Python, Go, PCRE, etc.
{{must_match}}Examples that should match (one per line)
{{must_not_match}}Examples that should not match (one per line)
{{edge_cases}}Specific edge cases to handle, e.g., "optional spaces between groups", "case insensitive", or "None"

Example

Input
description: Email addresses (simplified RFC-compliant)
language: JavaScript
must_match:
[email protected]
[email protected]
[email protected]
must_not_match:
@example.com
user@
user @example.com
edge_cases: Handle + and . in local part, subdomains
Output
Regex: /^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$/

Breakdown:
^ — start of string
[a-zA-Z0-9._%+\-]+ — one or more valid local part characters
@ — literal @ sign
[a-zA-Z0-9.\-]+ — domain name (letters, digits, dots, hyphens)
\.[a-zA-Z]{2,} — dot followed by TLD (2+ letters)
$ — end of string

Limitation: Does not validate IDN (internationalised) email addresses.

Related Tools

FAQ

Should I use regex for email validation?
Use regex only for basic format checking as a UX hint. For definitive validation, send a verification email — only a successful delivery proves the address is real and the mailbox exists.
Why does my regex work in testing but fail in production?
The most common cause is differences between regex flavours (PCRE vs. RE2 vs. JavaScript). Always specify the exact language/runtime and test with your actual runtime, not an online tool that may use a different engine.
Can I use this to generate regex for log parsing?
Yes. Paste 3-5 sample log lines as must_match examples and describe what groups to capture (timestamp, severity, message). The AI will generate a regex with named capture groups for each field.

Related Prompts