$devtoolkit.sh/glossary/what-is-regex

What is Regex? — Regular Expressions Explained

Definition

A regular expression (regex) is a sequence of characters that defines a search pattern for matching, extracting, and manipulating text. Regex patterns are supported by virtually every programming language and many text editors, command-line tools, and databases. They range from simple literal matches to complex patterns with quantifiers, character classes, groups, and lookaheads that can describe sophisticated text constraints in a compact notation.

How It Works

A regex engine reads the pattern and input string and attempts to find a match. Literal characters match themselves. Special metacharacters have specific meanings: . matches any character, * means "zero or more of the preceding", + means "one or more", ? means "optional", ^ anchors to the start of the string, $ to the end. Character classes like [a-z] match any lowercase letter, [^0-9] matches any non-digit. Quantifiers {3,5} match 3 to 5 repetitions. Groups () capture matching text. | is alternation (OR). Lookaheads (?=...) and lookbehinds (?<=...) match positions without consuming characters.

Common Use Cases

  • Validating email addresses, phone numbers, URLs, and other formatted input
  • Extracting data from text like dates, IP addresses, or structured records
  • Find-and-replace operations with captured groups in editors and code
  • Parsing log files and extracting structured fields from unstructured text
  • Tokenizing code and text for compilers, linters, and formatters

Example

// Email validation (simplified):
^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$

// Extract dates in YYYY-MM-DD:
(\d{4})-(\d{2})-(\d{2})
Match: "2024-03-15" → groups: ["2024","03","15"]

// Replace all whitespace with single space:
\s+  →  " "

Related Tools

FAQ

What is the difference between greedy and lazy quantifiers?
Greedy quantifiers (* + {}) match as much as possible. Lazy (non-greedy) quantifiers (*? +? {}?) match as little as possible. For example, <.+> on "<b>text</b>" greedily matches the whole string, while <.+?> matches <b> then </b> separately.
What is a capturing group vs a non-capturing group?
A capturing group () captures the matched text and stores it for backreferences or replacement operations. A non-capturing group (?:) groups without capturing, used for applying quantifiers or alternation without creating a capture. Use non-capturing groups when you do not need the matched text.
Why does my regex work in Python but not in JavaScript?
Different languages and engines have slightly different regex dialects. JavaScript lacks lookbehind support in older engines, does not support named backreferences the same way, and handles multiline mode differently. Always test regex in the specific language environment where it will be used.

Related Terms

/glossary/what-is-regexv1.0.0