Essential Regex Patterns Every Developer Should Know
Regular expressions show up everywhere in software development — input validation, log parsing, data transformation, search and replace, routing. Knowing the core syntax cold and having a library of reliable patterns for common tasks eliminates a significant amount of boilerplate lookup time and prevents the subtle bugs that come from writing a regex under pressure.
Core syntax reference
Character classes
| Pattern | Matches |
|---|---|
| . | Any character except newline |
| \d | Digit (0–9) |
| \D | Non-digit |
| \w | Word character: [a-zA-Z0-9_] |
| \W | Non-word character |
| \s | Whitespace: space, tab, newline, carriage return |
| \S | Non-whitespace |
| [abc] | Any of: a, b, or c |
| [^abc] | Anything except a, b, or c |
| [a-z] | Any lowercase letter |
| [a-zA-Z0-9] | Any alphanumeric character |
Quantifiers
| Pattern | Meaning |
|---|---|
| * | Zero or more (greedy) |
| + | One or more (greedy) |
| ? | Zero or one |
| {n} | Exactly n times |
| {n,} | n or more times |
| {n,m} | Between n and m times |
| *? | Zero or more (lazy — matches as few as possible) |
| +? | One or more (lazy) |
Anchors and boundaries
| Pattern | Meaning |
|---|---|
| ^ | Start of string (or start of line with m flag) |
| $ | End of string (or end of line with m flag) |
| \b | Word boundary (between \w and \W) |
| \B | Not a word boundary |
Groups and alternation
| Pattern | Meaning |
|---|---|
| (abc) | Capturing group |
| (?:abc) | Non-capturing group (faster, no capture overhead) |
| (?<name>abc) | Named capturing group |
| a|b | Alternation: a or b |
| (?=abc) | Positive lookahead: followed by abc |
| (?!abc) | Negative lookahead: not followed by abc |
| (?<=abc) | Positive lookbehind: preceded by abc |
| (?<!abc) | Negative lookbehind: not preceded by abc |
Flags
| Flag | Effect |
|---|---|
| g | Global — find all matches, not just the first |
| i | Case-insensitive |
| m | Multiline — ^ and $ match start/end of each line |
| s | Dotall — . matches newline characters too |
Greedy vs lazy matching
By default, quantifiers are greedy — they match as much as possible. This is a frequent source of bugs when extracting content from HTML or structured text.
const html = "<b>bold</b> and <b>more bold</b>";
// Greedy — matches everything from first <b> to last </b>
html.match(/<b>.+<\/b>/)[0]
// → "<b>bold</b> and <b>more bold</b>"
// Lazy — matches each <b>...</b> individually
html.match(/<b>.+?<\/b>/g)
// → ["<b>bold</b>", "<b>more bold</b>"]Adding ? after a quantifier makes it lazy — it matches as few characters as possible while still satisfying the overall pattern.
Practical patterns
Common mistakes
Not anchoring validation patterns
Without ^ and $, a pattern matches anywhere in the string. The pattern /\d+/ matches "abc123def" because there are digits somewhere in it.
// Wrong — matches any string containing digits
/\d+/.test("abc123") // true — matches "123" inside
// Correct — entire string must be digits
/^\d+$/.test("abc123") // false
/^\d+$/.test("123") // trueCatastrophic backtracking
Certain regex patterns can cause exponential worst-case matching time. Nested quantifiers on overlapping character classes are the most common culprit: /(a+)+/ on a long string of a's followed by a non-matching character can take seconds or minutes.
ReDoS vulnerability: If you apply user-provided regex patterns to user-provided strings, catastrophic backtracking becomes a denial-of-service vector. Always validate regex patterns from untrusted sources or use a safe regex library with backtracking limits.
Special characters not escaped
These characters have special meaning in regex and must be escaped with \ if you want to match them literally: . * + ? ^ $ { } [ ] | ( ) \. Forgetting to escape a dot is extremely common — /example.com/ matches "exampleXcom" because . means any character.
Escaping utility: To build a regex that literally matches a string, escape all special characters: str.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'). This is useful when building dynamic patterns from user input.