Docs
README
13.2 Regular Expressions
Overview
Regular expressions (regex) are patterns used to match character combinations in strings. JavaScript supports regex through the RegExp object and string methods.
Creating Regular Expressions
// Literal notation (preferred)
const regex1 = /pattern/flags;
// Constructor notation
const regex2 = new RegExp('pattern', 'flags');
const regex3 = new RegExp(variable, 'gi'); // For dynamic patterns
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
ā REGEX CREATION ā
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā¤
ā ā
ā /hello/ ā matches 'hello' ā
ā /hello/i ā matches 'hello', 'HELLO', 'Hello' ā
ā /hello/g ā matches all occurrences ā
ā ā
ā new RegExp('hello') ā same as /hello/ ā
ā new RegExp('hello', 'i') ā same as /hello/i ā
ā new RegExp('\\d+') ā same as /\d+/ (escape \) ā
ā ā
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
Flags
| Flag | Meaning |
|---|---|
g | Global - find all matches |
i | Case-insensitive |
m | Multi-line (^ and $ match line starts/ends) |
s | Dotall - dot matches newlines |
u | Unicode - treat pattern as Unicode |
y | Sticky - match at exact position |
d | Indices - include match indices |
// Common flag combinations
/hello/gi // Global, case-insensitive
/^start/m // Multi-line, match at line starts
/emoji/u // Unicode for emoji handling
Basic Patterns
Literal Characters
/hello/ / // Matches exactly 'hello'
Hello /
i; // Matches 'hello', 'HELLO', etc.
Character Classes
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
ā CHARACTER CLASSES ā
āāāāāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā¤
ā Pattern ā Matches ā
āāāāāāāāāāāāāāāāāā¼āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā¤
ā [abc] ā Any of a, b, c ā
ā [^abc] ā Any except a, b, c ā
ā [a-z] ā Any lowercase letter ā
ā [A-Z] ā Any uppercase letter ā
ā [0-9] ā Any digit ā
ā [a-zA-Z0-9] ā Any alphanumeric ā
ā . ā Any character (except newline) ā
āāāāāāāāāāāāāāāāāā“āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
Shorthand Character Classes
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
ā SHORTHAND CLASSES ā
āāāāāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā¤
ā \d ā Digit [0-9] ā
ā \D ā Non-digit [^0-9] ā
ā \w ā Word char [a-zA-Z0-9_] ā
ā \W ā Non-word char ā
ā \s ā Whitespace (space, tab, newline) ā
ā \S ā Non-whitespace ā
ā \b ā Word boundary ā
ā \B ā Non-word boundary ā
āāāāāāāāāāāāāāāāāā“āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
/\d{3}/ // Three digits: '123'
/\w+/ // One or more word chars: 'hello123'
/\s+/ // One or more whitespace
/\bhello\b/ // 'hello' as complete word
Quantifiers
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
ā QUANTIFIERS ā
āāāāāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā¤
ā Pattern ā Meaning ā
āāāāāāāāāāāāāāāāāā¼āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā¤
ā ? ā Zero or one (optional) ā
ā * ā Zero or more ā
ā + ā One or more ā
ā {n} ā Exactly n times ā
ā {n,} ā n or more times ā
ā {n,m} ā Between n and m times ā
āāāāāāāāāāāāāāāāāā¼āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā¤
ā *? +? ?? ā Non-greedy versions ā
ā {n,m}? ā Non-greedy range ā
āāāāāāāāāāāāāāāāāā“āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
/colou?r/ // 'color' or 'colour'
/go*d/ // 'gd', 'god', 'good', 'goood'...
/go+d/ // 'god', 'good', 'goood'...
/\d{3}-\d{4}/ // '123-4567'
/\d{2,4}/ // 2, 3, or 4 digits
// Greedy vs Non-greedy
'<div>content</div>'.match(/<.*>/); // '<div>content</div>' (greedy)
'<div>content</div>'.match(/<.*?>/); // '<div>' (non-greedy)
Anchors
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
ā ANCHORS ā
āāāāāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā¤
ā ^ ā Start of string (or line with m flag) ā
ā $ ā End of string (or line with m flag) ā
ā \b ā Word boundary ā
ā \B ā Non-word boundary ā
ā (?=...) ā Positive lookahead ā
ā (?!...) ā Negative lookahead ā
ā (?<=...) ā Positive lookbehind ā
ā (?<!...) ā Negative lookbehind ā
āāāāāāāāāāāāāāāāāā“āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
/^hello/ // Starts with 'hello'
/world$/ // Ends with 'world'
/^hello$/ // Exactly 'hello'
/\bword\b/ // 'word' as complete word
// Multi-line
/^line/m // Match 'line' at start of any line
// Word boundaries
'hello world'.match(/\bworld\b/); // 'world'
'helloworld'.match(/\bworld\b/); // null
Groups and Capturing
Capturing Groups
const pattern = /(\d{3})-(\d{4})/;
const match = '123-4567'.match(pattern);
// ['123-4567', '123', '4567']
// Named groups
const named = /(?<area>\d{3})-(?<number>\d{4})/;
const result = '123-4567'.match(named);
// result.groups = { area: '123', number: '4567' }
Non-Capturing Groups
/(?:https?):\/\//; // Groups but doesn't capture
Backreferences
// Match repeated words
/(\w+)\s+\1/ // 'the the', 'is is'
// Named backreference
/(?<word>\w+)\s+\k<word>/
Alternation
/cat|dog/ // 'cat' or 'dog'
/gr(a|e)y/ // 'gray' or 'grey'
/(red|green|blue)/ // Any of the colors
Lookahead and Lookbehind
// Positive lookahead (?=...)
/hello(?= world)/ // 'hello' followed by ' world'
// Negative lookahead (?!...)
/hello(?! world)/ // 'hello' NOT followed by ' world'
// Positive lookbehind (?<=...)
/(?<=\$)\d+/ // Digits preceded by $
// Negative lookbehind (?<!...)
/(?<!\$)\d+/ // Digits NOT preceded by $
// Examples
'$100'.match(/(?<=\$)\d+/); // ['100']
'100'.match(/(?<=\$)\d+/); // null
String Methods with Regex
test()
const regex = /hello/i;
regex.test('Hello World'); // true
regex.test('Goodbye World'); // false
match()
// Without g flag - returns details
'abc 123 def 456'.match(/\d+/);
// ['123', index: 4, input: '...']
// With g flag - returns all matches
'abc 123 def 456'.match(/\d+/g);
// ['123', '456']
// No match returns null
'hello'.match(/\d+/); // null
matchAll()
// Returns iterator with details for each match
const str = 'test1test2test3';
const matches = [...str.matchAll(/test(\d)/g)];
// [
// ['test1', '1', index: 0],
// ['test2', '2', index: 5],
// ['test3', '3', index: 10]
// ]
search()
'hello world'.search(/world/); // 6
'hello world'.search(/xyz/); // -1
replace() and replaceAll()
// Basic replace
'hello world'.replace(/world/, 'there');
// 'hello there'
// Global replace
'hello hello'.replace(/hello/g, 'hi');
// 'hi hi'
// With capture groups
'John Doe'.replace(/(\w+) (\w+)/, '$2, $1');
// 'Doe, John'
// With function
'hello'.replace(/./g, (char, i) => (i === 0 ? char.toUpperCase() : char));
// 'Hello'
split()
'a, b, c'.split(/,\s*/); // ['a', 'b', 'c']
'a1b2c3'.split(/\d/); // ['a', 'b', 'c', '']
'a1b2c3'.split(/(\d)/); // ['a', '1', 'b', '2', 'c', '3', '']
Common Patterns
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
ā COMMON PATTERNS ā
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā¤
ā ā
ā Email (simple): ā
ā /^[\w.-]+@[\w.-]+\.\w{2,}$/ ā
ā ā
ā URL (simple): ā
ā /https?:\/\/[\w.-]+(?:\/[\w.-]*)*\/?/ ā
ā ā
ā Phone (US): ā
ā /^\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}$/ ā
ā ā
ā Date (YYYY-MM-DD): ā
ā /^\d{4}-\d{2}-\d{2}$/ ā
ā ā
ā Hex color: ā
ā /^#?([a-fA-F0-9]{6}|[a-fA-F0-9]{3})$/ ā
ā ā
ā IP address: ā
ā /^(?:\d{1,3}\.){3}\d{1,3}$/ ā
ā ā
ā Password (8+ chars, upper, lower, digit): ā
ā /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d).{8,}$/ ā
ā ā
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
Regex Methods
RegExp.prototype.exec()
const regex = /\d+/g;
const str = 'a1b2c3';
let match;
while ((match = regex.exec(str)) !== null) {
console.log(`Found ${match[0]} at ${match.index}`);
}
// Found 1 at 1
// Found 2 at 3
// Found 3 at 5
RegExp Properties
const regex = /hello/gi;
regex.source; // 'hello'
regex.flags; // 'gi'
regex.global; // true
regex.ignoreCase; // true
regex.multiline; // false
regex.lastIndex; // Position for next match (with g flag)
Performance Tips
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
ā PERFORMANCE TIPS ā
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā¤
ā ā
ā 1. Use literal notation for static patterns ā
ā ā
const regex = /pattern/g; ā
ā ā const regex = new RegExp('pattern', 'g'); ā
ā ā
ā 2. Avoid catastrophic backtracking ā
ā ā /(a+)+$/ // Exponential time ā
ā ā
/a+$/ // Linear time ā
ā ā
ā 3. Be specific with quantifiers ā
ā ā /.*something/ ā
ā ā
/[^x]*something/ or /.{0,100}something/ ā
ā ā
ā 4. Use non-capturing groups when not extracting ā
ā (?:...) instead of (...) ā
ā ā
ā 5. Anchor patterns when possible ā
ā ^...$ is faster than unanchored ā
ā ā
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
Key Takeaways
- ā¢Use literal notation -
/pattern/is cleaner thannew RegExp() - ā¢Understand flags -
gfor all matches,ifor case-insensitive - ā¢Character classes -
\d,\w,\sare your friends - ā¢Capture groups -
()captures,(?:)doesn't - ā¢Named groups -
(?<name>...)for readable code - ā¢Non-greedy - Add
?after quantifiers for minimal matches - ā¢Lookahead/behind - Match without consuming
- ā¢Test your regex - Use tools like regex101.com