SmartToolsToday
๐Ÿ“–
RegexProgrammingDeveloper Tools

Regular Expressions for Beginners: Patterns, Syntax, and Examples

Master regex from scratch. This guide covers characters, quantifiers, groups, and common patterns with real examples you can test immediately.

ST
SmartToolsTodayยทJune 16, 2026ยท5 min read
Ad ยท 728ร—90 Leaderboard

What Is a Regular Expression?

A regular expression (regex) is a sequence of characters that defines a search pattern. You use regex to find, extract, or replace text that matches a specific structure โ€” without knowing the exact text in advance.

For example, the pattern \d{4}-\d{2}-\d{2} matches any string that looks like a date in YYYY-MM-DD format. It does not care whether the date is 2024-01-15 or 1999-12-31 โ€” both match because both fit the structure.

Regex is built into nearly every programming language: JavaScript, Python, Java, Go, Ruby, PHP. It powers search in VS Code, grep in the terminal, form validation on websites, and log parsing in production systems.


Your First Regex: Literal Characters

The simplest regex is just a literal string. The pattern cat matches any text containing the letters c, a, t in sequence:

Text:    "The cat sat on the mat."
Pattern: cat
Match:   "cat" (position 4)

Case sensitivity matters. cat does not match Cat unless you enable the case-insensitive flag (/cat/i in JavaScript).


Special Characters (Metacharacters)

Certain characters have special meaning in regex. They are called metacharacters:

Character Meaning
. Any single character except newline
^ Start of string (or line in multiline mode)
$ End of string (or line)
* Zero or more of the preceding element
+ One or more of the preceding element
? Zero or one of the preceding element
\ Escape the next character
[] Character class โ€” match any one character inside
() Capturing group
` `
{} Quantifier with exact count

To match a literal ., escape it: \.


Character Classes

A character class [...] matches any single character from the set:

[aeiou]       โ€” any vowel
[a-z]         โ€” any lowercase letter
[A-Z]         โ€” any uppercase letter
[0-9]         โ€” any digit (same as \d)
[a-zA-Z0-9]   โ€” any letter or digit
[^aeiou]      โ€” any character that is NOT a vowel (^ negates inside [])

Shorthand Classes

These are so common they have their own shortcuts:

Shorthand Equivalent Matches
\d [0-9] Any digit
\D [^0-9] Any non-digit
\w [a-zA-Z0-9_] Word character
\W [^a-zA-Z0-9_] Non-word character
\s [ \t\r\n\f] Whitespace
\S [^ \t\r\n\f] Non-whitespace

Quantifiers

Quantifiers control how many times the preceding element must appear:

\d+       โ€” one or more digits       โ†’ matches "42", "0", "1000"
\d*       โ€” zero or more digits      โ†’ matches "", "5", "999"
\d?       โ€” zero or one digit        โ†’ matches "" or "7"
\d{3}     โ€” exactly 3 digits         โ†’ matches "123" but not "12"
\d{2,4}   โ€” between 2 and 4 digits   โ†’ matches "12", "123", "1234"
\d{3,}    โ€” 3 or more digits

Greedy vs. Lazy

By default quantifiers are greedy โ€” they match as much as possible:

Pattern: <.+>
Text:    <b>bold</b>
Match:   <b>bold</b>   (the whole thing)

Adding ? makes it lazy โ€” match as little as possible:

Pattern: <.+?>
Text:    <b>bold</b>
Matches: <b>   and   </b>  (separately)

Anchors

Anchors match a position rather than a character:

^Hello        โ€” string must START with "Hello"
world$        โ€” string must END with "world"
^\d{5}$       โ€” string must be exactly 5 digits (US zip code)
\bcat\b       โ€” "cat" as a whole word (\b is a word boundary)

Word boundaries are particularly useful. \bcat\b matches "cat" in "the cat sat" but not in "concatenate".


Capturing Groups

Parentheses () create a capturing group, which lets you extract parts of a match:

(\d{4})-(\d{2})-(\d{2})

Applied to "2024-06-15":

  • Group 0 (full match): 2024-06-15
  • Group 1: 2024
  • Group 2: 06
  • Group 3: 15

In JavaScript you access groups via the exec() result or named groups:

const pattern = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/;
const match = pattern.exec("2024-06-15");
console.log(match.groups.year);  // "2024"
console.log(match.groups.month); // "06"

Non-capturing groups (?:...) group without capturing โ€” useful when you need grouping for quantifiers but do not need the captured value:

(?:https?://)?example\.com

Alternation

The pipe | means OR:

cat|dog        โ€” matches "cat" or "dog"
(jpg|jpeg|png) โ€” matches any image extension

Real-World Patterns

Email Address (simplified)

^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$

This matches most valid emails. A truly RFC-compliant email regex is famously complex โ€” for production use, validate with a library.

US Phone Number

^\+?1?\s?[\(\-]?\d{3}[\)\-\s]?\d{3}[\-\s]?\d{4}$

Matches: (555) 123-4567, 555-123-4567, +1 555 123 4567

URL

https?://[^\s/$.?#].[^\s]*

IPv4 Address

\b(?:\d{1,3}\.){3}\d{1,3}\b

Hex Color Code

#[0-9a-fA-F]{6}|#[0-9a-fA-F]{3}

Matches #FF5733 and #F53.

Strong Password (at least 8 chars, upper, lower, digit, special)

^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$

This uses lookaheads (?=...) โ€” zero-width assertions that check a condition without consuming characters.


Flags / Modifiers

Most regex engines support flags that change matching behavior:

Flag Meaning
i Case insensitive
g Global โ€” find all matches, not just the first
m Multiline โ€” ^ and $ match each line
s Dotall โ€” . matches newlines too
x Extended โ€” ignore whitespace and allow comments

In JavaScript: /pattern/flags โ€” e.g. /\d+/gi In Python: re.compile(r'\d+', re.IGNORECASE | re.MULTILINE)


Testing and Debugging Regex

Regex is notoriously hard to read back after writing it. Always test with real data including edge cases. Use the Regex Tester to paste your pattern and test strings side by side โ€” it highlights matches in real time and shows capturing groups.

Common debugging steps:

  1. Start with a simple literal match to confirm the tool works.
  2. Add one metacharacter at a time.
  3. Test both matching and non-matching cases.
  4. Check edge cases: empty strings, very long inputs, special characters.

Frequently Asked Questions

Why does my regex match too much? Quantifiers are greedy by default. Add ? after * or + to make them lazy, or use more specific character classes instead of ..

What is the difference between ^ inside and outside []? Outside [], ^ anchors to the start of a string. Inside [^abc], it negates the class โ€” matching any character that is not a, b, or c.

Is regex the same in every language? No. Most support the same core syntax but differ in advanced features: named groups, lookaheads, Unicode properties, and flags vary. JavaScript, Python, and PCRE (PHP, Perl) are similar. POSIX regex (used in older Unix tools) lacks some features.

When should I not use regex? For structured formats like JSON, XML, or HTML, use a proper parser instead. Regex on HTML is a notorious anti-pattern that breaks on nested tags and edge cases.


Try your patterns live in the Regex Tester โ€” paste a pattern, type test strings, and see matches highlighted instantly.

Ad ยท 728ร—90 Leaderboard
Back to BlogBrowse Tools โ†’

Related Articles

๐Ÿ“–
Base64Encoding
5 min read

Base64 Encoding Explained: What It Is and When to Use It

Understand Base64 encoding from first principles: how it works, when to use it, when to avoid it, and practical examples in APIs, emails, and data URIs.

ST
Jun 20, 2026Read โ†’
๐Ÿ“–
Password SecurityCybersecurity
6 min read

Creating Strong Passwords: A Practical Security Guide

Learn what makes a password truly strong, how attackers crack weak ones, and how to build a password strategy that actually works for your daily life.

ST
Jun 20, 2026Read โ†’
๐Ÿ“–
JSONDeveloper Tools
5 min read

How to Format and Validate JSON: A Developer's Guide

Learn how to format, validate, and debug JSON data with practical examples. Master JSON syntax rules and avoid common pitfalls in APIs and config files.

ST
Jun 20, 2026Read โ†’