🦋 Regex Basics in Raku

2026-03-16

If you have used regular expressions in Perl, Python, or JavaScript, Raku's regexes will feel familiar in purpose but refreshingly different in syntax. Raku redesigned regular expressions from scratch to be more readable, more powerful, and easier to maintain. Whitespace is ignored by default (making complex patterns readable), and the syntax is more consistent. Let us learn Raku regexes from the ground up.

Your First Match

The smart match operator checks if a string matches a regex:

my $text = "Hello, World!";

if $text ~~ /World/ {
    say "Found it!";     # Found it!
}

if $text ~~ /Raku/ {
    say "Found Raku";
} else {
    say "No Raku here";  # No Raku here
}

The ! operator is the negated form:

if "hello123" !~~ /^\d+$/ {
    say "Not all digits";    # Not all digits
}

Whitespace in Raku Regexes

This is the biggest difference from traditional regexes. In Raku, whitespace inside a regex is ignored by default. This lets you spread out complex patterns for readability:

# These are identical:
/hello world/       # WRONG: matches "helloworld" (space is ignored!)
/hello ' ' world/   # RIGHT: matches "hello world" (quoted literal space)
/hello \s world/    # RIGHT: matches "hello" + whitespace + "world"

To match a literal space, you must quote it or use \s. This is a deliberate design choice that makes complex regexes much easier to read.

Literal Matching

To match literal text, just write it. For characters that have special meaning, quote them:

say "abc" ~~ /abc/;           # True
say "hello" ~~ /hell/;        # True (partial match)

# Special characters must be quoted or escaped
say "3.14" ~~ / 3 '.' 14 /;   # True (quoted dot)
say "3.14" ~~ / 3 \. 14 /;    # True (escaped dot)
say "$100" ~~ / '$' 100 /;    # True

Anchors

Anchors match positions, not characters:

say "hello" ~~ /^ hello $/;     # True (exact match)
say "hello world" ~~ /^ hello/; # True (starts with)
say "hello world" ~~ /world $/; # True (ends with)

# ^^ and $$ match start/end of lines in multiline text
my $text = "line one\nline two\nline three";
say $text ~~ /$$ three/;        # True

Character Classes

Raku uses <[ ]> for character classes instead of the traditional [ ]:

# Match a single vowel
say "hello" ~~ /<[aeiou]>/;        # True

# Match digits
say "abc123" ~~ /<[0..9]>/;        # True

# Match a range of letters
say "hello" ~~ /<[a..z]>/;         # True

# Negated character class with -
say "hello" ~~ /<-[aeiou]>/;       # True (matches 'h', a non-vowel)

Character Class Operations

You can combine character classes with set operations:

# Union: vowels OR digits
/<[aeiou] + [0..9]>/

# Difference: letters BUT NOT vowels (i.e., consonants)
/<[a..z] - [aeiou]>/

# Built-in character classes
/\d/    # digit (same as <[0..9]>)
/\w/    # word character (letter, digit, underscore)
/\s/    # whitespace
/\D/    # non-digit
/\W/    # non-word character
/\S/    # non-whitespace
/./     # any character except newline

Quantifiers

Quantifiers control how many times a pattern matches:

my $text = "aabbbcccc";

say $text ~~ /a+/;       # True: one or more 'a'
say $text ~~ /a*b/;      # True: zero or more 'a' followed by 'b'
say $text ~~ /d?/;       # True: zero or one 'd' (zero matches)
say $text ~~ /b ** 3/;   # True: exactly 3 'b's
say $text ~~ /c ** 2..4/;# True: 2 to 4 'c's

Summary:

Quantifier	Meaning
	zero or more
`+`	one or more
`?`	zero or one
`N`	exactly N times
`N..M`	between N and M times
`N..`	N or more times

Greedy vs Frugal

By default, quantifiers are greedy (they match as much as possible). Add ? to make them frugal (match as little as possible):

my $html = '<b>bold</b> and <b>more bold</b>';

# Greedy: matches from first <b> to LAST </b>
if $html ~~ /'<b>;' .* '</b>'/ {
    say ~$/;    # <b>bold</b> and <b>more bold</b>
}

# Frugal: matches from first <b> to FIRST </b>
if $html ~~ /'<b>;' .*? '</b>'/ {
    say ~$/;    # <b>bold</b>
}

Captures

Parentheses create numbered captures, accessible as $0, $1, etc.:

my $date = "2026-03-16";

if $date ~~ /(\d ** 4) '-' (\d ** 2) '-' (\d ** 2)/ {
    say "Year:  $0";     # Year:  2026
    say "Month: $1";     # Month: 03
    say "Day:   $2";     # Day:   16
}

The match result is stored in the special variable $/:

"Hello World" ~~ /(\w+) \s+ (\w+)/;
say $/[0];    # Hello
say $/[1];    # World
say ~$/;      # Hello World (the entire match)

Named Captures

Named captures use $<name> syntax and are much more readable:

my $line = "Alice: 95";

if $line ~~ /$<name>;=(\w+) ':' \s* $<score>;=(\d+)/ {
    say "Student: $<name>";     # Student: Alice
    say "Score:   $<score>";    # Score:   95
}

Built-in Named Patterns

Raku provides several useful predefined patterns:

say "hello123" ~~ /<alpha>;/;     # True (alphabetic char)
say "hello123" ~~ /<digit>;/;     # True (digit char)
say "hello123" ~~ /<alnum>;/;     # True (alphanumeric char)
say "  hello"  ~~ /<space>;/;     # True (whitespace)
say "HELLO"    ~~ /<upper>;/;     # True (uppercase letter)
say "hello"    ~~ /<lower>;/;     # True (lowercase letter)

Use + with named patterns for matching multiple characters:

"hello123" ~~ /<alpha>;+/;
say ~$/;    # hello

"hello123" ~~ /<digit>;+/;
say ~$/;    # 123

Modifiers (Adverbs)

Raku regex adverbs modify matching behavior:

:i (case insensitive)

say "Hello" ~~ /:i hello/;      # True
say "HELLO" ~~ /:i hello/;      # True

:g (global matching)

my $text = "cat bat hat mat";
my @matches = $text ~~ m:g/\w+ 'at'/;
say @matches.elems;     # 4
say @matches;           # (cat bat hat mat)

:s (significant whitespace)

Normally whitespace in regexes is ignored. The :s adverb makes whitespace match \s+ (one or more whitespace characters):

my $text = "Hello   World";

# Without :s - whitespace ignored, this matches "HelloWorld"
say $text ~~ /Hello World/;     # False (there's no "HelloWorld")

# With :s - space in regex matches \s+
say $text ~~ /:s Hello World/;  # True (matches "Hello   World")

Combining Adverbs

my $text = "The Cat sat on the Mat";
my @found = $text ~~ m:i:g/ <;[cm]> at /;
say @found;    # (Cat Mat)

The .match Method

You can also use the .match method on strings:

my $result = "Hello World".match(/(\w+) \s+ (\w+)/);
if $result {
    say $result[0];    # Hello
    say $result[1];    # World
}

For global matches, pass :g:

my @all = "one 1 two 2 three 3".match(/:g \d+/);
say @all;    # (1 2 3)

Alternation

Use | for longest-match alternation and || for first-match:

# | tries all alternatives and picks the longest match
say "railroad" ~~ / rail | railroad /;
say ~$/;    # railroad (longest match wins)

# || tries alternatives left to right, picks the first that matches
say "railroad" ~~ / rail || railroad /;
say ~$/;    # rail (first match wins)

This distinction matters when alternatives overlap:

# Match file extensions
my $file = "photo.jpeg";
if $file ~~ / '.' (jpg | jpeg | png | gif) $/ {
    say "Image format: $0";    # Image format: jpeg
}

Practical Example: Email Validator

#!/usr/bin/env raku

my @emails = (
    'alice@example.com',
    'bob.smith@company.co.uk',
    'invalid@',
    '@nouser.com',
    'spaces in@address.com',
    'good_one+tag@gmail.com',
    'missing.domain@',
);

# Simple email pattern
my $email-rx = /^
    <;[\w.+\-]>+      # local part: word chars, dots, plus, hyphen
    '@'
    <;[\w.\-]>+        # domain: word chars, dots, hyphens
    '.'
    <;[a..zA..Z]> ** 2..10   # TLD: 2-10 letters
$/;

for @emails -> $addr {
    my $status = $addr ~~ $email-rx ?? "VALID" !! "INVALID";
    say sprintf("%-30s %s", $addr, $status);
}

Output:

alice@example.com              VALID
bob.smith@company.co.uk        VALID
invalid@                       INVALID
@nouser.com                    INVALID
spaces in@address.com          INVALID
good_one+tag@gmail.com         VALID
missing.domain@                INVALID

Practical Example: Log Line Parser

#!/usr/bin/env raku

my @logs = (
    '2026-03-16 10:30:15 [INFO] Server started on port 8080',
    '2026-03-16 10:30:22 [ERROR] Connection refused: 10.0.0.5:3306',
    '2026-03-16 10:31:05 [WARN] Memory usage at 85%',
);

for @logs -> $line {
    if $line ~~ /
        $<date>;  = (\d ** 4 '-' \d ** 2 '-' \d ** 2) \s+
        $<time>;  = (\d ** 2 ':' \d ** 2 ':' \d ** 2) \s+
        '[' $<level>; = (\w+) ']' \s+
        $<message>; = (.+)
    / {
        say "Level:   $<level>";
        say "Time:    $<date> $<time>";
        say "Message: $<message>";
        say "---";
    }
}

What is Next?

You now have a solid foundation in Raku regex basics. There is much more to explore (grammars, tokens, rules), but what you have learned here will handle the vast majority of text matching and extraction tasks. Next up: file I/O in Raku, where we will read, write, and navigate the filesystem.