raku.gg / beginner

Regex Basics in Raku

2026-03-16

If you have used regular expressions in Perl, Python, or JavaScript, Raku's regexes will feel familiar in purpose but refreshingly different in syntax. Raku redesigned regular expressions from scratch to be more readable, more powerful, and easier to maintain. Whitespace is ignored by default (making complex patterns readable), and the syntax is more consistent. Let us learn Raku regexes from the ground up.

Your First Match

The smart match operator checks if a string matches a regex:
my $text = "Hello, World!"; if $text ~~ /World/ { say "Found it!"; # Found it! } if $text ~~ /Raku/ { say "Found Raku"; } else { say "No Raku here"; # No Raku here }
The ! operator is the negated form:
if "hello123" !~~ /^\d+$/ { say "Not all digits"; # Not all digits }

Whitespace in Raku Regexes

This is the biggest difference from traditional regexes. In Raku, whitespace inside a regex is ignored by default. This lets you spread out complex patterns for readability:
# These are identical: /hello world/ # WRONG: matches "helloworld" (space is ignored!) /hello ' ' world/ # RIGHT: matches "hello world" (quoted literal space) /hello \s world/ # RIGHT: matches "hello" + whitespace + "world"
To match a literal space, you must quote it or use \s. This is a deliberate design choice that makes complex regexes much easier to read.

Literal Matching

To match literal text, just write it. For characters that have special meaning, quote them:
say "abc" ~~ /abc/; # True say "hello" ~~ /hell/; # True (partial match) # Special characters must be quoted or escaped say "3.14" ~~ / 3 '.' 14 /; # True (quoted dot) say "3.14" ~~ / 3 \. 14 /; # True (escaped dot) say "$100" ~~ / '$' 100 /; # True

Anchors

Anchors match positions, not characters:
say "hello" ~~ /^ hello $/; # True (exact match) say "hello world" ~~ /^ hello/; # True (starts with) say "hello world" ~~ /world $/; # True (ends with) # ^^ and $$ match start/end of lines in multiline text my $text = "line one\nline two\nline three"; say $text ~~ /$$ three/; # True

Character Classes

Raku uses <[ ]> for character classes instead of the traditional [ ]:
# Match a single vowel say "hello" ~~ /<[aeiou]>/; # True # Match digits say "abc123" ~~ /<[0..9]>/; # True # Match a range of letters say "hello" ~~ /<[a..z]>/; # True # Negated character class with - say "hello" ~~ /<-[aeiou]>/; # True (matches 'h', a non-vowel)

Character Class Operations

You can combine character classes with set operations:
# Union: vowels OR digits /<[aeiou] + [0..9]>/ # Difference: letters BUT NOT vowels (i.e., consonants) /<[a..z] - [aeiou]>/ # Built-in character classes /\d/ # digit (same as <[0..9]>) /\w/ # word character (letter, digit, underscore) /\s/ # whitespace /\D/ # non-digit /\W/ # non-word character /\S/ # non-whitespace /./ # any character except newline

Quantifiers

Quantifiers control how many times a pattern matches:
my $text = "aabbbcccc"; say $text ~~ /a+/; # True: one or more 'a' say $text ~~ /a*b/; # True: zero or more 'a' followed by 'b' say $text ~~ /d?/; # True: zero or one 'd' (zero matches) say $text ~~ /b ** 3/; # True: exactly 3 'b's say $text ~~ /c ** 2..4/;# True: 2 to 4 'c's
Summary:
Quantifier Meaning
zero or more
+ one or more
? zero or one
N exactly N times
N..M between N and M times
N.. N or more times

Greedy vs Frugal

By default, quantifiers are greedy (they match as much as possible). Add ? to make them frugal (match as little as possible):
my $html = '<b>bold</b> and <b>more bold</b>'; # Greedy: matches from first <b> to LAST </b> if $html ~~ /'<b>;' .* '</b>'/ { say ~$/; # <b>bold</b> and <b>more bold</b> } # Frugal: matches from first <b> to FIRST </b> if $html ~~ /'<b>;' .*? '</b>'/ { say ~$/; # <b>bold</b> }

Captures

Parentheses create numbered captures, accessible as $0, $1, etc.:
my $date = "2026-03-16"; if $date ~~ /(\d ** 4) '-' (\d ** 2) '-' (\d ** 2)/ { say "Year: $0"; # Year: 2026 say "Month: $1"; # Month: 03 say "Day: $2"; # Day: 16 }
The match result is stored in the special variable $/:
"Hello World" ~~ /(\w+) \s+ (\w+)/; say $/[0]; # Hello say $/[1]; # World say ~$/; # Hello World (the entire match)

Named Captures

Named captures use $<name> syntax and are much more readable:
my $line = "Alice: 95"; if $line ~~ /$<name>;=(\w+) ':' \s* $<score>;=(\d+)/ { say "Student: $<name>"; # Student: Alice say "Score: $<score>"; # Score: 95 }

Built-in Named Patterns

Raku provides several useful predefined patterns:
say "hello123" ~~ /<alpha>;/; # True (alphabetic char) say "hello123" ~~ /<digit>;/; # True (digit char) say "hello123" ~~ /<alnum>;/; # True (alphanumeric char) say " hello" ~~ /<space>;/; # True (whitespace) say "HELLO" ~~ /<upper>;/; # True (uppercase letter) say "hello" ~~ /<lower>;/; # True (lowercase letter)
Use + with named patterns for matching multiple characters:
"hello123" ~~ /<alpha>;+/; say ~$/; # hello "hello123" ~~ /<digit>;+/; say ~$/; # 123

Modifiers (Adverbs)

Raku regex adverbs modify matching behavior:

:i (case insensitive)

say "Hello" ~~ /:i hello/; # True say "HELLO" ~~ /:i hello/; # True

:g (global matching)

my $text = "cat bat hat mat"; my @matches = $text ~~ m:g/\w+ 'at'/; say @matches.elems; # 4 say @matches; # (cat bat hat mat)

:s (significant whitespace)

Normally whitespace in regexes is ignored. The :s adverb makes whitespace match \s+ (one or more whitespace characters):
my $text = "Hello World"; # Without :s - whitespace ignored, this matches "HelloWorld" say $text ~~ /Hello World/; # False (there's no "HelloWorld") # With :s - space in regex matches \s+ say $text ~~ /:s Hello World/; # True (matches "Hello World")

Combining Adverbs

my $text = "The Cat sat on the Mat"; my @found = $text ~~ m:i:g/ <;[cm]> at /; say @found; # (Cat Mat)

The .match Method

You can also use the .match method on strings:
my $result = "Hello World".match(/(\w+) \s+ (\w+)/); if $result { say $result[0]; # Hello say $result[1]; # World }
For global matches, pass :g:
my @all = "one 1 two 2 three 3".match(/:g \d+/); say @all; # (1 2 3)

Alternation

Use | for longest-match alternation and || for first-match:
# | tries all alternatives and picks the longest match say "railroad" ~~ / rail | railroad /; say ~$/; # railroad (longest match wins) # || tries alternatives left to right, picks the first that matches say "railroad" ~~ / rail || railroad /; say ~$/; # rail (first match wins)
This distinction matters when alternatives overlap:
# Match file extensions my $file = "photo.jpeg"; if $file ~~ / '.' (jpg | jpeg | png | gif) $/ { say "Image format: $0"; # Image format: jpeg }

Practical Example: Email Validator

#!/usr/bin/env raku my @emails = ( 'alice@example.com', 'bob.smith@company.co.uk', 'invalid@', '@nouser.com', 'spaces in@address.com', 'good_one+tag@gmail.com', 'missing.domain@', ); # Simple email pattern my $email-rx = /^ <;[\w.+\-]>+ # local part: word chars, dots, plus, hyphen '@' <;[\w.\-]>+ # domain: word chars, dots, hyphens '.' <;[a..zA..Z]> ** 2..10 # TLD: 2-10 letters $/; for @emails -> $addr { my $status = $addr ~~ $email-rx ?? "VALID" !! "INVALID"; say sprintf("%-30s %s", $addr, $status); }
Output:
alice@example.com VALID bob.smith@company.co.uk VALID invalid@ INVALID @nouser.com INVALID spaces in@address.com INVALID good_one+tag@gmail.com VALID missing.domain@ INVALID

Practical Example: Log Line Parser

#!/usr/bin/env raku my @logs = ( '2026-03-16 10:30:15 [INFO] Server started on port 8080', '2026-03-16 10:30:22 [ERROR] Connection refused: 10.0.0.5:3306', '2026-03-16 10:31:05 [WARN] Memory usage at 85%', ); for @logs -> $line { if $line ~~ / $<date>; = (\d ** 4 '-' \d ** 2 '-' \d ** 2) \s+ $<time>; = (\d ** 2 ':' \d ** 2 ':' \d ** 2) \s+ '[' $<level>; = (\w+) ']' \s+ $<message>; = (.+) / { say "Level: $<level>"; say "Time: $<date> $<time>"; say "Message: $<message>"; say "---"; } }

What is Next?

You now have a solid foundation in Raku regex basics. There is much more to explore (grammars, tokens, rules), but what you have learned here will handle the vast majority of text matching and extraction tasks. Next up: file I/O in Raku, where we will read, write, and navigate the filesystem.