Regex Sets and Ranges
Summary: in this tutorial, you’ll learn about the regex sets and ranges to create regular expressions that match a set of characters.
Sets
A set is one or more characters specified in square brackets. For example:
[abc]
Code language: PHP (php)
Since a set matches any characters in the square brackets, the [abc]
set matches the character a
, b
and c
.
The following example uses a set to match the string Jill
or Hill
:
$pattern = '/[JH]ill/';
$title = 'Jack and Jill Went Up the Hill';
if (preg_match_all($pattern, $title, $matches)) {
print_r($matches[0]);
}
Code language: PHP (php)
Output:
Array
(
[0] => Jill
[1] => Hill
)
Code language: PHP (php)
In this example, the set [JH]
matches the character J
or H
. Therefore, the regular expression /[JH]ill/
matches Jill
and Hill
.
Ranges
Suppose you want to match many characters in a set, e.g., from a
to z
. If you list all of these characters in that square brackets, it would not be ideal.
Ranges allow you to specify a range of characters. For example, the [a-z]
ranges from a
to z
.
Also, you can specify multiple ranges inside the square brackets. For example, the [a-z0-9]
range matches characters from a
to z
and numbers from 0
to 9
.
Similarly, the [a-zA-Z0-9_]
is the same as the \w
character class and the [0-9]
range is the same as the \d
.
Negate sets and ranges
To negate a set or range, you use the caret character (^
) at the beginning of the set and range. For example, the range [^0-9]
matches any character except a digit. It is the same as \D
.
Notice that the caret (^
) is also an anchor that matches the beginning of a string. If you use the caret (^
) inside the square brackets, it behaves like a negation operator, not an anchor.
The following example uses the caret (^) to negate the set [aeoiu] to match the consonants in the string 'Hello'
:
$pattern = '/[^aeoiu]/';
$title = 'Hello';
if (preg_match_all($pattern, $title, $matches)) {
print_r($matches[0]);
}
Code language: HTML, XML (xml)
Output:
Array
(
[0] => H
[1] => l
[2] => l
)
Code language: PHP (php)
Summary
- A set matches any character specified in the square brackets.
- A range matches any character in a range of characters.
- To negate a set or range, you use the caret character
[^...]
.