Regex Lookahead
Summary: in this tutorial, you’ll learn to use the regex lookahead and negative lookahead.
Introduction to the regex lookahead
Sometimes, you want to match A
but only if it is followed by B
. For example, suppose you have the following string:
2 chicken weigh 30lb
Code language: PHP (php)
And you want to match the number (30
) followed by the string lb
, not the number 2
. In this case, you can use the regex lookahead with the following syntax:
A(?=B)
Code language: PHP (php)
The lookahead means to search for A
but matches only if followed by B
. For a number followed by the string lb
, you can use the following pattern:
\d+(?=lb)
Code language: PHP (php)
In this pattern:
\d+
match one or more digits?=
is the lookaheadlb
match the textlb
.
The following code uses the regex lookahead syntax to match a number followed by the text lb
:
$pattern = '/\d+(?=lb)/';
$str = '2 chicken weigh 30lb';
if (preg_match($pattern, $str, $matches)) {
print_r($matches); // 30
}
Code language: PHP (php)
Output:
Array
(
[0] => 30
)
Code language: PHP (php)
The following regular expression also matches 30
followed immediately by lb
:
'/\d+(?=lb)/'
Code language: PHP (php)
For example:
$pattern = '/\d+(?=lb)/';
$str = '2 chicken weigh 30lb';
if (preg_match($pattern, $str, $matches)) {
print_r($matches); // 30
}
Code language: PHP (php)
Output:
Array
(
[0] => 30
)
Code language: PHP (php)
Multiple lookaheads
The following illustrates the multiple lookaheads:
A(?=B)(?=C)
Code language: PHP (php)
It works like this:
- Find A
- Test if B is immediately after A, skip if it’s not.
- Test if C is also immediately after B; skip if it’s not.
- If both tests pass, the A is a match; otherwise, search for the next match.
In short, the A(?=B)(?=C)
pattern matches A
followed by B
and C
simultaneously.
Negative Lookahead
Suppose you want to match only the number 2
in the following text but not the number 30
:
2 chicken weigh 30lb
Code language: PHP (php)
To do that, you can use the negative lookahead syntax:
A(?!B)
Code language: PHP (php)
The A(?!B)
matches A
only if followed by B
. It’s the \d+
not followed by the string lb
:
$pattern = '/\d+(?!lb)/';
$str = '2 chicken weigh 30lb';
if (preg_match($pattern, $str, $matches)) {
print_r($matches); // 2
}
Code language: PHP (php)
Output:
Array
(
[0] => 2
)
Code language: PHP (php)
Summary
- Use the regex lookahead
A(?=B)
that matchesA
only if followed byB
. - Use the negative regex lookahead
A(?!B)
that matchesA
only if not followed byB
.