MySQL REGEXP: Search Based On Regular Expressions

Created with Sketch.

MySQL REGEXP: Search Based On Regular Expressions

 

Summary: in this tutorial, you will learn how to use the MySQL REGEXP operator to perform complex searches based on regular expressions.

Introduction to regular expressions

A regular expression is a special string that describes a search pattern. It is a powerful tool that gives you a concise and flexible way to identify strings of text e.g., characters, and words, based on patterns.

For example, you can use regular expressions to search for email, IP address, phone number, social security number, or anything that has a specific pattern.

A regular expression uses its own syntax that can be interpreted by a regular expression processor. A regular expression is widely used in almost platforms from programming languages to databases including MySQL.

The advantage of using regular expression is that you are not limited to search for a string based on a fixed pattern with the percent sign (%) and underscore (_) in the LIKE operator. The regular expressions have more meta-characters to construct flexible patterns.

The disadvantage of using regular expression is that it is quite difficult to understand and maintain such a complicated pattern. Therefore, you should describe the meaning of the regular expression in the comment of the SQL statement. In addition, the speed of data retrieval, in some cases, is decreased if you use complex patterns in a regular expression.

The abbreviation of regular expressions is regex or regexp.

MySQL REGEXP operator

MySQL adapts the regular expression implemented by Henry Spencer. MySQL allows you to match pattern right in the SQL statements by using REGEXP operator.

The following illustrates the syntax of the REGEXP operator in the  WHERE clause:

SELECT
column_list
FROM
table_name
WHERE
string_column REGEXP pattern;

Code language: SQL (Structured Query Language) (sql)

This statement performs a pattern match of a string_column against a pattern.

If a value in the string_column matches the pattern, the expression in the WHERE clause returns true, otherwise it returns false.

If either string_column or pattern is NULL, the result is NULL.

In addition to the REGEXP operator, you can use the RLIKE operator, which is the synonym of the REGEXP operator.

The negation form of the REGEXP operator is NOT REGEXP.

MySQL REGEXP examples

Suppose you want to find all products whose last names start with character A, B or C. You can use a regular expression in the following SELECT statement:

SELECT
productname
FROM
products
WHERE
productname REGEXP '^(A|B|C)'
ORDER BY productname;

Code language: SQL (Structured Query Language) (sql)

 

The pattern allows you to find the product whose name begins with A, B, or C.

  • The character ^ means to match from the beginning of the string.
  • The character | means to search for alternatives if one fails to match.

The following table illustrates some commonly used metacharacters and constructs in a regular expression.

MetacharacterBehavior
^matches the position at the beginning of the searched string
$matches the position at the end of the searched string
.matches any single character
[…]matches any character specified inside the square brackets
[^…]matches any character not specified inside the square brackets
p1|p2matches any of the patterns p1 or p2
*matches the preceding character zero or more times
+matches preceding character one or more times
{n}matches n number of instances of the preceding character
{m,n}matches from m to n number of instances of the preceding character

To find products whose names start with the character a, you use the metacharacter '^' to match at the beginning of the name:

SELECT
productname
FROM
products
WHERE
productname REGEXP '^a';

Code language: SQL (Structured Query Language) (sql)

 

If you want the REGEXP operator to compare strings in a case-sensitive fashion, you can use the BINARY operator to cast a string to a binary string.

Because MySQL compares binary strings byte by byte rather than character by character. This allows the string comparison to be case sensitive.

For example, the following statement matches only uppercase "C" at the beginning of the product name.

SELECT
productname
FROM
products
WHERE
productname REGEXP BINARY '^C';

Code language: SQL (Structured Query Language) (sql)

 

To find the product whose name ends with f, you use 'f$' to match the end of a string.

SELECT
productname
FROM
products
WHERE
productname REGEXP 'f$'

Code language: SQL (Structured Query Language) (sql)

 

To find the product whose name contains the word "ford", you use the following query:

SELECT
productname
FROM
products
WHERE
productname REGEXP 'ford';

Code language: SQL (Structured Query Language) (sql)

 

To find the product whose name contains exactly 10 characters, you use ‘^' and ‘$ to match the beginning and end of the product name, and repeat {10} times of any character ‘.' in between as shown in the following query:

SELECT
productname
FROM
products
WHERE
productname REGEXP '^.{10}$';

Code language: SQL (Structured Query Language) (sql)

 

In this tutorial, you have learned how to query data using the MySQL REGEXP operator with regular expressions.

Leave a Reply

Your email address will not be published. Required fields are marked *