Managing Hierarchical Data in MySQL Using the Adjacency List Model
Summary: in this tutorial, you will learn how to use adjacency list model for managing hierarchical data in MySQL.
Introduction to adjacency list model
Hierarchical data is everywhere. It can be blog categories, product hierarchies, or organizational structures.
There are many ways to manage hierarchical data in MySQL and the adjacency list model may be the simplest solution. Because of its simplicity, the adjacency list model is a very popular choice by developers and database administrators.
In the adjacency list model, each node has a pointer that points to its parent. The top node has no parent. See the following categories of electronics products:
There are some terms that you should be familiar with before you work with the adjacency list model:
Electronics
is a top node or root node.Laptops, Cameras & photo, Phones & Accessories
nodes are the children of theElectronics
node. And vice versa Electronics node is the parent ofLaptops, Cameras & photo, Phones & Accessories
nodes.- The leaf nodes are the nodes that have no children e.g.,
Laptops
,PC
,Android
,iOS
, etc while the non-leaf nodes are the ones that have at least one child. - The children and grandchildren of a node are called descendants. And the parents, grandparents, etc., of a node are also known as ancestors.
To model this category tree, we can create a table named category
with three columns: id
, title
, and parent_id
as follows:
CREATE TABLE category (
id int(10) unsigned NOT NULL AUTO_INCREMENT,
title varchar(255) NOT NULL,
parent_id int(10) unsigned DEFAULT NULL,
PRIMARY KEY (id),
FOREIGN KEY (parent_id) REFERENCES category (id)
ON DELETE CASCADE ON UPDATE CASCADE
);
Code language: SQL (Structured Query Language) (sql)
Each row in a table is a node in the tree identified by the id
column. The parent_id
column is the foreign key of the category
table itself. It acts like a pointer that points to the id
column.
Inserting data
The root node of the tree has no parent, therefore, the parent_id
is set to NULL
. The other nodes must have one and only one parent.
To insert a root node, you set the parent_id
to NULL
as follows:
INSERT INTO category(title,parent_id)
VALUES('Electronics',NULL);
Code language: SQL (Structured Query Language) (sql)
To insert a non-root node, you just need to set its parent_id
to the id of its parent node. For example, the parent_id
of Laptop & PC
, Cameras & Photos
and Phone & Accessories
nodes are set to 1:
INSERT INTO category(title,parent_id)
VALUES('Laptops & PC',1);
INSERT INTO category(title,parent_id)VALUES(‘Laptops’,2);
INSERT INTO category(title,parent_id)
VALUES(‘PC’,2);
INSERT INTO category(title,parent_id)
VALUES(‘Cameras & photo’,1);
INSERT INTO category(title,parent_id)
VALUES(‘Camera’,5);
INSERT INTO category(title,parent_id)
VALUES(‘Phones & Accessories’,1);
INSERT INTO category(title,parent_id)
VALUES(‘Smartphones’,7);
INSERT INTO category(title,parent_id)
VALUES(‘Android’,8);
INSERT INTO category(title,parent_id)
VALUES(‘iOS’,8);
INSERT INTO category(title,parent_id)
VALUES(‘Other Smartphones’,8);
INSERT INTO category(title,parent_id)
VALUES(‘Batteries’,7);
INSERT INTO category(title,parent_id)
VALUES(‘Headsets’,7);
INSERT INTO category(title,parent_id)
VALUES(‘Screen Protectors’,7);
Code language: SQL (Structured Query Language) (sql)
Finding the root node
The root node is the node that has no parent. In other words, its parent_id
is NULL
:
SELECT
id, title
FROM
category
WHERE
parent_id IS NULL;
Code language: SQL (Structured Query Language) (sql)
Finding the immediate children of a node
The following query gets the immediate children of the root node:
SELECT
id, title
FROM
category
WHERE
parent_id = 1;
Code language: SQL (Structured Query Language) (sql)
Finding the leaf nodes
The leaf nodes are the nodes that have no children.
SELECT
c1.id, c1.title
FROM
category c1
LEFT JOIN
category c2 ON c2.parent_id = c1.id
WHERE
c2.id IS NULL;
Code language: SQL (Structured Query Language) (sql)
Querying the whole tree
The following recursive common table expression (CTE) retrieves the whole category tree. Notice that the CTE feature has been available since MySQL 8.0
WITH RECURSIVE category_path (id, title, path) AS
(
SELECT id, title, title as path
FROM category
WHERE parent_id IS NULL
UNION ALL
SELECT c.id, c.title, CONCAT(cp.path, ' > ', c.title)
FROM category_path AS cp JOIN category AS c
ON cp.id = c.parent_id
)
SELECT * FROM category_path
ORDER BY path;
Code language: SQL (Structured Query Language) (sql)
Querying a sub-tree
The following query gets the sub-tree of Phone & Accessories
whose id
is 7.
WITH RECURSIVE category_path (id, title, path) AS
(
SELECT id, title, title as path
FROM category
WHERE parent_id = 7
UNION ALL
SELECT c.id, c.title, CONCAT(cp.path, ' > ', c.title)
FROM category_path AS cp JOIN category AS c
ON cp.id = c.parent_id
)
SELECT * FROM category_path
ORDER BY path;
Code language: SQL (Structured Query Language) (sql)
Querying a single path
To query a single path from bottom to top e.g., from iOS
to Electronics
, you use the following statement:
WITH RECURSIVE category_path (id, title, parent_id) AS
(
SELECT id, title, parent_id
FROM category
WHERE id = 10 -- child node
UNION ALL
SELECT c.id, c.title, c.parent_id
FROM category_path AS cp JOIN category AS c
ON cp.parent_id = c.id
)
SELECT * FROM category_path;
Code language: SQL (Structured Query Language) (sql)
Calculating level of each node
Suppose the level of the root node is 0, each node underneath has a level that equals its parent node’s level plus 1.
WITH RECURSIVE category_path (id, title, lvl) AS
(
SELECT id, title, 0 lvl
FROM category
WHERE parent_id IS NULL
UNION ALL
SELECT c.id, c.title,cp.lvl + 1
FROM category_path AS cp JOIN category AS c
ON cp.id = c.parent_id
)
SELECT * FROM category_path
ORDER BY lvl;
Code language: SQL (Structured Query Language) (sql)
Deleting a node and its descendants
To delete a node and its descendants, just remove the node itself, all the descendants will be deleted automatically by the DELETE CASCADE
of the foreign key constraint.
For example, to delete the Laptops & PC
node and its children ( Laptops
, PC
), you use the following statement:
DELETE FROM category
WHERE
id = 2;
Code language: SQL (Structured Query Language) (sql)
Deleting a node and promote its descendants
To delete a non-leaf node and promote its descendants:
- First, update the
parent_id
of the immediate children of the node to theid
of the new parent node. - Then, delete the node.
For example, to delete the Smartphones
node and promote its children such as Android
, iOS
, Other Smartphones
nodes:
First, update the parent_id
for all immediate children of Smartphones
:
UPDATE category
SET
parent_id = 7 -- Phones & Accessories
WHERE
parent_id = 5; -- Smartphones
Code language: SQL (Structured Query Language) (sql)
Second, delete the Smartphones
node:
DELETE FROM category
WHERE
id = 8;
Code language: SQL (Structured Query Language) (sql)
Both statements should be wrapped in a single transaction:
BEGIN;
UPDATE categorySET
parent_id = 7
WHERE
parent_id = 5;
DELETE FROM category
WHERE
id = 8;
COMMIT;
Code language: SQL (Structured Query Language) (sql)
Moving a subtree
To move a subtree, just update the parent_id
of the top node of the subtree. For example, to move the Cameras & photo
as the children of Phone and Accessories
, you use the following statement:
UPDATE category
SET
parent_id = 7
WHERE
id = 5;
Code language: SQL (Structured Query Language) (sql)
In this tutorial, you have learned how to use the adjacency list model to manage hierarchical data in MySQL.