Managing Hierarchical Data in MySQL Using the Adjacency List Model

Created with Sketch.

Managing Hierarchical Data in MySQL Using the Adjacency List Model

Summary: in this tutorial, you will learn how to use adjacency list model for managing hierarchical data in MySQL.

Introduction to adjacency list model

Hierarchical data is everywhere. It can be blog categories, product hierarchies, or organizational structures.

There are many ways to manage hierarchical data in MySQL and the adjacency list model may be the simplest solution. Because of its simplicity, the adjacency list model is a very popular choice by developers and database administrators.

In the adjacency list model, each node has a pointer that points to its parent. The top node has no parent. See the following categories of electronics products:

There are some terms that you should be familiar with before you work with the adjacency list model:

  • Electronicsis a top node or root node.
  • Laptops, Cameras & photo, Phones & Accessories nodes are the children of the Electronics node. And vice versa Electronics node is the parent of Laptops, Cameras & photo, Phones & Accessories nodes.
  • The leaf nodes are the nodes that have no children e.g., Laptops, PC, Android, iOS, etc while the non-leaf nodes are the ones that have at least one child.
  • The children and grandchildren of a node are called descendants. And the parents, grandparents, etc., of a node are also known as ancestors.

To model this category tree, we can create a table named category with three columns: id, title, and parent_id as follows:

CREATE TABLE category (
id int(10) unsigned NOT NULL AUTO_INCREMENT,
title varchar(255) NOT NULL,
parent_id int(10) unsigned DEFAULT NULL,
PRIMARY KEY (id),
FOREIGN KEY (parent_id) REFERENCES category (id)
ON DELETE CASCADE ON UPDATE CASCADE
);

Code language: SQL (Structured Query Language) (sql)

Each row in a table is a node in the tree identified by the id column.  The parent_id column is the foreign key of the category table itself. It acts like a pointer that points to the id column.

Inserting data

The root node of the tree has no parent, therefore, the parent_id is set to NULL. The other nodes must have one and only one parent.

To insert a root node, you set the parent_id to NULL as follows:

INSERT INTO category(title,parent_id)
VALUES('Electronics',NULL);

Code language: SQL (Structured Query Language) (sql)

To insert a non-root node, you just need to set its parent_id to the id of its parent node. For example, the parent_id of Laptop & PC, Cameras & Photos and Phone & Accessories nodes are set to 1:

INSERT INTO category(title,parent_id)
VALUES('Laptops & PC',1);
INSERT INTO category(title,parent_id)
VALUES(‘Laptops’,2);
INSERT INTO category(title,parent_id)
VALUES(‘PC’,2);

INSERT INTO category(title,parent_id)
VALUES(‘Cameras & photo’,1);
INSERT INTO category(title,parent_id)
VALUES(‘Camera’,5);

INSERT INTO category(title,parent_id)
VALUES(‘Phones & Accessories’,1);
INSERT INTO category(title,parent_id)
VALUES(‘Smartphones’,7);

INSERT INTO category(title,parent_id)
VALUES(‘Android’,8);
INSERT INTO category(title,parent_id)
VALUES(‘iOS’,8);
INSERT INTO category(title,parent_id)
VALUES(‘Other Smartphones’,8);

INSERT INTO category(title,parent_id)
VALUES(‘Batteries’,7);
INSERT INTO category(title,parent_id)
VALUES(‘Headsets’,7);
INSERT INTO category(title,parent_id)
VALUES(‘Screen Protectors’,7);

Code language: SQL (Structured Query Language) (sql)

Finding the root node

The root node is the node that has no parent. In other words, its parent_id is NULL:

SELECT
id, title
FROM
category
WHERE
parent_id IS NULL;

Code language: SQL (Structured Query Language) (sql)

Finding the immediate children of a node

The following query gets the immediate children of the root node:

SELECT
id, title
FROM
category
WHERE
parent_id = 1;

Code language: SQL (Structured Query Language) (sql)

Finding the leaf nodes

The leaf nodes are the nodes that have no children.

SELECT
c1.id, c1.title
FROM
category c1
LEFT JOIN
category c2 ON c2.parent_id = c1.id
WHERE
c2.id IS NULL;

Code language: SQL (Structured Query Language) (sql)

Querying the whole tree

The following recursive common table expression (CTE) retrieves the whole category tree. Notice that the CTE feature has been available since MySQL 8.0

WITH RECURSIVE category_path (id, title, path) AS
(
SELECT id, title, title as path
FROM category
WHERE parent_id IS NULL
UNION ALL
SELECT c.id, c.title, CONCAT(cp.path, ' > ', c.title)
FROM category_path AS cp JOIN category AS c
ON cp.id = c.parent_id
)
SELECT * FROM category_path
ORDER BY path;

Code language: SQL (Structured Query Language) (sql)

Querying a sub-tree

The following query gets the sub-tree of Phone & Accessories whose id is 7.

WITH RECURSIVE category_path (id, title, path) AS
(
SELECT id, title, title as path
FROM category
WHERE parent_id = 7
UNION ALL
SELECT c.id, c.title, CONCAT(cp.path, ' > ', c.title)
FROM category_path AS cp JOIN category AS c
ON cp.id = c.parent_id
)
SELECT * FROM category_path
ORDER BY path;

Code language: SQL (Structured Query Language) (sql)

Querying a single path

To query a single path from bottom to top e.g., from iOS to Electronics, you use the following statement:

WITH RECURSIVE category_path (id, title, parent_id) AS
(
SELECT id, title, parent_id
FROM category
WHERE id = 10 -- child node
UNION ALL
SELECT c.id, c.title, c.parent_id
FROM category_path AS cp JOIN category AS c
ON cp.parent_id = c.id
)
SELECT * FROM category_path;

Code language: SQL (Structured Query Language) (sql)

Calculating level of each node

Suppose the level of the root node is 0, each node underneath has a level that equals its parent node’s level plus 1.

WITH RECURSIVE category_path (id, title, lvl) AS
(
SELECT id, title, 0 lvl
FROM category
WHERE parent_id IS NULL
UNION ALL
SELECT c.id, c.title,cp.lvl + 1
FROM category_path AS cp JOIN category AS c
ON cp.id = c.parent_id
)
SELECT * FROM category_path
ORDER BY lvl;

Code language: SQL (Structured Query Language) (sql)

Deleting a node and its descendants

To delete a node and its descendants, just remove the node itself, all the descendants will be deleted automatically by the DELETE CASCADE of the foreign key constraint.

For example, to delete the Laptops & PC node and its children ( Laptops, PC), you use the following statement:

DELETE FROM category
WHERE
id = 2;

Code language: SQL (Structured Query Language) (sql)

Deleting a node and promote its descendants

To delete a non-leaf node and promote its descendants:

  1. First, update the parent_id of the immediate children of the node to the id of the new parent node.
  2. Then, delete the node.

For example, to delete the Smartphones node and promote its children such as Android, iOS, Other Smartphones nodes:

First, update the parent_id for all immediate children of Smartphones:

UPDATE category
SET
parent_id = 7 -- Phones & Accessories
WHERE
parent_id = 5; -- Smartphones

Code language: SQL (Structured Query Language) (sql)

Second, delete the Smartphones node:

DELETE FROM category
WHERE
id = 8;

Code language: SQL (Structured Query Language) (sql)

Both statements should be wrapped in a single transaction:

BEGIN;UPDATE category
SET
parent_id = 7
WHERE
parent_id = 5;

DELETE FROM category
WHERE
id = 8;

COMMIT;

Code language: SQL (Structured Query Language) (sql)

Moving a subtree

To move a subtree, just update the parent_id of the top node of the subtree. For example, to move the Cameras & photo as the children of Phone and Accessories, you use the following statement:

UPDATE category
SET
parent_id = 7
WHERE
id = 5;

Code language: SQL (Structured Query Language) (sql)

In this tutorial, you have learned how to use the adjacency list model to manage hierarchical data in MySQL.

Leave a Reply

Your email address will not be published. Required fields are marked *