Find node level in a tree

2019-04-29 04:46发布

问题:

I have a tree (nested categories) stored as follows:

CREATE TABLE `category` (
  `category_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `category_name` varchar(100) NOT NULL,
  `parent_id` int(10) unsigned DEFAULT NULL,
  PRIMARY KEY (`category_id`),
  UNIQUE KEY `category_name_UNIQUE` (`category_name`,`parent_id`),
  KEY `fk_category_category1` (`parent_id`,`category_id`),
  CONSTRAINT `fk_category_category1` FOREIGN KEY (`parent_id`) REFERENCES `category` (`category_id`) ON DELETE SET NULL ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_spanish_ci

I need to feed my client-side language (PHP) with node information (child+parent) so it can build the tree in memory. I can tweak my PHP code but I think the operation would be way simpler if I could just retrieve the rows in such an order that all parents come before their children. I could do that if I knew the level for each node:

SELECT category_id, category_name, parent_id
FROM category
ORDER BY level -- No `level` column so far :(

Can you think of a way (view, stored routine or whatever...) to calculate the node level? I guess it's okay if it's not real-time and I need to recalculate it on node modification.

First update: progress so far

I've written these triggers based on feedback by Amarghosh:

DROP TRIGGER IF EXISTS `category_before_insert`;

DELIMITER //

CREATE TRIGGER `category_before_insert` BEFORE INSERT ON `category` FOR EACH ROW BEGIN
    IF NEW.parent_id IS NULL THEN
        SET @parent_level = 0;
    ELSE
        SELECT level INTO @parent_level
        FROM category
        WHERE category_id = NEW.parent_id;
    END IF;

    SET NEW.level = @parent_level+1;
END//

DELIMITER ;


DROP TRIGGER IF EXISTS `category_before_update`;

DELIMITER //

CREATE TRIGGER `category_before_update` BEFORE UPDATE ON `category` FOR EACH ROW BEGIN
    IF NEW.parent_id IS NULL THEN
        SET @parent_level = 0;
    ELSE
        SELECT level INTO @parent_level
        FROM category
        WHERE category_id = NEW.parent_id;
    END IF;

    SET NEW.level = @parent_level+1;
END//

DELIMITER ;

It seems to work fine for insertions and modifications. But it doesn't work for deletions: MySQL Server does not launch triggers when the rows are updated from ON UPDATE CASCADE foreign keys.

The first obvious idea is to write a new trigger for deletion; however, a trigger on table categories is not allowed to modify other rows on this same table:

DROP TRIGGER IF EXISTS `category_after_delete`;

DELIMITER //

CREATE TRIGGER `category_after_delete` AFTER DELETE ON `category` FOR EACH ROW BEGIN
    /*
     * Raises an error, see below
     */
    UPDATE category SET parent_id=NULL
    WHERE parent_id = OLD.category_id;
END//

DELIMITER ;

Error:

Grid editing error: SQL Error (1442): Can't update table 'category' in stored function/trigger because it is already used by statement which invoked this stored function/trigger.

Second update: working solution (unless proved wrong)

My first attempt was pretty sensible but I found a problem I could not manage to solve: when you launch a series of operations from a trigger, MySQL will not allow to alter other lines from the same table. Since node deletions require to adjust the level of all descendants, I had hit a wall.

In the end, I changed the approach using code from here: rather than correcting individual levels when a node change, I have code to calculate all levels and I trigger it on every edit. Since it's a slow calculation and fetching data requires a very complex query, I cache it into a table. In my case, it's an acceptable solution since editions should be rare.

1. New table for cached levels:

CREATE TABLE `category_level` (
  `category_id` int(10) NOT NULL,
  `parent_id` int(10) DEFAULT NULL, -- Not really necesary
  `level` int(10) NOT NULL,
  PRIMARY KEY (`category_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_spanish_ci

2. Helper function to calculate levels

If I really got a grasp on how it works, it doesn't really return anything useful by itself. Instead, it stores stuff in session variables.

CREATE FUNCTION `category_connect_by_parent_eq_prior_id`(`value` INT) RETURNS int(10)
    READS SQL DATA
BEGIN
    DECLARE _id INT;
    DECLARE _parent INT;
    DECLARE _next INT;
    DECLARE CONTINUE HANDLER FOR NOT FOUND SET @category_id = NULL;

    SET _parent = @category_id;
    SET _id = -1;

    IF @category_id IS NULL THEN
        RETURN NULL;
    END IF;

    LOOP
        SELECT  MIN(category_id)
        INTO    @category_id
        FROM    category
        WHERE   COALESCE(parent_id, 0) = _parent
            AND category_id > _id;
        IF @category_id IS NOT NULL OR _parent = @start_with THEN
            SET @level = @level + 1;
            RETURN @category_id;
        END IF;
        SET @level := @level - 1;
        SELECT  category_id, COALESCE(parent_id, 0)
        INTO    _id, _parent
        FROM    category
        WHERE   category_id = _parent;
    END LOOP;
END

3. Procedure to launch the recalculation process

It basically encapsulates the complex query that retrieves the levels aided by the helper function.

CREATE PROCEDURE `update_category_level`()
    SQL SECURITY INVOKER
BEGIN
    DELETE FROM category_level;

    INSERT INTO category_level (category_id, parent_id, level)
    SELECT hi.category_id, parent_id, level
    FROM (
        SELECT category_connect_by_parent_eq_prior_id(category_id) AS category_id, @level AS level
        FROM (
            SELECT  @start_with := 0,
                @category_id := @start_with,
                @level := 0
            ) vars, category
        WHERE @category_id IS NOT NULL
        ) ho
    JOIN category hi ON hi.category_id = ho.category_id;
END

4. Triggers to keep cache table up-to-date

CREATE TRIGGER `category_after_insert` AFTER INSERT ON `category` FOR EACH ROW BEGIN
    call update_category_level();
END

CREATE TRIGGER `category_after_update` AFTER UPDATE ON `category` FOR EACH ROW BEGIN
    call update_category_level();
END

CREATE TRIGGER `category_after_delete` AFTER DELETE ON `category` FOR EACH ROW BEGIN
    call update_category_level();
END

5. Known issues

  • It's pretty suboptimal if nodes are altered frequently.
  • MySQL does not allow transactions or table locking in triggers and procedures. You must take care of these details where you edit nodes.

回答1:

There's an excellent series of articles here on Hierarchical Queries in MySQL that includes how to identify level, leaf nodes, loops in the hierarchy, etc.



回答2:

If there won't be any cycles (if it'd always be a tree and not a graph), you can have a level field that is set to zero (top most) by default and a stored procedure that updates the level to (parent's level + 1) whenever you update the parent_id.

CREATE TRIGGER setLevelBeforeInsert BEFORE INSERT ON category
FOR EACH ROW
BEGIN
IF NEW.parent_id IS NOT NULL THEN
SELECT level INTO @pLevel FROM category WHERE id = NEW.parent_id;
SET NEW.level = @pLevel + 1;
ELSE 
SET NEW.level = 0;
END IF;
END;


回答3:

No level column so far :(

Hmm * shrug *
I'd just made this level field manually.
Say, like Materialized path, with just one update after insert and without all these fancy triggers.
A field which is going to be like 000000100000210000022 for the 3-rd level for example

so it can build the tree in memory.

if you going to get whole table into PHP, I see no problem here. A little recursive function can give you your nested arrays tree.

I can tweak my PHP code but I think the operation would be way simpler

well, well.
The code you've got so far doesn't seem to me "way simple" :)