Deeply nested subqueries for traversing trees in M

2019-05-07 16:02发布

问题:

I have a table in my database where I store a tree structure using the hybrid Nested Set (MPTT) model (the one which has lft and rght values) and the Adjacency List model (storing parent_id on each node).

my_table (id, parent_id, lft, rght, alias)

This question doesn't relate to any of the MPTT aspects of the tree but I thought I'd leave it in in case anyone had a good idea about how to leverage that.

I want to convert a path of aliases to a specific node. For example: "users.admins.nickf" would find the node with alias "nickf" which is a child of one with alias "admins" which is a child of "users" which is at the root. There is a unique index on (parent_id, alias).

I started out by writing the function so it would split the path to its parts, then query the database one by one:

SELECT `id` FROM `my_table` WHERE `parent_id` IS NULL AND `alias` = 'users';-- 1
SELECT `id` FROM `my_table` WHERE `parent_id` = 1 AND `alias` = 'admins';   -- 8
SELECT `id` FROM `my_table` WHERE `parent_id` = 8 AND `alias` = 'nickf';    -- 37

But then I realised I could do it with a single query, using a variable amount of nesting:

SELECT `id` FROM `my_table` WHERE `parent_id` = (
    SELECT `id` FROM `my_table` WHERE `parent_id` = (
        SELECT `id` FROM `my_table`
        WHERE `parent_id` IS NULL AND `alias` = 'users'
    ) AND `alias`  = 'admins'
) AND `alias` = 'nickf';

Since the number of sub-queries is dependent on the number of steps in the path, am I going to run into issues with having too many subqueries? (If there even is such a thing)

Are there any better/smarter ways to perform this query?

回答1:

Does this work?

select r0.id 
  from my_table as r0 
  join my_table as r1 on(r0.parent_id = r1.id) 
  join my_table as r2 on(r1.parent_id = r2.id)
 where r0.alias='nickf'
   and r1.alias='admins'
   and r2.alias='users'
   and r2.parent_id is null

Seems to me there is not really a need for nested subqueries ..

or am I wrong, missing something?



回答2:

I wondered about this myself, and was looking for something that didn't get slower as you went deeper (meaning more levels in both options above.) The problem I have with "my version", is that it has to create every possible path before it narrows the result down to the one you're actually searching for... so I think lexu's version should outperform mine even for very large nesting because it's a simple join, but I'm hoping someone might see it and wish to expand on it further.

Also, this way of doing it would definitely benefit from a stored proc, and/or a view of the 'paths' part of it (without the HAVING clause). Perhaps with those it is a better solution, but I unfortunately don't know enough at this time about SQL performance to say for sure. I can say that mine gets slower as the data (number of possible combination of paths) gets larger, but with a view (since the result is cached, and using that to narrow it down) it seems fast (the largest data set I found was 370 total, at some point I'll create a much larger set to test.)

SELECT node.id, GROUP_CONCAT(parent.alias
                 ORDER BY parent.lft SEPARATOR '.') AS path_name
FROM my_table AS node, my_table AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rght
GROUP BY node.id HAVING path_name = 'users.admins.nickf'