Thanks to Paul Spiegel's answer on a previous question, I have a table tab
in database dab
and it consists of four columns. A simple, a ID column (id
) that is also the primary key (int), a pat
column (varchar(255)), a path
column (varchar(191)), and finally has_tree
(bit(1)).
id pat path has_tree
1 AA 1 1
2 ABA 1/2 1
3 ABCA 1/3 1
4 ABA 1/5 0
5 AB 2 0
6 BBB 2/1 1
7 CAB 2/2 1
8 ABC 1/4 0
9 ABC 1/5/7 1
10 ABA 3/2 1
Given a pat
value (if it exists), I want
- all unique
pat
values (including the input's if ithas_tree==1
), - where their path contains the path of the input,
- and where
has_tree == 1
.
So input AA should return ['AA', 'ABA', 'ABCA', 'ABC']
.
Important remarks are that the path
for the input pat
is not necessarily a single item 'root path', in other words: it's very well possible that the input path
is something like 1/12/45/966
in which case I want all the unique pat
values where the path
is a descendant of 1/12/45/966
and where has_tree
is 1
. Another remark is that the path can be very deep, so even if the input path
is 1
, the result can be many levels deeper, e.g. 1/22/36/88/98/455/878/1205/2555
.
Paul suggested the following query, which worked for the proposed data structure, however as you can see the current structure and requirements are slightly different. (Also note that I have one database with multiple tables. So given a table tab
in database dab
I'd like to execute the script.)
SELECT t.*
FROM tree t
CROSS JOIN tree r -- root
WHERE r.CELL = '3B0'
AND t.path LIKE CONCAT(r.path, '%');
I am trying to do this with PHP and then calling some SQL database. Is it possible to write this in a generic, performant form such that it works for PostgreSQL, MySQL(i) and others? If not, I'd at least like to see the MySQLi and PostgreSQL variants.
Building on Paul's example code above, I think it would be something along these lines, but I am not sure. I am also not sure why the JOIN
operator is necessary and what it does in this context (I know what it generally does, but not why it is necessary in this case). I am sure this code can be improved, fixed, and made more general for all SQL flavours? Additionally, I do not see that my attempt below includes the current input pat
in the output.
SELECT t.pat
FROM `tab` t
JOIN `tab` r
WHERE r.pat = 'AA' -- input
AND t.path LIKE CONCAT(r.path, '/', '%')
AND t.has_tree = 1;
Bonus question: I have indexed all columns except for has_tree
. Would it be beneficial to index this column as well?
First note: The paths should end with a backslash (
'1/'
,'1/2/'
instead of'1'
and'1/2'
). This is important because otherwise you can not match the the root of the subtree. Example:will not match
'1'
but will match'1/'
.You can also not just skip the backslash in the query:
will match the root
'1'
but also'10..'
Now to your question: If you already know the path of the root node, you don't need a join:
To get only "unique" pats you can use the
DISTINCT
keyword:That's it.
If you don't know the roots path but only the
pat
value, you would either need to run two queries (first get the path then get the descendants) or use a join.PostgreSQL:
Demo: http://rextester.com/EXZT43019
MySQL:
Demo: http://rextester.com/DNHRJ83456
Note: As you will see in the domo, you can also use pipes (
||
) in MySQL for concatenation if you change the mode with:Regarding the index: An index on a boolean column is usually not very usefull. However you can only know for sure if you test it. For the given queries a composite index on
(has_tree, path)
might improve the performance.