hierarchical query match with multiple tables its

2019-08-25 12:32发布

问题:

I have this business_table

ref_ID      name    parent_id 
-----------------------------
ABC-0001    Amb     NULL 
PQR-899     boss    NULL
tgv-632     pick    NULL
yyy-888     xyz     NULL
kkk-456     ued     NULL

I want to update parent_id of business_table

parent_customer is another table which list the hierarchy of ref_ID and parent_id given below.

To update the parent_id of business_table staps are

1) check ref_id of business_table with ref_id of parent_customer . eg. ref_ID ABC-0001 of business_table match with parent_customer ref_id 1st row 1 ref_id-ABC-0001 opr-656 match found

2) then check parent_id of parent_customer of that matched record which is in this case parent_id opr-656 check with match_table_CM table

match_table_CM table list the ids which we want to match before updating record (we are checking this because of this is CRM id need to check emplpoyee exist of not)

3)match not found then check with parent_id opr-656 of parent_customer with same table parent_customer ref_id , 2nd record found with ref_id opr-656 then pick its parent_id ttK-668 check with match_table_CM match found 1 ttK-668 then update with business_table parent_id other wise check till the parent_customer ref_ID = parent_id (parent of all) and update that id even if match not found so in this case if match not found then ttK-668 is should be updated at last

note : - parent_customer table lists a hierarchy of data in which when both ref_id and parent_id are the same means it's the parent of the entire hierarchy.

For example:

4 PQR-899 PQR-899 this is ultimate parent of hierarchy

parent_customer

ID  ref_id     parent_id  
---------------------------
1   ABC-0001   opr-656
2   opr-656    ttK-668
3   ttK-668    ttK-668
4   PQR-899    PQR-899
5   kkk-565    AJY-567  
6   AJY-567    UXO-989
7   UXO-989    tgv-632
8   tgv-632    mnb-784 
9   mnb-784    qwe-525 
10  qwe-525    qwe-525
11  kkk-456    jjj-888

match_table_CM:

id    main_id
--------------
1     ttK-668
2     PQR-899
3     tgv-632
4     mnb-784

Expected output

ref_ID      name    parent_id 
-----------------------------
ABC-0001    Amb     ttK-668                    
PQR-899     boss    PQR-899
tgv-632     pick    qwe-525
yyy-888     xyz     NULL
kkk-456     ued     jjj-888

回答1:

This should return the expected result:

WITH hierarchy AS
 ( -- all rows from source table
   SELECT b.ref_id, pc.parent_id, 
      0 AS match,
      1 AS lvl
   FROM business_table AS b
   LEFT JOIN parent_customer AS pc
     ON b.ref_id = pc.ref_id

   UNION ALL

   SELECT h.ref_id, pc.parent_id, 
      -- check if we found a match or reached top of hierarchy
      CASE WHEN mt.main_id IS NOT NULL OR pc.parent_id = pc.ref_id THEN 1 ELSE 0 END,
      lvl+1
   FROM hierarchy AS h
   JOIN parent_customer AS pc 
     ON pc.ref_id = h.parent_id -- going up in the hierarchy
   LEFT JOIN match_table_CM AS mt
     ON mt.main_id = pc.ref_id
   WHERE h.match = 0 -- no match yet
     AND lvl < 10 -- just in case there's an endless loop due to bad data
 )
SELECT * FROM hierarchy AS h
WHERE lvl = 
 ( -- return the last row, matching or not
   SELECT Max(lvl)
   FROM hierarchy AS h2
   WHERE h.ref_id = h2.ref_id
 );

Edit:

Rewrite using EXISTS because SQL Server doesn't support Outer Joins in the recursive part:

WITH hierarchy AS
 ( -- all rows from source table
   SELECT b.ref_id, pc.parent_id, 
      0 AS match,
      1 AS lvl
   FROM business_table AS b
   LEFT JOIN parent_customer AS pc
     ON b.ref_id = pc.ref_id

   UNION ALL

   SELECT h.ref_id, pc.parent_id, 
      -- check if we found a match or reached top of hierarchy
      CASE WHEN exists
            ( select * 
              from match_table_CM AS mt
              where mt.main_id = pc.ref_id
            ) OR pc.parent_id = pc.ref_id
           THEN 1
           ELSE 0
      END,
      lvl+1
   FROM hierarchy AS h
   JOIN parent_customer AS pc 
     ON pc.ref_id = h.parent_id -- going up in the hierarchy
   WHERE h.match = 0 -- no match yet
     AND lvl < 10 -- just in case there's an endless loop due to bad data
 )
SELECT * FROM hierarchy AS h
WHERE lvl = 
 ( -- return the last row, matching or not
   SELECT Max(lvl)
   FROM hierarchy AS h2
   WHERE h.ref_id = h2.ref_id
 );

The optimizer's plan looked bad, so another rewrite to use a Windowed Aggregate instead of a Correlated Subquery:

WITH  hierarchy AS
 ( -- all rows from source table
   SELECT b.ref_id, pc.parent_id, 
      0 AS match,
      1 AS lvl
   FROM business_table AS b
   LEFT JOIN parent_customer AS pc
     ON b.ref_id = pc.ref_id

   UNION ALL

   SELECT h.ref_id, pc.parent_id, 
      -- check if we found a match or reached top of hierarchy
      CASE WHEN exists
            ( select * 
              from match_table_CM AS mt
              where mt.main_id = pc.ref_id
            ) OR pc.parent_id = pc.ref_id
           THEN 1
           ELSE 0
      END,
      lvl+1
   FROM hierarchy AS h
   JOIN parent_customer AS pc 
     ON pc.ref_id = h.parent_id -- going up in the hierarchy
   WHERE h.match = 0 -- no match yet
     AND lvl < 10 -- just in case there's an endless loop due to bad data
 )
select *
from 
 ( 
   SELECT h.*,
      max(lvl) over (partition by ref_id) as maxlvl
   FROM hierarchy AS h
 ) as dt
WHERE lvl = maxlvl
;


回答2:

You can get the ultimate parent using a recursive CTE:

with cte as (
      select pc.ref_id, pc.parent_id as ultimate_parent, 1 as lev
      from parent_customer pc
      where pc.ref_id = pc.parent_id
      union all
      select pc.ref_id, cte.ultimate_parent, lev + 1 
      from cte
           parent_customer pc
           on pc.parent_id = cte.ref_id and pc.ref_id <> pc.parent_id
    )
select *
from cte;

You can put this in an update:

with cte as (
      select pc.ref_id, pc.parent_id as ultimate_parent, 1 as lev
      from parent_customer pc
      where pc.ref_id = pc.parent_id
      union all
      select pc.ref_id, cte.ultimate_parent, lev + 1 
      from cte
           parent_customer pc
           on pc.parent_id = cte.ref_id and pc.ref_id <> pc.parent_id
    )
update bt
    set parent_id = cte.ultimate_parent
    from business_table bt join
         cte
         on cte.ref_id = bt.ref_id