deleting duplicates in sql and modifying relations

2019-07-27 05:31发布

I have three tables: menu_tab has columns (menu_id,menu_description) item_tab has columns (item_id,item_name,item_description,item_price) menu_has_item has columns{ (menu_tab_menu_id ---> which is foreign key to menu_id (pk in menu_tab)), item_tab_item_id --- which is foreign key to item_id (pk in item_tab))4

there will be 2 kinds of duplicates which will be encountered 1)Item duplicate in the same menu_description 2)Item duplicate in a different menu description

Example: Two Chicken Sandwiches in the lunch menu. One Chicken Sandwich in Lunch and another in Dinner menu _description

menu_tab    
menu_id menu_description
1        lunch
2        dinner
3        Specials


item_tab        
item_id item_description    
1       b 
2       d   
3       g   
4       x   
5       g          delete g
6       d   
7       e   
8       b          delete b
9       x   



menu_has_tab

menu_tab_menu_id item_tab_item_id
1............................1
1............................2
1............................3
1............................4
2............................5 replace by 3
2............................6
3............................7
3............................8 replace by 1
3............................9

How do I update my menu_has_item with the replaced values after removing the duplicates?

3条回答
趁早两清
2楼-- · 2019-07-27 05:41
begin
  for x in (
            -- find duplicate items
            select *
              from (select rowid row_id,
                           item_id,
                           item_description,
                           row_number() over(partition by item_description order by
                           item_description) row_no
                       from item_tab)
            where row_no > 1) loop
-- replaceing duplicate Items
    update menu_has_item 
    set menu_has_item.item_tab_item_id =
           ( select item_id
              from (select item_id,
                           row_number() over(partition by item_description order by
                           item_description) row_no
                       from item_tab where 
                       item_tab.item_description = x.item_description)
             where row_no = 1)
   where menu_has_item .item_tab_item_id = x.item_id;
-- deleting duplicate items
     delete item_tab where rowid = x.row_id;
  end loop;
-- commit;
end;
查看更多
SAY GOODBYE
3楼-- · 2019-07-27 05:42

i did this for my tables Rout(RoutID,SourceCityID,DestCityID) and Form(FormID,RoutID,...) i deleted duplicated routs from table Rout and update RoutID in Form table
first get the Duplicate Rows grouped by columns that u want to compare for duplicate

(SELECT * FROM
    Rout,
    (SELECT MIN(RoutID) MinRoutID
    FROM Rout,
        (SELECT SourceCityID,DestCityID
        FROM Rout
        GROUP BY SourceCityID,DestCityID
        HAVING count(*) > 1) AS Duplicates
    WHERE Rout.SourceCityID=Duplicates.SourceCityID AND Rout.DestCityID=Duplicates.DestCityID
    GROUP BY Rout.SourceCityID,Rout.DestCityID)AS MRCols
WHERE RoutID=MinRoutID)AS DuplicateGroup

then get all duplicated rows without grouping and with columns that will be compared for duplicate

(SELECT RoutID,Rout.SourceCityID,Rout.DestCityID FROM Rout,
    (SELECT SourceCityID,DestCityID
    FROM Rout
    GROUP BY SourceCityID,DestCityID
    HAVING count(*) > 1)AS Duplicates
WHERE Rout.SourceCityID=Duplicates.SourceCityID AND Rout.DestCityID=Duplicates.DestCityID)AS DuplicateDetail

and then Update Form tbl like below:

UPDATE Form SET RoutID=DuplicateGroup.RoutID
FROM
    Form,
    (SELECT * FROM
        Rout,
        (SELECT MAX(RoutID) MinRoutID
        FROM Rout,
            (SELECT SourceCityID,DestCityID
            FROM Rout
            GROUP BY SourceCityID,DestCityID
            HAVING count(*) > 1) AS Duplicates
        WHERE Rout.SourceCityID=Duplicates.SourceCityID AND Rout.DestCityID=Duplicates.DestCityID
        GROUP BY Rout.SourceCityID,Rout.DestCityID)AS MRCols
    WHERE RoutID=MinRoutID)AS DuplicateGroup
    ,
    (SELECT RoutID,Rout.SourceCityID,Rout.DestCityID FROM Rout,
        (SELECT SourceCityID,DestCityID
        FROM Rout
        GROUP BY SourceCityID,DestCityID
        HAVING count(*) > 1)AS Duplicates
    WHERE Rout.SourceCityID=Duplicates.SourceCityID AND Rout.DestCityID=Duplicates.DestCityID)AS DuplicateDetail
WHERE
    Form.RoutID=DuplicateDetail.RoutID AND
    DuplicateGroup.SourceCityID=DuplicateDetail.SourceCityID
    AND DuplicateGroup.DestCityID=DuplicateDetail.DestCityID

and now delete the rows in Rout that arent in table Form

DELETE FROM Rout WHERE RoutID NOT IN(SELECT DISTINCT RoutID FROM Form)
查看更多
小情绪 Triste *
4楼-- · 2019-07-27 05:50

first you need replace your duplicates in menu_tab with new value

merge into menu_tab dest
using (select *
         from (select item_id, min(item_id) over(partition by item_description) as new_item_id from item_tab)
        where item_id != new_item_id) src
on (dest.item_tab_item_id = src.item_id)
when matched then
    update set dest.item_tab_item_id = new_item_id;

after that you need remove duplicates from item table you can found script there http://sprogram.com.ua/en/articles/oracle-delete-duplicate-record

ups you mark question as plsql and I mistakelly think that you about Oracle, sorry. but I supose in MySQL exists merge statement good luck

查看更多
登录 后发表回答