I have a database generated from an XML document with duplicate records. I know how to delete one record from the main table, but not those with foreign-key restraints.
I have a large amount of XML documents, and they are inserted without caring about duplicates or not. One solution to removing duplicates is to just delete the lowest Primary_Key values (and all related foreign key records) and keep the highest. I don't know how to do that, though.
The database looks like this:
Table 1: [type]
+-------------+---------+-----------+
| Primary_Key | Food_ID | Food_Type |
+-------------+---------+-----------+
| 70001 | 12345 | fruit |
| 70002 | 12345 | fruit |
| 70003 | 12345 | meat |
+----^--------+---------+-----------+
|
|-----------------|
|
| Linked to primary key in the first table
+-------------+--------v--------+-------------+-------------+------------+
| Primary_Key | Information_ID | Food_Name | Information | Comments |
+-------------+-----------------+-------------+-------------+------------+
| 0001 | 70001 | banana | buy @ toms | delicious! |
| 0002 | 70002 | banana | buy @ mats | so-so |
| 0003 | 70003 | decade meat | buy @ sals | disgusting |
+-------------+-----------------+-------------+-------------+------------+
^ Table 2: [food_information]
There are several other linked tables as well, which all have a foreign key value of the matched primary key value in the main table ([type]).
My question based on which solution might be best:
- How do I delete all of those records, except 70003 (the highest one)? We can't know if it's duplicate record unless [Food_ID] shows up more than once. If it shows up more than once, we need to delete records from ALL tables (there are 10) based on the Primary_Key and Foreign_Key relationship.
- How do I update/merge these SQL records on insertion to avoid having to delete multiples again?
I'd prefer #1, as it prevents me from having to rebuild the database, and it makes inserting much easier.
Thanks!