How to update in SQL to get distinct tuples / not

2019-06-14 14:06发布

问题:

I have a mapping table with a unique contraint on the tuple (c_id, t_id).

Here's some sample data to illustrate the situation:

id  c_id    t_id  
----------------
1   10      2
2   10      3
3   10      7
4   12      2
5   13      3

I wrote a merge function for t_ids (x,y -> z OR x,y -> x). If my content (c_id) has both t_ids, then I'm of course violating the constraint by using this statement:

UPDATE mapping_table
SET t_id = '$target_tid'
WHERE t_id = '$t1_id' OR t_id = '$t2_id';

The result would be:

id  c_id    t_id
----------------
1   10      4
2   10      4       /* violates unique constraint */
3   10      7

Now I came up with this:

/* delete one of the duplicate entries */
DELETE FROM mapping_table
WHERE   ( SELECT count(c_id)
          FROM mapping_table
          WHERE t_id = '$t1_id' OR t_id = '$t2_id'
        ) > 1;

/* update the remaining row */
UPDATE mapping_table
SET t_id = '$target_tid'
WHERE t_id = '$t1_id' OR t_id = '$t2_id';

Now I'm getting the following error:
You can't specify target table 'mapping_table' for update in FROM clause

My questions are:

  1. What's exactly wrong here? Is the DELETE statement seen as an update and cannot be used in the WHERE clause?
  2. This there any more efficient way to do this?

回答1:

The error that you are having is a peculiarity of MySQL. You can get around this with a double set of subqueries:

DELETE FROM mapping_table
WHERE  (select *
        from ( SELECT count(c_id)
               FROM mapping_table
               WHERE t_id = '$t1_id' OR t_id = '$t2_id'
             ) > 1
        ) t

To fix your problem though, just remove all ids except for the minimum. I think this might also work:

delete from mapping_table
where id > (select minid from (select min(id) from mapping_table mt2
                               where mt2.c_id = mapping_table.c_id and
                                     mt2.t_id = mapping_table.t_id
                              )
           )

You can also store the list of ids in a temporary table, and use that in the query:

create temporary table minids as
     select c_id, t_id, min(id) as minid
     from mapping_table
     group by c_id, t_id;

delete from mapping_table
where exists (select 1 from minids
              where mt2.c_id = mapping_table.c_id and
                    mt2.t_id = mapping_table.t_id and
                    mt2.minid > mapping_table.id
             )


回答2:

Try this

DELETE FROM mapping_table
    WHERE   ( SELECT count(c_id)
      FROM mapping_table
      WHERE t_id = '$t1_id' OR t_id = '$t2_id'
      Having count(c_id) > 1
    );

EDIT:

try this in your update statment

 UPDATE mapping_table
 SET t_id = '$target_tid'
WHERE t_id in (select t_id from mapping_table where t_id= '$t1_id' OR t_id = '$t2_id') 


回答3:

I was looking for this solution. The performance is probably amazingly low, but at least I found a working solution (and learnt something).

/* actually delete rows that will become duplicate after the update */
DELETE FROM mt1 USING mapping_table AS mt1 WHERE id IN (

    /* sub-query to allow `mapping_table` in the DELETE statement */
    SELECT * FROM (

        /* select ids/rows with one t_id available */
        SELECT id
        FROM mapping_table AS mt2
        WHERE mt2.tag_id = $t1_id AND c_id IN (

            /* select ids/rows with both t_id available */
            SELECT c_id
            FROM mapping_table AS mt3
            WHERE mt3.c_id = mt2.c_id AND mt3.tag_id = $t2_id)

    /* alias needed for every derived table */
    ) as mres
)

/* Update to merge t_ids */
UPDATE mapping_table
SET t_id = '$target_id'
WHERE t_id = '$t1_id' OR t_id = '$t2_id';