MySQL 5.5: Efficient way to copy the same user inf

2019-09-04 06:44发布

问题:

In MySQL 5.5, assume we have the following MAIN table

Id  Name    Score
3   a   100
3   b   99
4   c   98
4   d   97

Also a SUB table:

Id  New_Id
3   1
3   1
4   2
4   2

The ideal output is OUTPUT table:

Id  Name    Score
3   a   100
3   b   99
4   c   98
4   d   97
1   a   100
1   b   99
2   c   98
2   d   97

The MySQL fiddle is available here

http://sqlfiddle.com/#!9/91c1cf/6

In the OUTPUT table,

we can see that the Id=1 shares the same information as Id =3. Similarly,

we can see that the Id=2 shares the same information as Id =4.

Is there any simple way such as "INSERT INTO...SELECT", "COPY", and "UPDATE" etc. that we can just copy the data from Id=3 to Id=1, and

copy the data from Id=4 to Id=2?

The following query does generate our ideal output, but the JOIN with the proper indexes is still painfully slow in our few hundreds millions rows. The job even fails due to not sufficient RAM and /tmp folder space in the case of few billion rows. We are upgrading our system from MySQL to somewhere more scab-able. However, we need to make sure the MySQL system will be functional during the few months of the transit period. Any 2 cents will be highly appreciated!

SELECT Id, Name, Score FROM MAIN

UNION

SELECT d.New_Id AS Id, c.Name, c.Score FROM MAIN c
RIGHT JOIN SUB d

ON c.Id = d.Id;

回答1:

Use INNER JOIN rather than RIGHT JOIN, since you don't need the null rows that result from non-matching rows. You can use INSERT INTO ... SELECT to add these new rows to the table. And rather than using UNION, you can simply do two inserts into the new table:

INSERT INTO OUTPUT (id, name, score)
SELECT id, name, score
FROM MAIN;

INSERT INTO OUTPUT (id, name, score)
SELECT d.new_id, c.name, c.score
FROM MAIN AS c
JOIN SUB AS d ON c.id = d.id;

As long as you have indexes on the id columns in both input tables this should be as efficient as possible.

Finally, when doing large UNION queries, use UNION ALL if you know there are no duplicates that need to be merged. UNION by itself defaults to UNION DISTINCT, so it needs to create a temporary table to hold all the results to scan for duplicates.