可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I have a table which is as follows:
emp_name emp_address sex matial_status
uuuu eee m s
iiii iii f s
uuuu eee m s
I want to remove the duplicate entries based on 3 fields emp_name, emp_address and sex.
and my resultant table (after removing the duplicates) should look like -
emp_name emp_address sex marital_status
uuuu eee m s
iiii iii f s
I am not able to recall how to write a SQL Query for this. an anyone pls help?
回答1:
It looks like all four column values are duplicated so you can do this -
select distinct emp_name, emp_address, sex, marital_status
from YourTable
However if marital status can be different and you have some other column based on which to choose (for eg you want latest record based on a column create_date) you can do this
select emp_name, emp_address, sex, marital_status
from YourTable a
where not exists (select 1
from YourTable b
where b.emp_name = a.emp_name and
b.emp_address = a.emp_address and
b.sex = a.sex and
b.create_date >= a.create_date)
回答2:
I would create a new table with a unique index over the columns that you want to keep unique. Then do an insert from the old table into the new, ignoring the warnings about duplicated rows. Lastly, I would drop (or rename) the old table and replace it with the new table. In MySQL, this would look like
CREATE TABLE tmp LIKE mytable;
ALTER TABLE tmp ADD UNIQUE INDEX myindex (emp_name, emp_address, sex, marital_status);
INSERT IGNORE INTO tmp SELECT * FROM mytable;
DROP TABLE mytable;
RENAME TABLE tmp TO mytable;
Or something similar (this is totally untested).
回答3:
This is not a query but a delete statement. It will delete/remove duplicate rows from your table
;with C as
(
select row_number() over(partition by DUPLICATE_VAARS_DECISION
order by NODE_EQ_NO) as rn
from yourtable
)
delete C
where rn > 1
If you are only interested in querying the table and get the non duplicates as a result you should use this instead.
;with C as
(
select *,
row_number() over(partition by DUPLICATE_VAARS_DECISION
order by NODE_EQ_NO) as rn
from yourtable
)
select *
from C
where rn = 1
回答4:
one way
select emp_name, emp_address, sex, max(marital_status) as marital_status
from Yourtable
group by emp_name, emp_address, sex
Since I don't know what you want, I used max for the marital status
See also Including an Aggregated Column's Related Values for more examples
回答5:
If you are okay with trading space for performance and simplicity then the duplicates in emp_name | emp_address | sex
combo can be eliminated, by the introduction of a calculated/derived column using CHECKSUM()
TSQL method and DISTINCT
keyword while querying.
Heres an example of CHECKSUM :
SELECT CHECKSUM(*) FROM HumanResources.Employee WHERE EmployeeID = 2
Google around and create a dependent column that contains the checksum of the 3 columns.
Then you can select distinct rows by looking at this question
回答6:
The best answer is here:
Use this SQL statement to identify the extra duplicated rows:
select * from Employee a
where %%physloc%% >
(select min(%%physloc%%) from Employee b
where a.emp_name=b.emp_name and a.emp_address=b.emp_address and a.sex=b.sex);
you will get the extra row:
uuuu eee m s
Use this SQL statement to delete the extra duplicated rows:
delete from Employee a
where %%physloc%% >
(select min(%%physloc%%) from Employee b
where a.emp_name=b.emp_name and a.emp_address=b.emp_address and a.sex=b.sex);
For all duplicated records, only the one with lowest physical location is kept. This method can be applied to remove all kinds of duplicated rows.
I am assuming that you use MS SQL Server. If you are using Oracle DB, then you can just replace '%%physloc%%' with 'rowid'
Enjoy the code!
回答7:
I know this is old post, but recently I tested a solution and want to share if any one can find my solution helpful -
CREATE TABLE tmpTable
LIKE yourTable
;
insert into tmpTable
(col1
, col2
... colN
) SELECT distinct col1
, col2
... colN
FROM yourTable
WHERE 1;
drop table yourTable
;
RENAME TABLE tmpTable
TO yourTable
;
Please note, insert into statement may execute without primary key.
Thanks.