可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I would like to select duplicate entries from a SQL Server table, but only if the id is consecutive.
I have been trying to twist this answer to my needs, but I can't get it to work.
The above answer is for Oracle, but I see that SQL Server also has lead
and lag
functions.
Also, I think that the answer above puts a *
next to duplicates, but I only want to select the duplicates.
select
id, companyName,
case
when companyName in (prev, next)
then '*'
end match,
prev,
next
from
(select
id,
companyName,
lag(companyName, 1) over (order by id) prev,
lead(companyName, 1) over (order by id) next
from
companies)
order by
id;
Example:
So from this data set:
id companyName
-------------------
1 dogs ltd
2 cats ltd
3 pigs ltd
4 pigs ltd
5 cats ltd
6 cats ltd
7 dogs ltd
8 pigs ltd
I want to select:
id companyName
-------------------
3 pigs ltd
4 pigs ltd
5 cats ltd
6 cats ltd
Update
Every now and again I am taken aback by the quantity and quality of answers I get on SO. This is one of those times. I don't have the level of expertise to judge one answer as being better than another, so I've gone for SqlZim as this was the first working answer I saw. But it's great to see the different approaches. Especially when only an hour ago I was wondering "is this even possible?".
回答1:
This is a gaps and islands style problem, but instead of using two row_numbers()
, we use the id
and row_number()
in the innermost subquery. Followed by count() over()
to get the count per grp
, and finally return those with a cnt > 1
.
select id, companyname
from (
select
id
, companyName
, grp
, cnt = count(*) over (partition by companyname, grp)
from (
select *
, grp = id - row_number() over (partition by companyname order by id)
from
companies
) islands
) d
where cnt > 1
order by id
rextester demo: http://rextester.com/ACP73683
returns:
+----+-------------+
| id | companyname |
+----+-------------+
| 3 | pigs ltd |
| 4 | pigs ltd |
| 5 | cats ltd |
| 6 | cats ltd |
+----+-------------+
回答2:
You are very close to what you want:
select id, companyName
from (select c.*,
lag(companyName, 1) over (order by id) prev,
lead(companyName, 1) over (order by id) next
from companies c
) a
where CompanyName in (prev, next)
order by id;
回答3:
One more alternate form, using LEAD() and LAG() (SQL 2012 and up)
SELECT id, CompanyName
FROM (
SELECT *,
LEAD(CompanyName, 1) OVER(ORDER BY id) as nc,
LAG(CompanyName, 1) OVER(ORDER BY id) AS pc
FROM #t t
) x
WHERE nc = companyName
OR pc = companyName
Here is the test data, so you can check it out yourself.
CREATE TABLE #T (id int not null PRIMARY KEY, companyName varchar(16) not null)
INSERT INTO #t Values
(1, 'dogs ltd'),
(2, 'cats ltd'),
(3, 'pigs ltd'),
(4, 'pigs ltd'),
(5, 'cats ltd'),
(6, 'cats ltd'),
(7, 'dogs ltd'),
(8, 'pigs ltd')
回答4:
In the WHERE clause you just need to limit to those where the companyName is the same as the prev or the next
select id, companyName
from (
select id, companyName,
lag(companyName, 1) over (order by id) as prev,
lead(companyName, 1) over (order by id) as next
from companies
) q
where companyName in (prev, next)
order by id;
To make sure the id's are really without gaps then you can do it like this:
select id, companyName
from (
select id, companyName,
lag(concat(id+1,companyName), 1) over (order by id) as prev,
lead(concat(id-1,companyName), 1) over (order by id) as next
from companies
) q
where concat(id,companyName) in (prev, next)
order by id;
回答5:
You can use Row_Number() and get the duplicates based on partition by clause
;with cte as (
SELECT id, companyName,
RowN = Row_Number() over (partition by id order by companynae) from #yourTable
)
Select * from cte where RowN > 1
Can you provide your input and expected output to verify this query