SQL to check when pairs don't match

2019-06-04 07:49发布

问题:

I am using SQL Server 2012 I have the following sample data

Date        Type    Symbol      Price
6/30/1995   gaus    313586U72   109.25
6/30/1995   gbus    313586U72   108.94
6/30/1995   csus    NES         34.5
6/30/1995   lcus    NES         34.5
6/30/1995   lcus    NYN         40.25
6/30/1995   uaus    NYN         40.25
6/30/1995   agus    SRR         10.25
6/30/1995   lcus    SRR         0.45
7/1/1995    gaus    313586U72   109.25
7/1/1995    gbus    313586U72   108.94

I want to filter out when symbol and price match. It's ok if type doesn't match. Thus with the above data I would expect to only see

Date        Type    Symbol      Price
6/30/1995   gaus    313586U72   109.25
6/30/1995   gbus    313586U72   108.94
6/30/1995   agus    SRR         10.25
6/30/1995   lcus    SRR         0.45
7/1/1995    gaus    313586U72   109.25
7/1/1995    gbus    313586U72   108.94

NES and NYN have been filtered out because their symbol and price matches.

I was thinking of using Partition and row number, but I am not sure how to pair and filter rows using that or another function.

* **UPDATE I will be testing the replies. I should have mentioned I just want to see duplicates for symbol and price that occur on the same date. Also the table is called duppri

回答1:

One way is to use the exists predicate with a correlated subquery that checks that the specific symbol have more than one price.:

select * from table1 t
where exists (
  select 1
  from table1
  where symbol = t.symbol
  and price <> t.price);

Sample SQL Fiddle

This would return:

|                   Date | Type |    Symbol |  Price |
|------------------------|------|-----------|--------|
| June, 30 1995 02:00:00 | gaus | 313586U72 | 109.25 |
| June, 30 1995 02:00:00 | gbus | 313586U72 | 108.94 |
| June, 30 1995 02:00:00 | agus |       SRR |  10.25 |
| June, 30 1995 02:00:00 | lcus |       SRR |   0.45 |
| July, 01 1995 02:00:00 | gaus | 313586U72 | 109.25 |
| July, 01 1995 02:00:00 | gbus | 313586U72 | 108.94 |

Edit: inspiried by Gordon Linoffs clever answer another option could be to use avg() as a windowed function:

select Date, Type, Symbol, Price  
from (
  select Date, Type, Symbol, Price, avg = avg(price) over (partition by symbol) 
  from table1) a
where avg <> price;

Edit: with a check to ensure only duplicates on the same date are returned: http://www.sqlfiddle.com/#!6/29d67/1



回答2:

I would approach this using window functions:

select s.*
from (select s.*,
             min(price) over (partition by symbol) as minprice,
             max(price) over (partition by symbol) as maxprice
      from sample s
     ) s
where minprice <> maxprice;


回答3:

Use a sub-select with GROUP BY combined with HAVING COUNT DISTINCT to find "bad" symbols:

select * from your_table
where symbol not in
(
  select symbol
  from your_table
  group by symbol
  having count(distinct price) > 1
)