SQL Selecting distinct rows from multiple columns

2019-08-12 20:27发布

问题:

This is my SQL View - lets call it MyView :

ECode   SHCode  TotalNrShare    CountryCode Country
000001  +00010  100 UKI United Kingdom
000001  ABENSO  900 USA United States
000355  +00012  1000    ESP Spain
000355  000010  50  FRA France
000042  009999  10  GER Germany
000042  +00012  999 ESP Spain
000787  ABENSO  500 USA United States
000787  000150  500 ITA Italy
001010  009999  100 GER Germany

I would like to return the single row with the highest number in the column TotalNrShare for each ECode.

For example, I’d like to return these results from the above view:

ECode   SHCode  TotalNrShare    CountryCode Country
000001  ABENSO  900 USA United States
000355  +00012  1000    ESP Spain
000042  +00012  999 ESP Spain
000787  ABENSO  500 USA United States
001010  009999  100 GER Germany

(note in the case of ECode 000787 where there are two SHCode's with 500 each, as they are the same amount we can just return the first row rather than both, it isnt important for me which row is returned since this will happen very rarely and my analysis doesnt need to be 100%)

Ive tried various things but do not seem to be able to return either unqiue results or the additional country code/country info that I need.

This is one of my attempts (based on other solutions on this site, but I am doing something wrong):

SELECT tsh.ECode, tsh.SHCode, tsh.TotalNrShare, tsh.CountryCode, tsh.Country
FROM  dbo.MyView AS tsh INNER JOIN
                   (SELECT DISTINCT ECode, MAX(TotalNrShare) AS MaxTotalSH
                    FROM   dbo.MyView
                    GROUP BY ECode) AS groupedtsh ON tsh.ECode = groupedtsh.ECode AND tsh.TotalNrShare = groupedtsh.MaxTotalSH

回答1:

WITH
  sequenced_data AS
(
  SELECT
    *,
    ROW_NUMBER() OVER (PARTITION BY ECode ORDER BY TotalNrShare) AS sequence_id
  FROM
    myView
)
SELECT
  *
FROM
  sequenced_data
WHERE
  sequence_id = 1

This should, however, give the same results as your example query. It's simply a different approach to accomplish the same thing.

As you say that something is wrong, however, please could you elaborate on what is going wrong? Is TotalNrShare actually a string for example? And is that messing up your ordering (and so the MAX())?

EDIT:

Even if the above code was not compatible with your SQL Server, it shouldn't crash it out completely. You should just get an error message. Try executing Select * By Magic, for example, and it should just give an error. I strongly suggest getting your installation of Management Studio looked at and/or re-installed.

In terms of an alternative, you could do this...

SELECT
  *
FROM
  (SELECT ECode FROM MyView GROUP BY ECode) AS base
CROSS APPLY
  (SELECT TOP 1 * FROM MyView WHERE ECode = base.ECode ORDER BY TotalNrShare DESC) AS data

Ideally you would replace the base sub-query with a table that already has a distinct list of all the ECodes that you are interested in.



回答2:

try this;

with cte as( 
 SELECT tsh.ECode, tsh.SHCode, tsh.TotalNrShare, tsh.CountryCode, tsh.Country,
 ROW_NUMBER() over (partition by ECode order by SHCode ) as row_num 
FROM  dbo.MyView)
select * from cte where row_num=1