SQL/mysql - Select distinct/UNIQUE but return all

2018-12-31 17:20发布

SELECT DISTINCT field1, field2, field3, ......   FROM table

I am trying to accomplish the following sql statement but I want it to return all columns is this possible? Something like:

SELECT DISTINCT field1, * from table

15条回答
若你有天会懂
2楼-- · 2018-12-31 18:16

From the phrasing of your question, I understand that you want to select the distinct values for a given field and for each such value to have all the other column values in the same row listed. Most DBMSs will not allow this with neither DISTINCT nor GROUP BY, because the result is not determined.

Think of it like this: if your field1 occurs more than once, what value of field2 will be listed (given that you have the same value for field1 in two rows but two distinct values of field2 in those two rows).

You can however use aggregate functions (explicitely for every field that you want to be shown) and using a GROUP BY instead of DISTINCT:

SELECT field1, MAX(field2), COUNT(field3), SUM(field4), .... FROM table GROUP BY field1
查看更多
人气声优
3楼-- · 2018-12-31 18:16

For SQL Server you can use the dense_rank and additional windowing functions to get all rows AND columns with duplicated values on specified columns. Here is an example...

with t as (
    select col1 = 'a', col2 = 'b', col3 = 'c', other = 'r1' union all
    select col1 = 'c', col2 = 'b', col3 = 'a', other = 'r2' union all
    select col1 = 'a', col2 = 'b', col3 = 'c', other = 'r3' union all
    select col1 = 'a', col2 = 'b', col3 = 'c', other = 'r4' union all
    select col1 = 'c', col2 = 'b', col3 = 'a', other = 'r5' union all
    select col1 = 'a', col2 = 'a', col3 = 'a', other = 'r6'
), tdr as (
    select 
        *, 
        total_dr_rows = count(*) over(partition by dr)
    from (
        select 
            *, 
            dr = dense_rank() over(order by col1, col2, col3),
            dr_rn = row_number() over(partition by col1, col2, col3 order by other)
        from 
            t
    ) x
)

select * from tdr where total_dr_rows > 1

This is taking a row count for each distinct combination of col1, col2, and col3.

查看更多
步步皆殇っ
4楼-- · 2018-12-31 18:23

That's a really good question. I have read some useful answers here already, but probably I can add a more precise explanation.

Reducing the number of query results with a GROUP BY statement is easy as long as you don't query additional information. Let's assume you got the following table 'locations'.

--country-- --city--
 France      Lyon
 Poland      Krakow
 France      Paris
 France      Marseille
 Italy       Milano

Now the query

SELECT country FROM locations
GROUP BY country

will result in:

--country--
 France
 Poland
 Italy

However, the following query

SELECT country, city FROM locations
GROUP BY country

...throws an error in MS SQL, because how could your computer know which of the three French cities "Lyon", "Paris" or "Marseille" you want to read in the field to the right of "France"?

In order to correct the second query, you must add this information. One way to do this is to use the functions MAX() or MIN(), selecting the biggest or smallest value among all candidates. MAX() and MIN() are not only applicable to numeric values, but also compare the alphabetical order of string values.

SELECT country, MAX(city) FROM locations
GROUP BY country

will result in:

--country-- --city--
 France      Paris
 Poland      Krakow
 Italy       Milano

or:

SELECT country, MIN(city) FROM locations
GROUP BY country

will result in:

--country-- --city--
 France      Lyon
 Poland      Krakow
 Italy       Milano

These functions are a good solution as long as you are fine with selecting your value from the either ends of the alphabetical (or numeric) order. But what if this is not the case? Let us assume that you need a value with a certain characteristic, e.g. starting with the letter 'M'. Now things get complicated.

The only solution I could find so far is to put your whole query into a subquery, and to construct the additional column outside of it by hands:

SELECT
     countrylist.*,
     (SELECT TOP 1 city
     FROM locations
     WHERE
          country = countrylist.country
          AND city like 'M%'
     )
FROM
(SELECT country FROM locations
GROUP BY country) countrylist

will result in:

--country-- --city--
 France      Marseille
 Poland      NULL
 Italy       Milano
查看更多
登录 后发表回答