SQL to get distinct record for a combination of tw

2020-05-07 19:01发布

问题:

Consider the below Table structure and data

CREATE TABLE Distance(
      source VARCHAR(20),
      destination VARCHAR(20),
      distance INTEGER
      );

Select * from Distance;

source     destination  distance
=======    ===========  ======
Chennai    Mumbai       500
Mumbai     Chennai      500
Mumbai     Bangalore    500
Bangalore  Mumbai       500
Goa        Mumbai       100
Mumbai     Goa          100
Kolkata    Goa          1000

I need the output to have single record for 2 cities if repeating, i,e, any one record among the below 2 is fine.

Chennai    Mumbai       500
Mumbai     Chennai      500

Expected o/p:
source     destination  distance
=======    ===========  ======
Chennai    Mumbai       500
Mumbai     Bangalore    500
Goa        Mumbai       100
Kolkata    Goa          1000

回答1:

Here is one method using least() and greatest():

select least(source, destination), greatest(source, destination), max(distance)
from distance
group by least(source, destination), greatest(source, destination);

This has the disadvantage that you could return a row not in the table. For instance, if you had a single row with "Mumbai/Chennai/500", then this query would return "Chennai/Mumbai/500" -- and this row is not in the original table.

So, an alternative method is:

select source, destination, distance
from distance
where source < destination
union all
select destination, source, distance
from distance d
where source > destination and
      not exists (select 1
                  from distance d2
                  where d2.source = d.destination and d2.destination = d.source
                 );

This version is also ANSI-compatible and should work in all databases.



回答2:

SELECT DISTINCT LEAST(source,destination) a
              , GREATEST(source,destination) b
              , distance 
           FROM distance;


回答3:

If you need to preserve the order of columns you might use

SELECT *
FROM Distance t1
WHERE NOT EXISTS
 (
   SELECT * FROM Distance t2
   WHERE t1.destination = t2.source
     AND t1.source = t2.destination
     AND t1.destination > t2.destination
 );

When multiple rows per source/combination exist you must either add DISTINCT or a GROUP BY.