Select all columns from rows distinct on one colum

2019-09-13 04:59发布

问题:

I am using Netezza (based on PostgreSQL) and need to select all columns in a table for rows distinct on one column. A related question with answer can be found here, but it doesn't handle the case with all columns, going by that answer throws an error:

select distinct on (some_field) table1.* from table1 order by some_field;

Snippet from error with real data:

"(" (at char 77) expecting '')''

回答1:

I don't think your code should throw an error in Postgres. However, it won't do what you expect without an order by:

select distinct on (some_field) table1.*
from table1
order by some_field;


回答2:

The syntax of your query is correct for Postgres (like you declared at first). See:

  • Select first row in each GROUP BY group?

You later clarified you actually work with Netezza, which is only loosely related to Postgres. Wikipedia states:

Netezza is based on PostgreSQL 7.2,[8] but does not maintain compatibility.

Netezza does not seem to support DISTINCT ON (), only DISTINCT.

It supports row_number(), though. So this should work:

SELECT *
FROM  (
   SELECT *, row_number() OVER (PARTITION BY some_field) AS rn
   FROM   table1
   ) sub
WHERE  rn = 1;

The question remains: Which row do you want from each set with identical some_field. If any row is good, you are done here. Else, you need to add ORDER BY to the OVER clause.

Related:

  • SQL to get unique rows in Netezza DB