SQL to get unique rows in Netezza DB

2019-05-11 15:01发布

问题:

I have a table with rows like:

id    group_name_code  
1     999  
2     16  
3     789  
4     999  
5     231  
6     999  
7     349  
8     16  
9     819  
10     999  
11     654  

But I want output rows like this:

id    group_name_code
1     999  
2     16  
3     789  
4     231  
5     349  
6     819  
7     654  

Will this query help?

select id, distinct(group_name_code) from group_table;

回答1:

You seem to want:

Distinct values for group_name_code and a sequential id ordered by minimum id per set of group_name_code.

Netezza has the DISTINCT key word, but not DISTINCT ON () (Postgres feature):
https://www.ibm.com/support/knowledgecenter/en/SSULQD_7.2.1/com.ibm.nz.dbu.doc/r_dbuser_select.html

You could:

SELECT DISTINCT group_name_code FROM group_table;

No parentheses, the DISTINCT key word does not require parentheses.
But you would not get the sequential id you show with this.

There are "analytic functions" a.k.a. window functions:
https://www.ibm.com/support/knowledgecenter/en/SSULQD_7.2.1/com.ibm.nz.dbu.doc/c_dbuser_overview_analytic_funcs.html

And there is also row_number():
https://www.ibm.com/support/knowledgecenter/en/SSULQD_7.2.1/com.ibm.nz.dbu.doc/r_dbuser_functions.html

So this should work:

SELECT row_number() OVER (ORDER BY min(id)) AS new_id, group_name_code
FROM   group_table
GROUP  BY group_name_code
ORDER  BY min(id);

Or use a subquery if Netezza should not allow to nest aggregate and window functions:

SELECT row_number() OVER (ORDER BY id) AS new_id, group_name_code
FROM  (
   SELECT min(id) AS id, group_name_code
   FROM   group_table
   GROUP  BY group_name_code
   ) sub
ORDER  BY id;


回答2:

If you do not mind losing data on id you can use an aggregate function on that column and group by group_name_code:

select min(id) as id, group_name_code 
from group_table
group by group_name_code
order by id; 

This way you pull unique values for group_name_code and the lowest id for each code.

If you don't need id in your output (it seems like this doesn't correspond to input table) and just want the unique codes, try this:

select group_name_code 
from p
group by group_name_code
order by id; 

This gets the codes you want. If you want id to be the rownumber that will depend on which RDBMS you are using



回答3:

you can get that result using CTE, replace #t with you table name and value with group_name_code

; WITH tbl AS ( SELECT DISTINCT value FROM #t )

SELECT ROW_NUMBER() OVER (ORDER BY value) AS id,* FROM tbl