LISTAGG in Oracle to return distinct values

2019-01-02 15:34发布

I am trying to use the LISTAGG function in Oracle. I would like to get only the distinct values for that column. Is there a way in which I can get only the distinct values without creating a function or a procedure?

  col1  col2 Created_by
   1     2     Smith 
   1     2     John 
   1     3     Ajay 
   1     4     Ram 
   1     5     Jack 

I need to select col1 and the LISTAGG of col2 (column 3 is not considered). When I do that, I get something like this as the result of LISTAGG: [2,2,3,4,5]

I need to remove the duplicate '2' here; I need only the distinct values of col2 against col1.

23条回答
一个人的天荒地老
2楼-- · 2019-01-02 15:51

To get around the string length issue you can use XMLAGG which is similar to listagg but it returns a clob.

You can can then parse using regexp_replace and get the unique values and then turn it back into a string using dbms_lob.substr(). If you have a huge amount of distinct values you will still run out of space this way but for a lot of cases the code below should work.

You can also change the delimiters you use. In my case I wanted '-' instead of ',' but you should be able to replace the dashes in my code and use commas if you want that.

select col1,
    dbms_lob.substr(ltrim(REGEXP_REPLACE(REPLACE(
         REPLACE(
           XMLAGG(
             XMLELEMENT("A",col2)
               ORDER BY col2).getClobVal(),
             '<A>','-'),
             '</A>',''),'([^-]*)(-\1)+($|-)', 
           '\1\3'),'-'), 4000,1) as platform_mix
from table
查看更多
流年柔荑漫光年
3楼-- · 2019-01-02 15:52

I think this could help - CASE the columns value to NULL if it's duplicate - then it's not appended to LISTAGG string:

with test_data as 
(
      select 1 as col1, 2 as col2, 'Smith' as created_by from dual
union select 1, 2, 'John' from dual
union select 1, 3, 'Ajay' from dual
union select 1, 4, 'Ram' from dual
union select 1, 5, 'Jack' from dual
union select 2, 5, 'Smith' from dual
union select 2, 6, 'John' from dual
union select 2, 6, 'Ajay' from dual
union select 2, 6, 'Ram' from dual
union select 2, 7, 'Jack' from dual
)
SELECT col1  ,
      listagg(col2 , ',') within group (order by col2 ASC) AS orig_value,
      listagg(CASE WHEN rwn=1 THEN col2 END , ',') within group (order by col2 ASC) AS distinct_value
from 
    (
    select row_number() over (partition by col1,col2 order by 1) as rwn, 
           a.*
    from test_data a
    ) a
GROUP BY col1   

Results in:

COL1  ORIG         DISTINCT
1   2,2,3,4,5   2,3,4,5
2   5,6,6,6,7   5,6,7
查看更多
春风洒进眼中
4楼-- · 2019-01-02 15:52

Further refining @YoYo's correction to @a_horse_with_no_name's row_number() based approach using DECODE vs CASE (i saw here). I see that @Martin Vrbovsky also has this case approach answer.

select
  col1, 
  listagg(col2, ',') within group (order by col2) AS col2_list,
  listagg(col3, ',') within group (order by col3) AS col3_list,
  SUM(col4) AS col4
from (
  select
    col1, 
    decode(row_number() over (partition by col1, col2 order by null),1,col2) as col2,
    decode(row_number() over (partition by col1, col3 order by null),1,col3) as col3
  from foo
)
group by col1;
查看更多
千与千寻千般痛.
5楼-- · 2019-01-02 15:53

Upcoming Oracle 19c will support DISTINCT with LISTAGG.

LISTAGG with DISTINCT option:

This feature is coming with 19c:

SQL> select deptno, listagg (distinct sal,', ') within group (order by sal)  
  2  from scott.emp  
  3  group by deptno;  
查看更多
旧时光的记忆
6楼-- · 2019-01-02 15:56

What about creating a dedicated function that will make the "distinct" part :

create or replace function listagg_distinct (t in str_t, sep IN VARCHAR2 DEFAULT ',') 
  return VARCHAR2
as 
  l_rc VARCHAR2(4096) := '';
begin
  SELECT listagg(val, sep) WITHIN GROUP (ORDER BY 1)
    INTO l_rc
    FROM (SELECT DISTINCT column_value val FROM table(t));
  RETURN l_rc;
end;
/

And then use it to do the aggregation :

SELECT col1, listagg_distinct(cast(collect(col_2) as str_t ), ', ')
  FROM your_table
  GROUP BY col_1;
查看更多
唯独是你
7楼-- · 2019-01-02 15:59

listagg() ignores NULL values, so in a first step you could use the lag() function to analyse whether the previous record had the same value, if yes then NULL, else 'new value'.

WITH tab AS 
(           
          SELECT 1 as col1, 2 as col2, 'Smith' as created_by FROM dual
UNION ALL SELECT 1 as col1, 2 as col2, 'John'  as created_by FROM dual
UNION ALL SELECT 1 as col1, 3 as col2, 'Ajay'  as created_by FROM dual
UNION ALL SELECT 1 as col1, 4 as col2, 'Ram'   as created_by FROM dual
UNION ALL SELECT 1 as col1, 5 as col2, 'Jack'  as created_by FROM dual
)
SELECT col1
     , CASE 
       WHEN lag(col2) OVER (ORDER BY col2) = col2 THEN 
         NULL 
       ELSE 
         col2 
       END as col2_with_nulls
     , created_by
  FROM tab;

Results

      COL1 COL2_WITH_NULLS CREAT
---------- --------------- -----
         1               2 Smith
         1                 John
         1               3 Ajay
         1               4 Ram
         1               5 Jack

Note that the second 2 is replaced by NULL. Now you can wrap a SELECT with the listagg() around it.

WITH tab AS 
(           
          SELECT 1 as col1, 2 as col2, 'Smith' as created_by FROM dual
UNION ALL SELECT 1 as col1, 2 as col2, 'John'  as created_by FROM dual
UNION ALL SELECT 1 as col1, 3 as col2, 'Ajay'  as created_by FROM dual
UNION ALL SELECT 1 as col1, 4 as col2, 'Ram'   as created_by FROM dual
UNION ALL SELECT 1 as col1, 5 as col2, 'Jack'  as created_by FROM dual
)
SELECT listagg(col2_with_nulls, ',') WITHIN GROUP (ORDER BY col2_with_nulls) col2_list
  FROM ( SELECT col1
              , CASE WHEN lag(col2) OVER (ORDER BY col2) = col2 THEN NULL ELSE col2 END as col2_with_nulls
              , created_by
           FROM tab );

Result

COL2_LIST
---------
2,3,4,5

You can do this over multiple columns too.

WITH tab AS 
(           
          SELECT 1 as col1, 2 as col2, 'Smith' as created_by FROM dual
UNION ALL SELECT 1 as col1, 2 as col2, 'John'  as created_by FROM dual
UNION ALL SELECT 1 as col1, 3 as col2, 'Ajay'  as created_by FROM dual
UNION ALL SELECT 1 as col1, 4 as col2, 'Ram'   as created_by FROM dual
UNION ALL SELECT 1 as col1, 5 as col2, 'Jack'  as created_by FROM dual
)
SELECT listagg(col1_with_nulls, ',') WITHIN GROUP (ORDER BY col1_with_nulls) col1_list
     , listagg(col2_with_nulls, ',') WITHIN GROUP (ORDER BY col2_with_nulls) col2_list
     , listagg(created_by, ',')      WITHIN GROUP (ORDER BY created_by) created_by_list
  FROM ( SELECT CASE WHEN lag(col1) OVER (ORDER BY col1) = col1 THEN NULL ELSE col1 END as col1_with_nulls
              , CASE WHEN lag(col2) OVER (ORDER BY col2) = col2 THEN NULL ELSE col2 END as col2_with_nulls
              , created_by
           FROM tab );

Result

COL1_LIST COL2_LIST CREATED_BY_LIST
--------- --------- -------------------------
1         2,3,4,5   Ajay,Jack,John,Ram,Smith
查看更多
登录 后发表回答