I am trying to use the LISTAGG
function in Oracle. I would like to get only the distinct values for that column. Is there a way in which I can get only the distinct values without creating a function or a procedure?
col1 col2 Created_by 1 2 Smith 1 2 John 1 3 Ajay 1 4 Ram 1 5 Jack
I need to select col1 and the LISTAGG
of col2 (column 3 is not considered). When I do that, I get something like this as the result of LISTAGG
: [2,2,3,4,5]
I need to remove the duplicate '2' here; I need only the distinct values of col2 against col1.
I wrote a function to handle this using regular expressions. The in parameters are: 1) the listagg call itself 2) A repeat of the delimiter
Now you don't have to repeat the regular expression every time you do this, simply say:
select col1, listaggr(col2,',') within group(Order by col2) from table group by col1
meaning aggregate the strings (col2) into list keeping the order n then afterwards deal with the duplicates as group by col1 meaning merge col1 duplicates in 1 group. perhaps this looks clean and simple as it should be and if in case you want col3 as well just you need to add one more listagg() that isselect col1, listaggr(col2,',') within group(Order by col2),listaggr(col3,',') within group(order by col3) from table group by col1
you can use undocumented
wm_concat
function.this function returns clob column, if you want you can use
dbms_lob.substr
to convert clob to varchar2.Use listagg_clob function created like this:
Using
SELECT DISTINCT ...
as part of a Subquery before calling LISTAGG is probably the best way for simple queries, as noted by @a_horse_with_no_nameHowever, in more complex queries, it might not be possible, or easy, to accomplish this. I had this come up in a scenario that was using top-n approach using an analytic function.
So I found the
COLLECT
aggregate function. It is documented to have theUNIQUE
orDISTINCT
modifier available. Only in 10g, it quietly fails (it ignores the modifier without error). However, to overcome this, from another answer, I came to this solution:Basically, by using
SET
, I remove the duplicates in my collection.You would still need to define the
tab_typ
as a basic collection type, and in the case of aVARCHAR
, this would be for example:Also as a correction to the answer from @a_horse_with_no_name on the multi column situation, where you might want to aggregate still on a third (or more) columns:
If you would leave the
rn = 1
as a where condition to the query, you would aggregate other columns incorrectly.