可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I have
POW,POW,POWPRO,PRO,PRO,PROUTL,TNEUTL,TNEUTL,UTL,UTLTNE,UTL,UTLTNE
I want
POW,POWPRO,PRO,PROUTL,TNEUTL,UTL,UTLTNE
I tried
select regexp_replace('POW,POW,POWPRO,PRO,PRO,PROUTL,TNEUTL,TNEUTL,UTL,UTLTNE,UTL,UTLTNE','([^,]+)(,\1)+','\1') from dual
And I get the output
POWPROUTL,TNEUTL,UTLTNE,UTLTNE
But i want the output to be
POW,POWPRO,PRO,PROUTL,TNEUTL,UTL,UTLTNE
Please help.
回答1:
Two solutions that use only SQL and a third solution that uses a small/simple PL/SQL function which makes for a very short final SQL query.
Oracle Setup:
CREATE TABLE data ( value ) AS
SELECT 'POW,POW,POWPRO,PRO,PRO,PROUTL,TNEUTL,TNEUTL,UTL,UTLTNE,UTL,UTLTNE' FROM DUAL;
CREATE TYPE stringlist AS TABLE OF VARCHAR2(4000);
/
Query 1:
SELECT LISTAGG( t.COLUMN_VALUE, ',' ) WITHIN GROUP ( ORDER BY t.COLUMN_VALUE ) AS list
FROM data d,
TABLE(
SET(
CAST(
MULTISET(
SELECT REGEXP_SUBSTR( d.value, '[^,]+', 1, LEVEL )
FROM DUAL
CONNECT BY LEVEL <= REGEXP_COUNT( d.value, '[^,]+' )
) AS stringlist
)
)
) t
GROUP BY d.value;
Outputs:
LIST
---------------------------------------
POW,POWPRO,PRO,PROUTL,TNEUTL,UTL,UTLTNE
Query 2:
SELECT ( SELECT LISTAGG( COLUMN_VALUE, ',' ) WITHIN GROUP ( ORDER BY ROWNUM )
FROM TABLE( d.uniques ) ) AS list
FROM (
SELECT ( SELECT CAST(
COLLECT(
DISTINCT
REGEXP_SUBSTR( d.value, '[^,]+', 1, LEVEL )
)
AS stringlist
)
FROM DUAL
CONNECT BY LEVEL <= REGEXP_COUNT( d.value, '[^,]+' )
) uniques
FROM data d
) d;
Output:
LIST
---------------------------------------
POW,POWPRO,PRO,PROUTL,TNEUTL,UTL,UTLTNE
Oracle Setup:
A small helper function:
CREATE FUNCTION split_String(
i_str IN VARCHAR2,
i_delim IN VARCHAR2 DEFAULT ','
) RETURN stringlist DETERMINISTIC
AS
p_result stringlist := stringlist();
p_start NUMBER(5) := 1;
p_end NUMBER(5);
c_len CONSTANT NUMBER(5) := LENGTH( i_str );
c_ld CONSTANT NUMBER(5) := LENGTH( i_delim );
BEGIN
IF c_len > 0 THEN
p_end := INSTR( i_str, i_delim, p_start );
WHILE p_end > 0 LOOP
p_result.EXTEND;
p_result( p_result.COUNT ) := SUBSTR( i_str, p_start, p_end - p_start );
p_start := p_end + c_ld;
p_end := INSTR( i_str, i_delim, p_start );
END LOOP;
IF p_start <= c_len + 1 THEN
p_result.EXTEND;
p_result( p_result.COUNT ) := SUBSTR( i_str, p_start, c_len - p_start + 1 );
END IF;
END IF;
RETURN p_result;
END;
/
Query 3:
SELECT ( SELECT LISTAGG( COLUMN_VALUE, ',' ) WITHIN GROUP ( ORDER BY ROWNUM )
FROM TABLE( SET( split_String( d.value ) ) ) ) AS list
FROM data d;
or (if you only want to pass a single value):
SELECT LISTAGG( COLUMN_VALUE, ',' ) WITHIN GROUP ( ORDER BY ROWNUM ) AS list
FROM TABLE( SET( split_String(
'POW,POW,POWPRO,PRO,PRO,PROUTL,TNEUTL,TNEUTL,UTL,UTLTNE,UTL,UTLTNE'
) ) );
Output:
LIST
---------------------------------------
POW,POWPRO,PRO,PROUTL,TNEUTL,UTL,UTLTNE
回答2:
The solution offered below uses straight SQL (no PL/SQL). It works with any possible input string, and it removes duplicates in place - it keeps the order of input tokens, whatever that order is. It also removes consecutive commas (it "deletes nulls" from the input string) while treating null inputs correctly. Notice the output for an input string consisting of commas only, and the correct treatment of "tokens" consisting of two spaces and one space respectively.
The query runs relatively slowly; if performance is an issue, it can be re-written as a recursive query, using "traditional" substr
and instr
which are quite a bit faster than regular expressions.
with inputs (input_string) as (
select 'POW,POW,POWPRO,PRO,PRO,PROUTL,TNEUTL,TNEUTL,UTL,UTLTNE,UTL,UTLTNE' from dual
union all
select null from dual
union all
select 'ab,ab,st,ab,st, , , ,x,,,r' from dual
union all
select ',,,' from dual
),
tokens (input_string, rk, token) as (
select input_string, level,
regexp_substr(input_string, '([^,]+)', 1, level, null, 1)
from inputs
connect by level <= 1 + regexp_count(input_string, ',')
),
distinct_tokens (input_string, rk, token) as (
select input_string, min(rk) as rk, token
from tokens
group by input_string, token
)
select input_string, listagg(token, ',') within group (order by rk) output_string
from distinct_tokens
group by input_string
;
Results for the inputs I created:
INPUT_STRING OUTPUT_STRING
------------------------------------------------------------------ ----------------------------------------
,,, (null)
POW,POW,POWPRO,PRO,PRO,PROUTL,TNEUTL,TNEUTL,UTL,UTLTNE,UTL,UTLTNE POW,POWPRO,PRO,PROUTL,TNEUTL,UTL,UTLTNE
ab,ab,st,ab,st, , , ,x,,,r ab,st, , ,x,r
(null) (null)
4 rows selected.