I know this has been answered to some degree with PHP and MYSQL, but I was wondering if someone could teach me the simplest approach to splitting a string (comma delimited) into multiple rows in Oracle 10g (preferably) and 11g.
The table is as follows:
Name | Project | Error
108 test Err1, Err2, Err3
109 test2 Err1
I want to create the following:
Name | Project | Error
108 Test Err1
108 Test Err2
108 Test Err3
109 Test2 Err1
I've seen a few potential solutions around stack, however they only accounted for a single column (being the comma delimited string). Any help would be greatly appreciated.
REGEXP_COUNT wasn't added until Oracle 11i. Here's an Oracle 10g solution, adopted from Art's solution.
Without using connect by or regexp:
A couple of more examples of the same:
Also, may use DBMS_UTILITY.comma_to_table & table_to_comma: http://www.oracle-base.com/articles/9i/useful-procedures-and-functions-9i.php#DBMS_UTILITY.comma_to_table
Here is an alternative implementation using XMLTABLE that allows for casting to different data types:
... or if your delimited strings are stored in one or more rows of a table:
The accepted answer has poor performance when using large datasets.
This may be an improved way (also with regexp and connect by):
EDIT: Here is a simple (as in, "not in depth") explanation of the query.
length (regexp_replace(t.error, '[^,]+')) + 1
usesregexp_replace
to erase anything that is not the delimiter (comma in this case) andlength +1
to get how many elements (errors) are there.The
select level from dual connect by level <= (...)
uses a hierarchical query to create a column with an increasing number of matches found, from 1 to the total number of errors.Preview:
table(cast(multiset(.....) as sys.OdciNumberList))
does some casting of oracle types.cast(multiset(.....)) as sys.OdciNumberList
transforms multiple collections (one collection for each row in the original data set) into a single collection of numbers, OdciNumberList.table()
function transforms a collection into a resultset.FROM
without a join creates a cross join between your dataset and the multiset. As a result, a row in the data set with 4 matches will repeat 4 times (with an increasing number in the column named "column_value").Preview:
trim(regexp_substr(t.error, '[^,]+', 1, levels.column_value))
uses thecolumn_value
as the nth_appearance/ocurrence parameter forregexp_substr
.t.name, t.project
as an example) for easy visualization.Some references to Oracle docs:
regular expressions is a wonderful thing :)