I have the following text to search:
#S7Z OK
#Wed Feb 18 07:16:26 GMT 2015
expiration=10.0
lastModified=1424192425832
length=466472
path=/name/_master_/563/5638ad54-8079-4399-ba2b-3257b6e6c7fd.pdf
userType=
The words proceeding each = are the names of properties. For each property name, I'd like to get the property value. That means I'm looking for a regular expression to use with regexp_substr to get the value of each known property.
Something like this:
SELECT REGEXP_SUBSTR(
'#S7Z OK
#Wed Feb 18 07:16:26 GMT 2015
expiration=10.0
lastModified=1424192425832
length=466472
path=/name/_master_/563/5638ad54-8079-4399-ba2b-3257b6e6c7fd.pdf
userType=',
'path=.+')
FROM dual
which returns:
path=/name/master/563/5638ad54-8079-4399-ba2b-3257b6e6c7fd.pdf
But I only want the value, that is "/name/master/563/5638ad54-8079-4399-ba2b-3257b6e6c7fd.pdf ".
It should also work for expiration, lastModified etc., that is, I don't just want to search for a url but any kind of value.
How can I achieve that in one regular expression?
SELECT REGEXP_SUBSTR(
'#S7Z OK
#Wed Feb 18 07:16:26 GMT 2015
expiration=10.0
lastModified=1424192425832
length=466472
path=/name/_master_/563/5638ad54-8079-4399-ba2b-3257b6e6c7fd.pdf
userType=',
'path=(.+)', 1, 1, null, 1)
FROM dual;
'. +=' as second argument
'' as third argument
Here is how you might capture all of the name=value
pairs all at once. Note that I use an explicit quantifier {1,10}
in the regular expression to prevent catastrophic backtracking. (This particular regex might not actually be subject to that, in which case you could replace the explicit quantifier with +
. But best not to take chances!)
WITH s1 AS (
SELECT '#S7Z OK
#Wed Feb 18 07:16:26 GMT 2015
expiration=10.0
lastModified=1424192425832
length=466472
path=/name/_master_/563/5638ad54-8079-4399-ba2b-3257b6e6c7fd.pdf
userType=' AS str
FROM dual
)
SELECT SUBSTR(name_value, 1, INSTR(name_value, '=') - 1) AS myname
, SUBSTR(name_value, INSTR(name_value, '=') + 1, LENGTH(name_value)) AS myvalue
FROM (
SELECT REGEXP_SUBSTR(REGEXP_SUBSTR(s1.str,'(\S+=\S*\s*){1,10}'), '\S+', 1, LEVEL) AS name_value
FROM s1
CONNECT BY REGEXP_SUBSTR(REGEXP_SUBSTR(s1.str,'(\S+=\S*\s*){1,10}'), '\S+', 1, LEVEL) IS NOT NULL
);
Output as follows:
MYNAME | MYVALUE
-------------------------------------------------------------------------
expiration | 10.0
lastModified | 1424192425832
length | 466472
path | /name/_master_/563/5638ad54-8079-4399-ba2b-3257b6e6c7fd.pdf
userType | (null)
Please see SQL Fiddle here.
Note that I could have used REGEXP_SUBSTR(name_value, '^[^=]+')
, etc., in the outer query but I figured there were enough regexes in this query (it's a bit expensive, I am sure SUBSTR()
plus INSTR()
is cheaper!). Also, note that if you're using Oracle 11g or higher, the CONNECT BY
clause can be replaced by the following:
CONNECT BY LEVEL <= REGEXP_COUNT(REGEXP_SUBSTR(s1.str,'(\S+=\S*\s*){1,10}'), '\S+')
See revised SQL Fiddle.