Regular Expression for REGEXP_SUBSTR in Oracle

2019-05-28 09:59发布

I have the following text to search:

#S7Z OK
#Wed Feb 18 07:16:26 GMT 2015
expiration=10.0
lastModified=1424192425832
length=466472
path=/name/_master_/563/5638ad54-8079-4399-ba2b-3257b6e6c7fd.pdf 
userType=

The words proceeding each = are the names of properties. For each property name, I'd like to get the property value. That means I'm looking for a regular expression to use with regexp_substr to get the value of each known property.

Something like this:

SELECT REGEXP_SUBSTR(
'#S7Z OK
#Wed Feb 18 07:16:26 GMT 2015
expiration=10.0
lastModified=1424192425832
length=466472
path=/name/_master_/563/5638ad54-8079-4399-ba2b-3257b6e6c7fd.pdf 
userType=',
'path=.+')
FROM dual

which returns: path=/name/master/563/5638ad54-8079-4399-ba2b-3257b6e6c7fd.pdf

But I only want the value, that is "/name/master/563/5638ad54-8079-4399-ba2b-3257b6e6c7fd.pdf ". It should also work for expiration, lastModified etc., that is, I don't just want to search for a url but any kind of value.

How can I achieve that in one regular expression?

3条回答
别忘想泡老子
2楼-- · 2019-05-28 10:25
SELECT REGEXP_SUBSTR(
'#S7Z OK
#Wed Feb 18 07:16:26 GMT 2015
expiration=10.0
lastModified=1424192425832
length=466472
path=/name/_master_/563/5638ad54-8079-4399-ba2b-3257b6e6c7fd.pdf 
userType=',
'path=(.+)', 1, 1, null, 1)
FROM dual;
查看更多
Fickle 薄情
3楼-- · 2019-05-28 10:28

'. +=' as second argument '' as third argument

查看更多
够拽才男人
4楼-- · 2019-05-28 10:29

Here is how you might capture all of the name=value pairs all at once. Note that I use an explicit quantifier {1,10} in the regular expression to prevent catastrophic backtracking. (This particular regex might not actually be subject to that, in which case you could replace the explicit quantifier with +. But best not to take chances!)

WITH s1 AS (
    SELECT '#S7Z OK
#Wed Feb 18 07:16:26 GMT 2015
expiration=10.0
lastModified=1424192425832
length=466472
path=/name/_master_/563/5638ad54-8079-4399-ba2b-3257b6e6c7fd.pdf 
userType=' AS str
      FROM dual
)
SELECT SUBSTR(name_value, 1, INSTR(name_value, '=') - 1) AS myname
     , SUBSTR(name_value, INSTR(name_value, '=') + 1, LENGTH(name_value)) AS myvalue
  FROM (
    SELECT REGEXP_SUBSTR(REGEXP_SUBSTR(s1.str,'(\S+=\S*\s*){1,10}'), '\S+', 1, LEVEL) AS name_value
      FROM s1
   CONNECT BY REGEXP_SUBSTR(REGEXP_SUBSTR(s1.str,'(\S+=\S*\s*){1,10}'), '\S+', 1, LEVEL) IS NOT NULL
);

Output as follows:

MYNAME       | MYVALUE
-------------------------------------------------------------------------
expiration   | 10.0
lastModified | 1424192425832
length       | 466472
path         | /name/_master_/563/5638ad54-8079-4399-ba2b-3257b6e6c7fd.pdf
userType     | (null)

Please see SQL Fiddle here.

Note that I could have used REGEXP_SUBSTR(name_value, '^[^=]+'), etc., in the outer query but I figured there were enough regexes in this query (it's a bit expensive, I am sure SUBSTR() plus INSTR() is cheaper!). Also, note that if you're using Oracle 11g or higher, the CONNECT BY clause can be replaced by the following:

CONNECT BY LEVEL <= REGEXP_COUNT(REGEXP_SUBSTR(s1.str,'(\S+=\S*\s*){1,10}'), '\S+')

See revised SQL Fiddle.

查看更多
登录 后发表回答