PowerCenter REG_EXTRACT issue

2019-09-07 01:58发布

问题:

I'm having an issue, converting REGEXP_SUBSTR from ORACLE to REG_EXTRACT in PWC (9.5.1).

In Oracle i have the statement below:

select 
regexp_substr('AA          12345678               *             123','[^' || CHR (9) || ']+', 1,1)  FIELD1,
regexp_substr('AA          12345678               *             123','[^' || CHR (9) || ']+', 1,2)  FIELD2,
regexp_substr('AA          12345678               *             123','[^' || CHR (9) || ']+', 1,3)  FIELD3,
regexp_substr('AA          12345678               *             123','[^' || CHR (9) || ']+', 1,4)  FIELD4
from    DUAL;

Result:

FIELD1=AA
FIELD2=12345678
FIELD3=*
FIELD4=123

In PWC i've created a transformation with 4 ports with a similar regular expression:

FIELD1=REG_EXTRACT('AA          12345678               *             123','([^\s]+)', 1,1)
FIELD2=REG_EXTRACT('AA          12345678               *             123','([^\s]+)', 1,2)
FIELD3=REG_EXTRACT('AA          12345678               *             123','([^\s]+)', 1,3)
FIELD4=REG_EXTRACT('AA          12345678               *             123','([^\s]+)', 1,4)

Result:

FIELD1=NULL
FIELD2=NULL
FIELD3=NULL
FIELD4=NULL

Why i'm getting NULL values in FIELD[1-4] ? Can anyone help me with this question ?

Thank You !

回答1:

The 3rd and 4th arguments of REGEXP_SUBSTR and REG_EXTRACT are different - Oracle's position (3rd argument) is PowerCenter's match_from_start (4th parameter), occurrence and subPatternNum are switched similarly.

Try to use the following expressions:

FIELD1=REG_EXTRACT('AA  12345678  *  123', '([^\s]+)\s+([^\s]+)\s+([^\s]+)\s+([^\s]+)', 1)
FIELD2=REG_EXTRACT('AA  12345678  *  123', '([^\s]+)\s+([^\s]+)\s+([^\s]+)\s+([^\s]+)', 2)
FIELD3=REG_EXTRACT('AA  12345678  *  123', '([^\s]+)\s+([^\s]+)\s+([^\s]+)\s+([^\s]+)', 3)
FIELD4=REG_EXTRACT('AA  12345678  *  123', '([^\s]+)\s+([^\s]+)\s+([^\s]+)\s+([^\s]+)', 4)

Here's a list of REG_EXTRACT arguments:

  1. subject - value you want to compare against the regular expression pattern
  2. pattern - regular expression pattern that you want to match; enclose each subpattern in parentheses
  3. subPatternNum - subpattern number of the regular expression you want to match
  4. match_from_start - returns the substring if a match is found from the start of the string; use the following guidelines to determine the match from start value:
    • 0 - matches pattern with subject string from the starting index or any index
    • non-zero - matches pattern with subject string from the starting index