Postgresql regexp_replace

2019-07-15 23:49发布

问题:

How can I replace any year from 1990-2050 with a space ?

I can replace any 4 digit number as follows

select regexp_replace('sdfg 2000', '(\y(\d{4})\y)', '', 'g');

But how additionally I can check the range? Any help are welcome

回答1:

I have discovered an alternative way to solve your problem. Take a look.

You wish to replace year from 1990-2050. Let's break that range into

  1. 1990-1999
  2. 2000-2049
  3. 2050

All three ranges can be matched by following regex.

Regex: [1][9][9][0-9]|[2][0][0-4][0-9]|2050

Explanation:

  • [1][9][9][0-9] will match years from 1990 to 1999.

  • [2][0][0-4][0-9] will match years from 2000 to 2049.

  • 2050 will match 2050 literally

  • | means alteration. It will check either of these three patterns.

Regex101 Demo



回答2:

You can use a CASE expression to extract and test for the year and only replace if the year falls into the range you want:

with test_data (col1) as (
  values ('sdfg 2000'), ('foo 1983'), ('bar 2010'), ('bla 1940')
)
select col1, 
       case 
         when nullif(regexp_replace(col1, '[^0-9]+',''),'')::int between 1990 and 2050 
              then regexp_replace(col1, '\d{4}', '', 'g')
         else col1 
       end as replaced
from test_data;

Results in:

col1      | replaced
----------+---------
sdfg 2000 | sdfg    
foo 1983  | foo 1983
bar 2010  | bar     
bla 1940  | bla 1940

The nullif(..) is necessary for values that do not contain any numbers. If you don't have values like that, you can leave it out.



回答3:

You can't, from Wikipedia (emphasis mine):

Each character in a regular expression (that is, each character in the string describing its pattern) is understood to be: a metacharacter (with its special meaning), or a regular character (with its literal meaning).

In your case, the letters do not have literal meaning, their meaning depends on characters around it.