Extracting number preceding a particular text usin

2019-07-20 00:52发布

I'm looking for a regex to extract two numbers from the same text (they can be run independently, no need to extract them both in one go.

I'm using yahoo pipes.

Source Text: S$ 5,200 / month Negotiable, 1,475 sqft / 137 sqm (built-in) - Apartment, 10 Anson Road (D02)

Need to extract as a number: 1,475 and also (but can be extracted on a separate instance) Need to extract as a number: 137

I got the following pattern from someone quite helpful on a different forum:

\b(\d+(,\d+)*)\s+(sqft|sqm)

but when i go and use it with a replace $1, it brings back the whole source text instead of just the numbers i want (ie. 1,475 or 137 depending on whether i run \b(\d+(,\d+))\s+(sqft) or \b(\d+(,\d+))\s+(sqm)

what am i doing wrong?

4条回答
劫难
2楼-- · 2019-07-20 01:34

Well you could do this by iterating through the matches and getting the results that way.

But if you want to use the replace method then this could work:

^.*?(?<sqft>\d+(,\d+)*)\s?sqft.*?(?<sqm>\d+(,\d+)*)\s?sqm.*$

And then replace with:

${sqft}
${sqm}

Here it is in action.

This will work with or without a comma in the sqft or sqm numbers. And the .* at the beginning, middle, and end forces it to match the entire string so that the replacement text eliminates everything except for what you're after.

查看更多
ゆ 、 Hurt°
3楼-- · 2019-07-20 01:40

Since you didn't specify a language, here is some Python:

import re

s = "$ 5,200 / month Negotiable, 1,475 sqft / 137 sqm (built-in) - Apartment, 10 Anson Road (D02)"
print re.search(r'\b([0-9.,]+) ?sqft ?/ ?([0-9.,]+) ?sqm', s).groups()
# prints ('1,475', '137')

Searches for any number, comma, or period after a word boundary, followed by an optional space, and the word 'sqft', then an optional space, a slash, an optional space space, followed by any number, comma, or period, an optional space, the word 'sqm'.

This should allow your formatting to be pretty loose (optional spaces, thousands and decimal separators).

查看更多
仙女界的扛把子
4楼-- · 2019-07-20 01:54

You may wish to consider the situations discussed in this answer in crafting a regex for numbers.

查看更多
狗以群分
5楼-- · 2019-07-20 01:57

In perl, I would write something like:

if ($line ~= m/\b([0-9.,]+) sqft/)
{
  $sqft = $1;
}
else
{
  $sqft = undef;
}

if ($line ~= m/\b([0-9.,]+) sqm/)
{
  $sqm = $1;
}
else
{
  $sqm = undef;
}
查看更多
登录 后发表回答