Python regex for number with or without decimals u

2020-06-13 08:50发布

I'm just learning regex and now I'm trying to match a number which more or less represents this:

[zero or more numbers][possibly a dot or comma][zero or more numbers]

No dot or comma is also okay. So it should match the following:

1
123
123.
123.4
123.456
.456
123,  # From here it's the same but with commas instead of dot separators
123,4
123,456
,456

But it should not match the following:

0.,1
0a,1
0..1
1.1.2
100,000.99  # I know this and the one below are valid in many languages, but I simply want to reject these
100.000,99

So far I've come up with [0-9]*[.,][0-9]*, but it doesn't seem to work so well:

>>> import re
>>> r = re.compile("[0-9]*[.,][0-9]*")
>>> if r.match('0.1.'): print 'it matches!'
...
it matches!
>>> if r.match('0.abc'): print 'it matches!'
...
it matches!

I have the feeling I'm doing two things wrong: I don't use match correctly AND my regex is not correct. Could anybody enlighten me on what I'm doing wrong? All tips are welcome!

标签: python regex
8条回答
时光不老,我们不散
2楼-- · 2020-06-13 09:50

The problem is that you are asking for a partial match, as long as it starts at the beginning.

One way around this is to end the regex in \Z (optionally $).

\Z Matches only at the end of the string.

and the other is to use re.fullmatch instead.

import re
help(re.match)
#>>> Help on function match in module re:
#>>>
#>>> match(pattern, string, flags=0)
#>>>     Try to apply the pattern at the start of the string, returning
#>>>     a match object, or None if no match was found.
#>>>

vs

import re
help(re.fullmatch)
#>>> Help on function fullmatch in module re:
#>>>
#>>> fullmatch(pattern, string, flags=0)
#>>>     Try to apply the pattern to all of the string, returning
#>>>     a match object, or None if no match was found.
#>>>

Note that fullmatch is new in 3.4.

You should also make the [.,] part optional, so append a ? to that.

'?' Causes the resulting RE to match 0 or 1 repetitions of the preceding RE. ab? will match either ‘a’ or ‘ab’.

Eg.

import re
r = re.compile("[0-9]*[.,]?[0-9]*\Z")

bool(r.match('0.1.'))
#>>> False

bool(r.match('0.abc'))
#>>> False

bool(r.match('0123'))
#>>> True
查看更多
聊天终结者
3楼-- · 2020-06-13 09:50

More generic method can be as follows

import re
r=re.compile(r"^\d\d*[,]?\d*[,]?\d*[.,]?\d*\d$")
print(bool(r.match('100,000.00')))

This will match the following pattern:

  1. This will match the following pattern:
    • 100
    • 1,000
    • 100.00
    • 1,000.00
    • 1,00,000
    • 1,00,000.00
  2. This will not match the following pattern:

    • .100
    • ..100
    • 100.100.00
    • ,100
    • 100,
    • 100.
查看更多
登录 后发表回答