可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
This question already has an answer here:
-
Extract float/double value
4 answers
I have a number of strings similar to Current Level: 13.4 db.
and I would like to extract just the floating point number. I say floating and not decimal as it\'s sometimes whole. Can RegEx do this or is there a better way?
回答1:
If your float is always expressed in decimal notation something like
>>> import re
>>> re.findall(\"\\d+\\.\\d+\", \"Current Level: 13.4 db.\")
[\'13.4\']
may suffice.
A more robust version would be:
>>> re.findall(r\"[-+]?\\d*\\.\\d+|\\d+\", \"Current Level: -13.2 db or 14.2 or 3\")
[\'-13.2\', \'14.2\', \'3\']
If you want to validate user input, you could alternatively also check for a float by stepping to it directly:
user_input = \"Current Level: 1e100 db\"
for token in user_input.split():
try:
# if this succeeds, you have your (first) float
print float(token), \"is a float\"
except ValueError:
print token, \"is something else\"
# => Would print ...
#
# Current is something else
# Level: is something else
# 1e+100 is a float
# db is something else
回答2:
You may like to try something like this which covers all the bases, including not relying on whitespace after the number:
>>> import re
>>> numeric_const_pattern = r\"\"\"
... [-+]? # optional sign
... (?:
... (?: \\d* \\. \\d+ ) # .1 .12 .123 etc 9.1 etc 98.1 etc
... |
... (?: \\d+ \\.? ) # 1. 12. 123. etc 1 12 123 etc
... )
... # followed by optional exponent part if desired
... (?: [Ee] [+-]? \\d+ ) ?
... \"\"\"
>>> rx = re.compile(numeric_const_pattern, re.VERBOSE)
>>> rx.findall(\".1 .12 9.1 98.1 1. 12. 1 12\")
[\'.1\', \'.12\', \'9.1\', \'98.1\', \'1.\', \'12.\', \'1\', \'12\']
>>> rx.findall(\"-1 +1 2e9 +2E+09 -2e-9\")
[\'-1\', \'+1\', \'2e9\', \'+2E+09\', \'-2e-9\']
>>> rx.findall(\"current level: -2.03e+99db\")
[\'-2.03e+99\']
>>>
For easy copy-pasting:
numeric_const_pattern = \'[-+]? (?: (?: \\d* \\. \\d+ ) | (?: \\d+ \\.? ) )(?: [Ee] [+-]? \\d+ ) ?\'
rx = re.compile(numeric_const_pattern, re.VERBOSE)
rx.findall(\"Some example: Jr. it. was .23 between 2.3 and 42.31 seconds\")
回答3:
Python docs has an answer that covers +/-, and exponent notation
scanf() Token Regular Expression
%e, %E, %f, %g [-+]?(\\d+(\\.\\d*)?|\\.\\d+)([eE][-+]?\\d+)?
%i [-+]?(0[xX][\\dA-Fa-f]+|0[0-7]*|\\d+)
This regular expression does not support international formats where a comma is used as the separator character between the whole and fractional part (3,14159).
In that case, replace all \\.
with [.,]
in the above float regex.
Regular Expression
International float [-+]?(\\d+([.,]\\d*)?|[.,]\\d+)([eE][-+]?\\d+)?
回答4:
re.findall(r\"[-+]?\\d*\\.\\d+|\\d+\", \"Current Level: -13.2 db or 14.2 or 3\")
as described above, works really well!
One suggestion though:
re.findall(r\"[-+]?\\d*\\.\\d+|[-+]?\\d+\", \"Current Level: -13.2 db or 14.2 or 3 or -3\")
will also return negative int values (like -3 in the end of this string)
回答5:
You can use the following regex to get integer and floating values from a string:
re.findall(r\'[\\d\\.\\d]+\', \'hello -34 42 +34.478m 88 cricket -44.3\')
[\'34\', \'42\', \'34.478\', \'88\', \'44.3\']
Thanks
Rex
回答6:
I think that you\'ll find interesting stuff in the following answer of mine that I did for a previous similar question:
https://stackoverflow.com/q/5929469/551449
In this answer, I proposed a pattern that allows a regex to catch any kind of number and since I have nothing else to add to it, I think it is fairly complete
回答7:
Another approach that may be more readable is simple type conversion. I\'ve added a replacement function to cover instances where people may enter European decimals:
>>> for possibility in \"Current Level: -13.2 db or 14,2 or 3\".split():
... try:
... str(float(possibility.replace(\',\', \'.\')))
... except ValueError:
... pass
\'-13.2\'
\'14.2\'
\'3.0\'
This has disadvantages too however. If someone types in \"1,000\", this will be converted to 1. Also, it assumes that people will be inputting with whitespace between words. This is not the case with other languages, such as Chinese.