How do I check if a string is a number (float)?

2018-12-31 01:31发布

What is the best possible way to check if a string can be represented as a number in Python?

The function I currently have right now is:

def is_number(s):
    try:
        float(s)
        return True
    except ValueError:
        return False

Which, not only is ugly and slow, seems clunky. However I haven't found a better method because calling float in the main function is even worse.

30条回答
初与友歌
2楼-- · 2018-12-31 01:54

I wanted to see which method is fastest. Overall the best and most consistent results were given by the check_replace function. The fastest results were given by the check_exception function, but only if there was no exception fired - meaning its code is the most efficient, but the overhead of throwing an exception is quite large.

Please note that checking for a successful cast is the only method which is accurate, for example, this works with check_exception but the other two test functions will return False for a valid float:

huge_number = float('1e+100')

Here is the benchmark code:

import time, re, random, string

ITERATIONS = 10000000

class Timer:    
    def __enter__(self):
        self.start = time.clock()
        return self
    def __exit__(self, *args):
        self.end = time.clock()
        self.interval = self.end - self.start

def check_regexp(x):
    return re.compile("^\d*\.?\d*$").match(x) is not None

def check_replace(x):
    return x.replace('.','',1).isdigit()

def check_exception(s):
    try:
        float(s)
        return True
    except ValueError:
        return False

to_check = [check_regexp, check_replace, check_exception]

print('preparing data...')
good_numbers = [
    str(random.random() / random.random()) 
    for x in range(ITERATIONS)]

bad_numbers = ['.' + x for x in good_numbers]

strings = [
    ''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(random.randint(1,10)))
    for x in range(ITERATIONS)]

print('running test...')
for func in to_check:
    with Timer() as t:
        for x in good_numbers:
            res = func(x)
    print('%s with good floats: %s' % (func.__name__, t.interval))
    with Timer() as t:
        for x in bad_numbers:
            res = func(x)
    print('%s with bad floats: %s' % (func.__name__, t.interval))
    with Timer() as t:
        for x in strings:
            res = func(x)
    print('%s with strings: %s' % (func.__name__, t.interval))

Here are the results with Python 2.7.10 on a 2017 MacBook Pro 13:

check_regexp with good floats: 12.688639
check_regexp with bad floats: 11.624862
check_regexp with strings: 11.349414
check_replace with good floats: 4.419841
check_replace with bad floats: 4.294909
check_replace with strings: 4.086358
check_exception with good floats: 3.276668
check_exception with bad floats: 13.843092
check_exception with strings: 15.786169

Here are the results with Python 3.6.5 on a 2017 MacBook Pro 13:

check_regexp with good floats: 13.472906000000009
check_regexp with bad floats: 12.977665000000016
check_regexp with strings: 12.417542999999995
check_replace with good floats: 6.011045999999993
check_replace with bad floats: 4.849356
check_replace with strings: 4.282754000000011
check_exception with good floats: 6.039081999999979
check_exception with bad floats: 9.322753000000006
check_exception with strings: 9.952595000000002

Here are the results with PyPy 2.7.13 on a 2017 MacBook Pro 13:

check_regexp with good floats: 2.693217
check_regexp with bad floats: 2.744819
check_regexp with strings: 2.532414
check_replace with good floats: 0.604367
check_replace with bad floats: 0.538169
check_replace with strings: 0.598664
check_exception with good floats: 1.944103
check_exception with bad floats: 2.449182
check_exception with strings: 2.200056
查看更多
妖精总统
3楼-- · 2018-12-31 01:54

So to put it all together, checking for Nan, infinity and complex numbers (it would seem they are specified with j, not i, i.e. 1+2j) it results in:

def is_number(s):
    try:
        n=str(float(s))
        if n == "nan" or n=="inf" or n=="-inf" : return False
    except ValueError:
        try:
            complex(s) # for complex
        except ValueError:
            return False
    return True
查看更多
后来的你喜欢了谁
4楼-- · 2018-12-31 01:55

You can use Unicode strings, they have a method to do just what you want:

>>> s = u"345"
>>> s.isnumeric()
True

Or:

>>> s = "345"
>>> u = unicode(s)
>>> u.isnumeric()
True

http://www.tutorialspoint.com/python/string_isnumeric.htm

http://docs.python.org/2/howto/unicode.html

查看更多
低头抚发
5楼-- · 2018-12-31 01:56

Which, not only is ugly and slow

I'd dispute both.

A regex or other string parsing would be uglier and slower.

I'm not sure that anything much could be faster than the above. It calls the function and returns. Try/Catch doesn't introduce much overhead because the most common exception is caught without an extensive search of stack frames.

The issue is that any numeric conversion function has two kinds of results

  • A number, if the number is valid
  • A status code (e.g., via errno) or exception to show that no valid number could be parsed.

C (as an example) hacks around this a number of ways. Python lays it out clearly and explicitly.

I think your code for doing this is perfect.

查看更多
伤终究还是伤i
6楼-- · 2018-12-31 01:59
import re
def is_number(num):
    pattern = re.compile(r'^[-+]?[-0-9]\d*\.\d*|[-+]?\.?[0-9]\d*$')
    result = pattern.match(num)
    if result:
        return True
    else:
        return False


​>>>: is_number('1')
True

>>>: is_number('111')
True

>>>: is_number('11.1')
True

>>>: is_number('-11.1')
True

>>>: is_number('inf')
False

>>>: is_number('-inf')
False
查看更多
柔情千种
7楼-- · 2018-12-31 02:00

In case you are looking for parsing (positive, unsigned) integers instead of floats, you can use the isdigit() function for string objects.

>>> a = "03523"
>>> a.isdigit()
True
>>> b = "963spam"
>>> b.isdigit()
False

String Methods - isdigit()

There's also something on Unicode strings, which I'm not too familiar with Unicode - Is decimal/decimal

查看更多
登录 后发表回答