How do I check if a string is a number (float)?

2018-12-31 01:31发布

What is the best possible way to check if a string can be represented as a number in Python?

The function I currently have right now is:

def is_number(s):
    try:
        float(s)
        return True
    except ValueError:
        return False

Which, not only is ugly and slow, seems clunky. However I haven't found a better method because calling float in the main function is even worse.

30条回答
像晚风撩人
2楼-- · 2018-12-31 01:44

This answer provides step by step guide having function with examples to find the string is:

  • Positive integer
  • Positive/negative - integer/float
  • How to discard "NaN" (not a number) strings while checking for number?

Check if string is positive integer

You may use str.isdigit() to check whether given string is positive integer.

Sample Results:

# For digit
>>> '1'.isdigit()
True
>>> '1'.isalpha()
False

Check for string as positive/negative - integer/float

str.isdigit() returns False if the string is a negative number or a float number. For example:

# returns `False` for float
>>> '123.3'.isdigit()
False
# returns `False` for negative number
>>> '-123'.isdigit()
False

If you want to also check for the negative integers and float, then you may write a custom function to check for it as:

def is_number(n):
    try:
        float(n)   # Type-casting the string to `float`.
                   # If string is not a valid `float`, 
                   # it'll raise `ValueError` exception
    except ValueError:
        return False
    return True

Sample Run:

>>> is_number('123')    # positive integer number
True

>>> is_number('123.4')  # positive float number
True

>>> is_number('-123')   # negative integer number
True

>>> is_number('-123.4') # negative `float` number
True

>>> is_number('abc')    # `False` for "some random" string
False

Discard "NaN" (not a number) strings while checking for number

The above functions will return True for the "NAN" (Not a number) string because for Python it is valid float representing it is not a number. For example:

>>> is_number('NaN')
True

In order to check whether the number is "NaN", you may use math.isnan() as:

>>> import math
>>> nan_num = float('nan')

>>> math.isnan(nan_num)
True

Or if you don't want to import additional library to check this, then you may simply check it via comparing it with itself using ==. Python returns False when nan float is compared with itself. For example:

# `nan_num` variable is taken from above example
>>> nan_num == nan_num
False

Hence, above function is_number can be updated to return False for "NaN" as:

def is_number(n):
    is_number = True
    try:
        num = float(n)
        # check for "nan" floats
        is_number = num == num   # or use `math.isnan(num)`
    except ValueError:
        is_number = False
    return is_number

Sample Run:

>>> is_number('Nan')   # not a number "Nan" string
False

>>> is_number('nan')   # not a number string "nan" with all lower cased
False

>>> is_number('123')   # positive integer
True

>>> is_number('-123')  # negative integer
True

>>> is_number('-1.12') # negative `float`
True

>>> is_number('abc')   # "some random" string
False

PS: Each operation for each check depending on the type of number comes with additional overhead. Choose the version of is_number function which fits your requirement.

查看更多
余生无你
3楼-- · 2018-12-31 01:45

TL;DR The best solution is s.replace('.','',1).isdigit()

I did some benchmarks comparing the different approaches

def is_number_tryexcept(s):
    """ Returns True is string is a number. """
    try:
        float(s)
        return True
    except ValueError:
        return False

import re    
def is_number_regex(s):
    """ Returns True is string is a number. """
    if re.match("^\d+?\.\d+?$", s) is None:
        return s.isdigit()
    return True


def is_number_repl_isdigit(s):
    """ Returns True is string is a number. """
    return s.replace('.','',1).isdigit()

If the string is not a number, the except-block is quite slow. But more importantly, the try-except method is the only approach that handles scientific notations correctly.

funcs = [
          is_number_tryexcept, 
          is_number_regex,
          is_number_repl_isdigit
          ]

a_float = '.1234'

print('Float notation ".1234" is not supported by:')
for f in funcs:
    if not f(a_float):
        print('\t -', f.__name__)

Float notation ".1234" is not supported by:
- is_number_regex

scientific1 = '1.000000e+50'
scientific2 = '1e50'


print('Scientific notation "1.000000e+50" is not supported by:')
for f in funcs:
    if not f(scientific1):
        print('\t -', f.__name__)




print('Scientific notation "1e50" is not supported by:')
for f in funcs:
    if not f(scientific2):
        print('\t -', f.__name__)

Scientific notation "1.000000e+50" is not supported by:
- is_number_regex
- is_number_repl_isdigit
Scientific notation "1e50" is not supported by:
- is_number_regex
- is_number_repl_isdigit

EDIT: The benchmark results

import timeit

test_cases = ['1.12345', '1.12.345', 'abc12345', '12345']
times_n = {f.__name__:[] for f in funcs}

for t in test_cases:
    for f in funcs:
        f = f.__name__
        times_n[f].append(min(timeit.Timer('%s(t)' %f, 
                      'from __main__ import %s, t' %f)
                              .repeat(repeat=3, number=1000000)))

where the following functions were tested

from re import match as re_match
from re import compile as re_compile

def is_number_tryexcept(s):
    """ Returns True is string is a number. """
    try:
        float(s)
        return True
    except ValueError:
        return False

def is_number_regex(s):
    """ Returns True is string is a number. """
    if re_match("^\d+?\.\d+?$", s) is None:
        return s.isdigit()
    return True


comp = re_compile("^\d+?\.\d+?$")    

def compiled_regex(s):
    """ Returns True is string is a number. """
    if comp.match(s) is None:
        return s.isdigit()
    return True


def is_number_repl_isdigit(s):
    """ Returns True is string is a number. """
    return s.replace('.','',1).isdigit()

enter image description here

查看更多
谁念西风独自凉
4楼-- · 2018-12-31 01:45

Lets say you have digits in string. str = "100949" and you would like to check if it has only numbers

if str.isdigit():
returns TRUE or FALSE 

isdigit docs

otherwise your method works great to find the occurrence of a digit in a string.

查看更多
孤独总比滥情好
5楼-- · 2018-12-31 01:46

how about this:

'3.14'.replace('.','',1).isdigit()

which will return true only if there is one or no '.' in the string of digits.

'3.14.5'.replace('.','',1).isdigit()

will return false

edit: just saw another comment ... adding a .replace(badstuff,'',maxnum_badstuff) for other cases can be done. if you are passing salt and not arbitrary condiments (ref:xkcd#974) this will do fine :P

查看更多
旧人旧事旧时光
6楼-- · 2018-12-31 01:46

Just Mimic C#

In C# there are two different functions that handle parsing of scalar values:

  • Float.Parse()
  • Float.TryParse()

float.parse():

def parse(string):
    try:
        return float(string)
    except Exception:
        throw TypeError

Note: If you're wondering why I changed the exception to a TypeError, here's the documentation.

float.try_parse():

def try_parse(string, fail=None):
    try:
        return float(string)
    except Exception:
        return fail;

Note: You don't want to return the boolean 'False' because that's still a value type. None is better because it indicates failure. Of course, if you want something different you can change the fail parameter to whatever you want.

To extend float to include the 'parse()' and 'try_parse()' you'll need to monkeypatch the 'float' class to add these methods.

If you want respect pre-existing functions the code should be something like:

def monkey_patch():
    if(!hasattr(float, 'parse')):
        float.parse = parse
    if(!hasattr(float, 'try_parse')):
        float.try_parse = try_parse

SideNote: I personally prefer to call it Monkey Punching because it feels like I'm abusing the language when I do this but YMMV.

Usage:

float.parse('giggity') // throws TypeException
float.parse('54.3') // returns the scalar value 54.3
float.tryParse('twank') // returns None
float.tryParse('32.2') // returns the scalar value 32.2

And the great Sage Pythonas said to the Holy See Sharpisus, "Anything you can do I can do better; I can do anything better than you."

查看更多
笑指拈花
7楼-- · 2018-12-31 01:46

I needed to determine if a string cast into basic types (float,int,str,bool). After not finding anything on the internet I created this:

def str_to_type (s):
    """ Get possible cast type for a string

    Parameters
    ----------
    s : string

    Returns
    -------
    float,int,str,bool : type
        Depending on what it can be cast to

    """    
    try:                
        f = float(s)        
        if "." not in s:
            return int
        return float
    except ValueError:
        value = s.upper()
        if value == "TRUE" or value == "FALSE":
            return bool
        return type(s)

Example

str_to_type("true") # bool
str_to_type("6.0") # float
str_to_type("6") # int
str_to_type("6abc") # str
str_to_type(u"6abc") # unicode       

You can capture the type and use it

s = "6.0"
type_ = str_to_type(s) # float
f = type_(s) 
查看更多
登录 后发表回答