How to correctly sort a string with a number insid

2019-01-03 05:41发布

Possible Duplicate:
Does Python have a built in function for string natural sort?

I have a list of strings containing numbers and I cannot find a good way to sort them.
For example I get something like this:

something1
something12
something17
something2
something25
something29

with the sort() method.

I know that I probably need to extract the numbers somehow and then sort the list but I have no idea how to do it in the most simple way.

1条回答
干净又极端
2楼-- · 2019-01-03 06:11

Perhaps you are looking for human sorting (also known as natural sorting):

import re

def atoi(text):
    return int(text) if text.isdigit() else text

def natural_keys(text):
    '''
    alist.sort(key=natural_keys) sorts in human order
    http://nedbatchelder.com/blog/200712/human_sorting.html
    (See Toothy's implementation in the comments)
    '''
    return [ atoi(c) for c in re.split('(\d+)', text) ]

alist=[
    "something1",
    "something12",
    "something17",
    "something2",
    "something25",
    "something29"]

alist.sort(key=natural_keys)
print(alist)

yields

['something1', 'something2', 'something12', 'something17', 'something25', 'something29']

PS. I've changed my answer to use Toothy's implementation of natural sorting (posted in the comments here) since it is significantly faster than my original answer.


If you wish to sort text with floats, then you'll need to change the regex from one that matches ints (i.e. (\d+)) to a regex that matches floats:

import re

def atof(text):
    try:
        retval = float(text)
    except ValueError:
        retval = text
    return retval

def natural_keys(text):
    '''
    alist.sort(key=natural_keys) sorts in human order
    http://nedbatchelder.com/blog/200712/human_sorting.html
    (See Toothy's implementation in the comments)
    float regex comes from https://stackoverflow.com/a/12643073/190597
    '''
    return [ atof(c) for c in re.split(r'[+-]?([0-9]+(?:[.][0-9]*)?|[.][0-9]+)', text) ]

alist=[
    "something1",
    "something2",
    "something1.0",
    "something1.25",
    "something1.105"]

alist.sort(key=natural_keys)
print(alist)

yields

['something1', 'something1.0', 'something1.105', 'something1.25', 'something2']
查看更多
登录 后发表回答