I would like to know if there is something similar to PHP natsort function in Python?
l = [\'image1.jpg\', \'image15.jpg\', \'image12.jpg\', \'image3.jpg\']
l.sort()
gives:
[\'image1.jpg\', \'image12.jpg\', \'image15.jpg\', \'image3.jpg\']
but I would like to get:
[\'image1.jpg\', \'image3.jpg\', \'image12.jpg\', \'image15.jpg\']
UPDATE
Solution base on this link
def try_int(s):
\"Convert to integer if possible.\"
try: return int(s)
except: return s
def natsort_key(s):
\"Used internally to get a tuple by which s is sorted.\"
import re
return map(try_int, re.findall(r\'(\\d+|\\D+)\', s))
def natcmp(a, b):
\"Natural string comparison, case sensitive.\"
return cmp(natsort_key(a), natsort_key(b))
def natcasecmp(a, b):
\"Natural string comparison, ignores case.\"
return natcmp(a.lower(), b.lower())
l.sort(natcasecmp);
From my answer to Natural Sorting algorithm:
import re
def natural_key(string_):
\"\"\"See http://www.codinghorror.com/blog/archives/001018.html\"\"\"
return [int(s) if s.isdigit() else s for s in re.split(r\'(\\d+)\', string_)]
Example:
>>> L = [\'image1.jpg\', \'image15.jpg\', \'image12.jpg\', \'image3.jpg\']
>>> sorted(L)
[\'image1.jpg\', \'image12.jpg\', \'image15.jpg\', \'image3.jpg\']
>>> sorted(L, key=natural_key)
[\'image1.jpg\', \'image3.jpg\', \'image12.jpg\', \'image15.jpg\']
To support Unicode strings, .isdecimal()
should be used instead of .isdigit()
. See example in @phihag\'s comment. Related: How to reveal Unicodes numeric value property.
.isdigit()
may also fail (return value that is not accepted by int()
) for a bytestring on Python 2 in some locales e.g., \'\\xb2\' (\'²\') in cp1252 locale on Windows.
You can check out the third-party natsort library on PyPI:
>>> import natsort
>>> l = [\'image1.jpg\', \'image15.jpg\', \'image12.jpg\', \'image3.jpg\']
>>> natsort.natsorted(l)
[\'image1.jpg\', \'image3.jpg\', \'image12.jpg\', \'image15.jpg\']
Full disclosure, I am the author.
This function can be used as the key=
argument for sorted
in Python 2.x and 3.x:
def sortkey_natural(s):
return tuple(int(part) if re.match(r\'[0-9]+$\', part) else part
for part in re.split(r\'([0-9]+)\', s))