This question already has answers here:
Closed 3 years ago.
I am concerned about the order of files and directories given by os.walk()
. If I have these directories, 1
, 10
, 11
, 12
, 2
, 20
, 21
, 22
, 3
, 30
, 31
, 32
, what is the order of the output list?
Is it sorted by numeric values?
1 2 3 10 20 30 11 21 31 12 22 32
Or sorted by ASCII values, like what is given by ls
?
1 10 11 12 2 20 21 22 3 30 31 32
Additionally, how can I get a specific sort?
os.walk
uses os.listdir
. Here is the docstring for os.listdir
:
listdir(path) -> list_of_strings
Return a list containing the names of the entries in the directory.
path: path of directory to list
The list is in arbitrary order. It does not include the special
entries '.' and '..' even if they are present in the directory.
(my emphasis).
You could, however, use sort
to ensure the order you desire.
for root, dirs, files in os.walk(path):
for dirname in sorted(dirs):
print(dirname)
(Note the dirnames are strings not ints, so sorted(dirs)
sorts them as strings -- which is desirable for once.
As Alfe and Ciro Santilli point out, if you want the directories to be recursed in sorted order, then modify dirs
in-place:
for root, dirs, files in os.walk(path):
dirs.sort()
for dirname in dirs:
print(os.path.join(root, dirname))
You can test this yourself:
import os
os.chdir('/tmp/tmp')
for dirname in '1 10 11 12 2 20 21 22 3 30 31 32'.split():
try:
os.makedirs(dirname)
except OSError: pass
for root, dirs, files in os.walk('.'):
for dirname in sorted(dirs):
print(dirname)
prints
1
10
11
12
2
20
21
22
3
30
31
32
If you wanted to list them in numeric order use:
for dirname in sorted(dirs, key=int):
To sort alphanumeric strings, use natural sort.
os.walk()
yields in each step what it will do in the next steps. You can in each step influence the order of the next steps by sorting the lists the way you want them. Quoting the 2.7 manual:
When topdown is True, the caller can modify the dirnames list in-place (perhaps using del or slice assignment), and walk() will only recurse into the subdirectories whose names remain in dirnames; this can be used to prune the search, impose a specific order of visiting
So sorting the dirNames
will influence the order in which they will be visited:
for rootName, dirNames, fileNames in os.walk(path):
dirNames.sort() # you may want to use the args cmp, key and reverse here
After this, the dirNames
are sorted in-place and the next yielded values of walk
will be accordingly.
Of course you also can sort the list of fileNames
but that won't influence any further steps (because files don't have descendants walk
will visit).
And of course you can iterate through sorted versions of these lists as unutbu's answer proposes, but that won't influence the further progress of the walk
itself.
The unmodified order of the values is undefined by os.walk
, meaning that it will be "any" order. You should not rely on what you experience today. But in fact it will probably be what the underlying file system returns. In some file systems this will be alphabetically ordered.
The simplest way is to sort the return values of os.walk()
, e.g. using:
for rootName, dirNames, fileNames in sorted(os.walk(path)):
#root, dirs and files are iterated in order...