I am trying to do something to all the files under a given path. I don't want to collect all the file names beforehand then do something with them, so I tried this:
import os
import stat
def explore(p):
s = ''
list = os.listdir(p)
for a in list:
path = p + '/' + a
stat_info = os.lstat(path )
if stat.S_ISDIR(stat_info.st_mode):
explore(path)
else:
yield path
if __name__ == "__main__":
for x in explore('.'):
print '-->', x
But this code skips over directories when it hits them, instead of yielding their contents. What am I doing wrong?
You can also implement the recursion using a stack.
There is not really any advantage in doing this though, other than the fact that it is possible. If you are using python in the first place, the performance gains are probably not worthwhile.
That calls
explore
like a function. What you should do is iterate it like a generator:EDIT: Instead of the
stat
module, you could useos.path.isdir(path)
.Iterators do not work recursively like that. You have to re-yield each result, by replacing
with something like
Python 3.3 added the syntax
yield from X
, as proposed in PEP 380, to serve this purpose. With it you can do this instead:If you're using generators as coroutines, this syntax also supports the use of
generator.send()
to pass values back into the recursively-invoked generators. The simplefor
loop above would not.Use
os.walk
instead of reinventing the wheel.In particular, following the examples in the library documentation, here is an untested attempt:
To answer the original question as asked, the key is that the
yield
statement needs to be propagated back out of the recursion (just like, say,return
). Here is a working reimplementation ofos.walk()
. I'm using this in a pseudo-VFS implementation, where I additionally replaceos.listdir()
and similar calls.os.walk is great if you need to traverse all the folders and subfolders. If you don't need that, it's like using an elephant gun to kill a fly.
However, for this specific case, os.walk could be a better approach.