-->

Recursion definition using Python os.walk as an ex

2019-07-25 12:51发布

问题:

I am having trouble grasping how to recursively list files from a directory using python. Does all the recursion logic take place in the module itself (os.walk)?

def listfiles(path):
    for root, dirs, files in os.walk(path):
        for f in files:
            print(f)

I have looked up multiple ways to recursively list files in a directory and they all follow the same pattern as above. This appears to be iterating through the files. Could someone explain to me how this is being recursively done?

回答1:

os.walk() is a generator. It recursively lists directories all the while generating the results. See the source code, but simplified, it comes down to:

def walk(top):
    try:
        names = os.listdir(top)
    except error, err:
        return

    dirs, nondirs = [], []
    for name in names:
        if os.path.isdir(os.path.join(top, name)):
            dirs.append(name)
        else:
            nondirs.append(name)

    yield top, dirs, nondirs

    for name in dirs:
        new_path = os.path.join(top, name)
        for x in walk(new_path):  # recursive call
            yield x

This code lists the top path after collecting directories and regular files, then recurses down to the nested directories. The recursive calls are passed on by yielding the results explicitly. The Python 3 version uses generator delegation there:

for name in dirs:
    new_path = os.path.join(top, name)
    yield from walk(new_path)

To simplify this, I omitted the support for error callbacks, filtering symlinks and bottom-up generation.