Why does this recursive copy function copy all fil

2019-08-26 01:58发布

I write a function to copy files from directory A to directory B recursive. The code is like this:

import os
import shutil
import sys
from os.path import join, exists

def copy_file(src, dest):
    for path, dirs, files in os.walk(src, topdown=True):
        if len(dirs) > 0:
            for di in dirs:
                copy_file(join(path, di), join(dest,  di))

        if not exists(dest):
            os.makedirs(dest)
        for fi in files:
            shutil.copy(join(path, fi), dest)

In my test, the input args are like this:

src = d:/dev

and it have one sub directory named py. Also, py has a sub directory named test

dest = d:/dev_bak

So, when i test my code, something strange happened. In my dest directory which is d:/dev_bak, three sub directories are created. That is: d:/dev_bak/py; d:/dev_bak/py/test; d:/dev_bak/test.

In my design, the structure of dev_bak will be same as dev. So, why this happened!

2条回答
甜甜的少女心
2楼-- · 2019-08-26 02:03

You can easily diagnose this by putting

    print path, dirs, files

right below

for path, dirs, files in os.walk(src, topdown=True):

Essentially, you're recursing twice.

By itself, os.walk descends into subdirectories. You're double-descending by recursively calling your own function. Here is some example output from that print statement:

>>> copy_file("c:\Intel", "c:\Intel-Bak")
c:\Intel ['ExtremeGraphics', 'Logs'] []
c:\Intel\ExtremeGraphics ['CUI'] []
c:\Intel\ExtremeGraphics\CUI ['Resource'] []
c:\Intel\ExtremeGraphics\CUI\Resource [] ['Intel\xae Graphics and Media Control Panel.lnk', 'Intel\xae HD Graphics.lnk']
c:\Intel\ExtremeGraphics\CUI\Resource [] ['Intel\xae Graphics and Media Control Panel.lnk', 'Intel\xae HD Graphics.lnk']
c:\Intel\ExtremeGraphics\CUI ['Resource'] []
c:\Intel\ExtremeGraphics\CUI\Resource [] ['Intel\xae Graphics and Media Control Panel.lnk', 'Intel\xae HD Graphics.lnk']
c:\Intel\ExtremeGraphics\CUI\Resource [] ['Intel\xae Graphics and Media Control Panel.lnk', 'Intel\xae HD Graphics.lnk']
c:\Intel\Logs [] ['IntelChipset.log', 'IntelControlCenter.log', 'IntelGFX.log', 'IntelGFXCoin.log']
c:\Intel\ExtremeGraphics ['CUI'] []
c:\Intel\ExtremeGraphics\CUI ['Resource'] []
c:\Intel\ExtremeGraphics\CUI\Resource [] ['Intel\xae Graphics and Media Control Panel.lnk', 'Intel\xae HD Graphics.lnk']
c:\Intel\ExtremeGraphics\CUI\Resource [] ['Intel\xae Graphics and Media Control Panel.lnk', 'Intel\xae HD Graphics.lnk']
c:\Intel\ExtremeGraphics\CUI ['Resource'] []
c:\Intel\ExtremeGraphics\CUI\Resource [] ['Intel\xae Graphics and Media Control Panel.lnk', 'Intel\xae HD Graphics.lnk']
c:\Intel\ExtremeGraphics\CUI\Resource [] ['Intel\xae Graphics and Media Control Panel.lnk', 'Intel\xae HD Graphics.lnk']
c:\Intel\Logs [] ['IntelChipset.log', 'IntelControlCenter.log', 'IntelGFX.log', 'IntelGFXCoin.log']

As you can see, the directories get visited twice.

You should fix the logic of your program so it visits each directory only once, but theoretically you could just ignore any directory you've already been to:

visited = []
def copy_file(src, dest):
    for path, dirs, files in os.walk(src, topdown=True):
        if path not in visited:
            for di in dirs:
                print dest, di
                copy_file(join(path, di), join(dest,  di))
            if not exists(dest):
                os.makedirs(dest)
            for fi in files:
                shutil.copy(join(path, fi), dest)
            visited.append(path)
查看更多
我命由我不由天
3楼-- · 2019-08-26 02:04

The shutil module already has a copytree function which will copy directories recursively. You might want to use it instead of providing your own implementation.

查看更多
登录 后发表回答