How can I safely create a nested directory in Pyth

2018-12-31 05:36发布

What is the most elegant way to check if the directory a file is going to be written to exists, and if not, create the directory using Python? Here is what I tried:

import os

file_path = "/my/directory/filename.txt"
directory = os.path.dirname(file_path)

try:
    os.stat(directory)
except:
    os.mkdir(directory)       

f = file(filename)

Somehow, I missed os.path.exists (thanks kanja, Blair, and Douglas). This is what I have now:

def ensure_dir(file_path):
    directory = os.path.dirname(file_path)
    if not os.path.exists(directory):
        os.makedirs(directory)

Is there a flag for "open", that makes this happen automatically?

25条回答
时光乱了年华
2楼-- · 2018-12-31 06:25

Using try except and the right error code from errno module gets rid of the race condition and is cross-platform:

import os
import errno

def make_sure_path_exists(path):
    try:
        os.makedirs(path)
    except OSError as exception:
        if exception.errno != errno.EEXIST:
            raise

In other words, we try to create the directories, but if they already exist we ignore the error. On the other hand, any other error gets reported. For example, if you create dir 'a' beforehand and remove all permissions from it, you will get an OSError raised with errno.EACCES (Permission denied, error 13).

查看更多
旧时光的记忆
3楼-- · 2018-12-31 06:27

I have put the following down. It's not totally foolproof though.

import os

dirname = 'create/me'

try:
    os.makedirs(dirname)
except OSError:
    if os.path.exists(dirname):
        # We are nearly safe
        pass
    else:
        # There was an error on creation, so make sure we know about it
        raise

Now as I say, this is not really foolproof, because we have the possiblity of failing to create the directory, and another process creating it during that period.

查看更多
谁念西风独自凉
4楼-- · 2018-12-31 06:29

The relevant Python documentation suggests the use of the EAFP coding style (Easier to Ask for Forgiveness than Permission). This means that the code

try:
    os.makedirs(path)
except OSError as exception:
    if exception.errno != errno.EEXIST:
        raise
    else:
        print "\nBE CAREFUL! Directory %s already exists." % path

is better than the alternative

if not os.path.exists(path):
    os.makedirs(path)
else:
    print "\nBE CAREFUL! Directory %s already exists." % path

The documentation suggests this exactly because of the race condition discussed in this question. In addition, as others mention here, there is a performance advantage in querying once instead of twice the OS. Finally, the argument placed forward, potentially, in favour of the second code in some cases --when the developer knows the environment the application is running-- can only be advocated in the special case that the program has set up a private environment for itself (and other instances of the same program).

Even in that case, this is a bad practice and can lead to long useless debugging. For example, the fact we set the permissions for a directory should not leave us with the impression permissions are set appropriately for our purposes. A parent directory could be mounted with other permissions. In general, a program should always work correctly and the programmer should not expect one specific environment.

查看更多
低头抚发
5楼-- · 2018-12-31 06:29

For a one-liner solution, you can use IPython.utils.path.ensure_dir_exists():

from IPython.utils.path import ensure_dir_exists
ensure_dir_exists(dir)

From the documentation: Ensure that a directory exists. If it doesn’t exist, try to create it and protect against a race condition if another process is doing the same.

查看更多
不再属于我。
6楼-- · 2018-12-31 06:30

When working with file I/O, the important thing to consider is

TOCTTOU (time of check to time of use)

So doing a check with if and then reading or writing later may end up in an unhandled I/O exception. The best way to do it is:

try:
    os.makedirs(dir_path)
except OSError as e:
    if e.errno != errno.EEXIS:
        raise
查看更多
泪湿衣
7楼-- · 2018-12-31 06:31

I found this Q/A and I was initially puzzled by some of the failures and errors I was getting. I am working in Python 3 (v.3.5 in an Anaconda virtual environment on an Arch Linux x86_64 system).

Consider this directory structure:

└── output/         ## dir
   ├── corpus       ## file
   ├── corpus2/     ## dir
   └── subdir/      ## dir

Here are my experiments/notes, which clarifies things:

# ----------------------------------------------------------------------------
# [1] https://stackoverflow.com/questions/273192/how-can-i-create-a-directory-if-it-does-not-exist

import pathlib

""" Notes:
        1.  Include a trailing slash at the end of the directory path
            ("Method 1," below).
        2.  If a subdirectory in your intended path matches an existing file
            with same name, you will get the following error:
            "NotADirectoryError: [Errno 20] Not a directory:" ...
"""
# Uncomment and try each of these "out_dir" paths, singly:

# ----------------------------------------------------------------------------
# METHOD 1:
# Re-running does not overwrite existing directories and files; no errors.

# out_dir = 'output/corpus3'                ## no error but no dir created (missing tailing /)
# out_dir = 'output/corpus3/'               ## works
# out_dir = 'output/corpus3/doc1'           ## no error but no dir created (missing tailing /)
# out_dir = 'output/corpus3/doc1/'          ## works
# out_dir = 'output/corpus3/doc1/doc.txt'   ## no error but no file created (os.makedirs creates dir, not files!  ;-)
# out_dir = 'output/corpus2/tfidf/'         ## fails with "Errno 20" (existing file named "corpus2")
# out_dir = 'output/corpus3/tfidf/'         ## works
# out_dir = 'output/corpus3/a/b/c/d/'       ## works

# [2] https://docs.python.org/3/library/os.html#os.makedirs

# Uncomment these to run "Method 1":

#directory = os.path.dirname(out_dir)
#os.makedirs(directory, mode=0o777, exist_ok=True)

# ----------------------------------------------------------------------------
# METHOD 2:
# Re-running does not overwrite existing directories and files; no errors.

# out_dir = 'output/corpus3'                ## works
# out_dir = 'output/corpus3/'               ## works
# out_dir = 'output/corpus3/doc1'           ## works
# out_dir = 'output/corpus3/doc1/'          ## works
# out_dir = 'output/corpus3/doc1/doc.txt'   ## no error but creates a .../doc.txt./ dir
# out_dir = 'output/corpus2/tfidf/'         ## fails with "Errno 20" (existing file named "corpus2")
# out_dir = 'output/corpus3/tfidf/'         ## works
# out_dir = 'output/corpus3/a/b/c/d/'       ## works

# Uncomment these to run "Method 2":

#import os, errno
#try:
#       os.makedirs(out_dir)
#except OSError as e:
#       if e.errno != errno.EEXIST:
#               raise
# ----------------------------------------------------------------------------

Conclusion: in my opinion, "Method 2" is more robust.

[1] How can I create a directory if it does not exist?

[2] https://docs.python.org/3/library/os.html#os.makedirs

查看更多
登录 后发表回答