I have several threads running in parallel from Python on a cluster system. Each python thread outputs to a directory mydir
. Each script, before outputting checks if mydir exists and if not creates it:
if not os.path.isdir(mydir):
os.makedirs(mydir)
but this yields the error:
os.makedirs(self.log_dir)
File "/usr/lib/python2.6/os.py", line 157, in makedirs
mkdir(name,mode)
OSError: [Errno 17] File exists
I suspect it might be due to a race condition, where one job creates the dir before the other gets to it. Is this possible? If so, how can this error be avoided?
I'm not sure it's a race condition so was wondering if other issues in Python can cause this odd error.
Any time code can execute between when you check something and when you act on it, you will have a race condition. One way to avoid this (and the usual way in Python) is to just try and then handle the exception
while True:
mydir = next_dir_name()
try:
os.makedirs(mydir)
break
except OSError, e:
if e.errno != os.errno.EEXIST:
raise
# time.sleep might help here
pass
If you have a lot of threads trying to make a predictable series of directories this will still raise a lot of exceptions, but you will get there in the end. Better to just have one thread creating the dirs in that case
As of Python >=3.2
, os.makedirs()
can take a third optional argument exist_ok
:
os.makedirs(mydir, exist_ok=True)
Catch the exception and, if the errno is 17, ignore it. That's the only thing you can do if there's a race condition between the isdir
and makedirs
calls.
However, it could also be possible that a file with the same name exists - in that case os.path.exists
would return True
but os.path.isdir
returns false.
I had a similar issues and here is what I did
try:
if not os.path.exists(os.path.dirname(mydir)):
os.makedirs(os.path.dirname(mydir))
except OSError as err:
print(err)
Description:
Just checking if the directory already exist throws this error message [Errno 17] File exists
because we are just checking if the directory name exist or not which will return the directory name of the mydir value being passed but not if it already exist or not. What is being missed is not checking if that directory already exist which can be done by checking the path with os.path.exists() and in there we passed the respective directory name.
To ignore the dir or file exist error, you can try this:
except OSError, e:
if e.errno != 17:
print("Error:", e)