so I'm writting a generic backup application with os
module and pickle
and far I've tried the code below to see if something is a file or directory (based on its string input and not its physical contents).
import os, re
def test(path):
prog = re.compile("^[-\w,\s]+.[A-Za-z]{3}$")
result = prog.match(path)
if os.path.isfile(path) or result:
print "is file"
elif os.path.isdir(path):
print "is directory"
else: print "I dont know"
Problems
test("C:/treeOfFunFiles/")
is directory
test("/beach.jpg")
I dont know
test("beach.jpg")
I dont know
test("/directory/")
I dont know
Desired Output
test("C:/treeOfFunFiles/")
is directory
test("/beach.jpg")
is file
test("beach.jpg")
is file
test("/directory/")
is directory
Resources
what regular expression should I be using to tell the difference between what might be a file
and what might be a directory
? or is there a different way to go about this?
In a character class, if present and meant as a hyphen, the
-
needs to either be the first/last character, or escaped\-
so change"^[\w-,\s]+\.[A-Za-z]{3}$"
to "^[-\w,\s]+.[A-Za-z]{3}$" for instance.Otherwise, I think using regex's to determine if something looks like a filename/directory is pointless...
/dev/fd0
isn't a file or directory for instance~/comm.pipe
could look like a file but is a named pipe~/images/test
is a symbolic link to a file called '~/images/holiday/photo1.jpg'Have a look at the
os.path
module which have functions that ask the OS what something is...:This might help someone, I had the exact same need and I used the following regular expression to test whether an input string is a directory, file or neither: for generic file:
for generic directory:
So the generated python function looks like :
Example:
This layer of security of input may be reinforced later by the os.path.isfile() and os.path.isdir() built-in functions as Mr.Squig kindly showed but I'd bet this preliminary test may save you a few microseconds and boost your script performance.
PS: While using this piece of code, I noticed I missed a huge use case when the path actually contains special chars like the dash "-" which is widely used. To solve this I changed the \w{0,} which specifies the requirement of alphabetic only words with .{0,} which is just a random character. This is more of a workaround than a solution. But that's all I have for now.
The
os
module provides methods to check whether or not a path is a file or a directory. It is advisable to use this module over regular expressions.