How to make Python check if ftp directory exists?

2019-03-24 17:55发布

问题:

I'm using this script to connect to sample ftp server and list available directories:

from ftplib import FTP
ftp = FTP('ftp.cwi.nl')   # connect to host, default port (some example server, i'll use other one)
ftp.login()               # user anonymous, passwd anonymous@
ftp.retrlines('LIST')     # list directory contents
ftp.quit()

How do I use ftp.retrlines('LIST') output to check if directory (for example public_html) exists, if it exists cd to it and then execute some other code and exit; if not execute code right away and exit?

回答1:

Nslt will list an array for all files in ftp server. Just check if your folder name is there.

from ftplib import FTP 
ftp = FTP('yourserver')
ftp.login('username', 'password')

folderName = 'yourFolderName'
if folderName in ftp.nlst():
    #do needed task 


回答2:

you can use a list. example

import ftplib
server="localhost"
user="user"
password="test@email.com"
try:
    ftp = ftplib.FTP(server)    
    ftp.login(user,password)
except Exception,e:
    print e
else:    
    filelist = [] #to store all files
    ftp.retrlines('LIST',filelist.append)    # append to list  
    f=0
    for f in filelist:
        if "public_html" in f:
            #do something
            f=1
    if f==0:
        print "No public_html"
        #do your processing here


回答3:

You can send "MLST path" over the control connection. That will return a line including the type of the path (notice 'type=dir' down here):

250-Listing "/home/user":
 modify=20131113091701;perm=el;size=4096;type=dir;unique=813gc0004; /
250 End MLST.

Translated into python that should be something along these lines:

import ftplib
ftp = ftplib.FTP()
ftp.connect('ftp.somedomain.com', 21)
ftp.login()
resp = ftp.sendcmd('MLST pathname')
if 'type=dir;' in resp:
    # it should be a directory
    pass

Of course the code above is not 100% reliable and would need a 'real' parser. You can look at the implementation of MLSD command in ftplib.py which is very similar (MLSD differs from MLST in that the response in sent over the data connection but the format of the lines being transmitted is the same): http://hg.python.org/cpython/file/8af2dc11464f/Lib/ftplib.py#l577



回答4:

The examples attached to ghostdog74's answer have a bit of a bug: the list you get back is the whole line of the response, so you get something like

drwxrwxrwx    4 5063     5063         4096 Sep 13 20:00 resized

This means if your directory name is something like '50' (which is was in my case), you'll get a false positive. I modified the code to handle this:

def directory_exists_here(self, directory_name):
    filelist = []
    self.ftp.retrlines('LIST',filelist.append)
    for f in filelist:
        if f.split()[-1] == directory_name:
            return True
    return False

N.B., this is inside an FTP wrapper class I wrote and self.ftp is the actual FTP connection.



回答5:

Tom is correct, but no one voted him up however for the satisfaction who voted up ghostdog74 I will mix and write this code, works for me, should work for you guys.

import ftplib
server="localhost"
user="user"
uploadToDir="public_html"
password="test@email.com"
try:
    ftp = ftplib.FTP(server)    
    ftp.login(user,password)
except Exception,e:
    print e
else:    
    filelist = [] #to store all files
    ftp.retrlines('NLST',filelist.append)    # append to list  
    num=0
    for f in filelist:
        if f.split()[-1] == uploadToDir:
            #do something
            num=1
    if num==0:
        print "No public_html"
        #do your processing here

first of all if you follow ghost dog method, even if you say directory "public" in f, even when it doesnt exist it will evaluate to true because the word public exist in "public_html" so thats where Tom if condition can be used so I changed it to if f.split()[-1] == uploadToDir:.

Also if you enter a directory name somethig that doesnt exist but some files and folder exist the second by ghostdog74 will never execute because its never 0 as overridden by f in for loop so I used num variable instead of f and voila the goodness follows...

Vinay and Jonathon are right about what they commented.



回答6:

In 3.x nlst() method is deprecated. Use this code:

import ftplib

remote = ftplib.FTP('example.com')
remote.login()

if 'foo' in [name for name, data in list(remote.mlsd())]:
    # do your stuff

The list() call is needed because mlsd() returns a generator and they do not support checking what is in them (do not have __contains__() method).

You can wrap [name for name, data in list(remote.mlsd())] list comp in a function of method and call it when you will need to just check if a directory (or file) exists.



回答7:

=> I found this web-page while googling for a way to check if a file exists using ftplib in python. The following is what I figured out (hope it helps someone):

=> When trying to list non-existent files/directories, ftplib raises an exception. Even though Adding a try/except block is a standard practice and a good idea, I would prefer my FTP scripts to download file(s) only after making sure they exist. This helps in keeping my scripts simpler - at least when listing a directory on the FTP server is possible.

For example, the Edgar FTP server has multiple files that are stored under the directory /edgar/daily-index/. Each file is named liked "master.YYYYMMDD.idx". There is no guarantee that a file will exist for every date (YYYYMMDD) - there is no file dated 24th Nov 2013, but there is a file dated: 22th Nov 2013. How does listing work in these two cases?

# Code
from __future__ import print_function  
import ftplib  

ftp_client = ftplib.FTP("ftp.sec.gov", "anonymous", "MY.EMAIL@gmail.com")  
resp = ftp_client.sendcmd("MLST /edgar/daily-index/master.20131122.idx")  
print(resp)   
resp = ftp_client.sendcmd("MLST /edgar/daily-index/master.20131124.idx")  
print(resp)  

# Output
250-Start of list for /edgar/daily-index/master.20131122.idx  
modify=20131123030124;perm=adfr;size=301580;type=file;unique=11UAEAA398;  
UNIX.group=1;UNIX.mode=0644;UNIX.owner=1019;  
/edgar/daily-index/master.20131122.idx
250 End of list  

Traceback (most recent call last):
File "", line 10, in <module>
resp = ftp_client.sendcmd("MLST /edgar/daily-index/master.20131124.idx")
File "lib/python2.7/ftplib.py", line 244, in sendcmd
return self.getresp()
File "lib/python2.7/ftplib.py", line 219, in getresp
raise error_perm, resp
ftplib.error_perm: 550 '/edgar/daily-index/master.20131124.idx' cannot be listed

As expected, listing a non-existent file generates an exception.

=> Since I know that the Edgar FTP server will surely have the directory /edgar/daily-index/, my script can do the following to avoid raising exceptions due to non-existent files:
a) list this directory.
b) download the required file(s) if they are are present in this listing - To check the listing I typically perform a regexp search, on the list of strings that the listing operation returns.

For example this script tries to download files for the past three days. If a file is found for a certain date then it is downloaded, else nothing happens.

import ftplib
import re
from datetime import date, timedelta

ftp_client = ftplib.FTP("ftp.sec.gov", "anonymous", "MY.EMAIL@gmail.com")
listing = []
# List the directory and store each directory entry as a string in an array
ftp_client.retrlines("LIST /edgar/daily-index", listing.append)
# go back 1,2 and 3 days
for diff in [1,2,3]:
  today = (date.today() - timedelta(days=diff)).strftime("%Y%m%d")
  month = (date.today() - timedelta(days=diff)).strftime("%Y_%m")
  # the absolute path of the file we want to download - if it indeed exists
  file_path = "/edgar/daily-index/master.%(date)s.idx" % { "date": today }
  # create a regex to match the file's name
  pattern = re.compile("master.%(date)s.idx" % { "date": today })
  # filter out elements from the listing that match the pattern
  found = filter(lambda x: re.search(pattern, x) != None, listing)
  if( len(found) > 0 ):
    ftp_client.retrbinary(
      "RETR %(file_path)s" % { "file_path": file_path },
      open(
        './edgar/daily-index/%(month)s/master.%(date)s.idx' % {
          "date": today
        }, 'wb'
      ).write
    )

=> Interestingly, there are situations where we cannot list a directory on the FTP server. The edgar FTP server, for example, disallows listing on /edgar/data because it contains far too many sub-directories. In such cases, I wouldn't be able to use the "List and check for existence" approach described here - in these cases I would have to use exception handling in my downloader script to recover from non-existent file/directory access attempts.



标签: python ftp