I have been trying to troubleshoot an issue where in when we are downloading a file from ftp/ftps. File gets downloaded successfully but no operation is performed post file download completion. No error has occurred which could give more information about the issue. I tried searching for this on stackoverflow and found this link which talks about similar problem statement and looks like I am facing similar issue, though I am not sure. Need little more help in resolving the issue.
I tried setting the FTP connection timeout to 60mins but of less help. Prior to this I was using retrbinary() of the ftplib but same issue occurs there. I tried passing different blocksize and windowsize but with that also issue was reproducible.
I am trying to download the file of size ~3GB from AWS EMR cluster. Sample code is written below.
def download_ftp(self, ip, port, user_name, password, file_name, target_path):
try:
os.chdir(target_path)
ftp = FTP(host=ip)
ftp.connect(port=int(port), timeout=3000)
ftp.login(user=user_name, passwd=password)
if ftp.nlst(file_name) != []:
dir = os.path.split(file_name)
ftp.cwd(dir[0])
for filename in ftp.nlst(file_name):
sock = ftp.transfercmd('RETR ' + filename)
def background():
fhandle = open(filename, 'wb')
while True:
block = sock.recv(1024 * 1024)
if not block:
break
fhandle.write(block)
sock.close()
t = threading.Thread(target=background)
t.start()
while t.is_alive():
t.join(60)
ftp.voidcmd('NOOP')
logger.info("File " + filename + " fetched successfully")
return True
else:
logger.error("File " + file_name + " is not present in FTP")
except Exception, e:
logger.error(e)
raise
Another option suggested in the above mentioned link is to close the connection post downloading small chunk of the file and then restart the connection. Can someone suggest how can this be achieved, not sure how to resume the download from the same point where the file download was stopped last time before closing the connection. Will this method be full proof of downloading the entire file.
I don't know much about FTP server level timeout settings so didn't know what and how it needs to be altered. I basically want to write a generic FTP down-loader which can help in downloading the files from FTP/FTPS.
When I use retrbinary() method of ftplib and set debug level to 2.
ftp.set_debuglevel(2)
ftp.retrbinary('RETR ' + filename, fhandle.write)
Below logs are getting printed.
cmd 'TYPE I' put 'TYPE I\r\n' get '200 Type set to I.\r\n' resp '200 Type set to I.' cmd 'PASV' put 'PASV\r\n' get '227 Entering Passive Mode (64,27,160,28,133,251).\r\n' resp '227 Entering Passive Mode (64,27,160,28,133,251).' cmd 'RETR FFFT_BRA_PM_R_201711.txt' put 'RETR FFFT_BRA_PM_R_201711.txt\r\n' get '150 Opening BINARY mode data connection for FFFT_BRA_PM_R_201711.txt.\r\n' resp '150 Opening BINARY mode data connection for FFFT_BRA_PM_R_201711.txt.'
Before doing anything, note that there is something very wrong with your connection, and diagnosing that and getting it fixed is far better than working around it. But sometimes, you just have to deal with a broken server, and even sending keepalives doesn't help. So, what can you do?
The trick is to download a chunk at a time, then abort the download—or, if the server can't handle aborting, close and reopen the connection.
Note that I'm testing everything below with ftp://speedtest.tele2.net/5MB.zip, which hopefully this doesn't cause a million people to start hammering their servers. Of course you'll want to test it with your actual server.
Testing for
REST
The entire solution of course relies on the server being able to resume transfers, which not all servers can do—especially when you're dealing with something badly broken. So we'll need to test for that. Note that this test will be very slow, and very heavy on the server, so do not testing with your 3GB file; find something much smaller. Also, if you can put something readable there, it will help for debugging, because you may be stuck comparing files in a hex editor.
You will probably not get 1MB at a time, but instead something under 8KB. Let's assume you're seeing 1448, then 2896, 4344, etc.
REST
, the server does not handle resuming—give up, you're hosed.f.seek
, I can explain—but you probably won't run into it.Testing for
ABRT
One thing we can do is try to abort the download and not reconnect.
You're going to want to try multiple variations:
sock.close
.ftp.abort
.sock.close
afterftp.abort
.ftp.abort
aftersock.close
.TYPE I
moved to before the loop instead of each time.Some will raise exceptions. Others will just appear to hang forever. If that's true for all 8 of them, we need to give up on aborting. But if any of them works, great!
Downloading a full chunk
The other way to speed things up is to download 1MB (or more) at a time before aborting or reconnecting. Just replace this code:
with this:
Now, instead of reading 1442 or 8192 bytes for each transfer, you're reading up to 1MB for each transfer. Try pushing it farther.
Combining with keepalives
If, say, your downloads were failing at 10MB, and the keepalive code in your question got things up to 512MB, but it just wasn't enough for 3GB—you can combine the two. Use keepalives to read 512MB at a time, then abort or reconnect and read the next 512MB, until you're done.