How to download big file in python via ftp (with m

2019-01-23 21:29发布

UPDATE #1

The code in the question works pretty good for stable connection (like local network or intranet).

UPDATE #2

I implemented the FTPClient class with ftplib which can:

  1. monitor a download progress
  2. reconnect in case of timeout or disconnect
  3. makes several attempts to download file
  4. shows current download speed.

After reconnect it continues the download process from disconnect point (if FTP server support it). For details see my answer below.


Question

I have to implement task on python which daily downloads a bunch of big files (0.3-1.5Gb per file * 200-300 files) via FTP and then makes some processing with the files. I did it via ftplib. But from time to time it hangs and it cannot complete the download for some files. To fix the issue I started to play with KEEPALIVE settings, but I still haven't received good result

with closing(ftplib.FTP()) as ftp:
    try:
        ftp.connect(self.host, self.port, 30*60) #30 mins timeout
        # print ftp.getwelcome()
        ftp.login(self.login, self.passwd)
        ftp.set_pasv(True)
        ftp.sock.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
        ftp.sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPINTVL, 75)
        ftp.sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPIDLE, 60)
        with open(local_filename, 'w+b') as f:
            res = ftp.retrbinary('RETR %s' % orig_filename, f.write)

            if not res.startswith('226 Transfer complete'):
                logging.error('Downloaded of file {0} is not compile.'.format(orig_filename))
                os.remove(local_filename)
                return None

        os.rename(local_filename, self.storage + filename + file_ext)
        ftp.rename(orig_filename, orig_filename + '.copied')

        return filename + file_ext

    except:
            logging.exception('Error during download from FTP')

Details

  • Usually it takes 7-15 minutes to download a file.
  • FTP server always shows me in the logs that files are fully downloaded, but the client part hangs. Not every time but from time to time.

Questions

  • May it be because of a disconnect?
  • How to implement a monitor for the download process and reconnect it in case if it's disconnected

标签: python ftplib
1条回答
Viruses.
2楼-- · 2019-01-23 22:00

Because I couldn't find any good suggestions or code samples, I implemented my own solution. Thank you so much to the Stackoverflow community for some ideas which I used in my code. I put the code to GitHub (pyFTPclient) due to the size of the code(~ 120 lines).

I tested the solution on bad quality network (include 3G mobile internet) and it was work ok for me. But of course it may have some bugs.

I will appreciate any comments or suggestions. Thank you in advance.

查看更多
登录 后发表回答