ftplib python: NOOP command works in ASCII not Bin

2019-07-04 07:55发布

问题:

I have a threaded FTP script. While the data socket is receiving the data, a threaded loop sends NOOP commands to the control socket to keep control connection alive during large transfers.

I am prevented from using the FTP.retrbinary() command as, if I want to keep the control connection alive I must separate the data and control sockets which retrbinary does not do.

Code below:

def downloadFile(filename, folder):
    myhost = 'HOST'
    myuser = 'USER'
    passw = 'PASS'
    #login
    ftp = FTP(myhost,myuser,passw)

    ftp.set_debuglevel(2)
    ftp.voidcmd('TYPE I')
    sock = ftp.transfercmd('RETR ' + filename)
    def background():
        f = open(folder + filename, 'wb')
        while True:
            block = sock.recv(1024*1024)
            if not block:
                break
            f.write(block)
        sock.close()
    t = threading.Thread(target=background)
    t.start()
    while t.is_alive():
        t.join(120)
        ftp.voidcmd('NOOP')
    ftp.quit();


My PROBLEM: FTP.transfercmd("RETR " + filename) defaults to ASCII transfers and Im transferring video so it has to be Binary (hence the ftp.voidcmd('TYPE I) call to force Binary mode).

If I DONT call ftp.voidcmd('TYPE I) the NOOP commands get send successfully and the output is as follows:

*cmd* 'NOOP'
*put* 'NOOP\r\n'
*get* '200 NOOP: data transfer in progress\n'
*resp* '200 NOOP: data transfer in progress'
*cmd* 'NOOP'
*put* 'NOOP\r\n'
*get* '200 NOOP: data transfer in progress\n'
*resp* '200 NOOP: data transfer in progress'
*cmd* 'NOOP'
*put* 'NOOP\r\n'
*get* '200 NOOP: data transfer in progress\n'
*resp* '200 NOOP: data transfer in progress'

etc. But the file is in ASCII and therefor corrupted. If I DO call ftp.voidcmd('TYPE I), The NOOP command only sends once, and the control socket doesnt respond until the transfer completes. If the file is large, the control socket times out as if the NOOPs were never sent...

Very strange, but I am sure its simple. It seems as though the transfercmd() is not splitting the control and data sockets as it is supposed to... and therefore the ftp var is not seperated from the data stream... or something. strange.

Thanks in advance for any advice you can offer.

回答1:

tcpdump confirms that server only sends 226 Transfer complete. after entire file was sent by the server.

I suspect that's part of FTP specification.

In fact, look at retrbinary code in ftplib.py:

    self.voidcmd('TYPE I')
    conn = self.transfercmd(cmd, rest)
    while 1:
        data = conn.recv(blocksize)
        if not data:
            break
        callback(data)
    conn.close()
    return self.voidresp()

The last line expects to get tranfer result (as known to server) only after tranfer is complete.

In fact it seems your code is missing voidresp() bit.

I am not very familiar with ftp, from what I've seen background downloaders like lftp actually open new control connection for each parallel download.

You have a valid concern if your file is really large.

There are many extensions to FTP, there may be something that does what you want.

Alternatively you can make a loop likes this:

pos = 0
while not full file:
    command REST
    download for a while in separate thread
    command ABRT
    wait for separate thread to abort
    pos += length of downloaded chunk