Python - Upload a in-memory file (generated by

2019-02-20 01:44发布

问题:

I need to be able to upload a file through FTP and SFTP in Python but with some not so usual constraints.

  1. File MUST NOT be written in disk.

  2. The file how it is generated is by calling an API and writing the response which is in JSON to the file.

  3. There are multiple calls to the API. It is not possible to retrieve the whole result in one single call of the API.

  4. I can not store in a string variable the full result by doing the multiple calls needed and appending in each call until I have the whole file in memory. File could be huge and there is a memory resource constraint. Each chunk should be sent and memory deallocated.

So here some sample code of what I would like to:

def chunks_generator():
    range_list = range(0, 4000, 100)
    for i in range_list:
        data_chunk = requests.get(url=someurl, url_parameters={'offset':i, 'limit':100})
        yield str(data_chunk)

def upload_file():
    chunks_generator = chunks_generator()
    for chunk in chunks_generator:
        data_chunk= chunk
        chunk_io = io.BytesIO(data_chunk)
        ftp = FTP(self.host)
        ftp.login(user=self.username, passwd=self.password)
        ftp.cwd(self.remote_path)
        ftp.storbinary("STOR " + "myfilename.json", chunk_io)

I want only one file with all the chunks appended. What I have already and works is if I have the whole file in memory and send it at once like this:

string_io = io.BytesIO(all_chunks_together_in_one_string)
ftp = FTP(self.host)
ftp.login(user=self.username, passwd=self.password)
ftp.cwd(self.remote_path)
ftp.storbinary("STOR " + "myfilename.json", string_io )

Bonus

I need this in ftplib but will need it in Paramiko as well for SFTP. If there are any other libraries that this would work better I am open.

How about if I need to zip the file? Can I zip each chunk and send the zip-chunked chunk at a time?

回答1:

You can implement file-like class that upon calling .read(blocksize) method retrieves data from requests object.

Something like this (untested):

class ChunksGenerator:
    i = 0
    requests = None

    def __init__(self, requests)
        self.requests = requests

    def read(self, blocksize):
        # TODO: somehow detect end-of-file and return false in that case
        buf = requests.get(
                  url=someurl, url_parameters={'offset':self.i, 'limit':blocksize})
        self.i += blocksize
        return buf

generator = ChunksGenerator(requests)
ftp.storbinary("STOR " + "myfilename.json", generator)

With Paramiko, you can use the same class with SFTPClient.putfo method.