HTTP POST binary files using Python: concise non-p

2019-03-16 12:41发布

问题:

I'm interested in writing a short python script which uploads a short binary file (.wav/.raw audio) via a POST request to a remote server.

I've done this with pycurl, which makes it very simple and results in a concise script; unfortunately it also requires that the end user have pycurl installed, which I can't rely on.

I've also seen some examples in other posts which rely only on basic libraries, urllib, urllib2, etc., however these generally seem to be quite verbose, which is also something I'd like to avoid.

I'm wondering if there are any concise examples which do not require the use of external libraries, and which will be quick and easy for 3rd parties to understand - even if they aren't particularly familiar with python.

What I'm using at present looks like,


def upload_wav( wavfile, url=None, **kwargs ):
    """Upload a wav file to the server, return the response."""

    class responseCallback:
        """Store the server response."""
        def __init__(self):
            self.contents=''
        def body_callback(self, buf):
            self.contents = self.contents + buf

        def decode( self ):
            self.contents = urllib.unquote(self.contents)
            try:
                self.contents = simplejson.loads(self.contents)
            except:
                return self.contents

    t = responseCallback()
    c = pycurl.Curl()
    c.setopt(c.POST,1)
    c.setopt(c.WRITEFUNCTION, t.body_callback)
    c.setopt(c.URL,url)
    postdict = [
        ('userfile',(c.FORM_FILE,wavfile)),  #wav file to post                                                                                 
        ]
    #If there are extra keyword args add them to the postdict                                                                                  
    for key in kwargs:
        postdict.append( (key,kwargs[key]) )
    c.setopt(c.HTTPPOST,postdict)
    c.setopt(c.VERBOSE,verbose)
    c.perform()
    c.close()
    t.decode()
    return t.contents

this isn't exact, but it gives you the general idea. It works great, it's simple for 3rd parties to understand, but it requires pycurl.

回答1:

POSTing a file requires multipart/form-data encoding and, as far as I know, there's no easy way (i.e. one-liner or something) to do this with the stdlib. But as you mentioned, there are plenty of recipes out there.

Although they seem verbose, your use case suggests that you can probably just encapsulate it once into a function or class and not worry too much, right? Take a look at the recipe on ActiveState and read the comments for suggestions:

  • Recipe 146306: Http client to POST using multipart/form-data

or see the MultiPartForm class in this PyMOTW, which seems pretty reusable:

  • PyMOTW: urllib2 - Library for opening URLs.

I believe both handle binary files.



回答2:

I met similar issue today, after tried both and pycurl and multipart/form-data, I decide to read python httplib/urllib2 source code to find out, I did get one comparably good solution:

  1. set Content-Length header(of the file) before doing post
  2. pass a opened file when doing post

Here is the code:

import urllib2, os
image_path = "png\\01.png"
url = 'http://xx.oo.com/webserviceapi/postfile/'
length = os.path.getsize(image_path)
png_data = open(image_path, "rb")
request = urllib2.Request(url, data=png_data)
request.add_header('Cache-Control', 'no-cache')
request.add_header('Content-Length', '%d' % length)
request.add_header('Content-Type', 'image/png')
res = urllib2.urlopen(request).read().strip()
return res

see my blog post: http://www.2maomao.com/blog/python-http-post-a-binary-file-using-urllib2/



回答3:

I know this is an old old stack, but I have a different solution.

If you went thru the trouble of building all the magic headers and everything, and are just UPSET that suddenly a binary file can't pass because python library is mean.. you can monkey patch a solution..

import httplib
class HTTPSConnection(httplib.HTTPSConnection):
def _send_output(self, message_body=None):
    self._buffer.extend(("",""))
    msg = "\r\n".join(self._buffer)
    del self._buffer[:]
    self.send(msg)
    if message_body is not None:
        self.send(message_body)

httplib.HTTPSConnection = HTTPSConnection

If you are using HTTP:// instead of HTTPS:// then replace all instances of HTTPSConnection above with HTTPConnection.

Before people get upset with me, YES, this is a BAD SOLUTION, but it is a way to fix existing code you really don't want to re-engineer to do it some other way.

Why does this fix it? Go look at the original Python source, httplib.py file.



回答4:

How's urllib substantially more verbose? You build postdict basically the same way, except you start with

postdict = [ ('userfile', open(wavfile, 'rb').read()) ]

Once you vave postdict,

resp = urllib.urlopen(url, urllib.urlencode(postdict))

and then you get and save resp.read() and maybe unquote and try JSON-loading if needed. Seems like it would be actually shorter! So what am I missing...?



回答5:

urllib.urlencode doesn't like some kinds of binary data.