python: HTTP PUT with binary data

2019-07-26 13:32发布

问题:

So I adapted urllib2 as suggested by answers to another question:

class HttpRequest(urllib2.Request):
  def __init__(self, *args, **kwargs):
    self._method = kwargs.pop('method', 'GET')
    urllib2.Request.__init__(self, *args, **kwargs)
  def get_method(self):
    return self._method

and it works nicely for PUT with JSON:

req = HttpRequest(url=url, method='PUT', 
    data=json.dumps(metadata))
response = urllib2.urlopen(req)

but it fails with data= binary data (partial stacktrace below):

  File "c:\appl\python\2.7.2\lib\urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "c:\appl\python\2.7.2\lib\urllib2.py", line 394, in open
    response = self._open(req, data)
  File "c:\appl\python\2.7.2\lib\urllib2.py", line 412, in _open
    '_open', req)
  File "c:\appl\python\2.7.2\lib\urllib2.py", line 372, in _call_chain
    result = func(*args)
  File "c:\appl\python\2.7.2\lib\urllib2.py", line 1199, in http_open
    return self.do_open(httplib.HTTPConnection, req)
  File "c:\appl\python\2.7.2\lib\urllib2.py", line 1168, in do_open
    h.request(req.get_method(), req.get_selector(), req.data, headers)
  File "c:\appl\python\2.7.2\lib\httplib.py", line 955, in request
    self._send_request(method, url, body, headers)
  File "c:\appl\python\2.7.2\lib\httplib.py", line 989, in _send_request
    self.endheaders(body)
  File "c:\appl\python\2.7.2\lib\httplib.py", line 951, in endheaders
    self._send_output(message_body)
  File "c:\appl\python\2.7.2\lib\httplib.py", line 809, in _send_output
    msg += message_body
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 10: ordinal
 not in range(128)

Is there a way I can fix this?

回答1:

It's because

data should be a buffer in the standard application/x-www-form-urlencoded format. The urllib.urlencode() function takes a mapping or sequence of 2-tuples and returns a string in this format.

from urllib2 doc



回答2:

You are trying to automatically convert a python unicode string to a regular byte string. JSoN is always unicode, but HTTP must send bytes. If you are confident that the reciever will understand the json encoded data in a particular encoding, you can just encode it that way:

>>> urllib2.urlopen(urllib2.Request("http://example.com", data=u'\u0ca0'))
Traceback (most recent call last):
  ...
UnicodeEncodeError: 'ascii' codec cannot encode character u'\u0ca0' in position 0: ordinal not in range(128)
>>> urllib2.urlopen(urllib2.Request("http://example.com", 
...                                 data=u'\u0ca0'.encode('utf-8')))
<addinfourl at 15700984 whose fp = <socket._fileobject object at 0xdfbe50>>
>>> 

Note the .encode('utf-8'), which converts unicode to str in utf-8. The implicit conversion would use ascii, which cant encode non-ascii characters.

tl;dr ... data=json.dumps(blabla).encode('utf-8') ...



回答3:

According to the urllib2 documentation, you will need to percent-encode the byte-string.