How to POST chunked encoded data in Python

2019-05-15 23:23发布

问题:

I am trying to POST chunked encoded data to httpbin.org/post. I tried two options: Requests and httplib

Using Requests

#!/usr/bin/env python

import requests

def gen():
        l = range(130)
        for i in l:
                yield '%d' % i

if __name__ == "__main__":
        url = 'http://httpbin.org/post'
        headers = {
                        'Transfer-encoding':'chunked',
                        'Cache-Control': 'no-cache',
                        'Connection': 'Keep-Alive',
                        #'User-Agent': 'ExpressionEncoder'
                }
        r = requests.post(url, headers = headers, data = gen())
        print r

Using httplib

#!/usr/bin/env python

import httplib
import os.path

if __name__ == "__main__":
        conn = httplib.HTTPConnection('httpbin.org')
        conn.connect()
        conn.putrequest('POST', '/post')
        conn.putheader('Transfer-Encoding', 'chunked')
        conn.putheader('Connection', 'Keep-Alive')
        conn.putheader('Cache-Control', 'no-cache')
        conn.endheaders()
        for i in range(130):
                conn.send(str(i))

        r = conn.getresponse()
        print r.status, r.reason

In both of these cases, whenever I analyze Wireshark traces, I do not see multiple chunks being sent. Instead, what I see is that all of the data is being sent in a single chunk? Am I missing something here?

回答1:

The code you posted shouldn't have worked correctly. The reason you still get a successful response back is because httpbin.org doesn't currently support chunked transfer encoding. See bug https://github.com/kennethreitz/httpbin/issues/102.

Like in the post Piotr linked to above, you're supposed to write out the length of each chunk in hexadecimal and then the chunk itself.

I butchered your code for an example. The http://httpbin.org/post endpoint has a form that you can use for testing. That's where I generated the chunk1 and chunk2 form data.

import httplib
import time

chunk1 = "custname=bob&custtel=11111&custemail=bob%40email.com&si"
chunk2 = "ze=medium&topping=bacon&delivery=11%3A00&comments=if+you%27re+late+we+get+it+free"

if __name__ == "__main__":
    conn = httplib.HTTPConnection('httpbin.org')
    conn.connect()
    conn.putrequest('POST', '/post')
    conn.putheader('Transfer-Encoding', 'chunked')
    conn.putheader('Content-Type', 'application/x-www-form-urlencoded')
    conn.endheaders()

    conn.send("%s\r\n" % hex(len(chunk1))[2:])
    conn.send("%s\r\n" % chunk1)

    time.sleep(1)

    conn.send("%s\r\n" % hex(len(chunk2))[2:])
    conn.send("%s\r\n" % chunk2)

    time.sleep(1)
    /* last chunk */
    conn.send("0\r\n\r\n")

    r = conn.getresponse()
    print r.status, r.reason, r.read()

The stream in wireshark will look something like the following which is incorrect as it's not waiting for (notice the trailing 0) or interpreting the request body (notice the json: null) that we sent:

POST /post HTTP/1.1
Host: httpbin.org
Accept-Encoding: identity
Transfer-Encoding: chunked
Content-Type: application/x-www-form-urlencoded

37
custname=bob&custtel=11111&custemail=bob%40email.com&si
51
ze=medium&topping=bacon&delivery=11%3A00&comments=if+you%27re+late+we+get+it+free
HTTP/1.1 200 OK
Connection: close
Server: gunicorn/18.0
Date: Fri, 31 Oct 2014 10:37:24 GMT
Content-Type: application/json
Content-Length: 494
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true
Via: 1.1 vegur

{
  "args": {}, 
  "data": "", 
  "files": {}, 
  "form": {}, 
  "headers": {
    "Accept-Encoding": "identity", 
    "Connect-Time": "2", 
    "Connection": "close", 
    "Content-Type": "application/x-www-form-urlencoded", 
    "Host": "httpbin.org", 
    "Total-Route-Time": "0", 
    "Transfer-Encoding": "chunked", 
    "Via": "1.1 vegur", 
    "X-Request-Id": "5053a365-ca6a-4c29-b97a-f7a6ded7f2d9"
  }, 
  "json": null, 
  "origin": "110.174.97.16", 
  "url": "http://httpbin.org/post"
}0