Apache HttpClient 4.x behaving strange when upload

2019-03-30 17:44发布

问题:

I'm developing and testing a little straight-forward client-server application using java (and scala).

The server is based on com.sun.net.httpserver.HttpServer and allows the upload of files via a basic RESTful interface using POST and PUT operations. The upload operation is restricted using Digest authentication which we implemented by ourselves, is tested and works in browsers, curl and Apache HttpClient.

The upload client wraps Apache HttpClient 4.1.2 and executes PUT operations over http to upload file entities. The content-type of the file is specified as application/xml in the header and only a single file is uploaded at a time.

When uploading files of different sizes a strange behaviour could be observed:

  • Files with sizes less or equals to 1.076.006 Byte are uploaded successfully.
  • Files with sizes greater or equals to 1.122.158 Bytes fail with a java.net.SocketException: Broken pipe.

(The exact critical size is unknown since I've created files with different sizes manually to approximate the max working size)

The reason for the broken pipe is, that the client somehow ignored the www-authenticate-response uploading files of that size, as is documented by the server logs. "Ignore" means, that it just send multiple (4) messages containing no authentication header at all. But smaller files work well and the client sends an authentication request with the proper challenge-response correctly immediately after the www-authenticate-response as it should be.

The upload works in curl with files of all sizes, so no problem there.

So at this point, one could say: "There is some bug in your client." Okay, I kind of hope so, but I've also tried an open-source java RESTclient (also wrapping apache httpclient) and it has exactly the same behaviour!

We tried it using this client over the internet and its also the same as described. So right now, I just hope I've missed to set something important in Apache HttpClient which leads to this erroneous behaviour and the developer of the open-source RESTclient missed it as well... any ideas what it could be would be great!

回答1:

Most likely it is a combination of several factors that leads to this situation

(1) Most likely your client does not use the 'expect-continue' handshake when sending large request entity with a request that does not include an authentication header.

(2) The server detects early that the request fails its expectations and instead of reading and discarding full request body it responds early with 401 status and closes connection on its end. In my opinion, this is an HTTP protocol violation on the part of the server.

(3) While some HTTP agents can deal with early responses, Apache HttpClient cannot due to the limitation of the Java blocking I/O (a thread of execution can either read or write from a blocking socket, but not both).

There are multiple ways of addressing the issue, the 'expect-continue' handshake being the easiest and most natural one. Alternatively one can execute a simple HEAD or a GET request to force HTTP authentication prior to executing a large POST or PUT request. HttpClient is capable of re-using authentication data for subsequent requests in the same logical HTTP session.