I'm writing a large response (hundreds of Mb) to the WebSphere's response.getOutputStream().
It appears that Web Sphere always store whole output stream data in some internal in-memory buffers before serving it to the cient. So my servet processing (generating data) finishes in seconds while browser may still be downloading it for half an hour. During that time whole response remains buffered in memory.
Is it possible to avoid this buffering?
I'd prefere more servlet threads waiting for the output stream than wasting Gbs of memory.
My Web Sphere version is 8.5.0.
I've tried setting content length and chunked response - it's all the same, still buffering.
My TCP transport chain settings are default with 32 kb response buffer, but it is ignored somehow.
Meanwhile the answer was found among WAS web container custom properties.
By default, the web container uses asynchronous writes to write response data in chunks up to the response buffer size. For larger responses that are greater than the response buffer size, the web container continues to buffer response data into memory while waiting for an asynchronous write of a response data chunk to complete. This process can result in part of a large response held in memory, which can lead to high memory usage and potentially an out of memory error. An application server hang might also occur when a server is simultaneously processing more requests than web container-defined threads.
If the com.ibm.ws.webcontainer.channelwritetype property is set to sync, synchronous writing is used, otherwise asynchronous writing is used by default. With synchronous writing, response data are written synchronously in chunks of up to the value of responsebuffersize and no response data are buffered into memory while waiting for a synchronous write of a response data chunk to complete. As a result, the approximate maximum amount of response data that is held in memory is equal to the responsebuffersize multiplied by the number of web container threads. The maximum number of requests that can be processed simultaneously by the web container is limited by the number of web container threads. Additional requests are queued, waiting for a request that is in process to complete.
The responsebuffersize web container custom property defines the maximum amount of response data written by the web container in a single chunk, and is 32k by default. As a result, it is used to change the number of writes needed by the web container to send complete response data. However, if an application flushes response data, any response data held by the web container is immediately written irrespective of the responsebuffersize.
Use the following name-value pair to write chunks of data using synchronous writes.
com.ibm.ws.webcontainer.channelwritetype async
Are you sure that this data is stored by Websphere in an internal in-memory buffer? The Websphere channel output buffer can only hold 32K of data at a time. So the remaining data is probably held by your servlet, since you have generated some big data stored in the heap and referenced by your code.
Take a look at your code and see what data your code is referring, and if you want to know what is stored in memory and who is keeping it alive, take a snapshot of the heap (a heap dump).