Apache commons fileupload “Streaming API”

2020-06-04 14:35发布

问题:

I quote from Apache Commons Page for Commons FileUpload

This page describes the traditional API of the commons fileupload library. The traditional API is a convenient approach. However, for ultimate performance, you might prefer the faster Streaming API.

My Question

What specific differences make Streaming API faster than traditional API?

回答1:

The key difference is in the way you're handling the file, as you noticed by yourself with the factory class.

The streaming API is not saving in disk while getting the input stream. In the end, you'll be able to handle the file faster (with a cost on temporary memory)... but the idea is to avoid saving the binary in disk unless you really want/need to.

After that, you are able to save the data to disk, of course, using a bufferedinputstream, a byte array or similar.

EDIT: The handler when you open the stream ( fileItemStreamElement.openStream() ) is a common InputStream instance. So, the answer to your "what if it's a big file" is something like this Memory issues with InputStream in Java

EDIT: The streaming API should not save to disk OR save in memory. It simply provides a stream you can read from to copy the file to where ever you want. This is a way to avoid having a temp directory and also avoid allocating enough memory to hold the file. This should be faster at least because it is not copied twice, once from the browser to disk/memory and then again from disk/memory to where ever you save it.



回答2:

The traditional API, which is described in the User Guide, assumes, that file items must be stored somewhere, before they are actually accessable by the user. This approach is convenient, because it allows easy access to an items contents. On the other hand, it is memory and time consuming.

http://commons.apache.org/fileupload/streaming.html



回答3:

The streaming API should not save to disk OR save in memory. It simply provides a stream you can read from to copy the file to where ever you want. This is a way to avoid having a temp directory and also avoid allocating enough memory to hold the file. This should be faster at least because it is not copied twice, once from the browser to disk/memory and then again from disk/memory to where ever you save it.



回答4:

Streaming generally refers to a API (like Apache FileUpload or StAX) in which data is transmitted and parsed serially at application run time, often in real time, and often from dynamic sources whose contents are not precisely known beforehand.

Traditional models refer to APIs like (Traditional file handling APIs, DOM API) which provide a lot more detail information about the data.

Like for a FileHandling API Traditional approach assumes that file items must be stored somewhere, before they are actually accessible by the user. This approach is convenient, because it allows easy access to an items contents. On the other hand, it is memory and time consuming.

An Streaming API will have a smaller memory footprint and smaller processor requirements and can have higher performance in certain situations.

It works on the fundamental of "cardboard tube" view of the document you are working with.