Using the HTTP Range Header with a range specifier

2019-01-08 23:32发布

问题:

The core question is about the use of the HTTP Headers, including Range, If-Range, Accept-Ranges and a user defined range specifier.

Here is a manufactured example to help illustrate my question. Assume I have a Web 2.0 style application that displays some sort of human readable documents. These documents are editorially broken up into pages (similar to articles you see on news websites). For this example, assume:

  • There is a document titled "HTTP Range Question" is broken up into three pages.
  • The shell page (/document/shell/http-range-question) knows the meta information about the document, including the number of pages.
  • The first readable page of the document is loaded during the page onload event via an ajax GET and inserted onto the page.
  • A UI control that looks like [ 1 2 3 All ] is at the bottom of the page, and clicking on a number will display that readable page (also loaded via ajax), and clicking "All" will display the entire document. Assume these URLS for the 1, 2, 3 and All use cases:
    • /document/content/http-range-question?page=1
    • /document/content/http-range-question?page=2
    • /document/content/http-range-question?page=3
    • /document/content/http-range-question

Now to the question. Can I use the HTTP Range headers instead part of the URL (e.g. a querystring parameter)? Maybe something like this on the GET /document/content/http-range-question request:

Range: page=1

It looks like the spec only defines byte ranges as allowable, so even if I made my ajax calls work with my browser and server code, anything in the middle could break the contract (e.g. a caching proxy server).

Range: bytes=0-499

Any opinions or real world examples of custom range specifiers?

Update: I did find a similar question about the Range header (Paging in a Rest Collection) where they mention that Dojo's JsonRestStore uses a custom Range header value.

Range: items=0-24

回答1:

Absolutely - you are free to specify any range units you like.

From RFC 2616:

3.12 Range Units

HTTP/1.1 allows a client to request that only part (a range of) the
response entity be included within the response. HTTP/1.1 uses range units in the Range (section 14.35) and Content-Range (section 14.16)
header fields. An entity can be broken down into subranges according to various structural units.

  range-unit       = bytes-unit | other-range-unit
  bytes-unit       = "bytes"
  other-range-unit = token

The only range unit defined by HTTP/1.1 is "bytes". HTTP/1.1
implementations MAY ignore ranges specified using other units.

The key piece is the last paragraph. Really what it's saying is that when they wrote the spec for HTTP/1.1, they only outlined the "bytes" token. But, as you can see from the 'other-range-unit' bit, you are free to come up with your own token specifiers.

Coming up with your own Range specifiers does mean that you have to have control over the client and server code that uses that specifier. So, if you own the backend piece that exposes the "/document/content/http-range-question" URI, you are good to go; presumably you're using a modern web framework that lets you inspect the request headers coming in. You could then look at the Range values to perform the backing query correctly.

Furthermore, if you control the AJAX code that makes requests to the backend, you should be able to set the Range header yourself.

However, there is a potential downside which you anticipate in your question: the potential to break caching. If you are using a custom Range unit, any caches between your client and the origin servers "MAY ignore ranges specified using [units other than 'bytes']". So for example, if you had a Squid/Varnish cache between the front and backend, there's no guarantee that the results you're hoping for will be served from the cache!

You might also consider an alternative implementation where, rather than using a query string, you make the page a "parameter" of the URI; e.g.: /document/content/http-range-question/page/1. This would likely be a little more work for you server-side, but it's HTTP/1.1 compliant and caches should treat it properly.

Hope this helps.



回答2:

HTTP Range is typically used for recovering interrupted downloads without starting from the beginning.

What you're trying to do would be better handled by OAI-ORE, which allows you to define relationships between multiple documents. (alternative formats, components of the whole, etc)

Unfortunately, it's a relatively new metadata format, and I don't know of any web browsers that ship with native support.



回答3:

bytes is the only unit supported by HTTP 1.1 Specification.



回答4:

It sounds like you want to change the HTTP spec just to remove a querystring parameter. In order to do this you'd have to modify code on both the client to send the modified header and the server to read from the "Range" header instead of the querystring.

The end result is that this will probably work, but you're breaking all of the standards and existing tools to do so.