Content-Type with charset only

2019-05-10 07:00发布

问题:

I came across this interesting header:

Content-Type: charset=utf-8

Set HTTP header to UTF-8 using PHP

The answerer says that this syntax is defined by RFC 2616, but I am not seeing it in the provided link. Is this valid syntax, and if so where specifically is this defined?

回答1:

The production in RFC 2616 for the Content-Type header is this:

Content-Type   = "Content-Type" ":" media-type

And the media-type production is this:

media-type     = type "/" subtype *( ";" parameter )
type           = token
subtype        = token

That says that while the parameter part (e.g., charset=utf-8 is optional, the type "/" subtype part is not—that is, a media type must have type followed by a slash followed by a subtype.

So Content-Type: charset=utf-8 isn’t valid syntax per that, and not specially defined anywhere else normatively/authoritatively to be either.

RFC 2616 is actually obsoleted by RFC 7231 and several other RFCs (the current HTTP RFCs).

But the corresponding parts of RFC 7231 define essentially the same productions for this case:

The production in RFC 7231 for the value of the Content-Type header is this:

Content-Type = media-type

And the media-type production is this:

media-type = type "/" subtype *( OWS ";" OWS parameter )
type       = token
subtype    = token

And no other spec obsoletes or supersedes that part—RFC 7231 remains authoritative on this.


Most programming languages have good media-type parsing libs for syntax checking; example:

npm install content-type
node -e "var ct = require('content-type'); ct.parse('charset=utf-8')"
=> TypeError: invalid media type
node -e "var ct = require('content-type'); ct.parse('image; charset=utf-8')"
=> TypeError: invalid media type


回答2:

No, I cannot find such content-type defined anywhere in RFC 2616 or RFC 7231.

And it doesn't even work in Chrome.

(I tried xhr.setRequestHeader('Content-type','charset=utf-8');. When I xhr.send it there is no content-type header.)