Web applications that want to force a resource to be downloaded rather than directly rendered in a Web browser issue a Content-Disposition
header in the HTTP response of the form:
Content-Disposition: attachment; filename=FILENAME
The filename
parameter can be used to suggest a name for the file into which the resource is downloaded by the browser. RFC 2183 (Content-Disposition), however, states in section 2.3 (The Filename Parameter) that the file name can only use US-ASCII characters:
Current [RFC 2045] grammar restricts parameter values (and hence Content-Disposition filenames) to US-ASCII. We recognize the great desirability of allowing arbitrary character sets in filenames, but it is beyond the scope of this document to define the necessary mechanisms.
There is empirical evidence, nevertheless, that most popular Web browsers today seem to permit non-US-ASCII characters yet (for the lack of a standard) disagree on the encoding scheme and character set specification of the file name. Question is then, what are the various schemes and encodings employed by the popular browsers if the file name “naïvefile” (without quotes and where the third letter is U+00EF) needed to be encoded into the Content-Disposition header?
For the purpose of this question, popular browsers being:
- Firefox
- Internet Explorer
- Safari
- Google Chrome
- Opera
Put you file name in double quotes. Solved the problem for me. Like this:
http://kb.mozillazine.org/Filenames_with_spaces_are_truncated_upon_download
We had a similar problem in a web application, and ended up by reading the filename from the HTML
<input type="file">
, and setting that in the url-encoded form in a new HTML<input type="hidden">
. Of course we had to remove the path like "C:\fakepath\" that is returned by some browsers.Of course this does not directly answer OPs question, but may be a solution for others.
RFC 6266 describes the “Use of the Content-Disposition Header Field in the Hypertext Transfer Protocol (HTTP)”. Quoting from that:
And in their examples section:
In Appendix D there is also a long list of suggestions to increase interoperability. It also points at a site which compares implementations. Current all-pass tests suitable for common file names include:
filename
” parameter.That RFC 5987 in turn references RFC 2231, which describes the actual format. 2231 is primarily for mail, and 5987 tells us what parts may be used for HTTP headers as well. Don't confuse this with MIME headers used inside a
multipart/form-data
HTTP body, which is governed by RFC 2388 (section 4.4 in particular) and the HTML 5 draft.There is discussion of this, including links to browser testing and backwards compatibility, in the proposed RFC 5987, "Character Set and Language Encoding for Hypertext Transfer Protocol (HTTP) Header Field Parameters."
RFC 2183 indicates that such headers should be encoded according to RFC 2184, which was obsoleted by RFC 2231, covered by the draft RFC above.
If you are using a nodejs backend you can use the following code I found here
In PHP this did it for me (assuming the filename is UTF8 encoded):
Tested against IE8-11, Firefox and Chrome.
If the browser can interpret filename*=utf-8 it will use the UTF8 version of the filename, else it will use the decoded filename. If your filename contains characters that can't be represented in ISO-8859-1 you might want to consider using
iconv
instead.