We can declare the character encoding in an INDIVIDUAL CSS file by codes below:
@charset "UTF-8";
My question is:
How to declare character encoding in an INDIVIDUAL JS file?
If I send a JS file to my friend, I hope he (she) can understand this JS file's character encoding from codes themselves when he (she) starts to browse or edit this JS file.
Thank you!
You can't. You can, however, define it in the script
tag that brings the file into the page, using the charset
attribute. This must match the charset
, if any, in the Content-Type
that you serve the file with. Quoting:
The charset
attribute gives the character encoding of the external script resource. The attribute must not be specified if the src
attribute is not present. If the attribute is set, its value must be a valid character encoding name, must be an ASCII case-insensitive match for the preferred MIME name for that encoding, and must match the encoding given in the charset
parameter of the Content-Type metadata of the external file, if any. [IANACHARSET]
Re your edit:
If I send a JS file to my friend, I hope he (she) can understand this JS file's character encoding from codes themselves when he (she) starts to browser or edit this JS file.
For that, you'll pretty much just have to tell him/her. If the file is in UTF-8 or Windows-1252 or ISO 8859-1, unfortunately there's no in-file indicator of the encoding available, so I'd include a comment at the beginning along the lines of:
// Encoding: UTF-8
If you're using UTF-16 or UTF-32, though, you should be able to tell your editor to use a BOM, which other editors should see and understand (if they're Unicode-aware editors). This would typically only apply if you were writing your comments in a text (language) requiring lots of multi-byte characters, and if you have a high ratio of comments to code (since the code is written with western text), although of course you're welcome to use any encoding you like. It's just that if the ratio of comments to code is low, you're probably better off sticking with UTF-8 even if the comments are in a text requiring lots of four-byte characters, because the code will only require one byte per character. (Whereas in UTF-16, you might have more two-byte instead of four-byte characters in your comments, but the code would always require two bytes per character; and in UTF-32, four bytes per character. So on the whole the file may well be larger even though the comments take less space. But here I'm probably telling you things you already know far better than I, if I'm guessing correctly about your reasons for the question.)
There is no JavaScript construct for declaring the encoding in the file itself, the way you can do in CSS. The encoding should be communicated to the recipients when delivering the data. When sending files as e-mail attachments, your e-mail program might or might not include them with Content-Type headers that indicate the encoding (but it might have hard time in figuring out what the encoding might be).
You can the a Byte Order Mark (BOM) at the start of a UTF-8 encoded file, too. Although there is no byte order issue in UTF-8, the BOM acts as a useful indicator–a file that starts with bytes that constitute a BOM in UTF-8 encoding is most probably UTF-8 encoded. This is why programs may well infer the encoding, in the absence of other indication. This is of course not 100% reliable, but a useful thing.
Many text editors have the option of saving your file as “UTF-8 encoded with a BOM”.
(On web pages, the BOM was once regarded as a risk, since browsers were observed to treat it as character data. These days, the BOM even in UTF-8 is useful rather than a risk.)
If you are interested in indicating the file's encoding in a human-readable way, T.J. Crowder's idea (adding a comment to the file like // Encoding: UTF-8
) is just the thing. And as Jukka K. Korpela pointed out, you can use the BOM as well.
But if you want a machine-readable way to indicate charset that is declared in the document there are a couple of other ways:
For instance, on an Apache httpd server you might use any of the following declarations:
AddDefaultCharset UTF-8
AddCharset UTF-8 .js
AddType 'application/javascript; charset=UTF-8' js
*
* I am not interested in making the case for using "application/javascript"
over "text/javascript"
. But if you are interested in knowing why one or the other might be preferable, cf. https://stackoverflow.com/a/4101763/1070047. Given the topic, though, application/javascript
seems quite appropriate (especially if you are intending to use a BOM, because it indicates that the code should be treated as a binary).
If the code will be interpreted/processed/compiled server-side (e.g. PHP), you can set headers in the document, e.g.…
header("Content-Type: application/javascript; charset=utf-8");
At least within PHP, be sure to add that header statement before any output takes place.
Lastly, when determining which declaration to use, consider that (when understood/honored, i.e. not in IE) the BOM has greater authority than document headers. And both take precedence over the linked/sourced charset declarations (like <script type="application/javascript" src="script.js" charset="utf-8"></script>
).