Is codepage 65001 and utf-8 the same thing?

2019-01-14 00:32发布

问题:

<%@LANGUAGE="VBSCRIPT" CODEPAGE="65001"%>
<!--#include file="conn.asp"-->
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

Is the above code right?

回答1:

Yes.

UTF-8 is CP65001 in Windows (which is just a way of specifying UTF-8 in the legacy codepage stuff). As far as I read ASP can handle UTF-8 when specified that way.



回答2:

Your code is correct although I prefer to set the CharSet in code rather than use the meta tag:-

<% Response.CharSet = "UTF-8" %>

The codepage 65001 does refer to the UTF-8 character set. You would need be make sure that your asp page (and any includes) are saved as UTF-8 if they contain any characters outside of the standard ASCII character set.

By specifying the CODEPAGE attribute in the <%@ block you are indicating that anything written using Response.Write should be encoded to the Codepage specified, in this case 65001 (utf-8). Its worth bearing in mind that this does not affect any static content which is sent verbatim byte for byte to the response. Hence the reason why the file needs be actually saved using the codepage that is specified.

The CharSet property of the response sets the CharSet value of the Content-Type header. This has no impact on how the content my be encoded it merely tells the client what encoding is being received. Again it is important that his value match the actual encoding sent.



回答3:

Yes, 65001 is the Windows code page identifier for UTF-8, as documented on the Microsoft website. Wikipedia suggests that IBM code page 128 and SAP code page 4110 are also indicators for UTF-8.



回答4:

response.codepage = 65001

seem to give bad result when the physical file is saved as utf-8

Otherwise, it work as it is supposed to.