I have a very simple ColdFusion web app that takes one URL parameter and prints it to the page. But it doesn't seem to be receiving a UTF-8 encoded value, even though it's sent that way.
Here's an HTTP request, taken from Fiddler:
POST http://blahblahwebservice/getme.htm HTTP/1.1
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
User-Agent: unicode_post
Host: miscmsuatsw
Content-Length: 14
Pragma: no-cache
Cookie: CFID=247445; CFTOKEN=305db8322d5cecfb-627BD26F-BC91-0EC2-25E9745308EF96F7
sysopts=ΠΣΩ
Here's what it looks like in HEX (URL & HTTP/1.1 with 2 CRLFs & host stripped):
43 6F 6E 74 65 6E 74 2D 54 79 70 65 3A 20 61 70 Content-Type: ap
70 6C 69 63 61 74 69 6F 6E 2F 78 2D 77 77 77 2D plication/x-www-
66 6F 72 6D 2D 75 72 6C 65 6E 63 6F 64 65 64 3B form-urlencoded;
20 63 68 61 72 73 65 74 3D 55 54 46 2D 38 0D 0A charset=UTF-8..
55 73 65 72 2D 41 67 65 6E 74 3A 20 75 6E 69 63 User-Agent: unic
6F 64 65 5F 70 6F 73 74 0D 0A 48 6F 73 74 3A 20 ode_post..Host:
-- -- -- -- -- -- -- -- -- -- -- 0D 0A 43 6F 6E .............Con
74 65 6E 74 2D 4C 65 6E 67 74 68 3A 20 31 34 0D tent-Length: 14.
0A 50 72 61 67 6D 61 3A 20 6E 6F 2D 63 61 63 68 .Pragma: no-cach
65 0D 0A 43 6F 6F 6B 69 65 3A 20 43 46 49 44 3D e..Cookie: CFID=
32 34 37 34 34 35 3B 20 43 46 54 4F 4B 45 4E 3D 247445; CFTOKEN=
33 30 35 64 62 38 33 32 32 64 35 63 65 63 66 62 305db8322d5cecfb
2D 36 32 37 42 44 32 36 46 2D 42 43 39 31 2D 30 -627BD26F-BC91-0
45 43 32 2D 32 35 45 39 37 34 35 33 30 38 45 46 EC2-25E9745308EF
39 36 46 37 0D 0A 0D 0A 73 79 73 6F 70 74 73 3D 96F7....sysopts=
CE A0 CE A3 CE A9 ΠΣΩ
Specifically, ΠΣΩ
is CE A0 CE A3 CE A9
.
When rendered, I merely get "???". I know that the page can render utf-8, I think it's in the reception of these bytes because when I set the UTF-8 hex chars to U+03A0
and so fort, it renders just fine.
Is there something my CF webpage is missing in order to handle UTF-8??
You can try getting ColdFusion to "think harder" about them being utf-8 by using the
setEncoding
function.Unless you're doing something exceptionally specialised, those two lines should always be in your
Application.cfm/c
.The tag
cfprocessingdirective
is only of benefit if you have utf-8 characters in the.cfm
file itself. UTF-8 encoded strings that come from url/form/cookie/database, etc do not fall into that category, and therefore won't be affected by that tag.Also, every application/system has its own set or default encoding - this includes your web server, your db server, your browser, your db front-end, your editor. All of those things need to be checked before ruling out the obvious.
Also try this to see what you get.
I am curious as to whether your data is actually binary data. If so, this will encode it for you. But the recommended approach is to check your browser, server,etc
Faith Sloan
Try
<cfprocessingdirective pageEncoding="utf-8">
on the first line of your CFM.If it works, you should switch to editor like CF Builder, where the BOM stamp will take care of that and no processing directive needed.