I am using Google App Engine to write a new file to a Google Cloud Storage bucket for eventual serving in the browser. Normally on my local computer this writes a nice text file which I can open and see the test character as expected:
with open('new_file.txt', 'w') as f:
f.write(u'é'.encode('utf-8'))
When I open new_file.txt
in Notepad it's properly displayed as é
.
But when I try the analogous process on Google Cloud Storage:
with gcs.open('/mybucket/newfile.txt', 'w', content_type='text/html') as f:
f.write(u'é'.encode('utf-8'))
My files are served in the browser with special characters all messed up, in this case it outputs é
.
The default charset for HTTP 1.1 is ISO-8859-1.
If you want the browser to interpret your text as UTF-8, you should set the content-type header to include the charset, like this: