I'm trying to output a CSV file that the user could open with excel. I've encoded all string in UTF-8 but when I opened the file with excel I see jibrish. Only after converting the file to UTF-8 with BOM (using notepad++ on windows) I was able to display the content properly.
I'm following this pattern from the docs:
def render_to_csv(self, request, qs):
response = HttpResponse(content_type='text/csv')
response['Content-Disposition'] = 'attachment; filename="test.csv"'
writer = csv.writer(response, delimiter=',')
for row in qs.values_list(*self.fields_to_export):
writer.writerow([unicode(v).encode('utf-8') if v is not None else '' for v in row])
return response
Where does to BOM fit into all of this ?
BTW, There are similar questions on SO but unfortunately non of them are answered.
EDIT
building on @Alastair McCormack, I ended up explicitly adding the BOM characters at the begining of the file. Only difference is i used the codecs package instead of hard coding the bytes. Feels awkward but does the trick !
import codecs
def render_to_csv(self, request, qs):
...
response.write(codecs.BOM_UTF8)
...
return response
Add the UTF-8 BOM to the response object before you write your data:
def render_to_csv(self, request, qs):
response = HttpResponse(content_type='text/csv')
response['Content-Disposition'] = 'attachment; filename="test.csv"'
# BOM
response.write("\xEF\xBB\xBF")
writer = csv.writer(response, delimiter=',')
…
StreamingHttpResponse
for csv
add UTF-8 BOM or \xEF\xBB\xBF
Modified from official documents
import csv
import codecs
from django.utils.six.moves import range
from django.http import StreamingHttpResponse
class Echo(object):
def write(self, value):
return value
def iter_csv(rows, pseudo_buffer):
yield pseudo_buffer.write(codecs.BOM_UTF8)
writer = csv.writer(pseudo_buffer)
for row in rows:
yield writer.writerow(row)
def some_streaming_csv_view(request):
rows = (["Row {}".format(idx), str(idx)] for idx in range(65536))
response = StreamingHttpResponse(iter_csv(rows), Echo()), content_type="text/csv")
return response
The given answers are great but I want to give a hint how to make the official examples from the Django
docs work with the UTF-8 BOM sequence of bytes at the start of the stream by only changing one line:
import itertools
import codecs
streaming_content = itertools.chain([codecs.BOM_UTF8], (writer.writerow(row) for row in rows))
response = StreamingHttpResponse(streaming_content, content_type="text/csv")