I am currently inserting data in my django models using csv file. Below is a simple save function that am using:
def save(self):
myfile = file.csv
data = csv.reader(myfile, delimiter=',', quotechar='"')
i=0
for row in data:
if i == 0:
i = i + 1
continue #skipping the header row
b=MyModel()
b.create_from_csv_row(row) # calls a method to save in models
The function is working perfectly with ascii characters. However, if the csv file has some non-ascii characters then, an error is raised: UnicodeDecodeError 'ascii' codec can't decode byte 0x93 in position 1526: ordinal not in range(128)
My question is: How can i remove non-ascii characters before saving my csv file to avoid this error.
Thanks in advance.
If you really want to strip it, try:
* WARNING THIS WILL MODIFY YOUR DATA * It attempts to find a close match - i.e. ć -> c
Perhaps a better answer is to use unicodecsv instead.
----- EDIT ----- Okay, if you don't care that the data is represented at all, try the following:
If row is a collection, not a unicode string, you will need to iterate over the collection to the string level to re-serialize it.
If you want to remove non-ascii characters from your data then iterate through your data and keep only the ascii.
If you want to convert unicode characters to ascii, then the response above by DivinusVox is accurate.
Pandas csv parser (http://pandas.pydata.org/pandas-docs/stable/generated/pandas.io.parsers.read_csv.html) supports different encodings: