I get a UnicodeEncodeError
writing text with a special character to a file:
File "D:\SOFT\Python3\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\ufffd' in position 956: character maps to <undefined>
My code:
expFile = open(expFilePath, 'w')
# data var is what contains a special char
expFile.write("\n\n" + data)
The data is probably some weird character from something like Microsoft Word that got pasted into the application's HTML form and it got persisted, now I am importing it. I can't even see it, shows as a diamond in my DB editor when I query it. It just has a placeholder in the text editor. The input should be more rigorously checked for character set compliance but it is not.
Is there a way to encode the data to make any character digestable for I/O processing?
Alternatively, is there a way to check whether my str is compliant to the character standard expected by file IO in order to do replacements of any data that violates it?