Weird characters while reading file content

2019-07-17 05:55发布

I'm not sure what is wrong:

for line in open(textfile, 'r'):
    print(line)

Output:

abcd

The file was created using textpad++ using Unix EOL and UTF8 encoding.

Now it works properly using Encoding with UTF-8 without BOM option on notepad++ . But why? I mean how could I convert all sent files to UTF-8 to avoid weird chars?

2条回答
叛逆
2楼-- · 2019-07-17 06:39

You must set the encoding of your file while reading it, using UTF-8.

Add a third parameter to your code, setting its enconding. From:

for line in open(textfile, 'r'):
    print(line)

to:

for line in open(textfile, 'r', encoding='utf-8-sig'):
    print (line)
查看更多
仙女界的扛把子
3楼-- · 2019-07-17 06:47

Specifying encoding will solve your problem.

for line in open(textfile, 'r', encoding='utf-8-sig'):
    print(line)

utf_8_sig: UTF-8 codec with BOM signature

查看更多
登录 后发表回答