Why does my Python code print the extra characters

2020-02-05 11:42发布

try:
    data=open('info.txt')
    for each_line in data:
        try:
            (role,line_spoken)=each_line.split(':',1)
            print(role,end='')
            print(' said: ',end='')
            print(line_spoken,end='')
        except ValueError:
            print(each_line)
    data.close()
except IOError:
     print("File is missing")

When printing the file line by line, the code tends to add three unnecessary characters in the front, namely "".

Actual output:

Man said:  Is this the right room for an argument?
Other Man said:  I've told you once.
Man said:  No you haven't!
Other Man said:  Yes I have.

Expected output:

Man said:  Is this the right room for an argument?
Other Man said:  I've told you once.
Man said:  No you haven't!
Other Man said:  Yes I have.

2条回答
够拽才男人
2楼-- · 2020-02-05 12:20

I had a very similar problem when dealing with excel csv files. Initially I had saved my file from the drop down choices as a .csv utf-8(comma delimited) file. Then I saved it as just a .csv(comma delimited) file and all was well. Perhaps there might be something similar issue with a .txt file

查看更多
孤傲高冷的网名
3楼-- · 2020-02-05 12:36

I can't find a duplicate of this for Python 3, which handles encodings differently from Python 2. So here's the answer: instead of opening the file with the default encoding (which is 'utf-8'), use 'utf-8-sig', which expects and strips off the UTF-8 Byte Order Mark, which is what shows up as .

That is, instead of

data = open('info.txt')

Do

data = open('info.txt', encoding='utf-8-sig')

Note that if you're on Python 2, you should see e.g. Python, Encoding output to UTF-8 and Convert UTF-8 with BOM to UTF-8 with no BOM in Python. You'll need to do some shenanigans with codecs or with str.decode for this to work right in Python 2. But in Python 3, all you need to do is set the encoding= parameter when you open the file.

查看更多
登录 后发表回答