How to read inputs from stdin and enforce an encod

The goal is to continuously read from stdin and enforce utf8 in both Python2 and Python3.

I've tried solutions from:

I've tried:

#!/usr/bin/env python

from __future__ import print_function, unicode_literals
import io
import sys

# Supports Python2 read from stdin and Python3 read from stdin.buffer
# https://stackoverflow.com/a/23932488/610569
user_input = getattr(sys.stdin, 'buffer', sys.stdin)


# Enforcing utf-8 in Python3
# https://stackoverflow.com/a/16549381/610569
with io.TextIOWrapper(user_input, encoding='utf-8') as fin:
    for line in fin:
        # Reads the input line by line
        # and do something, for e.g. just print line.
        print(line)

The code works in Python3 but in Python2, the TextIOWrapper doesn't have a read function and it throws:

Traceback (most recent call last):
  File "testfin.py", line 12, in <module>
    with io.TextIOWrapper(user_input, encoding='utf-8') as fin:
AttributeError: 'file' object has no attribute 'readable'

That's because in Python the user_input , i.e. sys.stdin.buffer is an _io.BufferedReader object and its attribute has readable:

<class '_io.BufferedReader'>

['__class__', '__del__', '__delattr__', '__dict__', '__dir__', '__doc__', '__enter__', '__eq__', '__exit__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__lt__', '__ne__', '__new__', '__next__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '_checkClosed', '_checkReadable', '_checkSeekable', '_checkWritable', '_dealloc_warn', '_finalizing', 'close', 'closed', 'detach', 'fileno', 'flush', 'isatty', 'mode', 'name', 'peek', 'raw', 'read', 'read1', 'readable', 'readinto', 'readinto1', 'readline', 'readlines', 'seek', 'seekable', 'tell', 'truncate', 'writable', 'write', 'writelines']

While in Python2 the user_input is a file object and its attributes don't have readable:

<type 'file'>

['__class__', '__delattr__', '__doc__', '__enter__', '__exit__', '__format__', '__getattribute__', '__hash__', '__init__', '__iter__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'close', 'closed', 'encoding', 'errors', 'fileno', 'flush', 'isatty', 'mode', 'name', 'newlines', 'next', 'read', 'readinto', 'readline', 'readlines', 'seek', 'softspace', 'tell', 'truncate', 'write', 'writelines', 'xreadlines']

标签： python file utf-8 io stdin

2条回答

戒情不戒烟

2楼-- · 2019-07-15 06:52

If you don't need a fully-fledged io.TextIOWrapper, but just a decoded stream for reading, you can use codecs.getreader() to create a decoding wrapper:

reader = codecs.getreader('utf8')(user_input)
for line in reader:
    # do whatever you need...
    print(line)

codecs.getreader('utf8') creates a factory for a codecs.StreamReader, which is then instantiated using the original stream. I'm not sure the StreamReader supports the with context, but this might not be strictly necessary (there's no need to close STDIN after reading, I guess...).

I've successfully used this solution in situations where the underlying stream only offers a very limited interface.

Update (2nd version)

From the comments, it became clear that you actually need an io.TextIOWrapper to have proper line buffering etc. in interactive mode; codecs.StreamReader only works for piped input and the like.

Using this answer, I was able to get interactive input work properly:

#!/usr/bin/env python
# coding: utf8

from __future__ import print_function, unicode_literals
import io
import sys

user_input = getattr(sys.stdin, 'buffer', sys.stdin)

with io.open(user_input.fileno(), encoding='utf8') as f:
    for line in f:
        # do whatever you need...
        print(line)

This creates an io.TextIOWrapper with enforced encoding from the binary STDIN buffer.

0人赞添加讨论(0) 举报

虎瘦雄心在

3楼-- · 2019-07-15 06:57

Have you tried forcing utf-8 encoding in python as follow :

import sys
reload(sys)
sys.setdefaultencoding('utf-8')

0人赞添加讨论(0) 举报

How to read inputs from stdin and enforce an encod

Update (2nd version)

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间