I'm trying to use genfromtxt
with Python3 to read a simple csv file containing strings and numbers. For example, something like (hereinafter "test.csv"):
1,a
2,b
3,c
with Python2, the following works well:
import numpy
data=numpy.genfromtxt("test.csv", delimiter=",", dtype=None)
# Now data is something like [(1, 'a') (2, 'b') (3, 'c')]
in Python3 the same code returns [(1, b'a') (2, b'b') (3, b'c')]
. This is somehow expected due to the different way Python3 reads the files. Therefore I use a converter to decode the strings:
decodef = lambda x: x.decode("utf-8")
data=numpy.genfromtxt("test.csv", delimiter=",", dtype="f8,S8", converters={1: decodef})
This works with Python2, but not with Python3 (same [(1, b'a') (2, b'b') (3, b'c')]
output.
However, if in Python3 I use the code above to read only one column:
data=numpy.genfromtxt("test.csv", delimiter=",", usecols=(1,), dtype="S8", converters={1: decodef})
the output strings are ['a' 'b' 'c']
, already decoded as expected.
I've also tried to provide the file as the output of an open
with the 'rb'
mode, as suggested at this link, but there are no improvements.
Why the converter works when only one column is read, and not when two columns are read? Could you please suggest me the correct way to use genfromtxt
in Python3? Am I doing something wrong? Thank you in advance!