How to control genfromtxt to read rows specified?

2019-07-08 11:05发布

问题:

genfromtxt can skip header and footer lines and speicfy which columns to use. But how can I control how many lines to read?

Sometimes a txt file might contain several blocks with different shape. For example,

a=StringIO('''
1,2,3
1,2,3
2,3
2,3
''')
genfromtxt(a,delimiter=',',skip_header=1)

This will raise an error,

ValueError: Some errors were detected !
    Line #4 (got 2 columns instead of 3)
    Line #5 (got 2 columns instead of 3)

Of couse, I can do it like this:

a=StringIO('''
1,2,3
1,2,3
2,3
2,3
''')
genfromtxt(a,delimiter=',',skip_header=1,skip_footer=2)

It's ugly as I have to calculate the number of rows under the block.

However I wish something like

genfromtxt(a,delimiter=',',skip_header=1,nrows=2)

that would be more clear.

Anyone have a good idea about that? Or use other function?


Update 2015 Oct

This question has been solved in new version of Numpy.

genfromtxt now have a new keywords named max_rows which allow one to control the number of lines to read, cf here.

回答1:

You can use the invalid_raise = False to skip reading the lines that are missing some data. E.g.

b = np.genfromtxt(a, delimiter=',', invalid_raise=False)

This will give you a warning, but will not raise an exception.