I have data similar to that seen in this gist and I am trying to extract the data with numpy. I am rather new to python so I tried to do so with the following code
import numpy as np
from datetime import datetime
convertfunc = lambda x: datetime.strptime(x, '%H:%M:%S:.%f')
col_headers = ["Mass", "Thermocouple", "T O2 Sensor",\
"Igniter", "Lamps", "O2", "Time"]
data = np.genfromtxt(files[1], skip_header=22,\
names=col_headers,\
converters={"Time": convertfunc})
Where as can be seen in the gist there are 22 rows of header material. In Ipython, when I "run" the following code I receive an error that ends with the following:
TypeError: float() argument must be a string or a number
The full ipython error trace can be seen here.
I am able to extract the six columns of numeric data just fine using an argument to genfromtxt like usecols=range(0,6), but when I try to use a converter to try and tackle the last column I'm stumped. Any and all comments would be appreciated!
This is happening because
np.genfromtxt
is trying to create a float array, which fails becauseconvertfunc
returns a datetime object, which cannot be cast as float. The easiest solution would be to just pass the argumentdtype='object'
tonp.genfromtxt
, ensuring the creation of an object array and preventing a conversion to float. However, this would mean that the other columns would be saved as strings. To get them properly saved as floats you need to specify thedtype
of each to get a structured array. Here I'm setting them all to double except the last column, which will be an object dtype:This will give you a structured array which you can access with the names you gave:
You can use pandas read_table:
worked for me. You need to process the header separately since they are variable number of space separated.