I have a csv file ready to load into my python code, however, I want to load it into the following format:
data = [[A,B,C,D],
[A,B,C,D],
[A,B,C,D],
]
How would I go about loading a .csv file that is readable as a numpy array? e.g., simply using previous tutorials plays havoc with using:
data = np.array(data)
Failing that, I would just like to upload my csv file (e.g. 'dual-Cored.csv' as data = dual-Cored.csv)
The simplest solution is just:
import numpy as np
data = np.loadtxt("myfile.csv")
As long as the data is convertible into float
and has an equal number of columns on each row, this works.
If the data is not convertible into float
in some column, you may write your own converters for it. Please see the numpy.loadtxt
documentation. It is really very flexible.
If your CVS looks like this:
A,B,C,D
A,B,C,D
A,B,C,D
A,B,C,D
then
import csv
with open(filename, 'rb') as f:
data = list(csv.reader(f))
would make data
equal to
[['A', 'B', 'C', 'D'],
['A', 'B', 'C', 'D'],
['A', 'B', 'C', 'D'],
['A', 'B', 'C', 'D']]
As a small example, I have some file data.csv
with the following contents.
A,B,C,D
1,2,3,4
W,X,Y,Z
5,6,7,8
with open('data.csv', 'r') as f:
data = [i.split(",") for i in f.read().split()]
print data
Output
[['A', 'B', 'C', 'D'],
['1', '2', '3', '4'],
['W', 'X', 'Y', 'Z'],
['5', '6', '7', '8']]
I'm assuming you mean to get all your data points as integers or floating point numbers.
First I wrote some sample data:
with open('dual-Cored.csv', 'w') as f:
f.write('1,2,3,4\n5,6,7,8\n9,10,11,12')
Now I'm reading back in the sample data
with open('dual-Cored.csv', 'rU') as f:
c = csv.reader(f)
for l in c:
print list(map(int, l))
Which prints:
[1, 2, 3, 4]
[5, 6, 7, 8]
[9, 10, 11, 12]
I recommend you read up a bit on datatypes in the Python tutorial, which talks about the difference between strings and numerical types.
To read into a numpy array with the csv module:
import numpy
with open('dual-Cored.csv', 'rU') as f:
c = csv.reader(f)
ar = numpy.array(list(c), dtype=int)
and ar
now returns:
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]])
Or directly use the numpy.genfromtxt
function (you'll need to specify the delimiter):
numpy.genfromtxt('dual-Cored.csv', delimiter=',')
returns:
array([[ 1., 2., 3., 4.],
[ 5., 6., 7., 8.],
[ 9., 10., 11., 12.]])