In another question, other users offered some help if I could supply the array I was having trouble with. However, I even fail at a basic I/O task, such as writing an array to a file.
Can anyone explain what kind of loop I would need to write a 4x11x14 numpy array to file?
This array consist of four 11 x 14 arrays, so I should format it with a nice newline, to make the reading of the file easier on others.
Edit: So I've tried the numpy.savetxt function. Strangely, it gives the following error:
TypeError: float argument required, not numpy.ndarray
I assume that this is because the function doesn't work with multidimensional arrays? Any solutions as I would like them within one file?
You can simply traverse the array in three nested loops and write their values to your file. For reading, you simply use the same exact loop construction. You will get the values in exactly the right order to fill your arrays correctly again.
I have a way to do it using a simply filename.write() operation. It works fine for me, but I'm dealing with arrays having ~1500 data elements.
I basically just have for loops to iterate through the file and write it to the output destination line-by-line in a csv style output.
The if and elif statement are used to add commas between the data elements. For whatever reason, these get stripped out when reading the file in as an nd array. My goal was to output the file as a csv, so this method helps to handle that.
Hope this helps!
If you don't need a human-readable output, another option you could try is to save the array as a MATLAB
.mat
file, which is a structured array. I despise MATLAB, but the fact that I can both read and write a.mat
in very few lines is convenient.Unlike Joe Kington's answer, the benefit of this is that you don't need to know the original shape of the data in the
.mat
file, i.e. no need to reshape upon reading in. And, unlike usingpickle
, a.mat
file can be read by MATLAB, and probably some other programs/languages as well.Here is an example:
If you forget the key that the array is named in the
.mat
file, you can always do:And of course you can store many arrays using many more keys.
So yes – it won't be readable with your eyes, but only takes 2 lines to write and read the data, which I think is a fair trade-off.
Take a look at the docs for scipy.io.savemat and scipy.io.loadmat and also this tutorial page: scipy.io File IO Tutorial
Pickle is best for these cases. Suppose you have a ndarray named
x_train
. You can dump it into a file and revert it back using the following command:I'm not certain if this meets your requirements, given I think you're interested in making the file readable by people, but if that's not a primary concern, just
pickle
it.To save it:
To read it back:
If you want to write it to disk so that it will be easy to read back in as a numpy array, look into
numpy.save
. Pickling it will work fine, as well, but it's less efficient for large arrays (which yours isn't, so either is perfectly fine).If you want it to be human readable, look into
numpy.savetxt
.Edit: So, it seems like
savetxt
isn't quite as great an option for arrays with >2 dimensions... But just to draw everything out to it's full conclusion:I just realized that
numpy.savetxt
chokes on ndarrays with more than 2 dimensions... This is probably by design, as there's no inherently defined way to indicate additional dimensions in a text file.E.g. This (a 2D array) works fine
While the same thing would fail (with a rather uninformative error:
TypeError: float argument required, not numpy.ndarray
) for a 3D array:One workaround is just to break the 3D (or greater) array into 2D slices. E.g.
However, our goal is to be clearly human readable, while still being easily read back in with
numpy.loadtxt
. Therefore, we can be a bit more verbose, and differentiate the slices using commented out lines. By default,numpy.loadtxt
will ignore any lines that start with#
(or whichever character is specified by thecomments
kwarg). (This looks more verbose than it actually is...)This yields:
Reading it back in is very easy, as long as we know the shape of the original array. We can just do
numpy.loadtxt('test.txt').reshape((4,5,10))
. As an example (You can do this in one line, I'm just being verbose to clarify things):