I know that if I want to save and load arrays of complex numbers with numpy, I can use the method described here: How to save and load an array of complex numbers using numpy.savetxt? .
Assume however, that someone did not know about this and saved their array numbers
with numpy.savetxt("numbers.txt",numbers
), producing a file with entries of the form
(0.000000000000000000e+00+-2.691033635430225765e-02j) .
In this case
numbers_load = numpy.loadtxt("numbers.txt").view(complex)
will, predictably, fail in the sense of
ValueError: could not convert string to float: (0.000000000000000000e+00+-2.691033635430225765e-02j) .
What would be an easy way of extracting the complex numbers from this file (without generating a different version of it)?
You could use converters to handle the custom format. The only problem that prevents reading the complex value properly is the +-
in 1+-2j
, replacing them to 1-2j
would work.
>>> numpy.savetxt('1.txt', numpy.array([2.3+4.5j, 6.7-0.89j]))
>>> numpy.loadtxt('1.txt', dtype=complex) # <- doesn't work directly
ValueError: complex() arg is a malformed string
>>> numpy.loadtxt('1.txt', dtype=complex, converters={0: lambda s: complex(s.decode().replace('+-', '-'))})
array([ 2.3+4.5j , 6.7-0.89j])
Before saving the array, you should use .view(float)
to convert it to an array of float
s, and then .view(complex)
to convert the float
s back to complex
numbers on loading.
In [1]: import numpy as np
In [2]: A = np.array([1+2j, 2+5j, 3-4j, -3+1j])
In [3]: A.view(float)
Out[3]: array([ 1., 2., 2., 5., 3., -4., -3., 1.])
In [4]: np.savetxt("numbers.txt", A.view(float))
In [5]: np.loadtxt("numbers.txt")
Out[5]: array([ 1., 2., 2., 5., 3., -4., -3., 1.])
In [6]: np.loadtxt("numbers.txt").view(complex)
Out[6]: array([ 1.+2.j, 2.+5.j, 3.-4.j, -3.+1.j])
If you can't modify the file, you can transform the strings as you read them in line by line.
import numpy as np
import re
# a regular expression that picks out the two components of the complex number
reg = re.compile('(-?\d.\d*e[+-]\d\d)\+(-?\d.\d*e[+-]\d\d)')
# a function that returns a properly formatted string
edit = lambda s: reg.search(s).expand(r'\1 \2')
with open("numbers.txt", 'r') as fobj:
# calling map applies the edit function to each line of numbers.txt in turn
numbers_load = np.loadtxt(map(edit, fobj))
print(numbers_load) # [ 0. -0.02691034]
print(numbers_load.view('complex')) # [ 0.-0.02691034j]
The docs for savetxt
talk about fmt
options for a complex array.
Starting with a 1d array:
In [17]: np.arange(5)+np.arange(5,0,-1)*1j
Out[17]: array([ 0.+5.j, 1.+4.j, 2.+3.j, 3.+2.j, 4.+1.j])
In [18]: arr = np.arange(5)+np.arange(5,0,-1)*1j
The default is to write the numbers, one ()
string per line. Reading that with loadtxt
(or genfromtxt
) is going to be a problem. It will have to be loaded as a string, and then converted line by line.
In [19]: np.savetxt('test.txt',arr)
In [20]: cat test.txt
(0.000000000000000000e+00+5.000000000000000000e+00j)
(1.000000000000000000e+00+4.000000000000000000e+00j)
(2.000000000000000000e+00+3.000000000000000000e+00j)
(3.000000000000000000e+00+2.000000000000000000e+00j)
(4.000000000000000000e+00+1.000000000000000000e+00j)
It says I can specify a format for the real and imaginary parts, in which case it saves it as 2 columns. That's easy to read with loadtxt
.
In [21]: np.savetxt('test.txt',arr, fmt='%f %f')
In [22]: cat test.txt
0.000000 5.000000
1.000000 4.000000
2.000000 3.000000
3.000000 2.000000
4.000000 1.000000
In [23]: np.loadtxt('test.txt')
Out[23]:
array([[ 0., 5.],
[ 1., 4.],
[ 2., 3.],
[ 3., 2.],
[ 4., 1.]])
In [25]: np.loadtxt('test.txt').view(complex)
Out[25]:
array([[ 0.+5.j],
[ 1.+4.j],
[ 2.+3.j],
[ 3.+2.j],
[ 4.+1.j]])
With a 2d complex array I need to specify fmt for all columns
In [28]: arr1=np.array((arr, arr*.1, arr+1))
In [29]: arr1
Out[29]:
array([[ 0.0+5.j , 1.0+4.j , 2.0+3.j , 3.0+2.j , 4.0+1.j ],
[ 0.0+0.5j, 0.1+0.4j, 0.2+0.3j, 0.3+0.2j, 0.4+0.1j],
[ 1.0+5.j , 2.0+4.j , 3.0+3.j , 4.0+2.j , 5.0+1.j ]])
In [33]: np.savetxt('test.txt',arr1, fmt=['%f %f']*5)
In [34]: cat test.txt
0.000000 5.000000 1.000000 4.000000 2.000000 3.000000 3.000000 2.000000 4.000000 1.000000
0.000000 0.500000 0.100000 0.400000 0.200000 0.300000 0.300000 0.200000 0.400000 0.100000
1.000000 5.000000 2.000000 4.000000 3.000000 3.000000 4.000000 2.000000 5.000000 1.000000
In [35]: np.loadtxt('test.txt').view(complex)
Out[35]:
array([[ 0.0+5.j , 1.0+4.j , 2.0+3.j , 3.0+2.j , 4.0+1.j ],
[ 0.0+0.5j, 0.1+0.4j, 0.2+0.3j, 0.3+0.2j, 0.4+0.1j],
[ 1.0+5.j , 2.0+4.j , 3.0+3.j , 4.0+2.j , 5.0+1.j ]])
The docs show a long format string with all columns, but evidently a list of strings works
In [36]: ['%f %f']*5
Out[36]: ['%f %f', '%f %f', '%f %f', '%f %f', '%f %f']
savetxt
joins that list with the delimiter to make one long format string.
In [37]: np.savetxt('test.txt',arr1, fmt=['%f %f']*5, delimiter=',')
In [38]: cat test.txt
0.000000 5.000000,1.000000 4.000000,2.000000 3.000000,3.000000 2.000000,4.000000 1.000000
...
For loadtxt
the delimiter between complex parts and columns has to be compatible:
In [39]: np.savetxt('test.txt',arr1, fmt=['%f %f']*5, delimiter=' ')
In [40]: cat test.txt
0.000000 5.000000 1.000000 4.000000 2.000000 3.000000 3.000000 2.000000 4.000000 1.000000
...
In sum, the save/load round trip will be easiest if the save
is done with load compatible formats.
Specifying the data type like:
np.loadtxt("numbers.txt", dtype=np.complex_)
works for me.