loading complex numbers with numpy.loadtxt

2019-05-23 14:41发布

问题:

I know that if I want to save and load arrays of complex numbers with numpy, I can use the method described here: How to save and load an array of complex numbers using numpy.savetxt? .

Assume however, that someone did not know about this and saved their array numbers with numpy.savetxt("numbers.txt",numbers), producing a file with entries of the form

(0.000000000000000000e+00+-2.691033635430225765e-02j)  .

In this case

numbers_load = numpy.loadtxt("numbers.txt").view(complex)

will, predictably, fail in the sense of

ValueError: could not convert string to float: (0.000000000000000000e+00+-2.691033635430225765e-02j)  .

What would be an easy way of extracting the complex numbers from this file (without generating a different version of it)?

回答1:

You could use converters to handle the custom format. The only problem that prevents reading the complex value properly is the +- in 1+-2j, replacing them to 1-2j would work.

>>> numpy.savetxt('1.txt', numpy.array([2.3+4.5j, 6.7-0.89j]))

>>> numpy.loadtxt('1.txt', dtype=complex)  # <- doesn't work directly
ValueError: complex() arg is a malformed string

>>> numpy.loadtxt('1.txt', dtype=complex, converters={0: lambda s: complex(s.decode().replace('+-', '-'))})
array([ 2.3+4.5j ,  6.7-0.89j])


回答2:

Before saving the array, you should use .view(float) to convert it to an array of floats, and then .view(complex) to convert the floats back to complex numbers on loading.

In [1]: import numpy as np

In [2]: A = np.array([1+2j, 2+5j, 3-4j, -3+1j])

In [3]: A.view(float)
Out[3]: array([ 1.,  2.,  2.,  5.,  3., -4., -3.,  1.])

In [4]: np.savetxt("numbers.txt", A.view(float))

In [5]: np.loadtxt("numbers.txt")
Out[5]: array([ 1.,  2.,  2.,  5.,  3., -4., -3.,  1.])

In [6]: np.loadtxt("numbers.txt").view(complex)
Out[6]: array([ 1.+2.j,  2.+5.j,  3.-4.j, -3.+1.j])


回答3:

If you can't modify the file, you can transform the strings as you read them in line by line.

import numpy as np
import re

# a regular expression that picks out the two components of the complex number
reg = re.compile('(-?\d.\d*e[+-]\d\d)\+(-?\d.\d*e[+-]\d\d)')
# a function that returns a properly formatted string
edit = lambda s: reg.search(s).expand(r'\1 \2')

with open("numbers.txt", 'r') as fobj:
    # calling map applies the edit function to each line of numbers.txt in turn
    numbers_load = np.loadtxt(map(edit, fobj))
print(numbers_load) # [ 0.         -0.02691034]
print(numbers_load.view('complex')) # [ 0.-0.02691034j]


回答4:

The docs for savetxt talk about fmt options for a complex array.

Starting with a 1d array:

In [17]: np.arange(5)+np.arange(5,0,-1)*1j
Out[17]: array([ 0.+5.j,  1.+4.j,  2.+3.j,  3.+2.j,  4.+1.j])
In [18]: arr = np.arange(5)+np.arange(5,0,-1)*1j

The default is to write the numbers, one () string per line. Reading that with loadtxt (or genfromtxt) is going to be a problem. It will have to be loaded as a string, and then converted line by line.

In [19]: np.savetxt('test.txt',arr)
In [20]: cat test.txt
 (0.000000000000000000e+00+5.000000000000000000e+00j)
 (1.000000000000000000e+00+4.000000000000000000e+00j)
 (2.000000000000000000e+00+3.000000000000000000e+00j)
 (3.000000000000000000e+00+2.000000000000000000e+00j)
 (4.000000000000000000e+00+1.000000000000000000e+00j)

It says I can specify a format for the real and imaginary parts, in which case it saves it as 2 columns. That's easy to read with loadtxt.

In [21]: np.savetxt('test.txt',arr, fmt='%f %f')
In [22]: cat test.txt
0.000000 5.000000
1.000000 4.000000
2.000000 3.000000
3.000000 2.000000
4.000000 1.000000

In [23]: np.loadtxt('test.txt')
Out[23]: 
array([[ 0.,  5.],
       [ 1.,  4.],
       [ 2.,  3.],
       [ 3.,  2.],
       [ 4.,  1.]])

In [25]: np.loadtxt('test.txt').view(complex)
Out[25]: 
array([[ 0.+5.j],
       [ 1.+4.j],
       [ 2.+3.j],
       [ 3.+2.j],
       [ 4.+1.j]])

With a 2d complex array I need to specify fmt for all columns

In [28]: arr1=np.array((arr, arr*.1, arr+1))
In [29]: arr1
Out[29]: 
array([[ 0.0+5.j ,  1.0+4.j ,  2.0+3.j ,  3.0+2.j ,  4.0+1.j ],
       [ 0.0+0.5j,  0.1+0.4j,  0.2+0.3j,  0.3+0.2j,  0.4+0.1j],
       [ 1.0+5.j ,  2.0+4.j ,  3.0+3.j ,  4.0+2.j ,  5.0+1.j ]])

In [33]: np.savetxt('test.txt',arr1, fmt=['%f %f']*5)
In [34]: cat test.txt
0.000000 5.000000 1.000000 4.000000 2.000000 3.000000 3.000000 2.000000 4.000000 1.000000
0.000000 0.500000 0.100000 0.400000 0.200000 0.300000 0.300000 0.200000 0.400000 0.100000
1.000000 5.000000 2.000000 4.000000 3.000000 3.000000 4.000000 2.000000 5.000000 1.000000
In [35]: np.loadtxt('test.txt').view(complex)
Out[35]: 
array([[ 0.0+5.j ,  1.0+4.j ,  2.0+3.j ,  3.0+2.j ,  4.0+1.j ],
       [ 0.0+0.5j,  0.1+0.4j,  0.2+0.3j,  0.3+0.2j,  0.4+0.1j],
       [ 1.0+5.j ,  2.0+4.j ,  3.0+3.j ,  4.0+2.j ,  5.0+1.j ]])

The docs show a long format string with all columns, but evidently a list of strings works

In [36]: ['%f %f']*5
Out[36]: ['%f %f', '%f %f', '%f %f', '%f %f', '%f %f']

savetxt joins that list with the delimiter to make one long format string.

In [37]: np.savetxt('test.txt',arr1, fmt=['%f %f']*5, delimiter=',')
In [38]: cat test.txt
0.000000 5.000000,1.000000 4.000000,2.000000 3.000000,3.000000 2.000000,4.000000 1.000000
...

For loadtxt the delimiter between complex parts and columns has to be compatible:

In [39]: np.savetxt('test.txt',arr1, fmt=['%f %f']*5, delimiter='  ')
In [40]: cat test.txt
0.000000 5.000000  1.000000 4.000000  2.000000 3.000000  3.000000 2.000000  4.000000 1.000000
...

In sum, the save/load round trip will be easiest if the save is done with load compatible formats.



回答5:

Specifying the data type like:

np.loadtxt("numbers.txt", dtype=np.complex_) 

works for me.