What's the simplest way to extend a numpy arra

2019-01-10 12:21发布

问题:

I have a 2d array that looks like this:

XX
xx

What's the most efficient way to add an extra row and column:

xxy
xxy
yyy

For bonus points, I'd like to also be able to knock out single rows and columns, so for example in the matrix below I'd like to be able to knock out all of the a's leaving only the x's - specifically I'm trying to delete the nth row and the nth column at the same time - and I want to be able to do this as quickly as possible:

xxaxx
xxaxx
aaaaa
xxaxx
xxaxx

回答1:

The shortest in terms of lines of code i can think of is for the first question.

>>> import numpy as np
>>> p = np.array([[1,2],[3,4]])

>>> p = np.append(p, [[5,6]], 0)
>>> p = np.append(p, [[7],[8],[9]],1)

>>> p
array([[1, 2, 7],
   [3, 4, 8],
   [5, 6, 9]])

And the for the second question

    p = np.array(range(20))
>>> p.shape = (4,5)
>>> p
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])
>>> n = 2
>>> p = np.append(p[:n],p[n+1:],0)
>>> p = np.append(p[...,:n],p[...,n+1:],1)
>>> p
array([[ 0,  1,  3,  4],
       [ 5,  6,  8,  9],
       [15, 16, 18, 19]])


回答2:

A useful alternative answer to the first question, using the examples from tomeedee’s answer, would be to use numpy’s vstack and column_stack methods:

Given a matrix p,

>>> import numpy as np
>>> p = np.array([ [1,2] , [3,4] ])

an augmented matrix can be generated by:

>>> p = np.vstack( [ p , [5 , 6] ] )
>>> p = np.column_stack( [ p , [ 7 , 8 , 9 ] ] )
>>> p
array([[1, 2, 7],
       [3, 4, 8],
       [5, 6, 9]])

These methods may be convenient in practice than np.append() as they allow 1D arrays to be appended to a matrix without any modification, in contrast to the following scenario:

>>> p = np.array([ [ 1 , 2 ] , [ 3 , 4 ] , [ 5 , 6 ] ] )
>>> p = np.append( p , [ 7 , 8 , 9 ] , 1 )
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/dist-packages/numpy/lib/function_base.py", line 3234, in append
    return concatenate((arr, values), axis=axis)
ValueError: arrays must have same number of dimensions

In answer to the second question, a nice way to remove rows and columns is to use logical array indexing as follows:

Given a matrix p,

>>> p = np.arange( 20 ).reshape( ( 4 , 5 ) )

suppose we want to remove row 1 and column 2:

>>> r , c = 1 , 2
>>> p = p [ np.arange( p.shape[0] ) != r , : ] 
>>> p = p [ : , np.arange( p.shape[1] ) != c ]
>>> p
array([[ 0,  1,  3,  4],
       [10, 11, 13, 14],
       [15, 16, 18, 19]])

Note - for reformed Matlab users - if you wanted to do these in a one-liner you need to index twice:

>>> p = np.arange( 20 ).reshape( ( 4 , 5 ) )    
>>> p = p [ np.arange( p.shape[0] ) != r , : ] [ : , np.arange( p.shape[1] ) != c ]

This technique can also be extended to remove sets of rows and columns, so if we wanted to remove rows 0 & 2 and columns 1, 2 & 3 we could use numpy's setdiff1d function to generate the desired logical index:

>>> p = np.arange( 20 ).reshape( ( 4 , 5 ) )
>>> r = [ 0 , 2 ]
>>> c = [ 1 , 2 , 3 ]
>>> p = p [ np.setdiff1d( np.arange( p.shape[0] ), r ) , : ] 
>>> p = p [ : , np.setdiff1d( np.arange( p.shape[1] ) , c ) ]
>>> p
array([[ 5,  9],
       [15, 19]])


回答3:

Another elegant solution to the first question may be the insert command:

p = np.array([[1,2],[3,4]])
p = np.insert(p, 2, values=0, axis=1) # insert values before column 2

Leads to:

array([[1, 2, 0],
       [3, 4, 0]])

insert may be slower than append but allows you to fill the whole row/column with one value easily.

As for the second question, delete has been suggested before:

p = np.delete(p, 2, axis=1)

Which restores the original array again:

array([[1, 2],
       [3, 4]])


回答4:

I find it much easier to "extend" via assigning in a bigger matrix. E.g.

import numpy as np
p = np.array([[1,2], [3,4]])
g = np.array(range(20))
g.shape = (4,5)
g[0:2, 0:2] = p

Here are the arrays:

p

   array([[1, 2],
       [3, 4]])

g:

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

and the resulting g after assignment:

   array([[ 1,  2,  2,  3,  4],
       [ 3,  4,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])


回答5:

Answer to the first question:

Use numpy.append.

http://docs.scipy.org/doc/numpy/reference/generated/numpy.append.html#numpy.append

Answer to the second question:

Use numpy.delete

http://docs.scipy.org/doc/numpy/reference/generated/numpy.delete.html



回答6:

You can use:

>>> np.concatenate([array1, array2, ...]) 

e.g.

>>> import numpy as np
>>> a = [[1, 2, 3],[10, 20, 30]]
>>> b = [[100,200,300]]
>>> a = np.array(a) # not necessary, but numpy objects prefered to built-in
>>> b = np.array(b) # "^
>>> a
array([[ 1,  2,  3],
       [10, 20, 30]])
>>> b
array([[100, 200, 300]])
>>> c = np.concatenate([a,b])
>>> c
array([[  1,   2,   3],
       [ 10,  20,  30],
       [100, 200, 300]])
>>> print c
[[  1   2   3]
 [ 10  20  30]
 [100 200 300]]

~-+-~-+-~-+-~

Sometimes, you will come across trouble if a numpy array object is initialized with incomplete values for its shape property. This problem is fixed by assigning to the shape property the tuple: (array_length, element_length).

Note: Here, 'array_length' and 'element_length' are integer parameters, which you substitute values in for. A 'tuple' is just a pair of numbers in parentheses.

e.g.

>>> import numpy as np
>>> a = np.array([[1,2,3],[10,20,30]])
>>> b = np.array([100,200,300]) # initialize b with incorrect dimensions
>>> a.shape
(2, 3)
>>> b.shape
(3,)
>>> c = np.concatenate([a,b])

Traceback (most recent call last):
  File "<pyshell#191>", line 1, in <module>
    c = np.concatenate([a,b])
ValueError: all the input arrays must have same number of dimensions
>>> b.shape = (1,3)
>>> c = np.concatenate([a,b])
>>> c
array([[  1,   2,   3],
       [ 10,  20,  30],
       [100, 200, 300]])


回答7:

maybe you need this.

>>> x = np.array([11,22])
>>> y = np.array([18,7,6])
>>> z = np.array([1,3,5])
>>> np.concatenate((x,y,z))
array([11, 22, 18,  7,  6,  1,  3,  5])