Avoid extra dimension added by numpy.vsplit

2019-08-18 00:01发布

We can join several 1d arrays with vstack (or hstack), e.g. D = np.vstack([a,b,c]).
The reverse operation is [a2,b2,c2] = np.vsplit(D, 3). But the dimensionality changes in the round-trip:

import numpy as np
a = np.random.rand(10,)
b = np.random.rand(10,)
c = np.random.rand(10,)
D = np.vstack([a,b,c])
[a2,b2,c2] = np.vsplit(D, 3)

>>> a.shape
(10,)

>>> a2.shape
(1, 10)

I know about squeeze to remove a dimension:

>>> a2.squeeze().shape
(10,)

But this is cumbersome, especially when splitting more than a couple of arrays.

Is there any way to 'automatically' perform a squeeze, or otherwise control the output of vsplit to avoid the mismatch in dimensions?

(the split docs do not mention any way to control the output dimensions as far as I can tell)

标签: python numpy
2条回答
成全新的幸福
2楼-- · 2019-08-18 00:02
In [98]: D = np.arange(12).reshape(4,3)
In [99]: np.vsplit(D, 4)
Out[99]: 
[array([[0, 1, 2]]),
 array([[3, 4, 5]]),
 array([[6, 7, 8]]),
 array([[ 9, 10, 11]])]

split is using a slice to select rows, thus preserving that dimension

[D[i:i+1,:] for i in range(4)]

That's a general behavior that lets it return other size splits.

But it appears you want to return one row at a time. There are many ways of doing this:

It's easy to apply squeeze iteratively (and not much more expensive, since split is already iterating):

In [100]: [np.squeeze(x) for x in np.vsplit(D, 4)]
Out[100]: [array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8]), array([ 9, 10, 11])]

Or you can use a plain list comprehension:

In [101]: [x for x in D]
Out[101]: [array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8]), array([ 9, 10, 11])]

Or convert the array to a list (this is different from D.tolist():

In [102]: list(D)
Out[102]: [array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8]), array([ 9, 10, 11])]

Or iteration by index. This is like split, but uses a scalar index rather than the slice. It's good to understand the difference between D[i,:] and D[i:i+1, :].

In [103]: [D[i] for i in range(4)]
Out[103]: [array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8]), array([ 9, 10, 11])]

Since you are using unpacking, you don't need any of this. The unpacking will do the row 'iteration' for you:

In [106]: a,b,c,d = D
In [107]: a,b,c,d
Out[107]: (array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8]), array([ 9, 10, 11]))
查看更多
我欲成王,谁敢阻挡
3楼-- · 2019-08-18 00:10

you can try:

import numpy as np
a = np.random.rand(10,)
b = np.random.rand(10,)
c = np.random.rand(10,)
D = np.vstack([a,b,c])
[a2,b2,c2]=[D[x,:] for x in range(3)]

print(a2.shape)

output:

(10,)
查看更多
登录 后发表回答