Delete all elements in an array corresponding to B

2019-05-18 10:47发布

I have a Boolean mask that exists as 2-D numpy array (Boolean Array)

array([[ True,  True,  True,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True,  True,  True],
       [False, False, False, False, False, False, False],
       [False, False, False, False, False, False, False],
       [False, False, False, False, False, False, False]], dtype=bool)

I also have a separate 2-D numpy array of values that is of the same dimensions as the Boolean mask (Values Array)

array([[ 19.189 ,  23.2535,  23.1555,  23.4655,  22.6795,  20.3295,  19.7005],
       [ 20.688 ,  20.537 ,  23.8465,  21.2265,  24.5805,  25.842 ,  23.198 ],
       [ 22.418 ,  21.0115,  21.0355,  20.217 ,  24.1275,  24.4595,  21.981 ],
       [ 21.156 ,  18.6195,  23.299 ,  22.5535,  23.2305,  28.749 ,  21.0245],
       [ 21.7495,  19.614 ,  20.3025,  21.706 ,  22.853 ,  19.623 ,  16.7415],
       [ 20.9715,  21.9505,  21.1895,  21.471 ,  21.0445,  21.096 ,  19.3295],
       [ 24.3815,  26.2095,  25.3595,  22.9985,  21.586 ,  23.796 ,  20.375 ]])

What I would like to do is delete all elements from the the array of values where the same location in the Boolean area equals False. Is there an easy way to do this?

The desired output for this example is:

array([[ 19.189 ,  23.2535,  23.1555,  23.4655,  22.6795,  20.3295,  19.7005],
       [ 20.688 ,  20.537 ,  23.8465,  21.2265,  24.5805,  25.842 ,  23.198 ],
       [ 22.418 ,  21.0115,  21.0355,  20.217 ,  24.1275,  24.4595,  21.981 ],
       [ 21.156 ,  18.6195,  23.299 ,  22.5535,  23.2305,  28.749 ,  21.0245]])

In this particular example, all the False values exist at the end of the the Boolean array, but this is not always the case and they can be randomly distributed. Therefore, I need a way of deleting any element from the values array in where the corresponding mask value equals False in the Boolean array

3条回答
叛逆
2楼-- · 2019-05-18 11:17

For most purposes you could simply create a MaskedArray which behaves as if these were "removed", that also allows to "remove" single elements from a column/row while keeping the dimensionality the same:

import numpy as np
arr = np.array([[ 19.189 , 23.2535, 23.1555, 23.4655, 22.6795, 20.3295, 19.7005],
                [ 20.688 , 20.537 , 23.8465, 21.2265, 24.5805, 25.842 , 23.198 ],
                [ 22.418 , 21.0115, 21.0355, 20.217 , 24.1275, 24.4595, 21.981 ],
                [ 21.156 , 18.6195, 23.299 , 22.5535, 23.2305, 28.749 , 21.0245],
                [ 21.7495, 19.614 , 20.3025, 21.706 , 22.853 , 19.623 , 16.7415],
                [ 20.9715, 21.9505, 21.1895, 21.471 , 21.0445, 21.096 , 19.3295],
                [ 24.3815, 26.2095, 25.3595, 22.9985, 21.586 , 23.796 , 20.375 ]])
mask = np.array([[ True,  True,  True,  True,  True,  True,  True],
                 [ True,  True,  True,  True,  True,  True,  True],
                 [ True,  True,  True,  True,  True,  True,  True],
                 [ True,  True,  True,  True,  True,  True,  True],
                 [False, False, False, False, False, False, False],
                 [False, False, False, False, False, False, False],
                 [False, False, False, False, False, False, False]])
marr = np.ma.MaskedArray(arr, mask=~mask)
marr

Gives:

masked_array(data =
 [[19.189 23.2535 23.1555 23.4655 22.6795 20.3295 19.7005]
 [20.688 20.537 23.8465 21.2265 24.5805 25.842 23.198]
 [22.418 21.0115 21.0355 20.217 24.1275 24.4595 21.981]
 [21.156 18.6195 23.299 22.5535 23.2305 28.749 21.0245]
 [-- -- -- -- -- -- --]
 [-- -- -- -- -- -- --]
 [-- -- -- -- -- -- --]],
             mask =
 [[False False False False False False False]
 [False False False False False False False]
 [False False False False False False False]
 [False False False False False False False]
 [ True  True  True  True  True  True  True]
 [ True  True  True  True  True  True  True]
 [ True  True  True  True  True  True  True]],
       fill_value = 1e+20)

In this case it would be also possible to just compress all rows that contain at least one masked element with np.ma.compress_rows:

>>> np.ma.compress_rows(marr)
array([[ 19.189 ,  23.2535,  23.1555,  23.4655,  22.6795,  20.3295,  19.7005],
       [ 20.688 ,  20.537 ,  23.8465,  21.2265,  24.5805,  25.842 ,  23.198 ],
       [ 22.418 ,  21.0115,  21.0355,  20.217 ,  24.1275,  24.4595,  21.981 ],
       [ 21.156 ,  18.6195,  23.299 ,  22.5535,  23.2305,  28.749 ,  21.0245]])
查看更多
来,给爷笑一个
3楼-- · 2019-05-18 11:18

To illustrate my comment:

In [33]: arr = np.arange(12).reshape(3,4)
In [34]: mask = ((arr+1)%3)>0
In [35]: mask
Out[35]: 
array([[ True,  True, False,  True],
       [ True, False,  True,  True],
       [False,  True,  True, False]], dtype=bool)

arr[mask] is 1d, because in general this selection does not return a neat 2d array.

In [36]: arr[mask]
Out[36]: array([ 0,  1,  3,  4,  6,  7,  9, 10])

We can see this clearly with the masked array solution

In [37]: marr = np.ma.MaskedArray(arr,mask=~mask)
In [38]: marr
Out[38]: 
masked_array(data =
 [[0 1 -- 3]
 [4 -- 6 7]
 [-- 9 10 --]],
             mask =
 [[False False  True False]
 [False  True False False]
 [ True False False  True]],
       fill_value = 999999)

ma compressed returns the 1d array

In [39]: marr.compressed()
Out[39]: array([ 0,  1,  3,  4,  6,  7,  9, 10])

With 8 terms I can reshape it to (4,2), but nothing involving 3.

You can mask whole rows or columns with various combinations of any or all.

查看更多
女痞
4楼-- · 2019-05-18 11:34

Assuming your mask will consist of rows which are either all True, or all False, then you can use mask.all(axis=1) and index:

In [116]: x
Out[116]: 
array([[ 1.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  1.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  1.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  1.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  1.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  1.]])

In [117]: x[mask.all(axis=1)]
Out[117]: 
array([[ 1.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  1.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  1.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  1.,  0.,  0.,  0.]])
查看更多
登录 后发表回答