Say I have a two dimensional array of coordinates that looks something like
x = array([[1,2],[2,3],[3,4]])
Previously in my work so far, I generated a mask that ends up looking something like
mask = [False,False,True]
When I try to use this mask on the 2D coordinate vector, I get an error
newX = np.ma.compressed(np.ma.masked_array(x,mask))
>>>numpy.ma.core.MaskError: Mask and data not compatible: data size
is 6, mask size is 3.`
which makes sense, I suppose. So I tried to simply use the following mask instead:
mask2 = np.column_stack((mask,mask))
newX = np.ma.compressed(np.ma.masked_array(x,mask2))
And what I get is close:
>>>array([1,2,2,3])
to what I would expect (and want):
>>>array([[1,2],[2,3]])
There must be an easier way to do this?
Is this what you are looking for?
import numpy as np
x[~np.array(mask)]
# array([[1, 2],
# [2, 3]])
Or from numpy masked array:
newX = np.ma.array(x, mask = np.column_stack((mask, mask)))
newX
# masked_array(data =
# [[1 2]
# [2 3]
# [-- --]],
# mask =
# [[False False]
# [False False]
# [ True True]],
# fill_value = 999999)
Your x
is 3x2:
In [379]: x
Out[379]:
array([[1, 2],
[2, 3],
[3, 4]])
Make a 3 element boolean mask:
In [380]: rowmask=np.array([False,False,True])
That can be used to select the rows where it is True, or where it is False. In both cases the result is 2d:
In [381]: x[rowmask,:]
Out[381]: array([[3, 4]])
In [382]: x[~rowmask,:]
Out[382]:
array([[1, 2],
[2, 3]])
This is without using the MaskedArray subclass. To make such array, we need a mask that matches x
in shape. There isn't provision for masking just one dimension.
In [393]: xmask=np.stack((rowmask,rowmask),-1) # column stack
In [394]: xmask
Out[394]:
array([[False, False],
[False, False],
[ True, True]], dtype=bool)
In [395]: np.ma.MaskedArray(x,xmask)
Out[395]:
masked_array(data =
[[1 2]
[2 3]
[-- --]],
mask =
[[False False]
[False False]
[ True True]],
fill_value = 999999)
Applying compressed
to that produces a raveled array: array([1, 2, 2, 3])
Since masking is element by element, it could mask one element in row 1, 2 in row 2 etc. So in general compressing
, removing the masked elements, will not yield a 2d array. The flattened form is the only general choice.
np.ma
makes most sense when there's a scattering of masked values. It isn't of much value if you want want to select, or deselect, whole rows or columns.
===============
Here are more typical masked arrays:
In [403]: np.ma.masked_inside(x,2,3)
Out[403]:
masked_array(data =
[[1 --]
[-- --]
[-- 4]],
mask =
[[False True]
[ True True]
[ True False]],
fill_value = 999999)
In [404]: np.ma.masked_equal(x,2)
Out[404]:
masked_array(data =
[[1 --]
[-- 3]
[3 4]],
mask =
[[False True]
[ True False]
[False False]],
fill_value = 2)
In [406]: np.ma.masked_outside(x,2,3)
Out[406]:
masked_array(data =
[[-- 2]
[2 3]
[3 --]],
mask =
[[ True False]
[False False]
[False True]],
fill_value = 999999)
Since none of these solutions worked for me, I thought to write down what solution did, maybe it will useful for somebody else. I use python 3.x and I worked on two 3D arrays. One, which I call data_3D
contains float values of recordings in a brain scan, and the other, template_3D
contains integers which represent regions of the brain. I wanted to choose those values from data_3D
corresponding to an integer region_code
as per template_3D
:
my_mask = np.in1d(template_3D, region_code).reshape(template_3D.shape)
data_3D_masked = data_3D[my_mask]
which gives me a 1D array of only relevant recordings.
In your last example, the problem is not the mask. It is your use of compressed
. From the docstring of compressed
:
Return all the non-masked data as a 1-D array.
So compressed
flattens the nonmasked values into a 1-d array. (It has to, because there is no guarantee that the compressed data will have an n-dimensional structure.)
Take a look at the masked array before you compress it:
In [8]: np.ma.masked_array(x, mask2)
Out[8]:
masked_array(data =
[[1 2]
[2 3]
[-- --]],
mask =
[[False False]
[False False]
[ True True]],
fill_value = 999999)