Why does my array lose its mask after multidimensional indexing in Numpy?

Question

Why does my array lose its mask after multidimensional indexing in Numpy?

I want to use a multidimensional MaskedArray as an array of indices:

Data:

In [149]: np.ma.arange(10, 60, 2)
Out[149]: 
masked_array(data = [10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58],
             mask = False,
       fill_value = 999999)

Indices:

In [140]: np.ma.array(np.arange(20).reshape(4, 5), 
                      mask=np.arange(20).reshape(4, 5) % 3)
Out[140]: 
masked_array(data =
 [[0 -- -- 3 --]
 [-- 6 -- -- 9]
 [-- -- 12 -- --]
 [15 -- -- 18 --]],
             mask =
 [[False  True  True False  True]
 [ True False  True  True False]
 [ True  True False  True  True]
 [False  True  True False  True]],
       fill_value = 999999)

Desired output:

In [151]: np.ma.arange(10, 60, 2)[np.ma.array(np.arange(20).reshape(4, 5), mask=np.arange(20).reshape(4, 5) % 3)]
Out[151]: 
masked_array(data =
 [[10 -- -- 16 --]
 [-- 22 -- -- 28]
 [-- -- 34 -- --]
 [40 -- -- 46 --]],
             mask =
 False,
       fill_value = 999999)

Actual output:

In [160]: np.ma.arange(10, 60, 2)[np.ma.array(np.arange(20).reshape(4, 5), mask=np.arange(20).reshape(4, 5) % 3)]
Out[160]: 
masked_array(data =
 [[10 12 14 16 18]
 [20 22 24 26 28]
 [30 32 34 36 38]
 [40 42 44 46 48]],
             mask =
 False,
       fill_value = 999999)

Why would the resulting array lose its mask? As per the answer here: Indexing with Masked Arrays in numpy , this indexing method is very bad. Why?

+3

python numpy multidimensional-array indexing

user2561747 01 june 15 at 23:23

source to share

2 answers

Try using the choose method like this:

np.ma.array(np.arange(20).reshape(4, 5), mask=np.arange(20).reshape(4, 5) % 3).
            choose(np.ma.arange(10, 60, 2))

which gives:

masked_array(data =
 [[10 -- -- 16 --]
 [-- 22 -- -- 28]
 [-- -- 34 -- --]
 [40 -- -- 46 --]],
             mask =
 [[False  True  True False  True]
 [ True False  True  True False]
 [ True  True False  True  True]
 [False  True  True False  True]],
       fill_value = 999999)

0

Andrzej Pronobis 02 june At 1:07 am

source to share

hpaulj · Accepted Answer · 2015-06-02T04:04:49+0000

It looks like indexing with a masked array is just ignoring the mask. Without digging through the docs or code, I would say that array indexing numpy

has no special knowledge of subclassing a masked array. The array you get is normal indexing arange(20)

.

But you can do normal indexing and "copy" the mask:

In [13]: data=np.arange(10,60,2)

In [14]: mI = np.ma.array(np.arange(20).reshape(4,5),mask=np.arange(20).reshape(4,5) % 3)

...

In [16]: np.ma.array(data[mI], mask=mI.mask)
Out[16]: 
masked_array(data =
 [[10 -- -- 16 --]
 [-- 22 -- -- 28]
 [-- -- 34 -- --]
 [40 -- -- 46 --]],
             mask =
 [[False  True  True False  True]
 [ True False  True  True False]
 [ True  True False  True  True]
 [False  True  True False  True]],
       fill_value = 999999)

You really need to combine indexing and masking into one operation (and array masking). This operation will work just as well if the mask is separate.

 I = np.arange(20).reshape(4,5)
 m = (np.arange(20).reshape(4,5) % 3)>0
 np.ma.array(data[I], mask=m)

If the masked index entries are not valid (e.g. out of range), you can fill them with something valid (followed by masking if necessary):

data[mI.filled(fill_value=0)]

Have you seen in the numpy array masked docs an example of using a masked array to index another? Or all the data of the masked arrays? Perhaps the designers never intended to use masked indexes.

The masked array .choose

works because it uses a method that was subclassed for masked arrays. Routine indexing is likely to transform the index in a regular array with something like: data[np.asarray(mI)]

.

The method __getitem__

for the class is MaskedArray

run:

    def __getitem__(self, indx):

        Return the item described by i, as a masked array.

        """
        # This test is useful, but we should keep things light...
#        if getmask(indx) is not nomask:
#            msg = "Masked arrays must be filled before they can be used as indices!"
#            raise IndexError(msg)

This is the method that is called at run time []

on a masked array. Obviously, the developer (s) thought he was formally blocking the use of a masked index, but decided that this was not an important enough issue. See the file for details np.ma.core.py

.

Why does my array lose its mask after multidimensional indexing in Numpy?

More articles: