Python NumPy: How to Apply a Mask from One Array to Another
In data analysis with NumPy, a common task is to filter or select elements from an array based on conditions derived from another array. This often involves creating a boolean mask from one array (e.g., based on a threshold) and then applying this same mask to a second, corresponding array to select or hide its elements at the same positions. NumPy's masked_array
submodule (numpy.ma
) provides convenient tools for these operations.
This guide will comprehensively demonstrate how to create a mask from one NumPy array based on a condition and then apply that exact mask to another array, effectively linking their filtering. We'll primarily use numpy.ma.masked_where()
and numpy.ma.getmask()
for this, covering both 1D and 2D array scenarios.
Understanding Masked Arrays in NumPy
NumPy's numpy.ma
module allows for the creation of "masked arrays." A masked array is a standard NumPy array combined with a boolean mask of the same shape. Where the mask is True
, the corresponding element in the data array is considered "masked" or invalid (e.g., hidden from calculations, often displayed as --
). Where the mask is False
, the data element is considered valid.