Skip to main content

Python NumPy: How to Fix "IndexError: arrays used as indices must be of integer (or boolean) type"

When performing "fancy indexing" or "boolean masking" in NumPy—that is, using one NumPy array (or list) to select elements from another NumPy array—you might encounter the IndexError: arrays used as indices must be of integer (or boolean) type. This error is a clear message from NumPy: the array you are using as an indexer (the one inside the square brackets []) must contain either all integer values (to specify positions) or all boolean values (to create a mask). If your indexing array contains floating-point numbers or other non-integer/non-boolean types, NumPy can not interpret these as valid positional indices or a valid mask.

This guide will thoroughly explain why this IndexError occurs, demonstrate common scenarios with float-based index arrays, and provide robust solutions, primarily focusing on converting your indexing array to the correct integer or boolean dtype using astype() or by ensuring correct dtype at initialization.

Understanding the Error: NumPy's Indexing Type Requirements

NumPy offers powerful ways to select elements from an array using another array as the indexer:

Integer Array Indexing (Fancy Indexing)

You can pass an array (or list) of integers to select elements at those specific integer positions.

import numpy as np

data_arr = np.array(['A', 'B', 'C', 'D', 'E'])
int_indices = np.array([0, 2, 4]) # Integer indices
selected_elements = data_arr[int_indices] # Selects elements at positions 0, 2, 4
print(selected_elements) # Output: ['A' 'C' 'E']

Output:

['A' 'C' 'E']
note

For this to work, int_indices must contain integer values.

Boolean Array Indexing (Masking)

You can pass a boolean array of the same shape as the axis being indexed. Elements corresponding to True in the mask are selected.

import numpy as np

data_arr_mask = np.array([10, 20, 30, 40, 50])
bool_mask = np.array([True, False, True, False, True]) # Boolean mask
selected_by_mask = data_arr_mask[bool_mask] # Selects elements where mask is True
print(selected_by_mask) # Output: [10 30 50]
note
  • For this, bool_mask must contain boolean values.
  • The IndexError: arrays used as indices must be of integer (or boolean) type occurs if the array you use inside the [] for indexing contains values that are neither integers nor booleans (e.g., floats).

Reproducing the Error: Using a Non-Integer/Non-Boolean Array as an Index

This typically happens if your indexing array is inadvertently created with or converted to a floating-point dtype.

import numpy as np

main_data_array = np.array([
[10, 11, 12, 13], # Row 0
[20, 21, 22, 23], # Row 1
[30, 31, 32, 33], # Row 2
[40, 41, 42, 43] # Row 3
])

# Indexing array, but it contains floats (e.g., from a calculation or another source)
# Let's say we want to select rows using the first column of another array that happens to be float
potential_row_indices_float = np.array([
[0.0, 10.5, 20.3], # First column is 0.0
[1.0, 11.2, 22.8], # First column is 1.0 (could be integer if not for others)
[2.7, 12.1, 24.5] # First column is 2.7 (float)
])
# If we intended to use the first column of this as indices:
indices_to_use_raw = potential_row_indices_float[:, 0] # This will be [0.0, 1.0, 2.7]
print(f"Raw indexing array: {indices_to_use_raw}")
print(f"dtype of raw indexing array: {indices_to_use_raw.dtype}") # Output: float64

try:
# ⛔️ Incorrect: Trying to index main_data_array using an array of floats
selected_rows_error = main_data_array[indices_to_use_raw]
print(selected_rows_error)
except IndexError as e:
print(f"Error: {e}")

Output:

Raw indexing array: [0.  1.  2.7]
ERROR!
dtype of raw indexing array: float64
Error: arrays used as indices must be of integer (or boolean) type
note

Even though 0.0 and 1.0 look like integers, the presence of 2.7 (or any other float) makes the entire indices_to_use_raw array have dtype=float64. NumPy can not use float values as positional indices.

Verifying the dtype of the Indexing Array

Always check the dtype of your indexing array if you encounter this error: print(your_indexing_array.dtype)

Solution 1: Convert Indexing Array to Integer Type using astype(int) (Most Common)

If your indexing array contains numbers that are meant to be integer positions but are currently floats (perhaps due to calculations), convert them to integers using the ndarray.astype(int) method.

import numpy as np

# main_data_array and indices_to_use_raw defined as above
main_data_array = np.array([
[10, 11, 12, 13], # Row 0
[20, 21, 22, 23], # Row 1
[30, 31, 32, 33], # Row 2
[40, 41, 42, 43] # Row 3
])
potential_row_indices_float = np.array([
[0.0, 10.5, 20.3], # First column is 0.0
[1.0, 11.2, 22.8], # First column is 1.0 (could be integer if not for others)
[2.7, 12.1, 24.5] # First column is 2.7 (float)
])
indices_to_use_raw = potential_row_indices_float[:, 0] # This will be [0.0, 1.0, 2.7]

# ✅ Convert the float indexing array to integer type
integer_indices = indices_to_use_raw.astype(int)
print(f"Indexing array after astype(int): {integer_indices}") # Output: [0 1 2] (floats are truncated)
print(f"dtype of integer_indices: {integer_indices.dtype}") # Output: e.g., int64 or int32

# Now use the integer_indices for indexing
selected_rows_correct = main_data_array[integer_indices]
print("Selected rows using integer indices:")
print(selected_rows_correct)

Output:

Indexing array after astype(int): [0 1 2]
dtype of integer_indices: int64
Selected rows using integer indices:
[[10 11 12 13]
[20 21 22 23]
[30 31 32 33]]
warning

Caution: astype(int) truncates floats (e.g., 2.7 becomes 2). Ensure this truncation is acceptable for your indexing logic. If you need rounding, use np.round(arr).astype(int).

Solution 2: Convert Indexing Array to Boolean Type using astype(bool)

If your indexing array's values are intended to act as a boolean mask (where non-zero typically means True and zero means False), you can convert it to bool type.

How Numeric Values Convert to Booleans

When converting a numeric array to boolean using astype(bool):

  • 0 (of any numeric type like 0, 0.0) becomes False.
  • All other non-zero numbers become True.
  • np.nan often converts to True (this can be surprising, so be careful with NaNs if using this method).
import numpy as np

# main_data_array defined as above
main_data_array = np.array([
[10, 11, 12, 13], # Row 0
[20, 21, 22, 23], # Row 1
[30, 31, 32, 33], # Row 2
[40, 41, 42, 43] # Row 3
])

# Example indexing array where 0 means "don't select" and non-zero means "select"
numeric_mask_like = np.array([1.0, 0.0, 0.0, 5.5]) # Intend to select rows 0 and 3
# Ensure its length matches the axis being indexed (e.g., number of rows in main_data_array)

if len(numeric_mask_like) == main_data_array.shape[0]: # Check length
# ✅ Convert to boolean type
boolean_mask_indices = numeric_mask_like.astype(bool)
print(f"Indexing array after astype(bool): {boolean_mask_indices}")
print(f"dtype of boolean_mask_indices: {boolean_mask_indices.dtype}\n")

selected_rows_bool_mask = main_data_array[boolean_mask_indices]
print("Selected rows using boolean mask from numeric array:")
print(selected_rows_bool_mask)
else:
print("Length of numeric_mask_like does not match number of rows in main_data_array for boolean masking.")

Output:

Indexing array after astype(bool): [ True False False  True]
dtype of boolean_mask_indices: bool

Selected rows using boolean mask from numeric array:
[[10 11 12 13]
[40 41 42 43]]
note

This solution is less common for direct indexing values and more for when an array conceptually represents a mask but isn't yet boolean. Direct creation of boolean masks (e.g., main_data_array[:, 0] > 20) is usually preferred.

Solution 3: Specify Integer dtype During Indexing Array Creation

If you are creating the indexing array from data that might be interpreted as float, explicitly set its dtype to an integer type during creation if appropriate.

import numpy as np

# main_data_array defined as above
main_data_array = np.array([
[10, 11, 12, 13], # Row 0
[20, 21, 22, 23], # Row 1
[30, 31, 32, 33], # Row 2
[40, 41, 42, 43] # Row 3
])

# Data for indices that might otherwise become float
potential_float_indices_data = [0, 1.0, 2] # Mixing int and float could lead to float array

# ✅ Create indexing array with explicit integer dtype
integer_indices_at_creation = np.array(potential_float_indices_data, dtype=int)
print(f"Indexing array created with dtype=int: {integer_indices_at_creation}") # Output: [0 1 2]
print(f"dtype: {integer_indices_at_creation.dtype}\n") # Output: e.g., int64

selected_rows_dtype_init = main_data_array[integer_indices_at_creation]
print("Selected rows using dtype=int at creation:")
print(selected_rows_dtype_init)

Output:

Indexing array created with dtype=int: [0 1 2]
dtype: int64

Selected rows using dtype=int at creation:
[[10 11 12 13]
[20 21 22 23]
[30 31 32 33]]

This preempts the issue by ensuring the indexing array is of the correct integer type from the start.

Key Takeaway: Match Indexer dtype to Indexing Method

The IndexError: arrays used as indices must be of integer (or boolean) type is NumPy's way of enforcing strict type requirements for its advanced indexing mechanisms.

  • For fancy indexing (selecting specific elements/rows/columns by position), the indexing array must contain integers.
  • For boolean masking, the indexing array must contain booleans.

Conclusion

The NumPy IndexError regarding non-integer or non-boolean index arrays is a common issue when the data type of your indexing array is incorrect for the type of indexing you're attempting. To resolve it:

  1. Verify dtype: Always check your_indexing_array.dtype.
  2. Convert to Integer: If you intend to select by position and your indexing array contains floats (or numbers that can be safely truncated/rounded to integers), use your_indexing_array.astype(int).
  3. Convert to Boolean: If your indexing array represents a condition and should be a mask, convert it using your_indexing_array.astype(bool) (being mindful of how numbers and NaN convert to booleans).
  4. Specify dtype at Creation: When creating an array intended for integer indexing, ensure it's created with an integer dtype from the outset if the source data might lead to float inference (e.g., np.array(data, dtype=int)).

By ensuring your indexing arrays have the appropriate integer or boolean data type, you can leverage NumPy's powerful indexing capabilities without encountering this IndexError.