Skip to main content

Python NumPy: How to Iterate Over the Columns of an Array

Iterating over a NumPy array is a common task, but by default, Python's for loop iterates over the rows of a 2D array. To process data column by column, you need to employ specific NumPy techniques. This is essential for many column-wise calculations, transformations, or when you need to analyze each feature (column) independently.

This guide will comprehensively demonstrate several effective methods to iterate over the columns of a 2D NumPy array, including using the transpose attribute (.T), the transpose() method, iterating with range() and array slicing, and leveraging Python's built-in zip() function. We'll also briefly cover iterating over columns in a 3D array.

Understanding Default Iteration (Row-wise) vs. Column-wise Iteration

When you directly use a for loop on a 2D NumPy array, Python iterates through its first dimension, which corresponds to the rows.

import numpy as np

array_2d = np.array([
[10, 11, 12, 13], # Row 0
[20, 21, 22, 23], # Row 1
[30, 31, 32, 33] # Row 2
])
print("Original 2D NumPy Array:")
print(array_2d)
print()

print("Default iteration (iterates over rows):")
for item in array_2d:
print(item)

Output:

Original 2D NumPy Array:
[[10 11 12 13]
[20 21 22 23]
[30 31 32 33]]

Default iteration (iterates over rows):
[10 11 12 13]
[20 21 22 23]
[30 31 32 33]

To iterate over columns, we need to change how we access or structure the array for the loop.

Transposing an array swaps its rows and columns. After transposing a 2D array, iterating over the rows of the transposed array is equivalent to iterating over the columns of the original array. This is often the most idiomatic and efficient NumPy approach.

Iterating with the .T Attribute

The .T attribute provides a view of the transposed array without copying data (it just changes how the data is strided in memory).

import numpy as np

# array_2d defined as above
array_2d = np.array([
[10, 11, 12, 13], # Row 0
[20, 21, 22, 23], # Row 1
[30, 31, 32, 33] # Row 2
])

print("Iterating over columns using array.T:")
for column_vector in array_2d.T:
print(column_vector)
print("---") # Separator

Output:

Iterating over columns using array.T:
[10 20 30]
---
[11 21 31]
---
[12 22 32]
---
[13 23 33]
---
note

Each column_vector is a 1D array representing a column from the original array_2d.

Iterating with the transpose() Method

The array.transpose() method achieves the same as .T for 2D arrays.

import numpy as np

# array_2d defined as above
array_2d = np.array([
[10, 11, 12, 13], # Row 0
[20, 21, 22, 23], # Row 1
[30, 31, 32, 33] # Row 2
])

print("Iterating over columns using array.transpose():")
for column_vector in array_2d.transpose():
print(column_vector)
print("---")

Output:

Iterating over columns using array.transpose():
[10 20 30]
---
[11 21 31]
---
[12 22 32]
---
[13 23 33]
---

How Transposition Works

import numpy as np

# array_2d defined as above
array_2d = np.array([
[10, 11, 12, 13], # Row 0
[20, 21, 22, 23], # Row 1
[30, 31, 32, 33] # Row 2
])

print("Original array_2d (shape {}):".format(array_2d.shape)) # (3, 4)
print(array_2d)
print()

transposed_array = array_2d.T
print("Transposed array_2d.T (shape {}):".format(transposed_array.shape)) # (4, 3)
print(transposed_array)

Output:

Original array_2d (shape (3, 4)):
[[10 11 12 13]
[20 21 22 23]
[30 31 32 33]]

Transposed array_2d.T (shape (4, 3)):
[[10 20 30]
[11 21 31]
[12 22 32]
[13 23 33]]
note

Iterating over the rows of transposed_array gives you the columns of array_2d.

Method 2: Iterating with range() and Column Slicing

You can iterate through the column indices and use NumPy's slicing to extract each column.

import numpy as np

# array_2d defined as above
array_2d = np.array([
[10, 11, 12, 13], # Row 0
[20, 21, 22, 23], # Row 1
[30, 31, 32, 33] # Row 2
])

num_columns = array_2d.shape[1] # Get the number of columns (index 1 of shape tuple)

print("Iterating over columns using range() and slicing:")
for col_index in range(num_columns):
column_vector = array_2d[:, col_index] # ':' selects all rows, 'col_index' selects the current column
print(column_vector)
print("---")

Output:

Iterating over columns using range() and slicing:
[10 20 30]
---
[11 21 31]
---
[12 22 32]
---
[13 23 33]
---
note
  • array_2d.shape returns a tuple (number_of_rows, number_of_columns). So, array_2d.shape[1] is the column count.
  • array_2d[:, col_index] slices all rows (:) for the specific col_index.

Method 3: Iterating with zip(*array)

Python's built-in zip() function, when used with the unpacking operator * on a 2D array (or list of lists), can effectively iterate over columns. zip(*array_2d) groups elements from the same column position across all rows.

import numpy as np

# array_2d defined as above
array_2d = np.array([
[10, 11, 12, 13], # Row 0
[20, 21, 22, 23], # Row 1
[30, 31, 32, 33] # Row 2
])

print("Iterating over columns using zip(*array):")
for column_tuple in zip(*array_2d):
# zip returns tuples, convert to NumPy array if needed for NumPy operations
column_vector = np.array(column_tuple)
print(column_vector)
print("---")

Output:

Iterating over columns using zip(*array):
[10 20 30]
---
[11 21 31]
---
[12 22 32]
---
[13 23 33]
---

Each column_tuple yielded by zip(*array_2d) contains the elements of one column from array_2d.

Bonus: Iterating Over Columns of a 3D NumPy Array

For a 3D array (shape: (depth, rows, columns)), iterating directly will give you 2D "slices" along the first axis (depth). To iterate over what you might consider "columns" in the traditional sense (i.e., vectors aligned along one of the last two dimensions, across all "depths"), you need to be more specific with transpose().

Let's assume "columns" means the elements along the last dimension, for fixed "depth" and "row" indices. To iterate over the "columns" of the 2D slices:

If your 3D array has shape (num_planes, num_rows_per_plane, num_cols_per_plane):

  • To iterate through vectors along the num_cols_per_plane axis (effectively column vectors within each plane, across all planes): arr.transpose(1, 2, 0) would bring num_rows_per_plane first, then num_cols_per_plane. Iterating for col_set in arr.transpose(1,2,0): would give col_set as a 2D array of shape (num_cols_per_plane, num_planes). This is getting complex and depends on definition.

An example:

import numpy as np

arr_3d = np.array([
[[1, 3, 5, 7], [2, 4, 6, 8]], # Plane 0
[[3, 5, 7, 9], [4, 6, 8, 11]] # Plane 1
], dtype=object) # dtype=object used in original, can affect operations

print("Original 3D array (shape {}):".format(arr_3d.shape)) # (2, 2, 4)
print(arr_3d)

print("Iterating using arr.transpose(1, 0, 2):")
# arr_3d.shape is (planes, rows, cols_in_row) = (2, 2, 4)
# transpose(1,0,2) makes it (rows, planes, cols_in_row) = (2, 2, 4)
# Iterating over this new first dimension (original rows) gives 2D slices.
# Each slice is a (planes, cols_in_row) shaped array.
# These slices represent all values for a given original row index, across all planes.
for slice_representing_original_row_across_planes in arr_3d.transpose(1, 0, 2):
print(slice_representing_original_row_across_planes)
print("---")

Output:

Original 3D array (shape (2, 2, 4)):
[[[1 3 5 7]
[2 4 6 8]]

[[3 5 7 9]
[4 6 8 11]]]
Iterating using arr.transpose(1, 0, 2):
[[1 3 5 7]
[3 5 7 9]]
---
[[2 4 6 8]
[4 6 8 11]]
---

A more common interpretation of "iterating columns" for a 3D array might be iterating through the "column vectors" within each 2D plane.

import numpy as np

arr_3d = np.array([
[[1, 3, 5, 7], [2, 4, 6, 8]], # Plane 0
[[3, 5, 7, 9], [4, 6, 8, 11]] # Plane 1
], dtype=object) # dtype=object used in original, can affect operations

print("Iterating columns within each 2D plane of 3D array:")
for i, plane in enumerate(arr_3d): # Iterate through planes
print(f"Plane {i}:")
for column_in_plane in plane.T: # Transpose each 2D plane to iterate its columns
print(column_in_plane)
print("---")

Output:

Iterating columns within each 2D plane of 3D array:
Plane 0:
[1 2]
[3 4]
[5 6]
[7 8]
---
Plane 1:
[3 4]
[5 6]
[7 8]
[9 11]
---
note

The definition of "column" in 3D+ arrays depends heavily on context. For simple 2D-like column iteration, arr.T is usually what's needed.

Choosing the Best Method (for 2D Arrays)

  • Transposition (array.T or array.transpose()): Generally the most Pythonic and often most efficient NumPy way. It's clear and leverages NumPy's strengths. This is usually the recommended method.
  • range() and Slicing: Explicit and understandable, but slightly more verbose. Performance is typically good.
  • zip(*array): A concise Pythonic way that works well for iterables. It returns tuples, so you might need to convert to np.array() inside the loop if you need NumPy array operations on the column.

Conclusion

While default iteration over a 2D NumPy array yields rows, several effective methods allow you to iterate over its columns:

  1. Transposing the array using .T or .transpose() and then iterating is the most common and idiomatic NumPy approach.
  2. Iterating through column indices using range(arr.shape[1]) and slicing arr[:, col_index] provides explicit control.
  3. Using Python's zip(*array) offers a compact way to achieve column-wise iteration.

For 3D arrays, "iterating over columns" requires careful definition of what a "column" means in that context, often involving more specific transpose() arguments or nested loops. For most 2D array tasks, transposition provides the cleanest solution for column-wise processing.