Skip to main content

Python Pandas: How to Apply Function to Each Cell/Element in DataFrame

Applying a custom function or a built-in function to every single cell (element) in a Pandas DataFrame is a common requirement for data transformation, cleaning, or feature engineering. Pandas provides methods to perform these element-wise operations efficiently.

This guide explains how to use DataFrame.map() (the successor to the deprecated DataFrame.applymap()) and alternatives like DataFrame.apply() with numpy.vectorize() to apply a function to each cell in a DataFrame.

The Goal: Element-wise Operations on a DataFrame

We want to take a function that operates on a single scalar value and apply it individually to every element within a Pandas DataFrame, producing a new DataFrame of the same shape with the transformed values.

Example DataFrame

import pandas as pd
import numpy as np # For np.nan and np.vectorize
import math # For math.sqrt

data = {
'A': [1.0, 4.0, 9.0, np.nan, 16.0],
'B': [0.5, 2.5, 4.5, 6.5, 8.5],
'C': [0.0, 1.0, -2.0, 3.0, np.nan]
}

df_original = pd.DataFrame(data)
print("Original DataFrame:")
print(df_original)

Output:

Original DataFrame:
A B C
0 1.0 0.5 0.0
1 4.0 2.5 1.0
2 9.0 4.5 -2.0
3 NaN 6.5 3.0
4 16.0 8.5 NaN

As of Pandas version 2.1.0, DataFrame.applymap() has been deprecated and renamed to DataFrame.map(). The DataFrame.map(func, na_action=None) method applies a function func that accepts and returns a scalar to every element of the DataFrame.

Applying a Built-in Function (e.g., math.sqrt)

The function passed to map() must work on individual scalar values. math.sqrt is an example.

import pandas as pd
import numpy as np
import math

df = pd.DataFrame({
'A': [1.0, 4.0, 9.0, np.nan, 16.0],
'B': [0.25, 2.25, 4.0, 6.25, 9.0], # Values suitable for sqrt
'C': [0.0, 1.0, 1.0, np.nan, 4.0] # Added another NaN for C
})


# ✅ Apply math.sqrt to each element using DataFrame.map()
# Note: math.sqrt will raise an error on NaN or negative numbers if not handled by na_action
# For simplicity here, we'll assume positive numbers or use na_action if NaN is present.
# If df had negative numbers, math.sqrt would error. Use np.sqrt for element-wise on Series directly for that.
try:
df_sqrt = df.map(math.sqrt) # This will error if a cell is negative or unhandled NaN
print("DataFrame after df.map(math.sqrt) - if no errors:")
print(df_sqrt)
print()
except TypeError as e: # math.sqrt can not handle NaN directly
print(f"Error with df.map(math.sqrt) due to NaN/negative: {e}")
print("Using na_action='ignore' or pre-filtering/filling NaNs is needed for math.sqrt on NaNs.")
print()

# More robust for general numeric operations: if the function can handle arrays (like np.sqrt)
# you can use df.apply(np.sqrt) or np.sqrt(df) directly if dtypes are numeric.
# However, for a function that STRICTLY takes a scalar, .map() is the way.

# Example: Forcing a scalar function like custom_sqrt on each cell
def custom_scalar_sqrt(x):
if pd.isna(x) or x < 0:
return np.nan # Handle NaN and negative numbers
return math.sqrt(x)

df_custom_sqrt = df.map(custom_scalar_sqrt)
print("DataFrame after df.map(custom_scalar_sqrt):")
print(df_custom_sqrt)

Output:

DataFrame after df.map(math.sqrt) - if no errors:
A B C
0 1.0 0.5 0.0
1 2.0 1.5 1.0
2 3.0 2.0 1.0
3 NaN 2.5 NaN
4 4.0 3.0 2.0

DataFrame after df.map(custom_scalar_sqrt):
A B C
0 1.0 0.5 0.0
1 2.0 1.5 1.0
2 3.0 2.0 1.0
3 NaN 2.5 NaN
4 4.0 3.0 2.0

Applying a Custom Function

You can pass any Python function (including a lambda function) that takes a single scalar argument and returns a single scalar value.

import pandas as pd
import numpy as np

df = pd.DataFrame({
'A': [1.0, 4.0, 9.0, np.nan, 16.0],
'B': [0.5, 2.5, 4.5, 6.5, 8.5],
'C': [0.0, 1.0, -2.0, 3.0, np.nan]
})


def transform_value(x):
if pd.isna(x):
return x # Keep NaN as is
if x > 5:
return x * 10
elif x < 1:
return x / 2
else:
return x # Keep values between 1 and 5 (inclusive) as is

# ✅ Apply the custom function using DataFrame.map()
df_transformed = df.map(transform_value)

print("DataFrame after df.map(custom_function):")
print(df_transformed)

Output:

DataFrame after df.map(custom_function):
A B C
0 1.0 0.25 0.0
1 4.0 2.50 1.0
2 90.0 4.50 -1.0
3 NaN 65.00 3.0
4 160.0 85.00 NaN

Handling NaN Values with na_action

The na_action parameter in df.map() controls how NaN values are handled:

  • None (default): NaN values are passed to the function func. If func can not handle NaN (like math.sqrt), it might raise an error.
  • 'ignore': NaN values are propagated. The function func is not called for NaN cells; they remain NaN in the output.
import pandas as pd
import numpy as np
import math

df_with_nan = pd.DataFrame({
'A': [1.0, 4.0, 9.0, np.nan, 16.0],
'B': [0.25, 2.25, np.nan, 6.25, 9.0],
'C': [0.0, 1.0, 1.0, np.nan, 4.0]
})

# Using math.sqrt which errors on NaN if na_action is not 'ignore'
# df_sqrt_error = df_with_nan.map(math.sqrt) # This would throw error

# ✅ Use na_action='ignore' to skip NaNs
df_sqrt_ignore_nan = df_with_nan.map(lambda x: math.sqrt(x) if x >= 0 else np.nan, na_action='ignore')
# The lambda here also handles potential negative numbers for math.sqrt

print("DataFrame after df.map(lambda_sqrt, na_action='ignore'):")
print(df_sqrt_ignore_nan)

Output:

DataFrame after df.map(lambda_sqrt, na_action='ignore'):
A B C
0 1.0 0.5 0.0
1 2.0 1.5 1.0
2 3.0 NaN 1.0
3 NaN 2.5 NaN
4 4.0 3.0 2.0

Method 2: Using DataFrame.apply() with numpy.vectorize()

While DataFrame.map() is preferred for element-wise operations on the whole DataFrame, you can also achieve this using DataFrame.apply() in conjunction with numpy.vectorize(). np.vectorize is primarily a convenience for functions not designed to work on arrays.

How It Works

  • numpy.vectorize(pyfunc): Takes a Python function pyfunc that works on scalars and returns a new "vectorized" function that can operate on NumPy arrays (element-wise).
  • DataFrame.apply(vectorized_func): Applies the vectorized function. When apply is used on a DataFrame without axis specified or axis=0 (default), it applies the function to each column (Series). If the vectorized function is then applied to a Series, it works element-wise. If axis=1, it applies to rows. For element-wise on the whole DataFrame, you usually apply it to the result of selecting all data (df[:]).
import pandas as pd
import numpy as np

df = pd.DataFrame({
'A': [1.0, 4.0, 9.0, np.nan, 16.0],
'B': [0.5, 2.5, 4.5, 6.5, 8.5],
'C': [0.0, 1.0, -2.0, 3.0, np.nan]
})

def transform_value_np(x): # Same custom function as before
if pd.isna(x): return x
if x > 5: return x * 10
elif x < 1: return x / 2
else: return x

# Vectorize the custom scalar function
vectorized_transform = np.vectorize(transform_value_np)

# ✅ Apply the vectorized function to the DataFrame
# df.apply() on a DataFrame by default passes each column (Series) to the function.
# np.vectorize makes our scalar function work on these Series element-wise.
df_transformed_np_apply = df.apply(vectorized_transform)

print("DataFrame after df.apply(np.vectorize(custom_function)):")
print(df_transformed_np_apply)

Output: (Same as df.map(transform_value))

DataFrame after df.apply(np.vectorize(custom_function)):
A B C
0 1.0 0.25 0.0
1 4.0 2.50 1.0
2 90.0 4.50 -1.0
3 NaN 65.00 3.0
4 160.0 85.00 NaN

Direct Application of Vectorized Function

You can also apply the vectorized function directly to the DataFrame (or its underlying NumPy array).

import pandas as pd
import numpy as np

def transform_value_np(x): # Same custom function as before
if pd.isna(x): return x
if x > 5: return x * 10
elif x < 1: return x / 2
else: return x

df_copy = pd.DataFrame({
'A': [1.0, 4.0, 9.0, np.nan, 16.0],
'B': [0.5, 2.5, 4.5, 6.5, 8.5],
'C': [0.0, 1.0, -2.0, 3.0, np.nan]
})
vectorized_transform = np.vectorize(transform_value_np)

# ✅ Apply vectorized function directly and reassign
# This works because vectorized_transform now handles array-like inputs element-wise.
df_copy[:] = vectorized_transform(df_copy)

print("DataFrame after vectorized_function(df) and reassignment:")
print(df_copy)

Output: (Same as df.map(transform_value))

DataFrame after vectorized_function(df) and reassignment:
A B C
0 1.0 0.25 0.0
1 4.0 2.50 1.0
2 90.0 4.50 -1.0
3 NaN 65.00 3.0
4 160.0 85.00 NaN
note

Note on np.vectorize: While convenient, np.vectorize is essentially a loop in C and doesn't always provide the same performance benefits as true NumPy ufuncs written in C. For performance-critical tasks, if your operation can be expressed using native NumPy/Pandas vectorized operations, that's usually better. df.map() is generally preferred over this for element-wise DataFrame operations.

Important Note: DataFrame.applymap() is Deprecated

If you see older code or tutorials using DataFrame.applymap(func), be aware that this method was deprecated in Pandas 2.1.0 and replaced by DataFrame.map(func). They serve the same purpose: applying a scalar function element-wise to a DataFrame. You should update your code to use DataFrame.map().

Choosing the Right Method

  • DataFrame.map(func, na_action=...): This is the recommended and modern method for applying a scalar function to every element of a DataFrame. It's clear, concise, and provides control over NaN handling.
  • DataFrame.apply(np.vectorize(func)): A viable alternative, especially if you already have a scalar function and want to make it work element-wise on a DataFrame (or its columns/rows). However, df.map() is generally more direct for this specific task.
  • Native Pandas/NumPy vectorized operations: If your transformation can be expressed directly using arithmetic operators (+, -, *, /) or NumPy universal functions (np.sqrt, np.log, np.sin, etc.) on the DataFrame or specific columns, this is usually the most performant approach (e.g., df * 2, np.sqrt(df)). df.map() is for more arbitrary scalar functions.

Conclusion

To apply a function to each cell (element-wise) in a Pandas DataFrame:

  1. Use DataFrame.map(your_function). This is the current standard, replacing the deprecated applymap(). Ensure your_function takes a single scalar value and returns a single scalar value. Use the na_action parameter to control how NaNs are treated.
  2. As an alternative, DataFrame.apply(numpy.vectorize(your_scalar_function)) can be used, but DataFrame.map() is generally more idiomatic for this purpose.

For optimal performance, always try to use built-in Pandas/NumPy vectorized operations if your transformation allows. If you need to apply a custom scalar function, DataFrame.map() is your primary tool.