Skip to main content

Python Pandas: How to Multiply DataFrame Columns (Element-wise)

Performing element-wise multiplication between two or more columns in a Pandas DataFrame is a common operation in data analysis, often used for calculating totals, weighted scores, or new features. Pandas provides several straightforward ways to achieve this, including using the standard multiplication operator (*) and the DataFrame.mul() method.

This guide explains how to multiply DataFrame columns, handle potential non-numeric data, and perform conditional multiplication.

The Goal: Element-wise Column Multiplication

Given a Pandas DataFrame, we want to create a new column (or update an existing one) where each value is the product of the corresponding values from two or more specified columns in the same row. For example, multiplying a 'Price' column by a 'Quantity' column to get a 'Total_Sale' column.

Example DataFrame

import pandas as pd

data = {
'ProductName': ['Apple', 'Banana', 'Orange', 'Grape'],
'Price_Per_Unit': [0.5, 0.25, 0.4, 1.5],
'Quantity_Sold': [100, 150, 200, 50],
'Discount_Factor': [1.0, 0.9, 1.0, 0.8], # 1.0 means no discount
'Category': ['Fruit', 'Fruit', 'Fruit', 'Fruit']
}

df_original = pd.DataFrame(data)
print("Original DataFrame:")
print(df_original)

Output:

Original DataFrame:
ProductName Price_Per_Unit Quantity_Sold Discount_Factor Category
0 Apple 0.50 100 1.0 Fruit
1 Banana 0.25 150 0.9 Fruit
2 Orange 0.40 200 1.0 Fruit
3 Grape 1.50 50 0.8 Fruit

This is the most direct and Pythonic way to perform element-wise multiplication between Series (DataFrame columns).

Multiplying Two Columns

Select the columns and use the * operator. Assign the result to a new column.

import pandas as pd

df = pd.DataFrame({
'ProductName': ['Apple', 'Banana'],
'Price_Per_Unit': [0.5, 0.25],
'Quantity_Sold': [100, 150]
})

# ✅ Multiply 'Price_Per_Unit' by 'Quantity_Sold'
df['Total_Revenue'] = df['Price_Per_Unit'] * df['Quantity_Sold']

print("DataFrame with 'Total_Revenue' column:")
print(df)

Output:

DataFrame with 'Total_Revenue' column:
ProductName Price_Per_Unit Quantity_Sold Total_Revenue
0 Apple 0.50 100 50.0
1 Banana 0.25 150 37.5

Multiplying More Than Two Columns

You can chain multiplications.

import pandas as pd

df = pd.DataFrame({
'ProductName': ['Apple', 'Banana'],
'Price_Per_Unit': [0.5, 0.25],
'Quantity_Sold': [100, 150],
'Discount_Factor': [1.0, 0.9]
})

# ✅ Calculate Net Revenue = Price * Quantity * Discount_Factor
df['Net_Revenue'] = df['Price_Per_Unit'] * df['Quantity_Sold'] * df['Discount_Factor']

print("DataFrame with 'Net_Revenue' column:")
print(df)

Output:

DataFrame with 'Net_Revenue' column:
ProductName Price_Per_Unit Quantity_Sold Discount_Factor Net_Revenue
0 Apple 0.50 100 1.0 50.00
1 Banana 0.25 150 0.9 33.75

Ensuring Numeric Data Types (astype())

Multiplication requires numeric data types (e.g., int, float). If your columns are stored as strings (object dtype), you must convert them to a numeric type first using astype().

import pandas as pd

data_str = {
'Price_Str': ['10.5', '20.0', '5.75'],
'Quantity_Str': ['5', '3', '10'],
'Item': ['A', 'B', 'C']
}
df_str = pd.DataFrame(data_str)
print("DataFrame with string numeric values:")
print(df_str.dtypes) # Price_Str and Quantity_Str will be 'object'
print()

# Convert to numeric before multiplying
price_numeric = df_str['Price_Str'].astype(float)
quantity_numeric = df_str['Quantity_Str'].astype(int) # Or float

df_str['Total_Value'] = price_numeric * quantity_numeric

print("DataFrame after converting to numeric and multiplying:")
print(df_str)

Output:

DataFrame with string numeric values:
Price_Str object
Quantity_Str object
Item object
dtype: object

DataFrame after converting to numeric and multiplying:
Price_Str Quantity_Str Item Total_Value
0 10.5 5 A 52.5
1 20.0 3 B 60.0
2 5.75 10 C 57.5
note

Attempting to multiply string columns directly will result in string concatenation or a TypeError.

Method 2: Using Series.mul() Method

Each Pandas Series (a DataFrame column) has a .mul() method for element-wise multiplication. This is equivalent to the * operator for basic multiplication but offers a fill_value parameter for handling missing data.

Basic Usage

import pandas as pd

df = pd.DataFrame({
'ProductName': ['Apple', 'Banana'],
'Price_Per_Unit': [0.5, 0.25],
'Quantity_Sold': [100, 150]
})

# ✅ Using .mul()
df['Total_Revenue_mul'] = df['Price_Per_Unit'].mul(df['Quantity_Sold'])

print("DataFrame with 'Total_Revenue_mul' (using .mul()):")
print(df)

Output:

DataFrame with 'Total_Revenue_mul' (using .mul()):
ProductName Price_Per_Unit Quantity_Sold Total_Revenue_mul
0 Apple 0.50 100 50.0
1 Banana 0.25 150 37.5

Handling Missing Values (fill_value)

If one of the Series has NaN values, the result of multiplication with * or default mul() will also be NaN. The fill_value parameter in mul() allows you to substitute NaNs with a specific value (e.g., 0 or 1) before the multiplication occurs for that specific operation.

import pandas as pd
import numpy as np

data_nan = {
'Price': [10, 20, np.nan, 40],
'Quantity': [2, np.nan, 5, 3]
}
df_nan = pd.DataFrame(data_nan)
print("DataFrame with NaNs:")
print(df_nan)
print()

# Multiplication with * (NaN propagates)
df_nan['Total_default'] = df_nan['Price'] * df_nan['Quantity']
print("Total (default, NaN propagates):\n", df_nan['Total_default'])
print()

# ✅ Using .mul() with fill_value
# Here, if a Price is NaN, it's treated as 0 for this multiplication.
# If a Quantity is NaN, it's also treated as 0.
df_nan['Total_fill_0'] = df_nan['Price'].mul(df_nan['Quantity'], fill_value=0)

print("Total (using .mul() with fill_value=0):")
print(df_nan)

Output:

DataFrame with NaNs:
Price Quantity
0 10.0 2.0
1 20.0 NaN
2 NaN 5.0
3 40.0 3.0

Total (default, NaN propagates):
0 20.0
1 NaN
2 NaN
3 120.0
Name: Total_default, dtype: float64

Total (using .mul() with fill_value=0):
Price Quantity Total_default Total_fill_0
0 10.0 2.0 20.0 20.0
1 20.0 NaN NaN 0.0
2 NaN 5.0 NaN 0.0
3 40.0 3.0 120.0 120.0

Choose fill_value carefully based on how you want missing data to affect the product (e.g., fill_value=1 if missing means "no change to the other factor").

Conditional Multiplication of Columns

Sometimes you only want to multiply columns if a certain condition is met for that row.

Using numpy.where() or Series.where()

np.where(condition, value_if_true, value_if_false) is excellent for this.

import pandas as pd
import numpy as np

df = pd.DataFrame({
'ProductName': ['Apple', 'Banana', 'Orange'],
'Price_Per_Unit': [0.5, 0.25, 0.4],
'Quantity_Sold': [100, 150, 200],
'On_Sale': [False, True, False]
})

# Multiply Price by Quantity only if On_Sale is True, otherwise use 0 or original Price.
df['Conditional_Total'] = np.where(
df['On_Sale'] == True, # Condition
df['Price_Per_Unit'] * df['Quantity_Sold'], # Value if True
0 # Value if False (e.g., no sale value)
)
# Or to keep original price if not on sale (less common for total):
# df['Price_Adjusted'] = np.where(df['On_Sale'], df['Price_Per_Unit'] * 0.9, df['Price_Per_Unit'])

print("DataFrame with conditional multiplication using np.where():")
print(df)

Output:

DataFrame with conditional multiplication using np.where():
ProductName Price_Per_Unit Quantity_Sold On_Sale Conditional_Total
0 Apple 0.50 100 False 0.0
1 Banana 0.25 150 True 37.5
2 Orange 0.40 200 False 0.0

Series.where(condition, other_value) also works: df['total'] = (df['price'] * df['amount']).where(df['product'] == 'apple', other=0). It keeps values where condition is True and replaces with other_value where False.

Using DataFrame.apply() with a Custom Function

For more complex row-wise conditional logic, apply(axis=1) can be used.

import pandas as pd

df = pd.DataFrame({
'ProductName': ['Apple', 'Banana', 'Orange'],
'Price_Per_Unit': [0.5, 0.25, 0.4],
'Quantity_Sold': [100, 150, 200],
'On_Sale': [False, True, False]
})

def calculate_total_apply(row):
if row['On_Sale']:
return row['Price_Per_Unit'] * row['Quantity_Sold']
else:
return 0 # Or some other default for non-sale items

df['Conditional_Total_apply'] = df.apply(calculate_total_apply, axis=1)

print("DataFrame with conditional multiplication using apply():")
print(df)

Output:

DataFrame with conditional multiplication using apply():
ProductName Price_Per_Unit Quantity_Sold On_Sale Conditional_Total_apply
0 Apple 0.50 100 False 0.0
1 Banana 0.25 150 True 37.5
2 Orange 0.40 200 False 0.0

apply(axis=1) is generally less performant than vectorized solutions like np.where or direct arithmetic for simple conditions but offers more flexibility for intricate logic.

Conclusion

Multiplying columns in a Pandas DataFrame is a fundamental element-wise operation:

  • The most direct method is using the standard multiplication operator (*) between selected columns: df['NewCol'] = df['ColA'] * df['ColB']. This also extends to more than two columns.
  • The Series.mul(other_series, fill_value=...) method provides an alternative, especially useful for its fill_value parameter to handle NaNs during the operation.
  • Ensure columns are of numeric type (int or float) before multiplication; use astype() if necessary.
  • For conditional multiplication, numpy.where() (or Series.where()) is efficient and recommended for clear conditions. DataFrame.apply(axis=1) offers more flexibility for complex row-wise logic.

These techniques allow for powerful and efficient calculations across your DataFrame columns.