Python Pandas: How to Do Element-wise Logical NOT (~
) and OR (|
) Operations
Performing element-wise logical operations on boolean Series (often derived from conditions on DataFrame columns) is fundamental for complex data filtering and manipulation in Pandas. The logical NOT operation inverts boolean values, while the logical OR operation combines conditions, selecting rows if at least one condition is true.
This guide explains how to use the tilde (~
) operator for logical NOT and the pipe (|
) operator for logical OR in Pandas, as well as their NumPy equivalents.
The Goal: Boolean Logic on Series/DataFrame
- Logical NOT: Given a boolean Series (e.g.,
df['Column'] > 10
), we want to invert its values (True
becomesFalse
,False
becomesTrue
). - Logical OR: Given two or more boolean Series (conditions), we want to create a new boolean Series that is
True
if at least one of the input conditions isTrue
for a given row.
This is essential for building complex filters for DataFrame row selection.
Example DataFrame
import pandas as pd
import numpy as np # For NaN and numpy functions
data = {
'EmployeeID': ['E01', 'E02', 'E03', 'E04', 'E05'],
'Department': ['Sales', 'HR', 'Engineering', 'Sales', 'HR'],
'YearsExperience': [2, 5, 1, 7, 4],
'IsManager': [False, True, False, True, False],
'OnProject': [True, False, True, True, np.nan] # Contains NaN
}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)
Output:
Original DataFrame:
EmployeeID Department YearsExperience IsManager OnProject
0 E01 Sales 2 False True
1 E02 HR 5 True False
2 E03 Engineering 1 False True
3 E04 Sales 7 True True
4 E05 HR 4 False NaN
Element-wise Logical NOT
Using the Tilde Operator (~
) (Recommended)
The tilde ~
is the unary operator for bitwise NOT in Python, and Pandas overloads it for element-wise logical NOT on boolean Series/DataFrames.
import pandas as pd
df_example = pd.DataFrame({
'IsManager': [False, True, False, True, False]
})
# Original boolean Series
is_manager_series = df_example['IsManager']
print("Original 'IsManager' Series:")
print(is_manager_series)
print()
# ✅ Apply logical NOT using ~
is_not_manager_series = ~is_manager_series
print("Inverted 'IsManager' Series (is_not_manager_series):")
print(is_not_manager_series)
Output:
Original 'IsManager' Series:
0 False
1 True
2 False
3 True
4 False
Name: IsManager, dtype: bool
Inverted 'IsManager' Series (is_not_manager_series):
0 True
1 False
2 True
3 False
4 True
Name: IsManager, dtype: bool
Using the Unary Minus Operator (-
) (Less Common for Booleans)
Pandas also allows the unary minus -
to act as a logical NOT for boolean Series. This is less conventional for boolean logic compared to ~
.
import pandas as pd
df_example = pd.DataFrame({
'IsManager': [False, True, False, True, False]
})
# Original boolean Series
is_manager_series = df_example['IsManager']
is_not_manager_minus = -is_manager_series
print("Inverted using unary minus (-):")
print(is_not_manager_minus) # Output is same as with ~
Output:
Inverted using unary minus (-):
0 True
1 False
2 True
3 False
4 True
Name: IsManager, dtype: bool
While it works, ~
is generally preferred for clarity when dealing with boolean inversion.
Using numpy.invert()
NumPy's np.invert()
function performs a bitwise NOT, which also works for boolean arrays/Series.
import pandas as pd
import numpy as np
df_example = pd.DataFrame({ 'IsManager': [False, True, False, True, False] })
is_manager_series = df_example['IsManager']
# ✅ Apply logical NOT using np.invert()
is_not_manager_np = np.invert(is_manager_series)
print("Inverted 'IsManager' Series (using np.invert()):")
print(is_not_manager_np)
Output:
Inverted 'IsManager' Series (using np.invert()):
0 True
1 False
2 True
3 False
4 True
Name: IsManager, dtype: bool
~
is more idiomatic Pandas than calling np.invert()
directly for this.
Handling Missing Values (NaN
) Before NOT
The ~
operator (and np.invert
) on a boolean Series containing NaN
will treat NaN
as True
(because NaN
is falsy, so not NaN
is True
in some boolean contexts, or more precisely, non-True
values become True
). If you need NaN
s to remain NaN
or be treated as False
after inversion, you might need to handle them first or use astype(bool)
carefully.
import pandas as pd
import numpy as np
s_with_nan = pd.Series([True, False, np.nan, True])
print(f"Original Series with NaN:\n{s_with_nan}\n")
# Convert to Pandas' nullable Boolean type so ~ works correctly with NaN
s_nullable_bool = s_with_nan.astype("boolean")
# Behavior of ~ on Series with NaN (NaN remains <NA>, inversion works correctly)
print(f"~s_with_nan (nullable boolean):\n{~s_nullable_bool}\n")
# If you want NaN to result in False (or True depending on logic for missing):
# First convert to bool, where NaN becomes True
s_astype_bool = s_with_nan.fillna(True).astype(bool)
print(f"s_with_nan.astype(bool) [NaN -> True]:\n{s_astype_bool}\n") # NaN becomes True
print(f"~s_with_nan.astype(bool):\n{~s_astype_bool}") # Then invert
# If NaN should remain NaN, you'd need more complex logic or fillna first.
# For most filtering, NaNs often get excluded by conditions anyway.
Output:
Original Series with NaN:
0 True
1 False
2 NaN
3 True
dtype: object
~s_with_nan (nullable boolean):
0 False
1 True
2 <NA>
3 False
dtype: boolean
s_with_nan.astype(bool) [NaN -> True]:
0 True
1 False
2 True
3 True
dtype: bool
~s_with_nan.astype(bool):
0 False
1 True
2 False
3 False
dtype: bool
When converting object columns with None
or np.nan
to bool
using astype(bool)
, None
and np.nan
are typically treated as True
. This behavior is important to remember.
Element-wise Logical OR
Using the Pipe Operator (|
) (Recommended)
The pipe |
is the bitwise OR operator in Python. Pandas overloads it for element-wise logical OR between boolean Series.
import pandas as pd
df_example = pd.DataFrame({
'Department': ['Sales', 'HR', 'Engineering', 'Sales'],
'YearsExperience': [2, 5, 1, 7],
'IsManager': [False, True, False, True]
})
# Condition 1: Department is 'HR'
cond1 = (df_example['Department'] == 'HR')
# Condition 2: IsManager is True
cond2 = (df_example['IsManager'] == True)
# ✅ Combine conditions with OR (|)
combined_or_mask = cond1 | cond2
print("Combined OR mask (HR or IsManager):")
print(combined_or_mask)
print()
print("Rows where Department is 'HR' OR IsManager is True:")
print(df_example[combined_or_mask])
Output:
Combined OR mask (HR or IsManager):
0 False
1 True
2 False
3 True
dtype: bool
Rows where Department is 'HR' OR IsManager is True:
Department YearsExperience IsManager
1 HR 5 True
3 Sales 7 True
Using numpy.logical_or()
NumPy's np.logical_or(condition1, condition2)
performs an element-wise logical OR.
import pandas as pd
import numpy as np
df_example = pd.DataFrame({
'Department': ['Sales', 'HR'], 'IsManager': [False, True]
})
cond1 = (df_example['Department'] == 'HR')
cond2 = (df_example['IsManager'] == True)
combined_or_np = np.logical_or(cond1, cond2)
print("Combined OR mask (using np.logical_or()):")
print(combined_or_np) # Returns a NumPy array
print()
print("Rows selected using np.logical_or():")
print(df_example[combined_or_np])
Output:
Combined OR mask (using np.logical_or()):
0 False
1 True
dtype: bool
Rows selected using np.logical_or():
Department IsManager
1 HR True
Using numpy.logical_or.reduce()
for Multiple Conditions
If you have a list of more than two boolean Series (conditions) and want to OR them all together, np.logical_or.reduce()
is useful.
import pandas as pd
import numpy as np
df_example = pd.DataFrame({
'Department': ['Sales', 'HR', 'Dev', 'Sales'],
'YearsExperience': [2, 5, 1, 7],
'IsManager': [False, True, False, False]
})
cond_dept_hr = (df_example['Department'] == 'HR')
cond_is_manager = (df_example['IsManager'] == True)
cond_exp_gt_5 = (df_example['YearsExperience'] > 5)
list_of_conditions = [cond_dept_hr, cond_is_manager, cond_exp_gt_5]
# ✅ Apply logical OR across all conditions in the list
combined_or_reduce = np.logical_or.reduce(list_of_conditions)
print("Combined OR mask (using np.logical_or.reduce()):")
print(combined_or_reduce)
print()
print("Rows selected using np.logical_or.reduce():")
print(df_example[combined_or_reduce])
Output:
Combined OR mask (using np.logical_or.reduce()):
[False True False True]
Rows selected using np.logical_or.reduce():
Department YearsExperience IsManager
1 HR 5 True
3 Sales 7 False
Operator Precedence and Parentheses
When combining multiple logical conditions (especially with &
AND and |
OR), it's crucial to use parentheses ()
to ensure correct order of operations due to Python's operator precedence.
&
(AND) has higher precedence than|
(OR).~
(NOT) has higher precedence than&
and|
.
Always wrap individual conditions in parentheses when combining them:
# Correct:
mask = (df['colA'] > 10) & (df['colB'] == 'X') | (df['colC'] < 0)
# Incorrect (might lead to unexpected results):
mask = df['colA'] > 10 & df['colB'] == 'X' | df['colC'] < 0
Conclusion
Pandas leverages Python's bitwise operators for element-wise logical operations on boolean Series:
- For Logical NOT (inversion): Use the tilde operator (
~
) directly on a boolean Series:~your_boolean_series
. - For Logical OR (select if any condition is true): Use the pipe operator (
|
) between boolean Series:(condition1) | (condition2)
. Remember to use parentheses around individual conditions. - NumPy functions like
np.invert()
,np.logical_or()
, andnp.logical_or.reduce()
provide alternatives, though the Pandas operators (~
,|
) are often more concise and idiomatic for DataFrame operations. - Be mindful of how
NaN
values are treated, especially with the NOT operator, and consider usingastype(bool)
orfillna()
if specific handling of missing boolean values is required before inversion.
These logical operators are fundamental for constructing complex boolean masks used in filtering and selecting data within your Pandas DataFrames.