Python Pandas: Select DataFrame Rows Between Two Values (Numeric Range)

Filtering a Pandas DataFrame to select rows where values in a specific column fall within a certain numerical range (between two values) is a fundamental data selection task. This is crucial for tasks like isolating data points within specific thresholds, analyzing segments, or data cleaning.

This guide demonstrates several common and effective methods to select DataFrame rows based on a numerical range, primarily using Series.between(), boolean indexing with AND (&), and DataFrame.query().

The Goal: Filtering by a Numerical Range

Given a Pandas DataFrame and a specific column containing numerical data, we want to select only those rows where the value in that column lies between a lower bound and an upper bound. For example, selecting all employees with a salary between $50,000 and $70,000.

Example DataFrame

import pandas as pd

data = {
    'EmployeeID': [101, 102, 103, 104, 105, 106],
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve', 'Frank'],
    'Department': ['HR', 'Engineering', 'Sales', 'HR', 'Engineering', 'Sales'],
    'Salary': [55000, 85000, 70000, 48000, 90000, 65000],
    'YearsExperience': [3, 7, 5, 2, 8, 4]
}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)

Output:

   EmployeeID     Name   Department  Salary  YearsExperience
       101    Alice           HR   55000                3
       102      Bob  Engineering   85000                7
       103  Charlie        Sales   70000                5
       104    David           HR   48000                2
       105      Eve  Engineering   90000                8
       106    Frank        Sales   65000                4

We will filter based on the 'Salary' column.

Method 1: Using `Series.between()` (Recommended)

The Series.between(left, right, inclusive="both") method is specifically designed for this task. It returns a boolean Series indicating whether each element of the calling Series lies between left and right.

Basic Usage (Inclusive Boundaries)

By default, between() includes both the left and right boundary values in the range.

import pandas as pd

df_example = pd.DataFrame({
    'EmployeeID': [101, 102, 103, 104, 105, 106],
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve', 'Frank'],
    'Salary': [55000, 85000, 70000, 48000, 90000, 65000]
})

lower_bound = 50000
upper_bound = 70000

# Create a boolean mask using .between()
salary_in_range_mask = df_example['Salary'].between(lower_bound, upper_bound)
print("Boolean mask from .between():")
print(salary_in_range_mask)
print()

# ✅ Use the mask to filter the DataFrame
df_salary_between = df_example[salary_in_range_mask]

print(f"Rows where Salary is between {lower_bound} and {upper_bound} (inclusive):")
print(df_salary_between)

Output:

Boolean mask from .between():
0     True
1    False
2     True
3    False
4    False
5     True
Name: Salary, dtype: bool

Rows where Salary is between 50000 and 70000 (inclusive):
   EmployeeID     Name  Salary
0         101    Alice   55000
2         103  Charlie   70000
5         106    Frank   65000

Controlling Inclusivity (`inclusive` parameter)

The inclusive parameter can take values:

"both" (default): left <= series <= right
"neither": left < series < right
"left": left <= series < right
"right": left < series <= right

import pandas as pd

df_example = pd.DataFrame({
    'EmployeeID': [101, 102, 103, 104, 105, 106],
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve', 'Frank'],
    'Salary': [55000, 85000, 70000, 48000, 90000, 65000]
})


lower_bound = 50000
upper_bound = 70000

# Example: Salary strictly between 50000 and 70000 (exclusive)
salary_strictly_between_mask = df_example['Salary'].between(lower_bound, upper_bound, inclusive="neither")
df_salary_strictly_between = df_example[salary_strictly_between_mask]

print(f"Rows where Salary is strictly between {lower_bound} and {upper_bound}:")
print(df_salary_strictly_between)

Output (Employee 101 and 103 are excluded as their salaries are on the boundaries):

Rows where Salary is strictly between 50000 and 70000:
   EmployeeID   Name  Salary
0         101  Alice   55000
5         106  Frank   65000

Method 2: Using Boolean Indexing with Logical AND (`&`)

You can construct the range check manually using two separate comparisons combined with the logical AND operator (&). Remember to enclose each individual comparison in parentheses due to operator precedence.

import pandas as pd

df_example = pd.DataFrame({
    'EmployeeID': [101, 102, 103, 104, 105, 106],
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve', 'Frank'],
    'Salary': [55000, 85000, 70000, 48000, 90000, 65000]
})

lower_bound = 50000
upper_bound = 70000

# ✅ Define two conditions and combine with &
condition_ge_lower = (df_example['Salary'] >= lower_bound)
condition_le_upper = (df_example['Salary'] <= upper_bound)

df_salary_between_and = df_example[condition_ge_lower & condition_le_upper]

print(f"Rows where Salary is between {lower_bound} and {upper_bound} (using AND):")
print(df_salary_between_and)

Output: (Same as .between() with inclusive="both")

Rows where Salary is between 50000 and 70000 (using AND):
   EmployeeID     Name  Salary
0         101    Alice   55000
2         103  Charlie   70000
5         106    Frank   65000

note

This method is very explicit and flexible if you need to adjust the inclusivity (e.g., using > instead of >=).

Method 3: Using `DataFrame.query()`

The DataFrame.query() method allows you to filter rows using a string expression. You can express "between" logic directly in the query string.

import pandas as pd

df_example = pd.DataFrame({
    'EmployeeID': [101, 102, 103, 104, 105, 106],
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve', 'Frank'],
    'Salary': [55000, 85000, 70000, 48000, 90000, 65000]
})


lower_bound_var = 50000 # Using variables in query with @
upper_bound_var = 70000

# ✅ Query using chained comparison (Python-like syntax)
query_string = f"@lower_bound_var <= Salary <= @upper_bound_var"
# Or, using 'and': "Salary >= @lower_bound_var and Salary <= @upper_bound_var"
df_salary_between_query = df_example.query(query_string)

print(f"Rows where Salary is between {lower_bound_var} and {upper_bound_var} (using query()):")
print(df_salary_between_query)

Output: (Same as .between() with inclusive="both")

Rows where Salary is between 50000 and 70000 (using query()):
   EmployeeID     Name  Salary
0         101    Alice   55000
2         103  Charlie   70000
5         106    Frank   65000

note

query() can be very readable for complex conditions.
You can reference local variables within the query string by prefixing them with @.
If column names have spaces, wrap them in backticks: `Column Name With Spaces`.

Selecting Rows NOT Between Two Values (Negation)

To get rows where the column value is outside the specified range, you can negate the boolean mask created by Series.between() or your AND condition using the tilde (~) operator.

import pandas as pd

df_example = pd.DataFrame({
    'EmployeeID': [101, 102, 103, 104, 105, 106],
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve', 'Frank'],
    'Salary': [55000, 85000, 70000, 48000, 90000, 65000]
})

lower_bound = 50000
upper_bound = 70000

salary_in_range_mask = df_example['Salary'].between(lower_bound, upper_bound)

# ✅ Negate the mask using ~
df_salary_not_between = df_example[~salary_in_range_mask]

print(f"Rows where Salary is NOT between {lower_bound} and {upper_bound}:")
print(df_salary_not_between)
print()

# Using boolean logic for "not between" (salary < lower OR salary > upper)
df_salary_not_between_alt = df_example[(df_example['Salary'] < lower_bound) | (df_example['Salary'] > upper_bound)]
print("Rows where Salary is NOT between (using OR logic):")
print(df_salary_not_between_alt)

Output:

Rows where Salary is NOT between 50000 and 70000:
   EmployeeID   Name  Salary
1         102    Bob   85000
3         104  David   48000
4         105    Eve   90000

Rows where Salary is NOT between (using OR logic):
   EmployeeID   Name  Salary
1         102    Bob   85000
3         104  David   48000
4         105    Eve   90000

Choosing the Right Method

Series.between(left, right, inclusive=...): Highly recommended for its clarity, conciseness, and explicit control over boundary inclusivity. It's specifically designed for this task.
Boolean Indexing with &: Very flexible and explicit. Good if between() isn't available (older Pandas) or if your conditions are more complex than a simple range.
DataFrame.query(): Can be very readable, especially with chained comparisons. Useful if you prefer string-based querying or need to reference local variables with @.

All three methods are effective. Series.between() is often the most direct and idiomatic choice for range filtering.

Conclusion

Selecting DataFrame rows where a column's values fall between two numerical limits is a fundamental Pandas operation.

The Series.between() method is the most specialized and often the cleanest way to define such a range, with options to control boundary inclusion.
Traditional boolean indexing with the & (AND) operator (df[(df['col'] >= L) & (df['col'] <= R)]) provides full control and explicitness.
DataFrame.query() offers an alternative string-based syntax (df.query('L <= col <= R')) that many find readable.
To select rows outside a range, negate the boolean condition using the tilde (~) operator.

By applying these techniques, you can effectively filter your Pandas DataFrames to isolate data within specific numerical ranges for focused analysis.

The Goal: Filtering by a Numerical Range​

Example DataFrame​

Method 1: Using Series.between() (Recommended)​

Basic Usage (Inclusive Boundaries)​

Controlling Inclusivity (inclusive parameter)​

Method 2: Using Boolean Indexing with Logical AND (&)​

Method 3: Using DataFrame.query()​

Selecting Rows NOT Between Two Values (Negation)​

Choosing the Right Method​

Conclusion​

Table of Contents

The Goal: Filtering by a Numerical Range

Example DataFrame

Method 1: Using `Series.between()` (Recommended)

Basic Usage (Inclusive Boundaries)

Controlling Inclusivity (`inclusive` parameter)

Method 2: Using Boolean Indexing with Logical AND (`&`)

Method 3: Using `DataFrame.query()`

Selecting Rows NOT Between Two Values (Negation)

Choosing the Right Method

Conclusion