Python Pandas: How to Select DataFrame Rows Based on a List of Indices

Selecting specific rows from a Pandas DataFrame based on their numerical positions (indices) is a fundamental operation in data manipulation. You might have a list of row indices that you want to extract from a larger DataFrame for further analysis or processing.

This guide demonstrates several effective methods to select DataFrame rows using a list of integer indices, including iloc, index.isin(), loc, take, and query().

The Task: Selecting Rows by Positional Indices

Given a Pandas DataFrame and a list of integers representing the desired row positions (e.g., [0, 2, 5]), we want to create a new DataFrame containing only the rows at these specified positions. It's important to distinguish this from selecting by label-based indices, which might not be integers or might not be sequential. Here, we focus on zero-based integer positions.

Example DataFrame:

import pandas as pd

data = {
    'product_id': ['P101', 'P102', 'P103', 'P104', 'P105'],
    'category': ['Electronics', 'Books', 'Home', 'Electronics', 'Apparel'],
    'price': [299.99, 19.95, 45.50, 799.00, 32.00],
    'in_stock': [True, True, False, True, True]
}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)

Output:

Original DataFrame:
  product_id     category   price  in_stock
     P101  Electronics  299.99      True
     P102        Books   19.95      True
     P103         Home   45.50     False
     P104  Electronics  799.00      True
     P105      Apparel   32.00      True

Method 1: Using `DataFrame.iloc` (Position-Based Indexer - Recommended)

The DataFrame.iloc indexer is designed specifically for integer-position based selection. It's the most direct and idiomatic way to select rows (and/or columns) by their integer positions.

import pandas as pd

df = pd.DataFrame({
    'product_id': ['P101', 'P102', 'P103', 'P104', 'P105'],
    'category': ['Electronics', 'Books', 'Home', 'Electronics', 'Apparel'],
    'price': [299.99, 19.95, 45.50, 799.00, 32.00],
    'in_stock': [True, True, False, True, True]
})


list_of_indices_to_select = [0, 3, 4] # Select first, fourth, and fifth rows

# ✅ Select rows using .iloc with the list of indices
selected_rows_iloc = df.iloc[list_of_indices_to_select]

print("Selected rows using .iloc:")
print(selected_rows_iloc)

Output:

Selected rows using .iloc:
  product_id     category   price  in_stock
0       P101  Electronics  299.99      True
3       P104  Electronics  799.00      True
4       P105      Apparel   32.00      True

df.iloc[list_of_indices_to_select]: This directly selects the rows at the integer positions specified in the list.
df.iloc[list_of_indices_to_select, :]: The colon : indicates that all columns should be selected for the chosen rows. This is often implicit if only row indices are provided.
iloc raises an IndexError if any index in the list is out of bounds.

note

To select all columns for the specified rows, you can also use:

selected_rows_iloc_all_cols = df.iloc[list_of_indices_to_select, :]
print("Selected rows using .iloc (all columns specified):")
print(selected_rows_iloc_all_cols)

Method 2: Using `DataFrame.index.isin()`

This method checks if the DataFrame's index labels are present in your list of indices. This is particularly useful when your DataFrame's index is the default integer index (0, 1, 2, ...), but it can also work with other index types if your list_of_indices contains labels from that index.

import pandas as pd

df = pd.DataFrame({
    'product_id': ['P101', 'P102', 'P103', 'P104', 'P105'],
    'category': ['Electronics', 'Books', 'Home', 'Electronics', 'Apparel'],
    'price': [299.99, 19.95, 45.50, 799.00, 32.00],
    'in_stock': [True, True, False, True, True]
})

list_of_indices_to_select = [0, 3, 4]

# Create a boolean Series: True where index is in the list
boolean_mask = df.index.isin(list_of_indices_to_select)
print(f"Boolean mask from .isin():\n{boolean_mask}")

# ✅ Use the boolean mask to select rows
selected_rows_isin = df[boolean_mask]

print("Selected rows using .index.isin():")
print(selected_rows_isin)

Output:

Boolean mask from .isin():
[ True False False  True  True]
Selected rows using .index.isin():
  product_id     category   price  in_stock
0       P101  Electronics  299.99      True
3       P104  Electronics  799.00      True
4       P105      Apparel   32.00      True

df.index.isin(list_of_indices_to_select): Returns a boolean Series indicating for each row whether its index is in list_of_indices_to_select.
df[boolean_mask]: Standard boolean indexing to select rows where the mask is True.

Method 3: Using `DataFrame.loc` with Index Slicing

While df.loc is primarily for label-based indexing, if your DataFrame's index consists of the integer labels you wish to select (e.g., a default RangeIndex), you can use it. However, to select by a list of positions using loc when the index might be something else (e.g., strings), you first get the actual index labels corresponding to those positions.

import pandas as pd

df = pd.DataFrame({
    'product_id': ['P101', 'P102', 'P103', 'P104', 'P105'],
    'category': ['Electronics', 'Books', 'Home', 'Electronics', 'Apparel'],
    'price': [299.99, 19.95, 45.50, 799.00, 32.00],
    'in_stock': [True, True, False, True, True]
})

list_of_positions = [0, 3, 4]

# Get the actual index LABELS at these integer positions
index_labels_at_positions = df.index[list_of_positions]
print(f"Index labels at positions {list_of_positions}: {index_labels_at_positions}")

# ✅ Use these labels with .loc
selected_rows_loc = df.loc[index_labels_at_positions]

print("Selected rows using .loc with derived index labels:")
print(selected_rows_loc)

Output:

Index labels at positions [0, 3, 4]: Int64Index([0, 3, 4], dtype='int64')
Selected rows using .loc with derived index labels:
  product_id     category   price  in_stock
0       P101  Electronics  299.99      True
3       P104  Electronics  799.00      True
4       P105      Apparel   32.00      True

note

This is slightly more indirect than iloc for purely positional selection but is useful if you have positions and need to use .loc for some reason (e.g., when also selecting columns by label). If the DataFrame index is already just [0, 1, 2, ...], then df.loc[list_of_indices] works directly.

Method 4: Using `DataFrame.take()`

The DataFrame.take() method is specifically designed to select elements along an axis using their integer positions.

import pandas as pd

df = pd.DataFrame({
    'product_id': ['P101', 'P102', 'P103', 'P104', 'P105'],
    'category': ['Electronics', 'Books', 'Home', 'Electronics', 'Apparel'],
    'price': [299.99, 19.95, 45.50, 799.00, 32.00],
    'in_stock': [True, True, False, True, True]
})

list_of_indices_to_select = [0, 3, 4]

# ✅ Select rows using .take()
# Default axis is 0 (rows)
selected_rows_take = df.take(list_of_indices_to_select)

print("Selected rows using .take():")
print(selected_rows_take)

Output:

Selected rows using .take():
  product_id     category   price  in_stock
0       P101  Electronics  299.99      True
3       P104  Electronics  799.00      True
4       P105      Apparel   32.00      True

df.take(list_of_indices_to_select): Returns a new DataFrame with rows at the specified integer positions.
df.take(indices, axis=1) would select columns by position.

Method 5: Using `DataFrame.query()` with `index`

If your DataFrame's index consists of the integers you want to select by (e.g., a default RangeIndex), you can use DataFrame.query() by referring to the special index field.

import pandas as pd

df = pd.DataFrame({
    'product_id': ['P101', 'P102', 'P103', 'P104', 'P105'],
    'category': ['Electronics', 'Books', 'Home', 'Electronics', 'Apparel'],
    'price': [299.99, 19.95, 45.50, 799.00, 32.00],
    'in_stock': [True, True, False, True, True]
})

list_of_indices_to_select = [0, 3, 4] # These are also the index labels in this case

# ✅ Query based on the index values
# The '@' prefix allows referencing a local variable in the query string
query_string = 'index in @list_of_indices_to_select'
selected_rows_query = df.query(query_string)

print("Selected rows using .query() on index:")
print(selected_rows_query)

Output:

Selected rows using .query() on index:
  product_id     category   price  in_stock
0       P101  Electronics  299.99      True
3       P104  Electronics  799.00      True
4       P105      Apparel   32.00      True

index in @list_of_indices_to_select: The query string. index refers to the DataFrame's index. @list_of_indices_to_select allows the query to access the Python list variable.

Choosing the Right Method

df.iloc[list_of_indices]: Most recommended for purely integer-position based row selection. It's explicit, idiomatic, and designed for this purpose.
df.take(list_of_indices): Also very good for positional selection and clear in its intent.
df[df.index.isin(list_of_indices)]: Useful, especially if you are already working with boolean masks or if your list_of_indices contains labels that match a non-integer index. For default integer indices, it works well.
df.loc[df.index[list_of_positions]]: More verbose for simple positional selection. Primarily used when you need label-based selection and derive labels from positions.
df.query('index in @list_of_indices'): A good option if you prefer the query string syntax and your list contains the actual index labels you want to match (which would be integer positions if you have a default index).

Conclusion

Pandas offers several flexible ways to select DataFrame rows based on a list of integer indices.

For direct positional selection, DataFrame.iloc is generally the most direct and preferred method.
DataFrame.take() is another excellent choice specifically designed for positional selection.
DataFrame.index.isin() provides a boolean masking approach that works well with default integer indices.
DataFrame.query() can be used if your list of indices matches the actual labels of your DataFrame's index.

Choose the method that best fits the clarity of your code and the nature of your DataFrame's index. iloc is often the go-to for straightforward selection by integer position.

The Task: Selecting Rows by Positional Indices​

Method 1: Using DataFrame.iloc (Position-Based Indexer - Recommended)​

Method 2: Using DataFrame.index.isin()​

Method 3: Using DataFrame.loc with Index Slicing​

Method 4: Using DataFrame.take()​

Method 5: Using DataFrame.query() with index​

Choosing the Right Method​

Conclusion​

Table of Contents

The Task: Selecting Rows by Positional Indices

Method 1: Using `DataFrame.iloc` (Position-Based Indexer - Recommended)

Method 2: Using `DataFrame.index.isin()`

Method 3: Using `DataFrame.loc` with Index Slicing

Method 4: Using `DataFrame.take()`

Method 5: Using `DataFrame.query()` with `index`

Choosing the Right Method

Conclusion