Python Pandas: How to Reorder DataFrame Rows Based on a List of Index Labels
Rearranging the order of rows in a Pandas DataFrame according to a custom sequence of its index labels is a common data manipulation task. You might need to present data in a specific order, align it with another dataset, or simply organize it for better readability.
This guide explains how to reorder DataFrame rows based on a provided list of index labels, primarily using DataFrame.reindex()
and DataFrame.loc[]
.
The Goal: Custom Row Ordering by Index Labels
Given a Pandas DataFrame with an existing index (which could be strings, numbers, datetimes, etc.), and a separate Python list containing some or all of those index labels in a new desired order, we want to create a new DataFrame where the rows are arranged according to this new order.
Example DataFrame with a Custom Index
We'll use a DataFrame with string labels for its index.
import pandas as pd
data = {
'Product Name': ['Laptop', 'Mouse', 'Keyboard', 'Monitor', 'Webcam'],
'Category': ['Electronics', 'Accessories', 'Accessories', 'Electronics', 'Accessories'],
'Price': [1200, 25, 75, 300, 45],
'Stock': [10, 150, 75, 20, 90]
}
index_labels = ['P101', 'A205', 'A210', 'P107', 'A215']
df_original = pd.DataFrame(data, index=index_labels)
print("Original DataFrame:")
print(df_original)
Output:
Original DataFrame:
Product Name Category Price Stock
P101 Laptop Electronics 1200 10
A205 Mouse Accessories 25 150
A210 Keyboard Accessories 75 75
P107 Monitor Electronics 300 20
A215 Webcam Accessories 45 90
Method 1: Using DataFrame.reindex(labels=...)
(Recommended)
The DataFrame.reindex()
method is specifically designed to conform a DataFrame to a new index. When you pass a list of labels to its labels
parameter (or just as the first positional argument if only reindexing rows), it will reorder the existing rows according to that list.
import pandas as pd
df = pd.DataFrame({
'Product Name': ['Laptop', 'Mouse', 'Keyboard', 'Monitor', 'Webcam'],
'Category': ['Electronics', 'Accessories', 'Accessories', 'Electronics', 'Accessories'],
'Price': [1200, 25, 75, 300, 45],
'Stock': [10, 150, 75, 20, 90]
}, index=['P101', 'A205', 'A210', 'P107', 'A215'])
# Define the new desired order of index labels
custom_index_order = ['A210', 'P107', 'P101', 'A215', 'A205']
# ✅ Reorder rows using df.reindex()
df_reordered_reindex = df.reindex(labels=custom_index_order)
# Or simply: df_reordered_reindex = df.reindex(custom_index_order)
print("DataFrame reordered using df.reindex():")
print(df_reordered_reindex)
Output:
DataFrame reordered using df.reindex():
Product Name Category Price Stock
A210 Keyboard Accessories 75 75
P107 Monitor Electronics 300 20
P101 Laptop Electronics 1200 10
A215 Webcam Accessories 45 90
A205 Mouse Accessories 25 150
df.reindex(labels=custom_index_order)
: Creates a new DataFrame with rows arranged in the order specified bycustom_index_order
.- If
custom_index_order
contains labels not present in the original DataFrame's index,reindex
will introduce rows withNaN
values for those labels by default (this behavior can be controlled with thefill_value
parameter). - If
custom_index_order
omits labels present in the original index, those rows will be dropped from the result.
Method 2: Using DataFrame.loc[index_list]
The DataFrame.loc[]
indexer, which is primarily for label-based selection, can also accept a list of index labels. It will return a new DataFrame with rows selected in the order they appear in the provided list.
import pandas as pd
df = pd.DataFrame({
'Product Name': ['Laptop', 'Mouse', 'Keyboard', 'Monitor', 'Webcam'],
'Category': ['Electronics', 'Accessories', 'Accessories', 'Electronics', 'Accessories'],
'Price': [1200, 25, 75, 300, 45],
'Stock': [10, 150, 75, 20, 90]
}, index=['P101', 'A205', 'A210', 'P107', 'A215'])
custom_index_order = ['A210', 'P107', 'P101', 'A215', 'A205']
# ✅ Reorder rows using df.loc[]
df_reordered_loc = df.loc[custom_index_order]
print("DataFrame reordered using df.loc[]:")
print(df_reordered_loc)
Output:
DataFrame reordered using df.loc[]:
Product Name Category Price Stock
A210 Keyboard Accessories 75 75
P107 Monitor Electronics 300 20
P101 Laptop Electronics 1200 10
A215 Webcam Accessories 45 90
A205 Mouse Accessories 25 150
df.loc[custom_index_order]
: Selects rows based on the labels incustom_index_order
and arranges them in that sequence.- Important Difference from
reindex
: Ifcustom_index_order
contains labels not present indf.index
,df.loc[]
will raise aKeyError
.reindex
is more flexible in handling missing/new labels. If your list of labels is guaranteed to be a permutation or subset of existing index labels,.loc[]
is a concise alternative.
Important Consideration: Handling Missing/Extra Labels in index_list
df.reindex(new_labels)
:- If
new_labels
contains labels not in the original index, new rows filled withNaN
(orfill_value
if specified) are added. - If
new_labels
omits labels from the original index, those rows are dropped.
- If
df.loc[new_labels]
:- If
new_labels
contains labels not in the original index, aKeyError
is raised. - If
new_labels
omits labels, only the specified rows are selected (effectively dropping the others).
- If
Choose reindex
if you need to conform to a new index that might have different labels. Choose loc
if you are sure your new_labels
list contains only existing index labels and you want to select/reorder them.
Alternative: Sorting by Index (sort_index()
)
If your goal is not a custom arbitrary order but rather to sort the rows based on their index labels (alphabetically, numerically, etc.), use DataFrame.sort_index()
.
import pandas as pd
df_original = pd.DataFrame({
'Product Name': ['Laptop', 'Mouse', 'Keyboard', 'Monitor', 'Webcam'],
'Category': ['Electronics', 'Accessories', 'Accessories', 'Electronics', 'Accessories'],
'Price': [1200, 25, 75, 300, 45],
'Stock': [10, 150, 75, 20, 90]
}, index=['P101', 'A205', 'A210', 'P107', 'A215'])
df_shuffled = df_original.sample(frac=1, random_state=42) # Shuffle for demo
print("Shuffled DataFrame before sort_index():")
print(df_shuffled)
print()
# ✅ Sort rows by index labels in ascending order
df_sorted_asc = df_shuffled.sort_index(ascending=True)
print("DataFrame sorted by index (ascending):")
print(df_sorted_asc)
print()
# Sort rows by index labels in descending order
df_sorted_desc = df_shuffled.sort_index(ascending=False)
print("DataFrame sorted by index (descending):")
print(df_sorted_desc)
Output:
Shuffled DataFrame before sort_index():
Product Name Category Price Stock
A205 Mouse Accessories 25 150
A215 Webcam Accessories 45 90
A210 Keyboard Accessories 75 75
P101 Laptop Electronics 1200 10
P107 Monitor Electronics 300 20
DataFrame sorted by index (ascending):
Product Name Category Price Stock
A205 Mouse Accessories 25 150
A210 Keyboard Accessories 75 75
A215 Webcam Accessories 45 90
P101 Laptop Electronics 1200 10
P107 Monitor Electronics 300 20
DataFrame sorted by index (descending):
Product Name Category Price Stock
P107 Monitor Electronics 300 20
P101 Laptop Electronics 1200 10
A215 Webcam Accessories 45 90
A210 Keyboard Accessories 75 75
A205 Mouse Accessories 25 150
This is different from reordering to a specific custom list but is mentioned for completeness as a common way to change row order based on the index.
Conclusion
To reorder Pandas DataFrame rows based on a specific list of its existing index labels:
df.reindex(labels=your_custom_list_of_labels)
: This is the most flexible method, as it can handle cases where the custom list might include new labels (filling withNaN
) or omit existing ones (dropping them). Generally recommended.df.loc[your_custom_list_of_labels]
: This is a concise alternative if you are certain thatyour_custom_list_of_labels
only contains labels already present in the DataFrame's index and you simply want to select and reorder those specific rows. It will raise aKeyError
if any label in the list is not found in the index.
If your goal is simply to sort the DataFrame by its index labels in ascending or descending order, use df.sort_index()
. Choose the method that best suits your specific reordering needs and how you want to handle discrepancies between your custom list and the DataFrame's current index.