Python Pandas: How to Add a Level to a MultiIndex (Rows or Columns)
Pandas MultiIndex (also known as a hierarchical index) allows you to have multiple levels of indexing on your DataFrame's rows or columns, enabling more complex data organization and analysis. Sometimes, you may need to add a new level to an existing index or column structure.
This guide explains several methods to add a new level to a Pandas MultiIndex for both rows (index) and columns, primarily using pd.concat()
and pd.MultiIndex.from_product()
.
Understanding MultiIndex and Levels
A MultiIndex can be thought of as an index with multiple "layers" or "levels." Each level can have its own name.
- Row Index: If a DataFrame has a MultiIndex on its rows, each row is identified by a tuple of values, one from each level.
- Column Index: Similarly, columns can have multiple levels, meaning each column header is a tuple of values.
Adding a level means introducing a new layer of grouping or categorization to your existing index or columns.
Example DataFrame (Single Index):
import pandas as pd
data = {
('Group1', 'Value1'): [10, 20, 30],
('Group1', 'Value2'): [15, 25, 35],
('Group2', 'Value1'): [100, 200, 300]
}
df_simple_cols = pd.DataFrame({
'A': [1, 2, 3, 4, 5],
'B': [10, 20, 30, 40, 50],
'C': [100, 200, 300, 400, 500]
})
print("Original DataFrame (Simple Index):")
print(df_simple_cols)
Output:
Original DataFrame (Simple Index):
A B C
0 1 10 100
1 2 20 200
2 3 30 300
3 4 40 400
4 5 50 500
Adding a Level to the DataFrame's Row Index
This adds an outer level to the existing row index.
Using pd.concat()
with keys
and names
(Recommended)
The pd.concat()
function can be used to add a new outermost level to the index by wrapping the DataFrame in a list and providing the keys
and names
arguments.
import pandas as pd
df = pd.DataFrame({
'A': [1, 2, 3, 4, 5],
'B': [10, 20, 30, 40, 50],
'C': [100, 200, 300, 400, 500]
})
new_outer_level_key = 'Experiment_X'
new_outer_level_name = 'Experiment_Set'
# ✅ Add a new outer level to the row index
df_multi_index_rows = pd.concat(
[df], # DataFrame(s) to concatenate, wrapped in a list
keys=[new_outer_level_key], # Key(s) for the new outermost level
names=[new_outer_level_name] # Name(s) for the new level(s)
)
print("DataFrame with New Outer Row Index Level:")
print(df_multi_index_rows)
print()
print("New index names:")
print(df_multi_index_rows.index.names)
Output:
DataFrame with New Outer Row Index Level:
A B C
Experiment_Set
Experiment_X 0 1 10 100
1 2 20 200
2 3 30 300
3 4 40 400
4 5 50 500
New index names:
['Experiment_Set', None]
[df]
: You pass a list containing the DataFrame(s). If just one, it's[df]
.keys=['X']
: This provides the label(s) for the new outermost level. For each DataFrame in the input list,concat
uses a corresponding key.names=['First Level']
: This assigns a name to the newly created level.- You can also pass a dictionary to
pd.concat()
where keys become the new outer index level:This produces the same result as above.df_multi_index_dict = pd.concat({new_outer_level_key: df}, names=[new_outer_level_name])
Using pd.MultiIndex.from_frame()
(More Manual Control)
This method gives you more control if you need to insert the new level at a specific position (not just outermost) or if the new level's values are more complex.
import pandas as pd
df = pd.DataFrame({
'A': [1, 2, 3, 4, 5],
'B': [10, 20, 30, 40, 50],
'C': [100, 200, 300, 400, 500]
})
# 1. Convert the current index to a DataFrame
index_df = df.index.to_frame(name='OriginalIndex') # Give the original index level a name
# 2. Prepare the values for the new level
new_level_values = ['SetA'] * 2 + ['SetB'] * 3 # Example: different values for the new level
new_level_name = 'New_Level'
# 3. Insert the new level into the index DataFrame (e.g., at the beginning)
index_df.insert(0, new_level_name, new_level_values)
print("DataFrame representing the new MultiIndex structure:")
print(index_df)
print()
# 4. Create a MultiIndex from this DataFrame and assign it back
df.index = pd.MultiIndex.from_frame(index_df)
print("DataFrame with New Index Level via from_frame():")
print(df)
Output:
DataFrame representing the new MultiIndex structure:
New_Level OriginalIndex
0 SetA 0
1 SetA 1
2 SetB 2
3 SetB 3
4 SetB 4
DataFrame with New Index Level via from_frame():
A B C
New_Level OriginalIndex
SetA 0 1 10 100
1 2 20 200
SetB 2 3 30 300
3 4 40 400
4 5 50 500
This method is more verbose but offers flexibility in constructing the new MultiIndex.
Adding a Level to the DataFrame's Column Index
This adds an outer level to the existing column headers.
Using pd.concat()
with axis=1
Similar to adding a row index level, but you specify axis=1
to operate on columns.
import pandas as pd
df = pd.DataFrame({
'A': [1, 2, 3, 4, 5],
'B': [10, 20, 30, 40, 50],
'C': [100, 200, 300, 400, 500]
})
new_outer_col_level_key = 'Metrics'
new_outer_col_level_name = 'Category'
# ✅ Add a new outer level to the column index
df_multi_col = pd.concat(
[df],
keys=[new_outer_col_level_key],
names=[new_outer_col_level_name],
axis=1 # Operate on columns
)
print("DataFrame with New Outer Column Level:")
print(df_multi_col)
print()
print("New column MultiIndex:")
print(df_multi_col.columns)
Output:
DataFrame with New Outer Column Level:
Category Metrics
A B C
0 1 10 100
1 2 20 200
2 3 30 300
3 4 40 400
4 5 50 500
New column MultiIndex:
MultiIndex([('Metrics', 'A'),
('Metrics', 'B'),
('Metrics', 'C')],
names=['Category', None])
Using pd.MultiIndex.from_product()
This is very useful for creating a new column MultiIndex when the new level is the same for all existing columns, or when you want to create a Cartesian product of levels.
import pandas as pd
df = pd.DataFrame({
'A': [1, 2, 3, 4, 5],
'B': [10, 20, 30, 40, 50],
'C': [100, 200, 300, 400, 500]
})
# Define the new outer level label(s)
new_outer_level = ['Measurements']
# Or for multiple new top levels: new_outer_level = ['Raw', 'Processed']
# Create the new MultiIndex by taking a product of the new level and existing columns
df.columns = pd.MultiIndex.from_product(
[new_outer_level, df.columns], # Iterables for the levels
names=['Data_Type', 'Original_Metric'] # Optional names for the levels
)
print("DataFrame with New Column Level via from_product():")
print(df)
print()
print("New column MultiIndex:")
print(df.columns)
Output:
DataFrame with New Column Level via from_product():
Data_Type Measurements
Original_Metric A B C
0 1 10 100
1 2 20 200
2 3 30 300
3 4 40 400
4 5 50 500
New column MultiIndex:
MultiIndex([('Measurements', 'A'),
('Measurements', 'B'),
('Measurements', 'C')],
names=['Data_Type', 'Original_Metric'])
If you wanted to add a new inner level instead, you'd swap the order in from_product
: [df.columns, ['SubMetric']]
.
Conclusion
Adding levels to a Pandas MultiIndex allows for more sophisticated data structuring:
- For Row Indexes:
pd.concat([df], keys=['NewOuterKey'], names=['NewLevelName'])
is straightforward for adding a new outermost level.- Converting the index to a DataFrame (
df.index.to_frame()
), inserting the new level's data, and then converting back (pd.MultiIndex.from_frame()
) offers more control over placement and values.
- For Column Indexes:
pd.concat([df], keys=['NewOuterKey'], names=['NewLevelName'], axis=1)
adds a new outermost column level.pd.MultiIndex.from_product([['OuterLevelName'], df.columns])
is excellent for adding a uniform outer level to existing columns or creating new hierarchical structures.
Choose the method that best fits the complexity of the new level you're adding and whether you're targeting rows or columns.