Python Pandas: Convert Pivot Table Back to DataFrame (Flatten)

Pandas pivot tables (DataFrame.pivot_table() or pd.pivot_table()) are excellent for summarizing and reshaping data. However, the resulting pivot table often has a hierarchical index (MultiIndex) on its rows and/or columns, and the name of the original column used for columns in the pivot operation becomes the name of the column index. For further analysis, plotting, or exporting, you might need to convert this structured pivot table back into a "flat" DataFrame with a simple index and clear column names.

This guide explains how to effectively convert a Pandas pivot table back into a standard DataFrame format.

Understanding Pivot Table Structure

A pivot table typically transforms your data as follows:

Index: Unique values from the column(s) specified in the index argument become the new row index. If multiple columns are used for index, a MultiIndex is created.
Columns: Unique values from the column(s) specified in the columns argument become the new column headers. If multiple columns are used, a MultiIndex is created for columns. The name of the original column used for columns becomes the name of the column index (e.g., pivot_table.columns.name).
Values: Cells are populated by aggregating values from the column specified in the values argument, using the function in aggfunc.

This often results in a DataFrame where the index is one or more of your original data columns, and the column headers are also derived from data. To "flatten" it means to move the index levels back into regular data columns and simplify the column header structure.

Example: Creating a Pivot Table

Let's create a sample DataFrame and a pivot table from it.

import pandas as pd
import numpy as np

data = {
    'Region': ['North', 'North', 'South', 'South', 'North', 'West', 'South', 'West'],
    'Product': ['A', 'B', 'A', 'B', 'A', 'A', 'B', 'C'],
    'Month': ['Jan', 'Jan', 'Jan', 'Feb', 'Feb', 'Feb', 'Jan', 'Feb'],
    'Sales': [100, 150, 200, 50, 120, 300, 180, 90]
}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)

# Create a pivot table: Sum of Sales by Region (index) and Product (columns)
pivot_table_df = df.pivot_table(
    index='Region',
    columns='Product',
    values='Sales',
    aggfunc='sum',
    fill_value=0 # Fill missing combinations with 0
)
print("Generated Pivot Table:")
print(pivot_table_df)
print()

print(f"Pivot table's column index name: '{pivot_table_df.columns.name}'")

Output:

Original DataFrame:
  Region Product Month  Sales
0  North       A   Jan    100
1  North       B   Jan    150
2  South       A   Jan    200
3  South       B   Feb     50
4  North       A   Feb    120
5   West       A   Feb    300
6  South       B   Jan    180
7   West       C   Feb     90

Generated Pivot Table:
Product    A    B   C
Region               
North    220  150   0
South    200  230   0
West     300    0  90
Pivot table's column index name: 'Product'

Our goal is to convert pivot_table_df into a DataFrame where 'Region' is a regular column, and the 'Product' name above columns A, B, C is removed or flattened.

Method 1: Using `reset_index()` and Managing Column Index Name (Recommended)

This is a two-step process: first, clean up the column index name, then move the row index levels to become regular columns.

Removing the Column Index Name (`df.columns.name = None`)

The pivot_table often sets the name attribute of the columns index (e.g., to 'Product' in our example). To get a cleaner "flat" DataFrame, it's good to remove this.

Resetting the Row Index (`df.reset_index()`)

The reset_index() method converts the current DataFrame index (or levels of a MultiIndex) into regular data columns.

import pandas as pd

pivot_table_df = pd.DataFrame({
    'A': {'North': 220, 'South': 200, 'West': 300},
    'B': {'North': 150, 'South': 230, 'West': 0},
    'C': {'North': 0, 'South': 0, 'West': 90}
}, dtype=int)
pivot_table_df.index.name = 'Region'
pivot_table_df.columns.name = 'Product' # Simulate pivot table output


# Make a copy to work with
df_flattened = pivot_table_df.copy()

# Step 1: Remove the name of the columns index
df_flattened.columns.name = None
print("After setting columns.name = None:")
print(df_flattened)
print()

# Step 2: Convert the row index ('Region') into a regular column
df_flattened = df_flattened.reset_index()
print("After reset_index():")
print(df_flattened)

Output:

After setting columns.name = None:
          A    B   C
Region              
North   220  150   0
South   200  230   0
West    300    0  90

After reset_index():
  Region    A    B   C
0  North  220  150   0
1  South  200  230   0
2   West  300    0  90

This is generally the most straightforward and common approach.

Method 2: Using `reset_index()` and `rename_axis(None, axis=1)`

This method achieves a similar result by first resetting the index and then removing the name from the column axis.

import pandas as pd

pivot_table_df = pd.DataFrame({
    'A': {'North': 220, 'South': 200, 'West': 300},
    'B': {'North': 150, 'South': 230, 'West': 0},
    'C': {'North': 0, 'South': 0, 'West': 90}
}, dtype=int)
pivot_table_df.index.name = 'Region'
pivot_table_df.columns.name = 'Product'


df_flattened_rename = pivot_table_df.copy()

# Chain reset_index() and rename_axis()
df_flattened_rename = df_flattened_rename.reset_index().rename_axis(None, axis=1)

print("DataFrame flattened using reset_index() and rename_axis(None, axis=1):")
print(df_flattened_rename)

Output:

DataFrame flattened using reset_index() and rename_axis(None, axis=1):
  Region    A    B   C
0  North  220  150   0
1  South  200  230   0
2   West  300    0  90

reset_index(): Moves 'Region' from index to a column.
rename_axis(None, axis=1): Sets the name of the column index (which was 'Product') to None. axis=1 specifies that we are renaming the column axis's name.

Method 3: Using `DataFrame.to_records()` and `pd.DataFrame()` Constructor

The to_records() method can convert a DataFrame (including one with a MultiIndex from a pivot table) into a NumPy record array. This record array can then be used to construct a new, flat DataFrame.

import pandas as pd

pivot_table_df = pd.DataFrame({
    'A': {'North': 220, 'South': 200, 'West': 300},
    'B': {'North': 150, 'South': 230, 'West': 0},
    'C': {'North': 0, 'South': 0, 'West': 90}
}, dtype=int)
pivot_table_df.index.name = 'Region'
pivot_table_df.columns.name = 'Product'

# Convert pivot table to NumPy record array
# The index levels become fields in the record array
records = pivot_table_df.to_records()
print("Records from pivot table:")
print(records)

# ✅ Create a new DataFrame from the records
df_from_records = pd.DataFrame(records)

print("DataFrame flattened using to_records():")
print(df_from_records)

Output:

Records from pivot table:
[('North', 220, 150,  0) ('South', 200, 230,  0) ('West', 300,   0, 90)]
DataFrame flattened using to_records():
  Region    A    B   C
0  North  220  150   0
1  South  200  230   0
2   West  300    0  90

note

to_records() automatically includes the index levels as fields in the resulting record array, which then become columns in the new DataFrame. The original column names from the pivot table are usually preserved as field names.

Handling MultiIndex Columns (e.g., from multiple `values` or `aggfunc`)

If your pivot table has a MultiIndex on the columns (e.g., if you used multiple values arguments or a list of aggfunc), you might need an extra step to flatten these column headers.

note

Flattening MultiIndex columns often involves re-creating the column names by joining the levels or selecting a specific level. This is a common precursor to the reset_index() step.

There are two options:

Option 1: Join levels with an underscore

import pandas as pd

df = pd.DataFrame({
    'Region': ['North', 'North', 'South', 'South', 'North', 'West', 'South', 'West'],
    'Product': ['A', 'B', 'A', 'B', 'A', 'A', 'B', 'C'],
    'Month': ['Jan', 'Jan', 'Jan', 'Feb', 'Feb', 'Feb', 'Jan', 'Feb'],
    'Sales': [100, 150, 200, 50, 120, 300, 180, 90],
    'Quantity': [10, 15, 12, 5, 8, 20, 18, 7] # Added Quantity for demo
})

pivot_multi_col = df.pivot_table(
    index='Region',
    columns='Product',
    values=['Sales', 'Quantity'], # Multiple values -> MultiIndex columns
    aggfunc='sum',
    fill_value=0
)
print("Pivot Table with MultiIndex Columns:")
print(pivot_multi_col)

# Flattening MultiIndex Columns (common techniques)
df_flat_cols = pivot_multi_col.copy()
df_flat_cols.columns = ['_'.join(col_levels).strip() for col_levels in df_flat_cols.columns.values]
df_flat_cols = df_flat_cols.reset_index()
print("Flattened MultiIndex Columns (joined):")
print(df_flat_cols)

Output:

Pivot Table with MultiIndex Columns:
        Quantity        Sales         
Product        A   B  C     A    B   C
Region                                
North         18  15  0   220  150   0
South         12  23  0   200  230   0
West          20   0  7   300    0  90
Flattened MultiIndex Columns (joined):
  Region  Quantity_A  Quantity_B  Quantity_C  Sales_A  Sales_B  Sales_C
0  North          18          15           0      220      150        0
1  South          12          23           0      200      230        0
2   West          20           0           7      300        0       90

Option 2: Option 2: If you only care about one level of the column MultiIndex

import pandas as pd

df = pd.DataFrame({
    'Region': ['North', 'North', 'South', 'South', 'North', 'West', 'South', 'West'],
    'Product': ['A', 'B', 'A', 'B', 'A', 'A', 'B', 'C'],
    'Month': ['Jan', 'Jan', 'Jan', 'Feb', 'Feb', 'Feb', 'Jan', 'Feb'],
    'Sales': [100, 150, 200, 50, 120, 300, 180, 90],
    'Quantity': [10, 15, 12, 5, 8, 20, 18, 7] # Added Quantity for demo
})

pivot_multi_col = df.pivot_table(
    index='Region',
    columns='Product',
    values=['Sales', 'Quantity'], # Multiple values -> MultiIndex columns
    aggfunc='sum',
    fill_value=0
)
print("Pivot Table with MultiIndex Columns:")
print(pivot_multi_col)
print()

# If you only care about one level of the column MultiIndex (e.g., the inner 'Product')
df_flat_cols_droplevel = pivot_multi_col.copy()
df_flat_cols_droplevel.columns = df_flat_cols_droplevel.columns.droplevel(0)    # Drop the 'Sales'/'Quantity' level
df_flat_cols_droplevel.columns.name = None                                      # Remove the 'Product' name
df_flat_cols_droplevel = df_flat_cols_droplevel.reset_index()
print("Flattened MultiIndex Columns (droplevel):")
print(df_flat_cols_droplevel)                                                   

Output:

Pivot Table with MultiIndex Columns:
        Quantity        Sales         
Product        A   B  C     A    B   C
Region                                
North         18  15  0   220  150   0
South         12  23  0   200  230   0
West          20   0  7   300    0  90

Flattened MultiIndex Columns (droplevel):
  Region   A   B  C    A    B   C
0  North  18  15  0  220  150   0
1  South  12  23  0  200  230   0
2   West  20   0  7  300    0  90

note

This might lead to duplicate column names if not handled carefully.

Renaming Columns After Flattening (Optional)

After you have a flat DataFrame, you might want to rename the columns for clarity, especially if the original column names from the pivot table values (like 'A', 'B', 'C' in our example) aren't descriptive enough.

import pandas as pd

df_flattened = pd.DataFrame({
    'Region': ['North', 'South', 'West'],
    'A': [220, 200, 300],
    'B': [150, 230, 0],
    'C': [0, 0, 90]
})

df_renamed = df_flattened.rename(columns={
    'A': 'Sales_Product_A',
    'B': 'Sales_Product_B',
    'C': 'Sales_Product_C'
})

# Or assign to df.columns directly if replacing all:
# df_flattened.columns = ['Region_Name', 'Sales_Product_A', 'Sales_Product_B', 'Sales_Product_C']
print("DataFrame with Renamed Columns:")
print(df_renamed)

Output:

DataFrame with Renamed Columns:
  Region  Sales_Product_A  Sales_Product_B  Sales_Product_C
0  North              220              150                0
1  South              200              230                0
2   West              300                0               90

Conclusion

Converting a Pandas pivot table back into a standard, "flat" DataFrame typically involves two main steps:

Handling the column index: Remove the columns.name attribute (e.g., df_pivot.columns.name = None) or use rename_axis(None, axis=1) after resetting the row index. If the columns themselves form a MultiIndex, you might need to flatten them first (e.g., by joining level names or using droplevel()).
Handling the row index: Use df_pivot.reset_index() to move the pivot table's row index levels into regular data columns.

The DataFrame.to_records() method followed by pd.DataFrame() also provides a concise way to achieve flattening. Choose the method that you find most readable and that best suits the structure of your pivot table, especially if dealing with MultiIndex columns.

Understanding Pivot Table Structure​

Example: Creating a Pivot Table​

Method 1: Using reset_index() and Managing Column Index Name (Recommended)​

Removing the Column Index Name (df.columns.name = None)​

Resetting the Row Index (df.reset_index())​

Method 2: Using reset_index() and rename_axis(None, axis=1)​

Method 3: Using DataFrame.to_records() and pd.DataFrame() Constructor​

Handling MultiIndex Columns (e.g., from multiple values or aggfunc)​

Option 1: Join levels with an underscore​

Option 2: Option 2: If you only care about one level of the column MultiIndex​

Renaming Columns After Flattening (Optional)​

Conclusion​

Table of Contents

Understanding Pivot Table Structure

Example: Creating a Pivot Table

Method 1: Using `reset_index()` and Managing Column Index Name (Recommended)

Removing the Column Index Name (`df.columns.name = None`)

Resetting the Row Index (`df.reset_index()`)

Method 2: Using `reset_index()` and `rename_axis(None, axis=1)`

Method 3: Using `DataFrame.to_records()` and `pd.DataFrame()` Constructor

Handling MultiIndex Columns (e.g., from multiple `values` or `aggfunc`)

Option 1: Join levels with an underscore

Option 2: Option 2: If you only care about one level of the column MultiIndex

Renaming Columns After Flattening (Optional)

Conclusion