Skip to main content

Python Pandas: Convert Nested Dictionary to DataFrame

Nested dictionaries are a common way to represent structured hierarchical data in Python. Pandas provides flexible methods to convert these nested structures into DataFrames, which are essential for tabular data analysis and manipulation. The key is often how you want the levels of the dictionary to map to the rows and columns of the DataFrame.

This guide explains how to convert various forms of nested dictionaries into Pandas DataFrames, primarily using pd.DataFrame.from_dict() and the pd.DataFrame() constructor, along with techniques for handling different nesting patterns.

Understanding Nested Dictionaries for DataFrame Creation

A nested dictionary is a dictionary where some of its values are themselves dictionaries. For example:

data_rows = {
'record1': {'name': 'Alice', 'age': 30, 'city': 'New York'},
'record2': {'name': 'Bob', 'age': 24, 'city': 'London'}
}
# Here, 'record1' and 'record2' are outer keys.
# {'name': ..., 'age': ..., 'city': ...} are inner dictionaries.

Pandas needs to know how to map these levels to DataFrame rows, columns, and index.

Common Nested Dictionary Structures

Outer Keys as Rows, Inner Keys as Columns

This is a very common structure where each outer dictionary key represents a row identifier (which will become the DataFrame index), and the inner dictionary's keys become the column names.

This is a very typical structure for representing multiple records, where each record has a unique identifier (outer key) and a set of attributes (inner key-value pairs).

Example:

data_structure_1 = {
'record_001': {
'name': 'Alice Wonderland',
'age': 30,
'department': 'Engineering',
'city': 'New York'
},
'record_002': {
'name': 'Bob The Builder',
'age': 24,
'department': 'Construction',
'city': 'London'
},
'record_003': {
'name': 'Charlie Chaplin',
'age': 45,
'department': 'Entertainment',
'city': 'Paris'
}
}
  • Outer keys ('record_001', 'record_002', 'record_003') are unique identifiers for each entity (e.g., a person, a product, an observation).
  • Inner dictionaries contain the attributes (like 'name', 'age') and their corresponding values for each outer key.

Outer Keys as Columns, Inner Keys as Row Index

Less common for direct conversion but possible. Here, the outer keys would form the DataFrame columns, and the inner keys would form the row index.

This structure organizes data primarily by attribute (outer keys), with each attribute having values corresponding to different entities (inner keys).

Example:

data_structure_2 = {
'Feature_A': {
'item_X': 10,
'item_Y': 15,
'item_Z': 20
},
'Feature_B': {
'item_X': 100,
'item_Y': 150,
'item_Z': 220 # Note: item_Z has a different value for Feature_B
},
'Feature_C': {
'item_X': True,
'item_Y': False,
'item_Z': True
}
}
  • Outer keys ('Feature_A', 'Feature_B', 'Feature_C') represent the main attributes or variables you are tracking.
  • Inner dictionaries map identifiers ('item_X', 'item_Y') to the values for that specific feature.

Dictionary with List Values

This structure can represent multiple observations per outer key or can be used to construct MultiIndex DataFrames.

This structure is often used when the "columns" of your conceptual table are known, and each "column" contains a list of values, one for each "row".

Example:

data_structure_3 = {
'Experiment_ID': ['Exp1', 'Exp2', 'Exp1', 'Exp3', 'Exp2'],
'Measurement_Type': ['Temp', 'Pressure', 'Temp', 'Humidity', 'Pressure'],
'Value': [25.5, 101.2, 26.1, 60.3, 100.9],
'Unit': ['Celsius', 'kPa', 'Celsius', '%', 'kPa']
}
  • Keys ('Experiment_ID', 'Measurement_Type', etc.) directly map to what will become column names in the DataFrame.
  • Values are lists, where each list contains all the entries for that column. All lists must be of the same length for this to work directly with pd.DataFrame().

Method 1: pd.DataFrame.from_dict(nested_dict, orient='index') (Outer Keys to Rows)

This is typically the most straightforward way when your nested dictionary structure has outer keys intended as row identifiers and inner keys as column headers.

import pandas as pd

student_data_nested = {
'student_A': {'Math': 90, 'Science': 85, 'History': 78},
'student_B': {'Math': 75, 'Science': 92, 'History': 88},
'student_C': {'Math': 88, 'Science': 80, 'History': 95}
}

# ✅ orient='index' makes outer keys ('student_A', etc.) the row index.
# Inner keys ('Math', etc.) become column names.
df_from_dict_orient_index = pd.DataFrame.from_dict(student_data_nested, orient='index')

print("DataFrame from nested dict (orient='index'):")
print(df_from_dict_orient_index)

Output:

DataFrame from nested dict (orient='index'):
Math Science History
student_A 90 85 78
student_B 75 92 88
student_C 88 80 95
  • orient='index': Tells Pandas to treat the keys of the input dict as the row labels (index).

Method 2: pd.DataFrame.from_dict(nested_dict, orient='columns') (Outer Keys to Columns - Default)

If orient='columns' (which is the default for from_dict if the inner values are dict-like), the outer keys of your nested_dict will become the DataFrame's column names. The keys of the inner dictionaries will become the DataFrame's index.

import pandas as pd

# Same data as before, but note how it's interpreted differently
student_data_nested = {
'student_A': {'Math': 90, 'Science': 85, 'History': 78},
'student_B': {'Math': 75, 'Science': 92, 'History': 88},
'student_C': {'Math': 88, 'Science': 80, 'History': 95}
}

# ✅ orient='columns' (or omitting orient with this dict structure)
df_from_dict_orient_cols = pd.DataFrame.from_dict(student_data_nested, orient='columns')

print("DataFrame from nested dict (orient='columns'):")
print(df_from_dict_orient_cols)

Output:

DataFrame from nested dict (orient='columns'):
student_A student_B student_C
Math 90 75 88
Science 85 92 80
History 78 88 95

This is useful if your nested dictionary is structured "by column."

Method 3: pd.DataFrame(nested_dict) (Often Outer Keys to Columns)

Passing a nested dictionary directly to the pd.DataFrame() constructor often behaves like orient='columns' if the inner values are dictionaries.

import pandas as pd

# Using the same student_data_nested
student_data_nested = {
'student_A': {'Math': 90, 'Science': 85, 'History': 78},
'student_B': {'Math': 75, 'Science': 92, 'History': 88},
'student_C': {'Math': 88, 'Science': 80, 'History': 95}
}
df_constructor = pd.DataFrame(student_data_nested)

print("DataFrame from nested dict using pd.DataFrame() constructor:")
print(df_constructor)

Output: (Same as orient='columns')

DataFrame from nested dict using pd.DataFrame() constructor:
student_A student_B student_C
Math 90 75 88
Science 85 92 80
History 78 88 95

Using .T (Transpose) to Flip Rows and Columns

If the pd.DataFrame() constructor gives you columns where you wanted rows (or vice-versa), you can use the transpose attribute .T to swap rows and columns.

import pandas as pd

student_data_nested = {
'student_A': {'Math': 90, 'Science': 85, 'History': 78},
'student_B': {'Math': 75, 'Science': 92, 'History': 88},
'student_C': {'Math': 88, 'Science': 80, 'History': 95}
}
df_constructor = pd.DataFrame(student_data_nested)

# If pd.DataFrame(nested_dict) results in outer keys as columns,
# and you wanted them as rows, transpose it.
df_transposed = df_constructor.T # .T is the transpose accessor

print("Transposed DataFrame (to get outer keys as rows):")
print(df_transposed)

Output:

Transposed DataFrame (to get outer keys as rows):
Math Science History
student_A 90 85 78
student_B 75 92 88
student_C 88 80 95

Handling Nested Dictionaries with List Values

If your inner dictionaries contain lists as values, pd.DataFrame.from_dict(..., orient='index') will create columns where each cell contains a list.

import pandas as pd

data_with_lists = {
'SensorA': {'Temperature': [22, 23, 22], 'Humidity': [60, 62, 61]},
'SensorB': {'Temperature': [25, 25, 26], 'Humidity': [55, 54, 55]}
}

df_lists = pd.DataFrame.from_dict(data_with_lists, orient='index')
print("DataFrame from dict with list values:")
print(df_lists)
print()

# To expand these lists into separate rows (if desired), you can use .stack().explode()
# or pd.explode() on each column.
df_exploded = df_lists.stack().explode().reset_index()
df_exploded.columns = ['Sensor', 'Metric', 'Value']
print("Exploded DataFrame:")
print(df_exploded.head()) # Show first few rows

Output:

DataFrame from dict with list values:
Temperature Humidity
SensorA [22, 23, 22] [60, 62, 61]
SensorB [25, 25, 26] [55, 54, 55]

Exploded DataFrame:
Sensor Metric Value
0 SensorA Temperature 22
1 SensorA Temperature 23
2 SensorA Temperature 22
3 SensorA Humidity 60
4 SensorA Humidity 62

Further processing with explode() or other methods might be needed depending on the desired final structure.

Converting Deeply Nested Dictionaries (Custom Logic/List Comprehensions)

If your dictionary is more deeply nested or has an irregular structure not directly supported by from_dict's orient parameter, you might need to first "flatten" the dictionary into a list of simpler dictionaries or a list of lists using custom Python logic (often list comprehensions or loops) before passing it to pd.DataFrame().

import pandas as pd

deeply_nested_dict = {
'dept_sales': {
'employee_101': {'name': 'Alice', 'region': 'North', 'sales': 1000},
'employee_102': {'name': 'Bob', 'region': 'South', 'sales': 1500}
},
'dept_hr': {
'employee_201': {'name': 'Charlie', 'region': 'North', 'role': 'Manager'},
'employee_202': {'name': 'David', 'region': 'West', 'role': 'Assistant'}
}
}

# Flatten into a list of dictionaries
flattened_data = []
for dept_name, employees in deeply_nested_dict.items():
for emp_id, emp_details in employees.items():
record = {'department': dept_name, 'employee_id': emp_id}
record.update(emp_details) # Add all inner details
flattened_data.append(record)

df_deep_flat = pd.DataFrame(flattened_data)
print("DataFrame from deeply nested dict after custom flattening:")
print(df_deep_flat[['department', 'employee_id', 'name', 'region', 'sales', 'role']]) # Select relevant columns

Output:

DataFrame from deeply nested dict after custom flattening:
department employee_id name region sales role
0 dept_sales employee_101 Alice North 1000.0 NaN
1 dept_sales employee_102 Bob South 1500.0 NaN
2 dept_hr employee_201 Charlie North NaN Manager
3 dept_hr employee_202 David West NaN Assistant

This approach gives you full control over how the nested structure is translated into rows and columns.

Setting Index Names After Conversion

After creating a DataFrame where dictionary keys became the index, you can name the index using df.index.name. If you used pd.concat() with the names argument, this might already be set.

import pandas as pd

student_data_nested = {
'student_A': {'Math': 90, 'Science': 85, 'History': 78},
'student_B': {'Math': 75, 'Science': 92, 'History': 88},
'student_C': {'Math': 88, 'Science': 80, 'History': 95}
}
df_from_dict_orient_index = pd.DataFrame.from_dict(student_data_nested, orient='index')

# Set the name of the index
df_from_dict_orient_index.index.name = 'StudentID'
print("DataFrame with named index:")
print(df_from_dict_orient_index)

Output:

DataFrame with named index:
Math Science History
StudentID
student_A 90 85 78
student_B 75 92 88
student_C 88 80 95

Conclusion

Pandas offers several ways to convert nested dictionaries into DataFrames, depending on how the dictionary's structure should map to rows and columns:

  • pd.DataFrame.from_dict(my_dict, orient='index'): Ideal when outer dictionary keys should become the DataFrame's row index and inner keys its columns.
  • pd.DataFrame(my_dict) or pd.DataFrame.from_dict(my_dict, orient='columns'): Use when outer dictionary keys should become DataFrame columns and inner keys the row index. You can use .T to transpose if the initial orientation is not what you need.
  • Custom Flattening Logic (e.g., list comprehensions) + pd.DataFrame(): Necessary for more complex or irregularly nested dictionaries to transform the data into a list of records (dictionaries) or list of lists suitable for DataFrame creation.

Understanding the orient parameter of from_dict and the default behavior of the DataFrame constructor is key to efficiently converting your nested dictionary data into a structured Pandas DataFrame.