Python Pandas: How to Convert DataFrame to List of Dictionaries
Converting a Pandas DataFrame into a list of dictionaries is a common requirement when you need to process data row by row in a dictionary format, serialize it to JSON (where a list of objects is standard), or pass it to APIs that expect this structure. Pandas provides the versatile DataFrame.to_dict()
method, which, with the correct orient
parameter, makes this conversion straightforward.
This guide explains how to effectively use DataFrame.to_dict(orient='records')
and other related techniques to transform your DataFrame into a list of dictionaries.
The Goal: DataFrame Rows as Dictionaries in a List
We want to transform a Pandas DataFrame where each row becomes a dictionary. In these dictionaries, the DataFrame's column names will be the keys, and the cell values for that row will be the dictionary values. All these row-dictionaries will then be collected into a Python list.
Example Structure:
If a DataFrame row is: Name: Alice, Age: 30
The corresponding dictionary in the list will be: {'Name': 'Alice', 'Age': 30}
Example DataFrame
import pandas as pd
data = {
'EmployeeID': ['E101', 'E102', 'E103', 'E104'],
'Name': ['Alice Smith', 'Bob Johnson', 'Charlie Brown', 'Diana Prince'],
'Department': ['HR', 'Engineering', 'HR', 'Sales'],
'Salary': [60000, 85000, 62000, 70000]
}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)
Output:
Original DataFrame:
EmployeeID Name Department Salary
0 E101 Alice Smith HR 60000
1 E102 Bob Johnson Engineering 85000
2 E103 Charlie Brown HR 62000
3 E104 Diana Prince Sales 70000
Method 1: Using DataFrame.to_dict(orient='records')
(Recommended)
The DataFrame.to_dict(orient='records')
method is specifically designed for this conversion. Setting orient='records'
tells Pandas to create a list where each element is a dictionary representing a row.
import pandas as pd
df_example = pd.DataFrame({
'EmployeeID': ['E101', 'E102', 'E103', 'E104'],
'Name': ['Alice Smith', 'Bob Johnson', 'Charlie Brown', 'Diana Prince'],
'Department': ['HR', 'Engineering', 'HR', 'Sales'],
'Salary': [60000, 85000, 62000, 70000]
})
# ✅ Convert DataFrame to a list of dictionaries
list_of_dicts = df_example.to_dict(orient='records')
print("List of Dictionaries (orient='records'):")
# For better readability, print one dictionary per line
for record in list_of_dicts:
print(record)
print(f"Type of result: {type(list_of_dicts)}")
if list_of_dicts:
print(f"Type of first element: {type(list_of_dicts[0])}")
Output:
List of Dictionaries (orient='records'):
{'EmployeeID': 'E101', 'Name': 'Alice Smith', 'Department': 'HR', 'Salary': 60000}
{'EmployeeID': 'E102', 'Name': 'Bob Johnson', 'Department': 'Engineering', 'Salary': 85000}
{'EmployeeID': 'E103', 'Name': 'Charlie Brown', 'Department': 'HR', 'Salary': 62000}
{'EmployeeID': 'E104', 'Name': 'Diana Prince', 'Department': 'Sales', 'Salary': 70000}
Type of result: <class 'list'>
Type of first element: <class 'dict'>
This is the most direct and idiomatic way to achieve the desired list of dictionaries.
Understanding Other orient
Options of to_dict()
The to_dict()
method is versatile, and its behavior changes significantly with the orient
parameter:
orient='dict'
(Default)
If orient
is not specified or set to 'dict'
, it returns a dictionary of dictionaries, where outer keys are column names and inner dictionaries map index labels to cell values.
import pandas as pd
df_example = pd.DataFrame({
'EmployeeID': ['E101', 'E102', 'E103', 'E104'],
'Name': ['Alice Smith', 'Bob Johnson', 'Charlie Brown', 'Diana Prince'],
'Department': ['HR', 'Engineering', 'HR', 'Sales'],
'Salary': [60000, 85000, 62000, 70000]
})
dict_of_dicts = df_example.to_dict() # Default orient='dict'
print("Result of to_dict() with orient='dict' (default):")
print(dict_of_dicts)
Output:
Result of to_dict() with orient='dict' (default):
{'EmployeeID': {0: 'E101', 1: 'E102', 2: 'E103', 3: 'E104'}, 'Name': {0: 'Alice Smith', 1: 'Bob Johnson', 2: 'Charlie Brown', 3: 'Diana Prince'}, 'Department': {0: 'HR', 1: 'Engineering', 2: 'HR', 3: 'Sales'}, 'Salary': {0: 60000, 1: 85000, 2: 62000, 3: 70000}}
orient='index'
This returns a dictionary where outer keys are the DataFrame's index labels, and inner dictionaries map column names to cell values for that row.
import pandas as pd
df_example = pd.DataFrame({
'EmployeeID': ['E101', 'E102', 'E103', 'E104'],
'Name': ['Alice Smith', 'Bob Johnson', 'Charlie Brown', 'Diana Prince'],
'Department': ['HR', 'Engineering', 'HR', 'Sales'],
'Salary': [60000, 85000, 62000, 70000]
})
dict_orient_index = df_example.to_dict(orient='index')
print("Result of to_dict() with orient='index':")
print(dict_orient_index)
Output:
Result of to_dict() with orient='index':
{0: {'EmployeeID': 'E101', 'Name': 'Alice Smith', 'Department': 'HR', 'Salary': 60000}, 1: {'EmployeeID': 'E102', 'Name': 'Bob Johnson', 'Department': 'Engineering', 'Salary': 85000}, 2: {'EmployeeID': 'E103', 'Name': 'Charlie Brown', 'Department': 'HR', 'Salary': 62000}, 3: {'EmployeeID': 'E104', 'Name': 'Diana Prince', 'Department': 'Sales', 'Salary': 70000}}
While orient='index'
gives you dictionaries per row, the overall structure is a dictionary keyed by index, not a list of dictionaries.
For converting to a list of dictionaries, orient='records'
is the correct choice.
Including the DataFrame Index in Each Dictionary
If you want the DataFrame's index to be included as a key-value pair in each dictionary within the list, first use df.reset_index()
to turn the index into a regular column.
import pandas as pd
df_custom_index = pd.DataFrame({
'Name': ['Alice', 'Bob'], 'Salary': [60, 85]
}, index=pd.Index(['emp1', 'emp2'], name='EmpCode'))
print("DataFrame with custom named index:")
print(df_custom_index)
# Step 1: Reset the index to make 'EmpCode' a column
df_with_index_col = df_custom_index.reset_index()
print("After reset_index():")
print(df_with_index_col)
print()
# Step 2: Convert to list of dictionaries
list_of_dicts_with_index = df_with_index_col.to_dict(orient='records')
print("List of Dictionaries including original index as 'EmpCode':")
for record in list_of_dicts_with_index:
print(record)
Output:
DataFrame with custom named index:
Name Salary
EmpCode
emp1 Alice 60
emp2 Bob 85
After reset_index():
EmpCode Name Salary
0 emp1 Alice 60
1 emp2 Bob 85
List of Dictionaries including original index as 'EmpCode':
{'EmpCode': 'emp1', 'Name': 'Alice', 'Salary': 60}
{'EmpCode': 'emp2', 'Name': 'Bob', 'Salary': 85}
If the original index was unnamed, reset_index()
creates a column named 'index'.
Converting Specific Columns to a List of Dictionaries
If you only want a subset of columns in your list of dictionaries, select those columns before calling to_dict()
.
import pandas as pd
df_example = pd.DataFrame({
'EmployeeID': ['E101', 'E102', 'E103', 'E104'],
'Name': ['Alice Smith', 'Bob Johnson', 'Charlie Brown', 'Diana Prince'],
'Department': ['HR', 'Engineering', 'HR', 'Sales'],
'Salary': [60000, 85000, 62000, 70000]
})
# Select only 'Name' and 'Department' columns
columns_subset = ['Name', 'Department']
df_subset = df_example[columns_subset]
# ✅ Convert the subset DataFrame to a list of dictionaries
list_of_dicts_subset = df_subset.to_dict(orient='records')
print("List of Dictionaries from subset of columns ('Name', 'Department'):")
for record in list_of_dicts_subset:
print(record)
Output:
List of Dictionaries from subset of columns ('Name', 'Department'):
{'Name': 'Alice Smith', 'Department': 'HR'}
{'Name': 'Bob Johnson', 'Department': 'Engineering'}
{'Name': 'Charlie Brown', 'Department': 'HR'}
{'Name': 'Diana Prince', 'Department': 'Sales'}
Note the use of double square brackets df_example[columns_subset]
to select multiple columns, which returns a DataFrame.
Alternative (Less Common): Using Transpose and dict.values()
This method is more circuitous for this specific task but demonstrates another way to_dict()
can be used.
- Transpose the DataFrame (
df.T
) so rows become columns. - Convert this transposed DataFrame to a dictionary (default
orient='dict'
). This will give a dictionary where keys are original row indices, and values are dictionaries of original column names to values. - Take the
.values()
of this dictionary and convert to a list.
import pandas as pd
df_example = pd.DataFrame({
'EmployeeID': ['E101', 'E102', 'E103', 'E104'],
'Name': ['Alice Smith', 'Bob Johnson', 'Charlie Brown', 'Diana Prince'],
'Department': ['HR', 'Engineering', 'HR', 'Sales'],
'Salary': [60000, 85000, 62000, 70000]
})
# Transpose, convert to dict, get values
transposed_dict = df_example.T.to_dict() # orient='dict' is default
print(transposed_dict)
print()
list_of_dicts_transpose = list(transposed_dict.values())
print("List of Dictionaries using transpose method:")
for record in list_of_dicts_transpose:
print(record)
Output:
{0: {'EmployeeID': 'E101', 'Name': 'Alice Smith', 'Department': 'HR', 'Salary': 60000}, 1: {'EmployeeID': 'E102', 'Name': 'Bob Johnson', 'Department': 'Engineering', 'Salary': 85000}, 2: {'EmployeeID': 'E103', 'Name': 'Charlie Brown', 'Department': 'HR', 'Salary': 62000}, 3: {'EmployeeID': 'E104', 'Name': 'Diana Prince', 'Department': 'Sales', 'Salary': 70000}}
List of Dictionaries using transpose method:
{'EmployeeID': 'E101', 'Name': 'Alice Smith', 'Department': 'HR', 'Salary': 60000}
{'EmployeeID': 'E102', 'Name': 'Bob Johnson', 'Department': 'Engineering', 'Salary': 85000}
{'EmployeeID': 'E103', 'Name': 'Charlie Brown', 'Department': 'HR', 'Salary': 62000}
{'EmployeeID': 'E104', 'Name': 'Diana Prince', 'Department': 'Sales', 'Salary': 70000}
This is generally less direct and potentially less performant than df.to_dict(orient='records')
.
Conclusion
Converting a Pandas DataFrame to a list of dictionaries is a common requirement for data interchange and processing.
- The most direct and recommended method is
your_dataframe.to_dict(orient='records')
. This clearly expresses the intent and efficiently produces the desired list of row-based dictionaries. - If you need to include the DataFrame's index as a field in each dictionary, first use
df.reset_index()
to convert the index to a column, then applyto_dict(orient='records')
. - To convert only specific columns, select the desired columns into a subset DataFrame first, then call
to_dict(orient='records')
on the subset.
By using to_dict(orient='records')
, you can easily transform your tabular DataFrame data into a list of dictionary objects suitable for various downstream applications.