Python Pandas: How to Convert GroupBy Results to Dictionary of Lists/DataFrames
After performing a groupby()
operation in Pandas, you often get a DataFrameGroupBy
object. A common next step is to convert the results of this grouping into a Python dictionary, where keys are the group names and values are lists (or even sub-DataFrames) of the corresponding group's data. This format can be convenient for further processing, JSON serialization, or custom data structuring.
This guide explains how to convert Pandas GroupBy
results into a dictionary of lists or a dictionary of DataFrames.
Understanding GroupBy
Objects
When you call df.groupby('ColumnA')
, Pandas creates a DataFrameGroupBy
object. This object itself doesn't immediately show the groups; rather, it holds information about how the DataFrame has been split. You can iterate over this object to get (group_name, group_dataframe)
pairs, or apply aggregation functions (.sum()
, .mean()
, .size()
, .apply()
, etc.) to it.
Our goal is to take these (group_name, group_data)
relationships and structure them into a Python dictionary.
Example DataFrame
import pandas as pd
data = {
'Department': ['Sales', 'HR', 'Engineering', 'Sales', 'HR', 'Engineering', 'Sales'],
'Employee': ['Alice', 'Bob', 'Charlie', 'David', 'Eve', 'Frank', 'Grace'],
'Salary': [70000, 60000, 95000, 72000, 65000, 105000, 68000],
'Project': ['P1', 'P2', 'P3', 'P1', 'P3', 'P4', 'P2']
}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)
Output:
Original DataFrame:
Department Employee Salary Project
0 Sales Alice 70000 P1
1 HR Bob 60000 P2
2 Engineering Charlie 95000 P3
3 Sales David 72000 P1
4 HR Eve 65000 P3
5 Engineering Frank 105000 P4
6 Sales Grace 68000 P2
We'll group by 'Department' and create dictionaries from the 'Salary' or 'Employee' columns, or entire sub-DataFrames.
Convert GroupBy Results to Dictionary of Lists
Here, the dictionary keys will be the unique group names (e.g., 'Sales', 'HR'), and the values will be lists of items from another column corresponding to each group.
Using groupby().apply(list).to_dict()
(Recommended for single column lists)
This is a concise way to get a dictionary where values are lists from a single specified column.
import pandas as pd
df_example = pd.DataFrame({
'Department': ['Sales', 'HR', 'Sales', 'HR', 'Sales'],
'Salary': [70, 60, 72, 65, 68]
})
# Group by 'Department' and collect 'Salary' values into lists
dict_of_salary_lists = df_example.groupby('Department')['Salary'].apply(list).to_dict()
print("Dictionary of Salary Lists per Department (using apply(list).to_dict()):")
print(dict_of_salary_lists)
Output:
Dictionary of Salary Lists per Department (using apply(list).to_dict()):
{'HR': [60, 65], 'Sales': [70, 72, 68]}
df.groupby('Department')
: Creates the GroupBy object.['Salary']
: Selects the 'Salary' column within each group. This results in aSeriesGroupBy
object..apply(list)
: For each group in theSeriesGroupBy
object, it applies thelist
constructor, converting the Series of salaries for that group into a Python list. This produces a Series where the index is 'Department' and values are lists of salaries..to_dict()
: Converts this resulting Series (Index -> List) into a dictionary.
Using a Dictionary Comprehension
This provides more explicit control and is very Pythonic.
import pandas as pd
df_example = pd.DataFrame({
'Department': ['Sales', 'HR', 'Sales', 'HR', 'Sales'],
'Salary': [70, 60, 72, 65, 68]
})
# Iterate over the GroupBy object (which yields name, group_df pairs)
# Or iterate over a specific column grouped
dict_comp_salary_lists = {
dept_name: group_salaries.tolist() # .tolist() converts Series to list
for dept_name, group_salaries in df_example.groupby('Department')['Salary']
}
print("Dictionary of Salary Lists per Department (dict comprehension):")
print(dict_comp_salary_lists)
Output:
Dictionary of Salary Lists per Department (dict comprehension):
{'HR': [60, 65], 'Sales': [70, 72, 68]}
df.groupby('Department')['Salary']
when iterated, yields(group_name, series_for_that_group)
.group_salaries.tolist()
: Converts the Series of salaries for the current group into a list.
Using a for
Loop
A traditional for
loop can achieve the same.
import pandas as pd
df_example = pd.DataFrame({
'Department': ['Sales', 'HR', 'Sales', 'HR', 'Sales'],
'Salary': [70, 60, 72, 65, 68]
})
dict_loop_salary_lists = {}
for dept_name, group_salaries in df_example.groupby('Department')['Salary']:
dict_loop_salary_lists[dept_name] = group_salaries.tolist()
print("Dictionary of Salary Lists per Department (for loop):")
print(dict_loop_salary_lists)
Output:
Dictionary of Salary Lists per Department (for loop):
{'HR': [60, 65], 'Sales': [70, 72, 68]}
Convert GroupBy Results to Dictionary of DataFrames
Here, the dictionary keys are the group names, and the values are the actual sub-DataFrame objects corresponding to each group.
Using dict(tuple(df.groupby(...)))
(Recommended)
Iterating over a DataFrameGroupBy
object yields (name, group_df)
tuples. You can convert this sequence of tuples directly into a dictionary.
import pandas as pd
df_full = pd.DataFrame({
'Department': ['Sales', 'HR', 'Engineering', 'Sales'],
'Employee': ['Alice', 'Bob', 'Charlie', 'David'],
'Salary': [70000, 60000, 95000, 72000]
})
# Create the GroupBy object
grouped_by_dept = df_full.groupby('Department')
# ✅ Convert the iterator of (name, group_df) tuples into a dictionary
dict_of_dataframes = dict(tuple(grouped_by_dept))
print("Dictionary of DataFrames per Department:")
for dept, sub_df in dict_of_dataframes.items():
print(f"--- Department: {dept} ---")
print(sub_df)
print()
Output:
Dictionary of DataFrames per Department:
--- Department: Engineering ---
Department Employee Salary
2 Engineering Charlie 95000
--- Department: HR ---
Department Employee Salary
1 HR Bob 60000
--- Department: Sales ---
Department Employee Salary
0 Sales Alice 70000
3 Sales David 72000
tuple(grouped_by_dept)
: Converts the GroupBy iterator into a tuple of(name, df)
pairs.dict(...)
: Converts this tuple of pairs into a dictionary.
Using a Dictionary Comprehension
import pandas as pd
df_full = pd.DataFrame({
'Department': ['Sales', 'HR', 'Engineering', 'Sales'],
'Employee': ['Alice', 'Bob', 'Charlie', 'David'],
'Salary': [70000, 60000, 95000, 72000]
})
dict_of_dfs_comp = {
dept_name: group_df
for dept_name, group_df in df_full.groupby('Department')
}
print("Dictionary of DataFrames (dict comprehension):")
print(dict_of_dfs_comp['Sales']) # Example access
print()
for dept, sub_df in dict_of_dfs_comp.items():
print(f"--- Department: {dept} ---")
print(sub_df)
print()
Output:
Dictionary of DataFrames (dict comprehension):
Department Employee Salary
0 Sales Alice 70000
3 Sales David 72000
--- Department: Engineering ---
Department Employee Salary
2 Engineering Charlie 95000
--- Department: HR ---
Department Employee Salary
1 HR Bob 60000
--- Department: Sales ---
Department Employee Salary
0 Sales Alice 70000
3 Sales David 72000
This is arguably even more readable than the dict(tuple(...))
approach for creating a dictionary of DataFrames.
Choosing the Right Method
- Dictionary of Lists (from a single column):
df.groupby('group_col')['value_col'].apply(list).to_dict()
: Very concise and idiomatic Pandas.- Dict comprehension: Pythonic and clear, good if you need slight modifications during list creation.
- Dictionary of DataFrames:
dict(tuple(df.groupby('group_col')))
or the dict comprehension{name: group for name, group in df.groupby('group_col')}
: Both are excellent, clear, and Pythonic. The dict comprehension might be slightly more explicit.
Conclusion
Converting Pandas GroupBy
results to dictionaries provides a flexible way to structure grouped data for various purposes.
- To create a dictionary of lists (where values are lists of elements from a specific column for each group), use
df.groupby('group_col')['value_col'].apply(list).to_dict()
or an equivalent dictionary comprehension. - To create a dictionary of DataFrames (where values are the sub-DataFrames for each group), iterate over the
DataFrameGroupBy
object (which yieldsname, group_df
pairs) and construct a dictionary, for instance, usingdict(tuple(df.groupby('group_col')))
.
These methods allow you to easily transition from a grouped representation to a standard Python dictionary structure tailored to your needs.