Skip to main content

Python Pandas: How to Convert GroupBy Results to Dictionary of Lists/DataFrames

After performing a groupby() operation in Pandas, you often get a DataFrameGroupBy object. A common next step is to convert the results of this grouping into a Python dictionary, where keys are the group names and values are lists (or even sub-DataFrames) of the corresponding group's data. This format can be convenient for further processing, JSON serialization, or custom data structuring.

This guide explains how to convert Pandas GroupBy results into a dictionary of lists or a dictionary of DataFrames.

Understanding GroupBy Objects

When you call df.groupby('ColumnA'), Pandas creates a DataFrameGroupBy object. This object itself doesn't immediately show the groups; rather, it holds information about how the DataFrame has been split. You can iterate over this object to get (group_name, group_dataframe) pairs, or apply aggregation functions (.sum(), .mean(), .size(), .apply(), etc.) to it.

Our goal is to take these (group_name, group_data) relationships and structure them into a Python dictionary.

Example DataFrame

import pandas as pd

data = {
'Department': ['Sales', 'HR', 'Engineering', 'Sales', 'HR', 'Engineering', 'Sales'],
'Employee': ['Alice', 'Bob', 'Charlie', 'David', 'Eve', 'Frank', 'Grace'],
'Salary': [70000, 60000, 95000, 72000, 65000, 105000, 68000],
'Project': ['P1', 'P2', 'P3', 'P1', 'P3', 'P4', 'P2']
}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)

Output:

Original DataFrame:
Department Employee Salary Project
0 Sales Alice 70000 P1
1 HR Bob 60000 P2
2 Engineering Charlie 95000 P3
3 Sales David 72000 P1
4 HR Eve 65000 P3
5 Engineering Frank 105000 P4
6 Sales Grace 68000 P2

We'll group by 'Department' and create dictionaries from the 'Salary' or 'Employee' columns, or entire sub-DataFrames.

Convert GroupBy Results to Dictionary of Lists

Here, the dictionary keys will be the unique group names (e.g., 'Sales', 'HR'), and the values will be lists of items from another column corresponding to each group.

This is a concise way to get a dictionary where values are lists from a single specified column.

import pandas as pd

df_example = pd.DataFrame({
'Department': ['Sales', 'HR', 'Sales', 'HR', 'Sales'],
'Salary': [70, 60, 72, 65, 68]
})

# Group by 'Department' and collect 'Salary' values into lists
dict_of_salary_lists = df_example.groupby('Department')['Salary'].apply(list).to_dict()

print("Dictionary of Salary Lists per Department (using apply(list).to_dict()):")
print(dict_of_salary_lists)

Output:

Dictionary of Salary Lists per Department (using apply(list).to_dict()):
{'HR': [60, 65], 'Sales': [70, 72, 68]}
  • df.groupby('Department'): Creates the GroupBy object.
  • ['Salary']: Selects the 'Salary' column within each group. This results in a SeriesGroupBy object.
  • .apply(list): For each group in the SeriesGroupBy object, it applies the list constructor, converting the Series of salaries for that group into a Python list. This produces a Series where the index is 'Department' and values are lists of salaries.
  • .to_dict(): Converts this resulting Series (Index -> List) into a dictionary.

Using a Dictionary Comprehension

This provides more explicit control and is very Pythonic.

import pandas as pd

df_example = pd.DataFrame({
'Department': ['Sales', 'HR', 'Sales', 'HR', 'Sales'],
'Salary': [70, 60, 72, 65, 68]
})

# Iterate over the GroupBy object (which yields name, group_df pairs)
# Or iterate over a specific column grouped
dict_comp_salary_lists = {
dept_name: group_salaries.tolist() # .tolist() converts Series to list
for dept_name, group_salaries in df_example.groupby('Department')['Salary']
}

print("Dictionary of Salary Lists per Department (dict comprehension):")
print(dict_comp_salary_lists)

Output:

Dictionary of Salary Lists per Department (dict comprehension):
{'HR': [60, 65], 'Sales': [70, 72, 68]}
  • df.groupby('Department')['Salary'] when iterated, yields (group_name, series_for_that_group).
  • group_salaries.tolist(): Converts the Series of salaries for the current group into a list.

Using a for Loop

A traditional for loop can achieve the same.

import pandas as pd

df_example = pd.DataFrame({
'Department': ['Sales', 'HR', 'Sales', 'HR', 'Sales'],
'Salary': [70, 60, 72, 65, 68]
})

dict_loop_salary_lists = {}
for dept_name, group_salaries in df_example.groupby('Department')['Salary']:
dict_loop_salary_lists[dept_name] = group_salaries.tolist()

print("Dictionary of Salary Lists per Department (for loop):")
print(dict_loop_salary_lists)

Output:

Dictionary of Salary Lists per Department (for loop):
{'HR': [60, 65], 'Sales': [70, 72, 68]}

Convert GroupBy Results to Dictionary of DataFrames

Here, the dictionary keys are the group names, and the values are the actual sub-DataFrame objects corresponding to each group.

Iterating over a DataFrameGroupBy object yields (name, group_df) tuples. You can convert this sequence of tuples directly into a dictionary.

import pandas as pd

df_full = pd.DataFrame({
'Department': ['Sales', 'HR', 'Engineering', 'Sales'],
'Employee': ['Alice', 'Bob', 'Charlie', 'David'],
'Salary': [70000, 60000, 95000, 72000]
})

# Create the GroupBy object
grouped_by_dept = df_full.groupby('Department')

# ✅ Convert the iterator of (name, group_df) tuples into a dictionary
dict_of_dataframes = dict(tuple(grouped_by_dept))

print("Dictionary of DataFrames per Department:")
for dept, sub_df in dict_of_dataframes.items():
print(f"--- Department: {dept} ---")
print(sub_df)
print()

Output:

Dictionary of DataFrames per Department:
--- Department: Engineering ---
Department Employee Salary
2 Engineering Charlie 95000

--- Department: HR ---
Department Employee Salary
1 HR Bob 60000

--- Department: Sales ---
Department Employee Salary
0 Sales Alice 70000
3 Sales David 72000
  • tuple(grouped_by_dept): Converts the GroupBy iterator into a tuple of (name, df) pairs.
  • dict(...): Converts this tuple of pairs into a dictionary.

Using a Dictionary Comprehension

import pandas as pd

df_full = pd.DataFrame({
'Department': ['Sales', 'HR', 'Engineering', 'Sales'],
'Employee': ['Alice', 'Bob', 'Charlie', 'David'],
'Salary': [70000, 60000, 95000, 72000]
})

dict_of_dfs_comp = {
dept_name: group_df
for dept_name, group_df in df_full.groupby('Department')
}

print("Dictionary of DataFrames (dict comprehension):")
print(dict_of_dfs_comp['Sales']) # Example access
print()

for dept, sub_df in dict_of_dfs_comp.items():
print(f"--- Department: {dept} ---")
print(sub_df)
print()

Output:

Dictionary of DataFrames (dict comprehension):
Department Employee Salary
0 Sales Alice 70000
3 Sales David 72000

--- Department: Engineering ---
Department Employee Salary
2 Engineering Charlie 95000

--- Department: HR ---
Department Employee Salary
1 HR Bob 60000

--- Department: Sales ---
Department Employee Salary
0 Sales Alice 70000
3 Sales David 72000

This is arguably even more readable than the dict(tuple(...)) approach for creating a dictionary of DataFrames.

Choosing the Right Method

  • Dictionary of Lists (from a single column):
    • df.groupby('group_col')['value_col'].apply(list).to_dict(): Very concise and idiomatic Pandas.
    • Dict comprehension: Pythonic and clear, good if you need slight modifications during list creation.
  • Dictionary of DataFrames:
    • dict(tuple(df.groupby('group_col'))) or the dict comprehension {name: group for name, group in df.groupby('group_col')}: Both are excellent, clear, and Pythonic. The dict comprehension might be slightly more explicit.

Conclusion

Converting Pandas GroupBy results to dictionaries provides a flexible way to structure grouped data for various purposes.

  • To create a dictionary of lists (where values are lists of elements from a specific column for each group), use df.groupby('group_col')['value_col'].apply(list).to_dict() or an equivalent dictionary comprehension.
  • To create a dictionary of DataFrames (where values are the sub-DataFrames for each group), iterate over the DataFrameGroupBy object (which yields name, group_df pairs) and construct a dictionary, for instance, using dict(tuple(df.groupby('group_col'))).

These methods allow you to easily transition from a grouped representation to a standard Python dictionary structure tailored to your needs.