How to Resolve Python "TypeError: Object of type ndarray is not JSON serializable"
When working with data analysis libraries like NumPy and Pandas in Python, you frequently need to serialize your data, often into JSON format for APIs, storage, or web applications. However, attempting to directly serialize NumPy arrays (ndarray
) or Pandas DataFrames using Python's standard json.dumps()
function leads to errors like TypeError: Object of type ndarray is not JSON serializable
or TypeError: Object of type DataFrame is not JSON serializable
. This occurs because the default JSON encoder doesn't natively understand these complex data structures.
This guide explains why these errors happen and provides standard solutions using built-in methods and custom encoders.
Understanding the Error: JSON Serializable Types
Python's built-in json
library can directly serialize standard Python types into their JSON equivalents:
Python Type | JSON Equivalent |
---|---|
dict | object |
list , tuple | array |
str | string |
int , float | number |
True | true |
False | false |
None | null |
Specialized objects from external libraries like NumPy's ndarray
or Pandas' DataFrame
are not on this list. The default json.dumps()
function doesn't know how to represent their internal structure using standard JSON types, hence the TypeError
.
Error 1: TypeError: Object of type ndarray is not JSON serializable
Cause: json.dumps()
Can not Handle ndarray
You are trying to pass a NumPy array directly into json.dumps()
.
import json
import numpy as np
# Create a NumPy array
my_array = np.array([10, 20, 30, 40])
data = {'id': 'A123', 'values': my_array}
print(f"Type of data['values']: {type(data['values'])}")
# Output: Type of data['values']: <class 'numpy.ndarray'>
try:
# ⛔️ TypeError: Object of type ndarray is not JSON serializable
json_string = json.dumps(data)
print(json_string)
except TypeError as e:
print(e)
Solution 1: Convert ndarray
to list
using .tolist()
(Recommended)
The simplest and most common solution is to convert the NumPy array into a standard Python list before serialization using the array's .tolist()
method. Lists are JSON serializable.
import json
import numpy as np
my_array = np.array([10, 20, 30, 40])
# Create data with the array converted to a list
data = {'id': 'A123', 'values': my_array.tolist()} # ✅ Convert here
print(f"Type of data['values'] now: {type(data['values'])}")
# Output: Type of data['values'] now: <class 'list'>
# ✅ Now json.dumps works correctly
json_string = json.dumps(data)
print(f"Serialized JSON: {json_string}")
# Output: Serialized JSON: {"id": "A123", "values": [10, 20, 30, 40]}
# --- Deserialization back to NumPy (Optional) ---
if json_string:
loaded_data = json.loads(json_string)
recreated_array = np.array(loaded_data['values'])
print(f"Recreated NumPy array: {recreated_array}")
print(f"Type: {type(recreated_array)}")
# Output:
# Recreated NumPy array: [10 20 30 40]
# Type: <class 'numpy.ndarray'>
This is generally the preferred method due to its simplicity and directness.
Solution 2: Use a Custom JSONEncoder
For more complex scenarios or to handle various NumPy types centrally, create a custom encoder.
import json
import numpy as np
class NumpyEncoder(json.JSONEncoder):
""" Custom encoder for numpy data types """
def default(self, obj):
if isinstance(obj, np.integer):
return int(obj) # Convert NumPy int to Python int
elif isinstance(obj, np.floating):
return float(obj) # Convert NumPy float to Python float
elif isinstance(obj, np.ndarray):
return obj.tolist() # Convert ndarray to Python list
else:
return super(NumpyEncoder, self).default(obj) # Default handling
my_array = np.array([1.1, 2.2, 3.3])
# Use np.int64 for demonstration
data = {'id': 'B456', 'count': np.int64(5), 'measurements': my_array}
# ✅ Use the custom encoder via the 'cls' argument
json_string = json.dumps(data, cls=NumpyEncoder)
print(f"Serialized JSON (Custom Encoder): {json_string}")
# Output: Serialized JSON (Custom Encoder): {"id": "B456", "count": 5, "measurements": [1.1, 2.2, 3.3]}
This encoder handles basic NumPy integer/float types and arrays by converting them to their Python equivalents.
Solution 3: Use the default
Argument in json.dumps()
Provide a function to the default
argument that handles the conversion for specific types.
import json
import numpy as np
def numpy_serializer(obj):
"""Function to serialize NumPy types for json.dumps default."""
if isinstance(obj, np.ndarray):
return obj.tolist()
# Add checks for other numpy types if needed (np.integer, np.floating)
# raise TypeError # Optionally raise error for unhandled types
# For simplicity here, we only handle ndarray
return obj # Or let default handle/error out
my_array = np.array([5, 10, 15])
data = {'run': 1, 'samples': my_array}
# ✅ Pass the serializer function to the 'default' argument
json_string = json.dumps(data, default=numpy_serializer)
print(f"Serialized JSON (default arg): {json_string}")
# Output: Serialized JSON (default arg): {"run": 1, "samples": [5, 10, 15]}
This is similar to the custom encoder but might be simpler for one-off conversions.
Solution 4: Using Pandas Series.to_json()
(If using Pandas)
If your array is part of a Pandas workflow, you can convert it to a Series and use its to_json()
method.
import numpy as np
import pandas as pd # Requires pandas installation
my_array = np.array([100, 200, 300])
# Convert to Pandas Series then to JSON string
json_string = pd.Series(my_array).to_json(orient='values')
print(f"Serialized JSON (Pandas Series): {json_string}")
# Output: Serialized JSON (Pandas Series): [100,200,300]
This is specific to workflows already involving Pandas.
Error 2: TypeError: Object of type DataFrame is not JSON serializable
Cause: json.dumps()
Can not Handle DataFrame
Similar to NumPy arrays, Pandas DataFrames are complex objects that the standard json.dumps
doesn't know how to serialize.
import json
import pandas as pd # Requires pandas installation
df = pd.DataFrame(
{
"ID": ["X1", "X2", "X3"],
"Value": [10.5, 12.1, 13.8],
"Status": ["OK", "OK", "Error"],
}
)
print("Original DataFrame:")
print(df)
print(f"Type of df: {type(df)}")
# Output: Type of df: <class 'pandas.core.frame.DataFrame'>
try:
# ⛔️ TypeError: Object of type DataFrame is not JSON serializable
json_string = json.dumps({'data': df})
except TypeError as e:
print(f"Error: {e}")
Solution 1: Use Pandas .to_json()
Method (Recommended)
Pandas DataFrames have their own built-in .to_json()
method designed specifically for this purpose. It offers various formatting options via the orient
parameter.
import pandas as pd
df = pd.DataFrame(
{
"ID": ["X1", "X2", "X3"],
"Value": [10.5, 12.1, 13.8],
"Status": ["OK", "OK", "Error"],
}
)
# ✅ Use the DataFrame's .to_json() method
# Default orient='columns': {"ID":{"0":"X1",...},"Value":{...},...}
json_string_cols = df.to_json()
print(f"Serialized JSON (orient='columns'): {json_string_cols}")
# orient='records': [{"ID":"X1","Value":10.5,"Status":"OK"}, ...]
json_string_records = df.to_json(orient='records')
print(f"Serialized JSON (orient='records'): {json_string_records}")
# orient='split': {"index":["0","1","2"],"columns":["ID",...],"data":[["X1",10.5,"OK"],...]}
json_string_split = df.to_json(orient='split')
print(f"Serialized JSON (orient='split'): {json_string_split}")
print(f"Type of result: {type(json_string_cols)}") # Output: <class 'str'>
Choose the orient
value that produces the JSON structure most suitable for your needs. This is the standard way to serialize DataFrames.
Solution 2: Convert DataFrame
to dict
using .to_dict()
You can convert the DataFrame to a Python dictionary using .to_dict()
(which also has an orient
parameter) and then serialize that dictionary using json.dumps()
.
import json
import pandas as pd
df = pd.DataFrame(
{
"ID": ["X1", "X2", "X3"],
"Value": [10.5, 12.1, 13.8],
}
)
# Convert DataFrame to dictionary (e.g., default 'columns' orientation)
data_dict = df.to_dict()
print(f"DataFrame converted to dict: {data_dict}")
# Output: {'ID': {0: 'X1', 1: 'X2', 2: 'X3'}, 'Value': {0: 10.5, 1: 12.1, 2: 13.8}}
# ✅ Serialize the resulting dictionary
json_string_from_dict = json.dumps(data_dict)
print(f"Serialized JSON (from dict): {json_string_from_dict}")
# Output: Serialized JSON (from dict): {"ID": {"0": "X1", "1": "X2", "2": "X3"}, "Value": {"0": 10.5, "1": 12.1, "2": 13.8}}
This works because the standard json
library can serialize Python dictionaries. However, using the DataFrame's native .to_json()
method (Solution 1) is usually more direct.
Conclusion
The TypeError
indicating that NumPy ndarray
or Pandas DataFrame
objects are not JSON serializable arises because Python's standard json.dumps
only handles built-in types.
- For NumPy arrays (
ndarray
): The best practice is to convert them to Python lists using.tolist()
before callingjson.dumps()
. Custom encoders or thedefault
argument are alternatives. - For Pandas DataFrames: The recommended approach is to use the DataFrame's own
.to_json()
method, selecting theorient
option that fits your desired JSON output structure. Converting to a dictionary first using.to_dict()
is also possible.
By applying these appropriate conversion methods, you can successfully serialize your NumPy and Pandas data structures into JSON format.