How to Resolve Python "json.decoder.JSONDecodeError: Extra data"
When parsing JSON data in Python using json.loads()
(for strings) or json.load()
(for files), you might encounter the json.decoder.JSONDecodeError: Extra data: line X column Y (char Z)
. This error indicates that the parser successfully read a complete, valid JSON value (like an object {...}
or an array [...]
) but then found additional, unexpected characters immediately following it within the same input stream or string, which violates the JSON standard.
This guide explains the common causes of this error, primarily related to multiple JSON objects/arrays not being properly contained, and provides standard solutions.
Understanding the Error: JSON Expects a Single Top-Level Value
The JSON specification requires a valid JSON text to consist of one single top-level value. This value can be a simple type (string, number, boolean, null) or a complex type (object {}
or array []
).
The Extra data
error occurs because the Python json
parser successfully parsed one complete JSON value but then found more non-whitespace characters immediately after it. The parser doesn't know how to handle this "extra" data because it expects the input to end after the first complete value.
Cause 1: Multiple JSON Objects/Arrays Not Enclosed in a Container
This is the most common cause of the "Extra data" error. You have input that looks like multiple JSON objects or arrays placed next to each other without being contained within a single parent array or object.
Multiple Objects Side-by-Side
- Error Scenario (String):
import json
# Two JSON objects directly concatenated
invalid_json_string = r'{"id": 1, "name": "A"}{"id": 2, "name": "B"}'
print(f"Input String: {invalid_json_string}")
try:
# ⛔️ json.decoder.JSONDecodeError: Extra data: line 1 column 23 (char 22)
# Parser finishes reading {"id": 1, "name": "A"} then sees '{' again.
data = json.loads(invalid_json_string)
print(data)
except json.JSONDecodeError as e:
print(e) - Error Scenario (File -
data.json
):{"id": 1, "name": "Alice"}
{"id": 2, "name": "Bob"}noteReading this file with
json.load()
will cause the error after parsing the first line/object.
Multiple Arrays Side-by-Side
Error Scenario (String):
import json
invalid_json_string = r'[1, 2][3, 4]' # Two arrays adjacent
print(f"\nInput String: {invalid_json_string}")
try:
# ⛔️ json.decoder.JSONDecodeError: Extra data: line 1 column 8 (char 7)
# Parser finishes reading [1, 2] then sees '[' again.
data = json.loads(invalid_json_string)
print(data)
except json.JSONDecodeError as e:
print(e)
Solution 1: Wrap Multiple Values in a JSON Array []
(Recommended)
The standard way to represent a collection of multiple JSON objects or values is to enclose them in a JSON array (square brackets []
), with commas separating the elements.
Fixing Strings
import json
# --- Fixing Multiple Objects ---
# Original invalid string: r'{"id": 1}{"id": 2}'
# ✅ Wrap in [] and add comma
valid_json_string_objects = r'[{"id": 1}, {"id": 2}]'
print(f"\nValid Objects String: {valid_json_string_objects}")
data_objects = json.loads(valid_json_string_objects)
print(f"Parsed Objects: {data_objects}") # Output: [{'id': 1}, {'id': 2}]
print(f"Accessing: {data_objects[0]['id']}") # Output: 1
# --- Fixing Multiple Arrays ---
# Original invalid string: r'[1, 2][3, 4]'
# ✅ Wrap in outer [] and add comma
valid_json_string_arrays = r'[[1, 2], [3, 4]]'
print(f"\nValid Arrays String: {valid_json_string_arrays}")
data_arrays = json.loads(valid_json_string_arrays)
print(f"Parsed Arrays: {data_arrays}") # Output: [[1, 2], [3, 4]]
print(f"Accessing: {data_arrays[1][0]}") # Output: 3
Do not put a comma after the last element inside the array brackets []
, as trailing commas are also invalid JSON.
Fixing Files
Modify the .json
file content to wrap the multiple objects/arrays in a single top-level array.
- Corrected
data.json
(Multiple Objects):[
{"id": 1, "name": "Alice"},
{"id": 2, "name": "Bob"},
{"id": 3, "name": "Carl"}
] - Corrected
data.json
(Multiple Arrays):[
[1, 2],
[3, 4],
[5, 6]
]
Now, reading these files with json.load()
will work correctly.
import json
import os
# Assume 'data_fixed.json' contains the array structure
filename = "data_fixed.json"
# Create dummy file (just for this example)
with open(filename, "w", encoding='utf-8') as f:
f.write('[\n {"id": 1, "name": "Alice"},\n {"id": 2, "name": "Bob"}\n]')
try:
with open(filename, 'r', encoding='utf-8') as f:
# ✅ Reading the file with a single top-level array works
my_data = json.load(f)
print(f"Data loaded from {filename}:")
print(my_data) # Output: [{'id': 1, 'name': 'Alice'}, {'id': 2, 'name': 'Bob'}]
except FileNotFoundError:
print(f"Error: File '{filename}' not found.")
except json.JSONDecodeError as e:
print(f"Error decoding JSON from file: {e}")
finally:
if os.path.exists(filename): os.remove(filename)
Output:
Data loaded from data_fixed.json:
[{'id': 1, 'name': 'Alice'}, {'id': 2, 'name': 'Bob'}]
Solution 2: Wrap Multiple Values in a JSON Object {}
Alternatively, you can structure your JSON so the multiple items are values within a single top-level JSON object.
Fixing Strings
import json
# Original invalid string: r'{"id": 1}{"id": 2}' -> Error
# Original invalid string: r'[1, 2][3, 4]' -> Error
# Wrap Objects in a parent object
valid_object_wrap = r'{"obj1": {"id": 1}, "obj2": {"id": 2}}'
data_obj = json.loads(valid_object_wrap)
print(f"\nParsed Object Wrap: {data_obj}") # Output: Parsed Object Wrap: {'obj1': {'id': 1}, 'obj2': {'id': 2}}
print(f"Accessing: {data_obj['obj1']['id']}") # Output: 1
# Wrap Arrays in a parent object
valid_array_wrap = r'{"listA": [1, 2], "listB": [3, 4]}'
data_arr = json.loads(valid_array_wrap)
print(f"\nParsed Array Wrap: {data_arr}") # Output: Parsed Array Wrap: {'listA': [1, 2], 'listB': [3, 4]}
print(f"Accessing: {data_arr['listB'][0]}") # Output: 3
Fixing Files
Modify the .json
file to have a single top-level object containing the other items.
Corrected data.json
:
{
"users": [
{"id": 1, "name": "Alice"},
{"id": 2, "name": "Bob"}
],
"status": "processed"
}
Solution 3: Process as Line-Delimited JSON (JSON Lines / ndjson)
If your input source intentionally contains one complete, independent JSON object (or value) per line, this is a format known as JSON Lines (.jsonl
) or newline-delimited JSON (ndjson). The standard json.load()
or json.loads()
can not parse this directly because it expects a single JSON value for the whole input.
You need to read the input line by line and parse each line individually.
- Example
data.jsonl
:{"id": 1, "event": "start"}
{"id": 2, "event": "process", "value": 10}
{"id": 3, "event": "end"} - Python Code:
Output:
import json
import os
filename = "data.jsonl"
# Create dummy file
with open(filename, "w", encoding='utf-8') as f:
f.write('{"id": 1, "event": "start"}\n')
f.write('{"id": 2, "event": "process", "value": 10}\n')
f.write('{"id": 3, "event": "end"}\n')
parsed_data = []
print(f"Reading line-delimited JSON from '{filename}':")
try:
with open(filename, 'r', encoding='utf-8') as f:
for line in f:
# Skip empty lines if necessary
line = line.strip()
if line:
try:
# ✅ Parse each line individually
obj = json.loads(line)
parsed_data.append(obj)
print(f" Parsed line: {obj}")
except json.JSONDecodeError as e:
print(f" Error parsing line '{line}': {e}")
print(f"Final list of parsed objects: {parsed_data}")
except FileNotFoundError:
print(f"Error: File '{filename}' not found.")
except Exception as e:
print(f"An unexpected error occurred: {e}")
finally:
if os.path.exists(filename): os.remove(filename)Reading line-delimited JSON from 'data.jsonl':
Parsed line: {'id': 1, 'event': 'start'}
Parsed line: {'id': 2, 'event': 'process', 'value': 10}
Parsed line: {'id': 3, 'event': 'end'}
Final list of parsed objects: [{'id': 1, 'event': 'start'}, {'id': 2, 'event': 'process', 'value': 10}, {'id': 3, 'event': 'end'}]
Checking JSON Validity
If you suspect your JSON might be malformed beyond just the "Extra data" issue, use an online JSON validator or editor linters to pinpoint syntax errors (like missing quotes, incorrect bracket usage, invalid literals).
Conclusion
The json.decoder.JSONDecodeError: Extra data
occurs when the JSON parser finds unexpected characters after successfully parsing one complete top-level JSON value (object or array). This usually means your input contains multiple JSON values side-by-side without a proper container.
The standard solutions are:
- Wrap the multiple values in a single JSON array
[...]
, separating elements with commas (no trailing comma). - Structure the data within a single top-level JSON object
{...}
, where the multiple values become properties of that object. - If the input is intentionally line-delimited JSON (
.jsonl
), read and parse each line individually usingjson.loads()
.
Ensuring your input conforms to the JSON standard of having a single top-level value is key to resolving this error.