How to Resolve Python "JSONDecodeError: Invalid control character at ..."
When parsing JSON data in Python using the json
module, you might encounter the json.decoder.JSONDecodeError: Invalid control character at ...
. This specific error indicates that the JSON parser found an unescaped "control character" (like a newline \n
or tab \t
) within a string value, which is disallowed by default according to strict JSON standards.
This guide explains why this error occurs and provides the primary solution using the strict
parameter, along with alternative approaches.
Understanding the Error: Control Characters in JSON
Control characters are special, often non-printing characters within the ASCII range 0-31. Common examples include newline (\n
), tab (\t
), carriage return (\r
), and null (\0
). Standard JSON syntax requires these characters to be escaped with a backslash if they appear inside a string value (e.g., "line1\\nline2"
is valid JSON, but "line1\nline2"
is not strictly valid).
By default, Python's json.loads()
and json.load()
functions operate in a strict mode (strict=True
) that adheres to this standard and disallows unescaped control characters within strings, leading to the error.
Cause: Unescaped Control Characters with strict=True
The error occurs when you pass a string containing raw, unescaped control characters to json.loads()
(or read such content using json.load()
) without telling the parser to relax its rules.
import json
# String contains a raw newline character '\n' inside the value
invalid_json_string = '{"message": "Hello\nWorld"}'
print(f"Input string (repr): {repr(invalid_json_string)}")
try:
# ⛔️ json.decoder.JSONDecodeError: Invalid control character at: line 1 column 20 (char 19)
# The error points to the location of the raw newline character.
data = json.loads(invalid_json_string) # strict=True by default
print(data)
except json.JSONDecodeError as e:
print(e)
Output:
Input string (repr): '{"message": "Hello\nWorld"}'
Invalid control character at: line 1 column 19 (char 18)
The parser encounters the raw \n
at character index 19 and raises the error because it expects control characters to be escaped (like \\n
) in strict mode.
Solution 1: Allow Control Characters (strict=False
) (Recommended)
The json
module provides a direct way to handle this specific error: set the strict
parameter to False
when calling json.loads()
or json.load()
. This tells the parser to permit unescaped control characters within strings.
import json
# Same string with a raw newline character
json_string_with_newline = '{"message": "Hello\nWorld"}'
print(f"Input string (repr): {repr(json_string_with_newline)}")
try:
# ✅ Set strict=False to allow unescaped control characters
data = json.loads(json_string_with_newline, strict=False)
# Parsing succeeds, the newline is preserved in the Python string
print("\nParsed data:", data)
# Output: Parsed data: {'message': 'Hello\nWorld'}
print(f"Value contains newline: {'\n' in data['message']}") # Output: True
except json.JSONDecodeError as e:
# This block should not run now for this specific error
print(f"Error despite strict=False: {e}")
Output:
Input string (repr): '{"message": "Hello\nWorld"}'
Parsed data: {'message': 'Hello\nWorld'}
Value contains newline: True
Setting strict=False
is the intended mechanism within the json
library for parsing JSON-like data that might contain raw control characters.
Solution 2: Remove Control Characters Before Parsing
If you don't need the control characters in your final data, you can remove them from the string before parsing. This approach modifies the data.
import json
json_string_with_newline = '{"message": "Hello\nWorld", "note": "Tab\there"}'
print(f"Original string (repr): {repr(json_string_with_newline)}")
# ✅ Remove specific control characters (e.g., newline and tab)
cleaned_string = json_string_with_newline.replace('\n', '').replace('\t', '')
print(f"Cleaned string: {cleaned_string}")
# Output: Cleaned string: {"message": "HelloWorld", "note": "Tabhere"}
try:
# Parse the cleaned string (strict=True is fine now)
data = json.loads(cleaned_string)
print("Parsed data after removal:", data)
# Output: Parsed data after removal: {'message': 'HelloWorld', 'note': 'Tabhere'}
except json.JSONDecodeError as e:
print(f"Error parsing cleaned string: {e}")
Output:
Original string (repr): '{"message": "Hello\nWorld", "note": "Tab\there"}'
Cleaned string: {"message": "HelloWorld", "note": "Tabhere"}
Parsed data after removal: {'message': 'HelloWorld', 'note': 'Tabhere'}
Use str.replace(control_char, '')
for each control character you want to eliminate.
Be aware this changes the content of the string values.
Solution 3: Escape Control Characters Before Parsing
If you need to preserve the meaning of the control character but make the JSON string strictly valid, you can escape them by adding an extra backslash before parsing.
import json
json_string_with_newline = '{"message": "Hello\nWorld"}'
print(f"Original string (repr): {repr(json_string_with_newline)}")
# ✅ Escape the newline character
escaped_string = json_string_with_newline.replace('\n', '\\n') # Replace \n with \\n
print(f"Escaped string (repr): {repr(escaped_string)}")
# Output: Escaped string (repr): '{"message": "Hello\\nWorld"}'
try:
# Parse the escaped string (strict=True is fine now)
data = json.loads(escaped_string)
# The resulting Python string will contain the literal newline character
print("Parsed data after escaping:", data)
# Output: Parsed data after escaping: {'message': 'Hello\nWorld'}
print(f"Value contains newline: {'\n' in data['message']}")
# Output: Value contains newline: True
except json.JSONDecodeError as e:
print(f"Error parsing escaped string: {e}")
Output:
Original string (repr): '{"message": "Hello\nWorld"}'
Escaped string (repr): '{"message": "Hello\\nWorld"}'
Parsed data after escaping: {'message': 'Hello\nWorld'}
Value contains newline: True
Replacing \n
with \\n
, \t
with \\t
, etc., creates a string that conforms to strict JSON syntax rules regarding control characters.
Handling Files (json.load()
with strict=False
)
The same strict=False
parameter can be used with json.load()
when reading from a file object that might contain unescaped control characters.
import json
import io # For simulating a file
# Simulate file content with a raw newline
file_content = '{"config": "path/to/file\nline2"}'
file_simulation = io.StringIO(file_content)
try:
# ✅ Use strict=False with json.load()
data = json.load(file_simulation, strict=False)
print("Parsed data from file stream:", data)
# Output: Parsed data from file stream: {'config': 'path/to/file\nline2'}
except json.JSONDecodeError as e:
print(f"Error loading from file stream: {e}")
# --- Without strict=False, this would fail ---
file_simulation_strict = io.StringIO(file_content)
try:
# ⛔️ json.decoder.JSONDecodeError: Invalid control character...
data_strict = json.load(file_simulation_strict) # Default strict=True
except json.JSONDecodeError as e:
print(f"\nError loading from file stream (strict=True): {e}")
Output:
Parsed data from file stream: {'config': 'path/to/file\nline2'}
Error loading from file stream (strict=True): Invalid control character at: line 1 column 25 (char 24)
Alternatively, you could read the entire file content into a string (f.read()
), perform removal or escaping (Solutions 2 or 3), and then parse the modified string using json.loads()
.
Related Issue: Unterminated Strings (Missing Quotes)
While the "Invalid control character" error is specific, sometimes users encounter other JSONDecodeError
types when dealing with malformed strings. A common one is Unterminated string starting at...
, which usually means a closing double quote ("
) is missing for a key or string value.
import json
# Missing closing quote after Nolan
malformed_string = '{"name": "Tom Nolan}'
try:
# ⛔️ json.decoder.JSONDecodeError: Unterminated string starting at: line 1 column 10 (char 9)
data = json.loads(malformed_string)
except json.JSONDecodeError as e:
print(e)
Output:
Unterminated string starting at: line 1 column 10 (char 9)
Solution: Ensure all strings and keys in your JSON data are properly enclosed in double quotes ("
). This is a different issue than control characters but also relates to JSON validity.
Conclusion
The JSONDecodeError: Invalid control character at ...
specifically occurs when Python's json
parser, in its default strict mode (strict=True
), encounters unescaped control characters (like \n
, \t
) within JSON strings.
The most direct solution is to allow these characters by setting strict=False
in your json.loads()
or json.load()
call. This is the intended way if you need to parse JSON-like data that includes raw control characters.
Alternatively, if you don't need the control characters or want to ensure strict JSON validity before parsing, you can remove them (e.g., string.replace('\n', '')
) or escape them (e.g., string.replace('\n', '\\n')
) in the input string prior to parsing. Remember to also ensure overall JSON validity (correct quoting, no trailing commas, etc.) to avoid other types of JSONDecodeError
.