Python String encode() Function
The String encode()
method converts the string into bytes using a specified encoding scheme and returns it.
Since Python 3.0, strings are encoded to UTF-8 as default.
Syntax
my_string.encode(encoding, errors)
encode() Parameters
Python String encode()
function parameters:
Parameter | Condition | Description |
---|---|---|
encoding | Optional | The encoding of the string like ascii , utf-8 , windows-1250 , windows-1252 , etc. |
error | Optional | Specifies what to do if the encoding fails (strict , replace , ignore , backslashreplace , namereplace , xmlcharrefreplace ) |
Notice that different error parameter has different effects:
strict
will raise an exception in case of an encoding error. This is the default value.replace
will replace malformed data with a suitable replacement marker, such as?
orufffd
ignore
will ignore malformed data and continue without further noticexmlcharrefreplace
will replace the unencodable character with the appropriate XML character reference (for encoding only)backslashreplace
will replace the unencodable character with backslashed escape sequences (for encoding only)namreplace
will replace the unencodable character with its name
encode() Return Value
Python String encode()
function returns a bytes object representing the encoded string.
Examples
Example 1: Encode to Default Utf-8 Encoding
For example, encoding a string to its default UTF-8 encoding.
The output shows the encoded version of the string, where special characters are represented by their byte sequences.
my_str = 'Das straße'
result = my_str.encode()
print(result) # Output: b'Das stra\xc3\x9fe'
output
b'Das stra\xc3\x9fe'
Example 2: Encoding with various error parameter
Let’s try to encode the German words Das straße
, which translates to The street
in English.
my_str = 'Das straße'
result = my_str.encode(encoding='ascii', errors='backslashreplace')
print(result) # Output b'Das stra\\xdfe'
result = my_str.encode(encoding='ascii', errors='ignore')
print(result) # Output b'Das strae'
result = my_str.encode(encoding='ascii', errors='namereplace')
print(result) # Output b'Das stra\\N{LATIN SMALL LETTER SHARP S}e'
result = my_str.encode(encoding='ascii', errors='replace')
print(result) # Output b'Das stra?e'
result = my_str.encode(encoding='ascii', errors='xmlcharrefreplace')
print(result) # Output b'Das straße'
result = my_str.encode(encoding='UTF-8', errors='strict')
print(result) # Output b'Das stra\xc3\x9fe
output
b'Das stra\\xdfe'
b'Das strae'
b'Das stra\\N{LATIN SMALL LETTER SHARP S}e'
b'Das stra?e'
b'Das straße'
b'Das stra\xc3\x9fe'
Observation about String Encoding
Starting from Python 3.0, strings are stored as Unicode, meaning each character in a string is represented by a Unicode code point.
This allows for the efficient storage of strings by converting the sequence of Unicode code points into a set of bytes through a process known as encoding.
Python supports various encodings, with utf-8
being the most commonly used and the default encoding. The encode()
method in Python enables the conversion of Unicode strings into any encoding supported by Python, with utf-8
being the default one.