How to Convert Integers to Bytes in Python
Converting integers to their byte representations is essential for tasks like network programming, file I/O, and data serialization.
This guide explains how to use Python's built-in int.to_bytes()
and int.from_bytes()
methods to convert between integers and bytes, covering byte order, signed integers, and calculating the required byte length. We'll also briefly touch on converting an integer to a string then to bytes, clarifying the difference.
Converting an Integer to Bytes with to_bytes()
The int.to_bytes()
method is the primary and most direct way to convert an integer to a bytes object:
num = 2048
my_bytes = num.to_bytes(2, byteorder='big')
print(my_bytes) # Output: b'\x08\x00'
# Or, without an intermediate variable:
print((2048).to_bytes(2, byteorder='big')) # Output: b'\x08\x00'
num.to_bytes(length, byteorder)
length
: The number of bytes to use to represent the integer. If the integer requires more bytes than you specify, you'll get anOverflowError
.byteorder
: Specifies the byte order:'big'
(Big-Endian): The most significant byte is placed first (at the lowest memory address). This is the most common order for network protocols.'little'
(Little-Endian): The least significant byte is placed first. This is common on x86/x64 processors.
length
Argument
The length
argument determines the size of the resulting bytes object. If the integer can be represented with fewer bytes than specified, the result is padded with zero bytes.
my_bytes = (1024).to_bytes(2, byteorder='big')
print(my_bytes) # Output: b'\x04\x00' (2 bytes: 0x04 0x00)
my_bytes = (1024).to_bytes(4, byteorder='big')
print(my_bytes) # Output: b'\x00\x00\x04\x00' (4 bytes, padded)
# OverflowError:
# my_bytes = (1024).to_bytes(1, byteorder='big') # ERROR: OverflowError
byteorder
Argument: Big-Endian vs. Little-Endian
Byte order matters when an integer requires more than one byte.
- Big-Endian (
'big'
): Most significant byte first. Think of it like writing a number normally (e.g., 1024 is written with the '1' first). This is the standard for network protocols. - Little-Endian (
'little'
): Least significant byte first.
my_bytes_big = (1024).to_bytes(2, byteorder='big')
print(my_bytes_big) # Output: b'\x04\x00' (Big-Endian: 0x04 0x00)
my_bytes_little = (1024).to_bytes(2, byteorder='little')
print(my_bytes_little) # Output: b'\x00\x04' (Little-Endian: 0x00 0x04)
- The bit representation of number 1024 needs two bytes to be stored. Using big-endian byte order, the first byte is the "big" end (most significant) and the second one is the "little" end (least significant).
- When using the little-endian byte order, the order is reversed.
Converting Signed (Negative) Integers
To represent negative integers, use the signed=True
argument:
def int_to_bytes(integer):
return integer.to_bytes(
length=(8 + (integer + (integer < 0)).bit_length()) // 8,
byteorder='big',
signed=True # Handles the sign of the integer
)
print(int_to_bytes(-1024)) # Output: b'\xfc\x00'
print(int_to_bytes(-2048)) # Output: b'\xf8\x00'
signed=True
: Indicates that two's complement representation should be used for negative numbers. If set toFalse
while the number is negative, anOverflowError
is raised.- Calculating the required bytes is more complex, and involves handling the sign bit.
Calculating Byte Length Automatically
To automatically determine the minimum number of bytes required to represent an integer, you can use this function:
def int_to_bytes(integer):
return integer.to_bytes((integer.bit_length() + 7) // 8, 'big')
print(int_to_bytes(65)) # Output: b'A'
print(int_to_bytes(1024)) # Output: b'\x04\x00'
print(int_to_bytes(2048)) # Output: b'\x08\x00'
integer.bit_length()
: Returns the number of bits needed to represent the integer.+ 7
: We add 7 to ensure that we round up to the nearest byte. For instance, ifbit_length()
is 9, we need 2 bytes (16 bits).9 + 7 = 16
.// 8
: Integer division by 8 converts bits to bytes.
4. Converting Bytes Back to an Integer with from_bytes()
The int.from_bytes()
method reverses the process, converting a bytes object back to an integer:
def int_from_bytes(bytes_obj):
return int.from_bytes(bytes_obj, byteorder='big') # Using big-endian by default
print(int_from_bytes(b'A')) # Output: 65
print(int_from_bytes(b'\x04\x00')) # Output: 1024
print(int_from_bytes(b'\x08\x00')) # Output: 2048
-
The
int.from_bytes()
method requires the bytes object, and the byte order to be specified. -
To handle negative numbers use the
signed=True
parameter:
def int_from_bytes(binary_data):
return int.from_bytes(binary_data, byteorder='big', signed=True)
print(int_from_bytes(b'\xfc\x00')) # Output: -1024
print(int_from_bytes(b'\xf8\x00')) # Output: -2048
Converting an Integer to a String, then to Bytes (Different!)
It's important to distinguish between converting an integer directly to bytes (using to_bytes()
) and converting an integer to its string representation and then encoding that string:
num = 2048
# Correct way to represent the *number* 2048 as bytes:
bytes_representation = num.to_bytes(2, byteorder='big')
print(bytes_representation) # Output: b'\x08\x00'
# Converting the *string* "2048" to bytes:
string_representation = str(num).encode('utf-8')
print(string_representation) # Output: b'2048'
num.to_bytes(2, 'big')
: This gives the binary representation of the number 2048, which requires two bytes (0x08 0x00 in big-endian).str(num).encode('utf-8')
: This gives the UTF-8 encoded bytes of the string "2048". Each digit is represented by its UTF-8 byte value. This is not the same as the numerical representation.