How to Solve "OverflowError: Python int too large to convert to C long" in Python
The OverflowError: Python int too large to convert to C long
error in Python occurs when you attempt to convert a Python integer that is too large to fit into a C long
data type.
This most often happens when working with NumPy or Pandas, where C types are used internally for performance reasons.
This guide explains the cause of the error and presents practical solutions, covering cases with NumPy arrays, Pandas DataFrames, and considerations for large integers.
Understanding the Error: C long
Limits
The error arises because NumPy, and sometimes Pandas, use C data types internally for performance.
The C long
type has a limited size, which depends on the platform (32 bits on Windows, often 64 bits on Linux/macOS).
Python's built-in integers have arbitrary precision (they can be arbitrarily large), but when you try to force a large Python integer into a C long
, you get an overflow.
import numpy as np
# ⛔️ OverflowError: Python int too large to convert to C long
try:
arr = np.array([1, 5, 2147483648], dtype=int) # Using default int type
except OverflowError as e:
print(e)
# Output: Python int too large to convert to C long
Solutions for NumPy Arrays
Using np.int64
The most direct fix is to specify np.int64
as the dtype
when creating the NumPy array. This uses a 64-bit integer, which has a much larger range than the default int
type (which often defaults to a 32-bit integer, especially on Windows).
import numpy as np
arr = np.array([1, 5, 2147483648], dtype=np.int64) # Explicitly use int64
print(arr)
Output:
[ 1 5 2147483648]
np.int64
will explicitly declare the type of the array as 64 bit integers.
Using np.float64
(for very large numbers)
If your numbers are extremely large (larger than sys.maxsize
), even np.int64
might not be enough. In this case, you can use np.float64
. Keep in mind that you'll lose integer precision, but you can represent much larger magnitudes.
import numpy as np
import sys
print(sys.maxsize) # Output: 9223372036854775807 (for 64-bit system)
arr = np.array([1, 5, 9223372036854775808], dtype=np.float64)
print(arr) # Output: [1.00000000e+00 5.00000000e+00 9.22337204e+18]
- The type
np.float64
will be able to store the large values in the array, whileint
would throw an exception.
Solutions for Pandas DataFrames
Using astype(int)
or astype(float)
If the error arises in a Pandas DataFrame, use the astype()
method to change the data type of the problematic column:
import pandas as pd
df = pd.DataFrame({'salary': ['9223372036854775804', '439243294932']})
df['new'] = df['salary'].astype('int64') # Convert to int64
print(df)
Output:
salary new
0 9223372036854775804 9223372036854775804
1 439243294932 439243294932
import pandas as pd
df = pd.DataFrame({'salary': ['9223372036854775808', '439243294932']})
df['new'] = df['salary'].astype(float) # Or convert to float
print(df)
Output:
salary new
0 9223372036854775808 9.223372e+18
1 439243294932 4.392433e+11
Using pd.to_numeric()
The function pd.to_numeric()
is recommended to convert the data type, as it also gives us tools to automatically handle conversion exceptions:
import pandas as pd
df = pd.DataFrame({
'experience': ['5', '14', '7', '10'],
'salary': [175.1, 180.2, 190.3, 205.4],
}, index=['Alice', 'Bob', 'Carl', 'Dan'])
df['experience'] = pd.to_numeric(df['experience'])
print(df.dtypes)
print(df.idxmax())
Output:
experience int64
salary float64
dtype: object
experience Bob
salary Dan
dtype: object
Handling Integers Larger than sys.maxsize
Using Native Python Lists
Standard Python lists can handle arbitrarily large integers. If you don't need the specific features of NumPy or Pandas, a regular list might be sufficient:
a_list = [1, 5, 92233720368547758088888888]
print(a_list) # Output: [1, 5, 92233720368547758088888888]
Conclusion
The OverflowError: Python int too large to convert to C long
typically occurs when working with NumPy or Pandas and trying to store integers that exceed the limits of the underlying C data types.
By using np.int64
, np.float64
(for very large numbers), astype(int)
, astype(float)
or pd.to_numeric()
, and being mindful of the data types you're working with, you can resolve this error and handle large integers effectively.
Remember that standard Python lists are a good option for storing arbitrarily large integers if you don't need the specific features of NumPy or Pandas.