Skip to main content

How to Solve "OverflowError: Python int too large to convert to C long" in Python

The OverflowError: Python int too large to convert to C long error in Python occurs when you attempt to convert a Python integer that is too large to fit into a C long data type.

This most often happens when working with NumPy or Pandas, where C types are used internally for performance reasons.

This guide explains the cause of the error and presents practical solutions, covering cases with NumPy arrays, Pandas DataFrames, and considerations for large integers.

Understanding the Error: C long Limits

The error arises because NumPy, and sometimes Pandas, use C data types internally for performance.

The C long type has a limited size, which depends on the platform (32 bits on Windows, often 64 bits on Linux/macOS).

Python's built-in integers have arbitrary precision (they can be arbitrarily large), but when you try to force a large Python integer into a C long, you get an overflow.

import numpy as np

# ⛔️ OverflowError: Python int too large to convert to C long
try:
arr = np.array([1, 5, 2147483648], dtype=int) # Using default int type
except OverflowError as e:
print(e)
# Output: Python int too large to convert to C long

Solutions for NumPy Arrays

Using np.int64

The most direct fix is to specify np.int64 as the dtype when creating the NumPy array. This uses a 64-bit integer, which has a much larger range than the default int type (which often defaults to a 32-bit integer, especially on Windows).

import numpy as np

arr = np.array([1, 5, 2147483648], dtype=np.int64) # Explicitly use int64
print(arr)

Output:

[         1          5 2147483648]
note

np.int64 will explicitly declare the type of the array as 64 bit integers.

Using np.float64 (for very large numbers)

If your numbers are extremely large (larger than sys.maxsize), even np.int64 might not be enough. In this case, you can use np.float64. Keep in mind that you'll lose integer precision, but you can represent much larger magnitudes.

import numpy as np
import sys

print(sys.maxsize) # Output: 9223372036854775807 (for 64-bit system)

arr = np.array([1, 5, 9223372036854775808], dtype=np.float64)
print(arr) # Output: [1.00000000e+00 5.00000000e+00 9.22337204e+18]
  • The type np.float64 will be able to store the large values in the array, while int would throw an exception.

Solutions for Pandas DataFrames

Using astype(int) or astype(float)

If the error arises in a Pandas DataFrame, use the astype() method to change the data type of the problematic column:

import pandas as pd

df = pd.DataFrame({'salary': ['9223372036854775804', '439243294932']})
df['new'] = df['salary'].astype('int64') # Convert to int64

print(df)

Output:

                salary                  new
0 9223372036854775804 9223372036854775804
1 439243294932 439243294932
import pandas as pd

df = pd.DataFrame({'salary': ['9223372036854775808', '439243294932']})
df['new'] = df['salary'].astype(float) # Or convert to float

print(df)

Output:

                salary           new
0 9223372036854775808 9.223372e+18
1 439243294932 4.392433e+11

Using pd.to_numeric()

The function pd.to_numeric() is recommended to convert the data type, as it also gives us tools to automatically handle conversion exceptions:

import pandas as pd

df = pd.DataFrame({
'experience': ['5', '14', '7', '10'],
'salary': [175.1, 180.2, 190.3, 205.4],
}, index=['Alice', 'Bob', 'Carl', 'Dan'])

df['experience'] = pd.to_numeric(df['experience'])

print(df.dtypes)
print(df.idxmax())

Output:

experience      int64
salary float64
dtype: object
experience Bob
salary Dan
dtype: object

Handling Integers Larger than sys.maxsize

Using Native Python Lists

Standard Python lists can handle arbitrarily large integers. If you don't need the specific features of NumPy or Pandas, a regular list might be sufficient:

a_list = [1, 5, 92233720368547758088888888]
print(a_list) # Output: [1, 5, 92233720368547758088888888]

Conclusion

The OverflowError: Python int too large to convert to C long typically occurs when working with NumPy or Pandas and trying to store integers that exceed the limits of the underlying C data types.

By using np.int64, np.float64 (for very large numbers), astype(int), astype(float) or pd.to_numeric(), and being mindful of the data types you're working with, you can resolve this error and handle large integers effectively.

Remember that standard Python lists are a good option for storing arbitrarily large integers if you don't need the specific features of NumPy or Pandas.