Skip to main content

Python Pandas: How to Convert Entire DataFrame to Numeric (int or float)

When working with data in Pandas, columns that should be numeric are sometimes loaded as strings (object dtype), especially if the source data (like a CSV) contains mixed types or non-standard numeric representations. Converting these columns, or even an entire DataFrame where applicable, to appropriate numeric types (integer or float) is essential for performing calculations, statistical analysis, and plotting.

This guide explains how to convert all convertible columns in a Pandas DataFrame to numeric types using DataFrame.apply() with pd.to_numeric() and how to handle non-convertible values.

The Goal: Numeric Conversion for the Entire DataFrame

Given a Pandas DataFrame where multiple columns contain data that should be numeric (integers or floats) but are currently stored as strings (object dtype), we want to convert all such columns to their appropriate numeric types efficiently.

Example DataFrame with String Numerics

import pandas as pd

data = {
'RecordID_Str': ['101', '102', '103', '104'],
'Quantity_Str': ['5', '12', '8', '20'],
'Price_Str': ['19.99', '5.50', '120.00', '0.99'],
'Category_NonNumeric': ['A', 'B', 'A', 'C']
}

df_original = pd.DataFrame(data)
print("Original DataFrame:")
print(df_original)
print()

print("Original dtypes:")
print(df_original.dtypes)

Output:

Original DataFrame:
RecordID_Str Quantity_Str Price_Str Category_NonNumeric
0 101 5 19.99 A
1 102 12 5.50 B
2 103 8 120.00 A
3 104 20 0.99 C

Original dtypes:
RecordID_Str object
Quantity_Str object
Price_Str object
Category_NonNumeric object
dtype: object

All columns here are initially object type because their values are strings.

Method: Using DataFrame.apply(pd.to_numeric)

The DataFrame.apply(func) method applies a function func along an axis of the DataFrame. When applied column-wise (default axis=0), it passes each column (as a Series) to the function. pd.to_numeric() is designed to convert an array-like or Series to a numeric type.

Basic Conversion (Assumes All Columns Are Convertible)

If all columns in your DataFrame contain string representations of numbers, you can directly apply pd.to_numeric.

import pandas as pd

# Example with only convertible string columns
df_all_convertible = pd.DataFrame({
'Col_Int_Str': ['10', '20', '30'],
'Col_Float_Str': ['1.1', '2.2', '3.3']
})
print("DataFrame with all convertible string columns (before):")
print(df_all_convertible.dtypes)
print()

# ✅ Apply pd.to_numeric to each column
df_all_numeric = df_all_convertible.apply(pd.to_numeric)

print("DataFrame dtypes after apply(pd.to_numeric):")
print(df_all_numeric.dtypes)

Output:

DataFrame with all convertible string columns (before):
Col_Int_Str object
Col_Float_Str object
dtype: object

DataFrame dtypes after apply(pd.to_numeric):
Col_Int_Str int64
Col_Float_Str float64
dtype: object
  • pd.to_numeric infers whether to convert to int64 or float64 based on the content (e.g., presence of a decimal point).

Verifying Data Types (.dtypes or .info())

After conversion, check the dtypes attribute or use df.info() to confirm.

import pandas as pd

df_all_convertible = pd.DataFrame({
'Col_Int_Str': ['10', '20', '30'],
'Col_Float_Str': ['1.1', '2.2', '3.3']
})

df_all_numeric = df_all_convertible.apply(pd.to_numeric)

print("--- Using .info() to verify ---")
df_all_numeric.info()

Output:

--- Using .info() to verify ---
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Col_Int_Str 3 non-null int64
1 Col_Float_Str 3 non-null float64
dtypes: float64(1), int64(1)
memory usage: 120.0 bytes

Handling Non-Numeric Columns or Values (The errors Parameter)

If your DataFrame contains columns that are genuinely non-numeric (like 'Category_NonNumeric' in our main example) or individual string values within a column that cannot be converted to a number (e.g., "Unknown"), applying pd.to_numeric directly will raise a ValueError. The errors parameter in pd.to_numeric controls this behavior.

Default Behavior: errors='raise' (Raises ValueError)

By default, if pd.to_numeric encounters a value it cannot parse, it raises an exception.

import pandas as pd

data = {
'RecordID_Str': ['101', '102', '103', '104'],
'Quantity_Str': ['5', '12', '8', '20'],
'Price_Str': ['19.99', '5.50', '120.00', '0.99'],
'Category_NonNumeric': ['A', 'B', 'A', 'C']
}

df_original = pd.DataFrame(data)

df_error_attempt = df_original.apply(pd.to_numeric) # This would raise ValueError
# ValueError: Unable to parse string "A" at position 0 (for 'Category_NonNumeric')

Solution: errors='ignore' (Keeps Non-Convertible Columns/Values as Original)

If errors='ignore', pd.to_numeric will convert columns/values it can, and leave those it cannot parse unchanged (they will retain their original object dtype or value).

import pandas as pd

data = {
'RecordID_Str': ['101', '102', '103', '104'],
'Quantity_Str': ['5', '12', '8', '20'],
'Price_Str': ['19.99', '5.50', '120.00', '0.99'],
'Category_NonNumeric': ['A', 'B', 'A', 'C']
}

df_original = pd.DataFrame(data)

df_original_copy = pd.DataFrame({
'RecordID_Str': ['101', '102'], 'Quantity_Str': ['5', '12'],
'Price_Str': ['19.99', '5.50'], 'Category_NonNumeric': ['A', 'B']
})


# ✅ Apply pd.to_numeric with errors='ignore'
# The 'errors' parameter is passed to to_numeric for each column.
df_ignore_errors = df_original_copy.apply(pd.to_numeric, errors='ignore')

print("DataFrame dtypes after apply(pd.to_numeric, errors='ignore'):")
print(df_ignore_errors.dtypes)
print()

print("DataFrame content (errors='ignore'):")
print(df_ignore_errors)

Output:

DataFrame dtypes after apply(pd.to_numeric, errors='ignore'):
RecordID_Str int64
Quantity_Str int64
Price_Str float64
Category_NonNumeric object
dtype: object

DataFrame content (errors='ignore'):
RecordID_Str Quantity_Str Price_Str Category_NonNumeric
0 101 5 19.99 A
1 102 12 5.50 B

Solution: errors='coerce' (Converts Non-Convertible to NaN)

If errors='coerce', pd.to_numeric will convert columns/values it can, and replace those it cannot parse with NaN (Not a Number). Columns containing NaN (and successfully converted numbers) will become float type.

import pandas as pd

# Sample DataFrame with some columns having non-numeric data
df_original_copy = pd.DataFrame({
'RecordID_Str': ['101', '102'],
'Quantity_Str': ['5', '12'],
'Price_Str': ['19.99', '5.50'],
'Category_NonNumeric': ['A', 'B'],
'Mixed_Numeric': ['100', 'Error'] # Column with some non-numeric strings
})

# ✅ Apply pd.to_numeric with errors='coerce'
df_coerce_errors = df_original_copy.apply(pd.to_numeric, errors='coerce')

# Display the data types after coercion
print("DataFrame dtypes after apply(pd.to_numeric, errors='coerce'):")
print(df_coerce_errors.dtypes)
print()

# Display the DataFrame content after coercion
print("DataFrame content (errors='coerce'):")
print(df_coerce_errors)

Output:

DataFrame dtypes after apply(pd.to_numeric, errors='coerce'):
RecordID_Str int64
Quantity_Str int64
Price_Str float64
Category_NonNumeric float64
Mixed_Numeric float64
dtype: object

DataFrame content (errors='coerce'):
RecordID_Str Quantity_Str Price_Str Category_NonNumeric Mixed_Numeric
0 101 5 19.99 NaN 100.0
1 102 12 5.50 NaN NaN

Notice that the 'Category_NonNumeric' column, which contained only non-numeric strings, becomes all NaN and thus float64 dtype.

Using functools.partial with apply (Alternative for passing errors)

If df.apply(pd.to_numeric, errors='...') syntax feels indirect for passing keyword arguments to pd.to_numeric, you can use functools.partial to create a new function with the errors argument pre-set.

from functools import partial
import pandas as pd

# Sample DataFrame with string data in the 'Price_Str' column
df_original_copy = pd.DataFrame({
'Category_NonNumeric': ['A', 'B'],
'Price_Str': ['19.99', '5.50']
})

# Create a partial function with errors='ignore' pre-set
to_numeric_ignore_errors = partial(pd.to_numeric, errors='ignore')

# Apply the partial function across the DataFrame
df_partial_example = df_original_copy.apply(to_numeric_ignore_errors)

# Display the data types after using functools.partial with errors='ignore'
print("DataFrame dtypes (using functools.partial with errors='ignore'):")
print(df_partial_example.dtypes)

Output:

DataFrame dtypes (using functools.partial with errors='ignore'):
Category_NonNumeric object
Price_Str float64
dtype: object

This achieves the same as passing errors='ignore' directly to apply.

Choosing the Right errors Strategy

  • errors='raise' (default): Use if you expect all data to be cleanly numeric. Any non-numeric value will halt the process, forcing you to identify and fix data quality issues.
  • errors='ignore': Use if you want to convert what you can and leave problematic columns/values as they are (often object dtype). You'll need subsequent steps if you want to further process or clean these ignored columns.
  • errors='coerce': Often the most practical for data cleaning. It converts valid numerics and flags unconvertible entries as NaN, which can then be easily counted, imputed, or dropped using standard Pandas methods (.isnull().sum(), .fillna(), .dropna()).

Conclusion

To convert all applicable columns in a Pandas DataFrame to numeric types (integer or float):

  1. Use df_numeric = df.apply(pd.to_numeric, errors=...).
  2. The errors parameter is crucial for handling columns or individual values that cannot be converted:
    • errors='raise' (default): Stops on error.
    • errors='ignore': Leaves non-convertible data as is (often object type).
    • errors='coerce': Converts non-convertible data to NaN, allowing numeric operations on the rest. This is often the most useful option for data cleaning workflows.

By applying pd.to_numeric across the DataFrame with an appropriate error handling strategy, you can efficiently ensure your data is in the correct numeric format for further analysis.