Python Pandas: How to Convert timedelta64[ns] Column to Total Seconds
Pandas uses the timedelta64[ns]
data type to represent durations or differences between two datetime objects. While this is useful for precise time arithmetic, you often need to express these durations in a more common unit, like total seconds, for reporting, calculations, or feature engineering.
This guide explains how to convert a Pandas DataFrame column of timedelta64[ns]
objects into its equivalent in total seconds, primarily using Series.dt.total_seconds()
and an alternative division method.
Understanding timedelta64[ns]
in Pandas
A timedelta64[ns]
object in Pandas represents a duration. The [ns]
indicates that the duration is stored with nanosecond precision. For example, a timedelta could represent "2 days, 5 hours, 30 minutes" or "7500 seconds." Our goal is to convert such durations purely into their total second representation (e.g., 1 minute = 60 seconds, 1 hour = 3600 seconds).
Example DataFrame with a Timedelta Column
import pandas as pd
import numpy as np # For np.timedelta64
data = {
'TaskID': [1, 2, 3, 4, 5],
'TaskName': ['Analysis', 'Development', 'Testing', 'Deployment', 'Meeting'],
'Duration': [
np.timedelta64(2, 'h'), # 2 hours
pd.Timedelta(days=1, hours=4), # 1 day, 4 hours
pd.Timedelta(minutes=90), # 90 minutes
np.timedelta64(3600 * 3 + 300, 's'), # 3 hours 5 minutes in seconds
pd.NaT # A missing timedelta
]
}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)
print()
print("Dtype of 'Duration' column:")
print(df['Duration'].dtype)
Output:
Original DataFrame:
TaskID TaskName Duration
0 1 Analysis 0 days 02:00:00
1 2 Development 1 days 04:00:00
2 3 Testing 0 days 01:30:00
3 4 Deployment 0 days 03:05:00
4 5 Meeting NaT
Dtype of 'Duration' column:
timedelta64[ns]
The 'Duration'
column is of timedelta64[ns]
type.
Method 1: Using Series.dt.total_seconds()
(Recommended)
Once a column is of timedelta64[ns]
type, you can use the .dt
accessor to access timedelta-like properties and methods. The Series.dt.total_seconds()
method returns the total duration of each timedelta in the Series, expressed in seconds, as a float.
Basic Conversion
import pandas as pd
import numpy as np
df_example = pd.DataFrame({
'Duration': [np.timedelta64(2, 'h'), pd.Timedelta(days=1, hours=4), pd.NaT]
})
# ✅ Convert the 'Duration' column to total seconds
duration_in_seconds = df_example['Duration'].dt.total_seconds()
print("'Duration' column converted to total seconds:")
print(duration_in_seconds)
Output:
'Duration' column converted to total seconds:
0 7200.0
1 100800.0
2 NaN
Name: Duration, dtype: float64
- This method correctly calculates the total number of seconds, considering all components of the timedelta (days, hours, minutes, seconds, etc.).
NaT
(Not a Time) values in the original timedelta column are converted toNaN
(Not a Number) in the resulting float Series.
Updating the Column In-Place or Creating a New Column
You can either overwrite the original 'Duration' column or create a new column for the seconds.
-
Overwrite existing column:
import pandas as pd
data = {
'TaskID': [1, 2, 3, 4, 5],
'TaskName': ['Analysis', 'Development', 'Testing', 'Deployment', 'Meeting'],
'Duration': [
np.timedelta64(2, 'h'), # 2 hours
pd.Timedelta(days=1, hours=4), # 1 day, 4 hours
pd.Timedelta(minutes=90), # 90 minutes
np.timedelta64(3600 * 3 + 300, 's'), # 3 hours 5 minutes in seconds
pd.NaT # A missing timedelta
]
}
df = pd.DataFrame(data)
df_update = df.copy()
df_update['Duration'] = df_update['Duration'].dt.total_seconds()
print("DataFrame with 'Duration' column updated to seconds:")
print(df_update)Output:
DataFrame with 'Duration' column updated to seconds:
TaskID TaskName Duration
0 1 Analysis 7200.0
1 2 Development 100800.0
2 3 Testing 5400.0
3 4 Deployment 11100.0
4 5 Meeting NaN -
Create a new column:
import pandas as pd
import numpy as np
df_new_col = pd.DataFrame({
'TaskName': ['Analysis', 'Development', 'Testing'],
'Duration': [np.timedelta64(2, 'h'), pd.Timedelta(days=1, hours=4), np.timedelta64(30, 'm')]
})
df_new_col['Duration_Seconds'] = df_new_col['Duration'].dt.total_seconds()
print("DataFrame with new 'Duration_Seconds' column:")
print(df_new_col)Output:
DataFrame with new 'Duration_Seconds' column:
TaskName Duration Duration_Seconds
0 Analysis 0 days 02:00:00 7200.0
1 Development 1 days 04:00:00 100800.0
2 Testing 0 days 00:30:00 1800.0
Method 2: Dividing by np.timedelta64(1, 's')
Another way to convert a timedelta64[ns]
Series to total seconds is to divide it by a timedelta64
object representing one second. This works because Pandas/NumPy handle arithmetic operations between timedelta objects.
import pandas as pd
import numpy as np
df_divide_example = pd.DataFrame({
'TaskName': ['Analysis', 'Development'],
'Duration': [np.timedelta64(2, 'h'), pd.Timedelta(days=1, hours=4)]
})
# Define a timedelta object representing one second
one_second_timedelta = np.timedelta64(1, 's')
# ✅ Divide the 'Duration' column by one second
duration_in_seconds_div = df_divide_example['Duration'] / one_second_timedelta
print("'Duration' column converted to total seconds (by division):")
print(duration_in_seconds_div)
Output:
'Duration' column converted to total seconds (by division):
0 7200.0
1 100800.0
Name: Duration, dtype: float64
- This method also correctly converts
NaT
toNaN
. - While it works,
Series.dt.total_seconds()
is generally considered more explicit and idiomatic for this specific conversion.
Handling NaT
(Not a Time) Values
Both Series.dt.total_seconds()
and division by np.timedelta64(1, 's')
will convert NaT
values in the original timedelta column to np.nan
(Not a Number) in the resulting floating-point Series of seconds. This is usually the desired behavior for representing missing duration data.
import pandas as pd
import numpy as np
df_with_nat = pd.DataFrame({'Duration': [np.timedelta64(1, 'h'), pd.NaT]})
seconds_col = df_with_nat['Duration'].dt.total_seconds()
print("Conversion of NaT to NaN:")
print(seconds_col)
Output:
Conversion of NaT to NaN:
0 3600.0
1 NaN
Name: Duration, dtype: float64
Conclusion
To convert a Pandas DataFrame column from timedelta64[ns]
type to total seconds:
- The most recommended and idiomatic method is to use the
dt
accessor with thetotal_seconds()
method:df['duration_in_seconds'] = df['your_timedelta_column'].dt.total_seconds()
- An alternative is to divide the timedelta column by a
numpy.timedelta64(1, 's')
object:df['duration_in_seconds'] = df['your_timedelta_column'] / np.timedelta64(1, 's')
Both methods produce a column of float
dtype representing the total duration in seconds, and correctly handle NaT
values by converting them to np.nan
. The .dt.total_seconds()
method is generally preferred for its explicitness.