Python Pandas: How to Find Closest Value to a Number in a DataFrame Column

Locating the value in a Pandas DataFrame column that is numerically closest to a given target number is a common task in data analysis. This might be needed for nearest neighbor searches, data matching, or finding the best approximation. Pandas, often in conjunction with NumPy, provides efficient ways to achieve this.

This guide explains how to find the row(s) containing the value closest to a target number in a specific DataFrame column, primarily using absolute differences with argsort() or idxmin().

The Goal: Finding the Nearest Numerical Match

Given a Pandas DataFrame, a specific numerical column, and a target number, we want to identify the row (or rows, if there are ties or we want multiple closest) where the value in the specified column is numerically closest to our target number.

Example DataFrame

import pandas as pd

data = {
    'ProductID': ['A101', 'B202', 'C303', 'D404', 'E505', 'F606'],
    'Measurement': [10.5, 12.1, 9.8, 12.5, 10.1, 15.0],
    'Temperature': [22.5, 21.0, 23.1, 20.5, 22.8, 19.9],
    'Quantity': [100, 150, 90, 160, 105, 200]
}

df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)

Output:

Original DataFrame:
  ProductID  Measurement  Temperature  Quantity
    A101         10.5         22.5       100
    B202         12.1         21.0       150
    C303          9.8         23.1        90
    D404         12.5         20.5       160
    E505         10.1         22.8       105
    F606         15.0         19.9       200

Let's say we want to find the row where 'Measurement' is closest to 11.0.

Method 1: Using Absolute Difference, `argsort()`, and `iloc` (Recommended for N Closest)

This method involves these steps:

Calculate the absolute difference between each value in the target column and the given number.
Get the indices that would sort these absolute differences using Series.argsort(). The first index in this sorted list corresponds to the smallest difference (i.e., the closest value).
Use DataFrame.iloc with these sorted indices to select the row(s).

Calculating Absolute Differences

import pandas as pd

df_example = pd.DataFrame({
    'ProductID': ['A101', 'B202', 'C303'],
    'Measurement': [10.5, 12.1, 9.8],
})
target_number = 11.0
column_to_search = 'Measurement'

# Calculate the absolute difference
absolute_differences = (df_example[column_to_search] - target_number).abs()
print(f"Absolute differences from {target_number} in '{column_to_search}':")
print(absolute_differences)

Output:

Absolute differences from 11.0 in 'Measurement':
0    0.5
1    1.1
2    1.2
Name: Measurement, dtype: float64

Getting Sorted Indices with `argsort()`

Series.argsort() returns a Series of integer positions that would sort the original Series. The first value in this result is the position of the smallest element (smallest difference).

import pandas as pd

df_example = pd.DataFrame({
    'ProductID': ['A101', 'B202', 'C303'],
    'Measurement': [10.5, 12.1, 9.8],
})

df_example = pd.DataFrame({ 'Measurement': [10.5, 12.1, 9.8] })
target_number = 11.0
column_to_search = 'Measurement'
absolute_differences = (df_example[column_to_search] - target_number).abs()

# Get indices that would sort the absolute differences
sorted_indices_by_diff = absolute_differences.argsort()
print(f"Indices that sort the differences:\n{sorted_indices_by_diff}")

Output:

Indices that sort the differences:
0    0
1    1
2    2
Name: Measurement, dtype: int32

Selecting the Row with the Closest Value

Use df.iloc[] with the first index from sorted_indices_by_diff to get the row with the closest value. To get it as a DataFrame, pass the index in a list.

import pandas as pd

df = pd.DataFrame({
    'ProductID': ['A101', 'B202', 'C303', 'D404', 'E505', 'F606'],
    'Measurement': [10.5, 12.1, 9.8, 12.5, 10.1, 15.0],
})
target_number = 11.0
column_to_search = 'Measurement'

# Combine steps
sorted_indices_by_diff = (df[column_to_search] - target_number).abs().argsort()

# ✅ Select the row at the first sorted index position (which corresponds to the smallest difference)
# sorted_indices_by_diff[0] would give the index label of the closest row.
# To select using iloc, we need the integer position.
# If the index is already 0,1,2,... then argsort gives these positions.
# If the index is custom, argsort gives the *positions* that would sort.
# The result of argsort() is an array of integer indices (positions).
closest_row_df = df.iloc[sorted_indices_by_diff.iloc[[0]]] # Get first position, select as DataFrame

print(f"Row with '{column_to_search}' closest to {target_number}:")
print(closest_row_df)
print()

# More directly using the values from argsort to slice with iloc:
closest_row_df_direct = df.iloc[(df[column_to_search] - target_number).abs().argsort()[:1]]
print(f"Closest row (direct iloc slice):")
print(closest_row_df_direct)

Output:

Row with 'Measurement' closest to 11.0:
  ProductID  Measurement
0      A101         10.5

Closest row (direct iloc slice):
  ProductID  Measurement
0      A101         10.5

Getting the Closest Value Itself (or N Closest Values)

import pandas as pd

df = pd.DataFrame({ 'Measurement': [10.5, 12.1, 9.8, 12.5, 10.1, 15.0] })
target_number = 11.0
column_to_search = 'Measurement'
closest_row_df_direct = df.iloc[(df[column_to_search] - target_number).abs().argsort()[:1]]

# Get the actual closest value from the selected row
closest_value = closest_row_df_direct[column_to_search].iloc[0] # Use .iloc[0] if result is DataFrame
# Or if closest_row_df_direct was a Series: closest_value = closest_row_df_direct[column_to_search]
print(f"The closest value to {target_number} is: {closest_value}")

# To get the N closest rows (e.g., 2 closest)
n_closest = 2
closest_n_rows_df = df.iloc[(df[column_to_search] - target_number).abs().argsort()[:n_closest]]
print(f"{n_closest} closest rows to {target_number}:")
print(closest_n_rows_df)

Output:

The closest value to 11.0 is: 10.5
2 closest rows to 11.0:
   Measurement
0         10.5
4         10.1

Method 2: Using Absolute Difference and `idxmin()` (For Single Closest Row Index)

The Series.idxmin() method returns the index label of the first occurrence of the minimum value in the Series. We can apply this to the Series of absolute differences.

import pandas as pd

df = pd.DataFrame({
    'ProductID': ['A101', 'B202', 'C303', 'D404', 'E505'],
    'Measurement': [10.5, 12.1, 9.8, 12.5, 10.1],
})
target_number = 11.0
column_to_search = 'Measurement'

# Calculate absolute differences
absolute_differences = (df[column_to_search] - target_number).abs()

# ✅ Get the index LABEL of the minimum difference
index_label_of_closest = absolute_differences.idxmin()
print(f"Index label of the row with '{column_to_search}' closest to {target_number}: {index_label_of_closest}\n")

# Select the row using this index label with .loc
closest_row_idxmin = df.loc[[index_label_of_closest]] # Use [[...]] to get a DataFrame
print("Closest row using idxmin():")
print(closest_row_idxmin)
print()

# Get the closest value itself
closest_value_idxmin = df.loc[index_label_of_closest, column_to_search]
print(f"The closest value itself (idxmin): {closest_value_idxmin}")

Output:

Index label of the row with 'Measurement' closest to 11.0: 0

Closest row using idxmin():
  ProductID  Measurement
0      A101         10.5

The closest value itself (idxmin): 10.5

idxmin() is excellent if you only need the single closest row and its index label.
If there are multiple rows with the same minimum difference, idxmin() returns the index label of the first such row.

Getting the Index Label of the Closest Value

As shown in the idxmin() method, absolute_differences.idxmin() directly gives you the index label.

With the argsort() method, the values returned by argsort() are integer positions. If your DataFrame has a default integer index (0, 1, 2,...), these positions are also the index labels. If you have a custom index, you can get the corresponding labels:

import pandas as pd

df = pd.DataFrame({ 'Measurement': [10.5, 12.1, 9.8, 12.5, 10.1] }, index=['a','b','c','d','e'])
target_number = 11.0
column_to_search = 'Measurement'
sorted_indices_positions = (df[column_to_search] - target_number).abs().argsort()

# sorted_indices_positions contains integer positions: e.g., [0, 4, 1, 2, 3]
# df.index gives the actual labels: Index(['a', 'b', 'c', 'd', 'e'], dtype='object')

# Get the label of the closest (first in sorted_indices_positions)
closest_index_label_argsort = df.index[sorted_indices_positions.iloc[0]]
print(f"Index LABEL of closest value (argsort method): {closest_index_label_argsort}")

# Get labels of N closest
n_closest = 2
closest_n_index_labels_argsort = df.index[sorted_indices_positions.iloc[:n_closest]]
print(f"{n_closest} closest index LABELS (argsort method): {closest_n_index_labels_argsort.tolist()}")

Output:

Index LABEL of closest value (argsort method): a
2 closest index LABELS (argsort method): ['a', 'e']

Conclusion

To find the value in a Pandas DataFrame column closest to a target number:

Calculate the absolute difference between the column's values and the target number: (df['column'] - target).abs().
To get the single closest row/value/index:
- Use .idxmin() on the absolute difference Series to get the index label of the closest row: diff_series.idxmin(). Then use df.loc[] to get the row or value. (Recommended for single closest)
To get N closest rows/values/indices:
- Use .argsort() on the absolute difference Series to get the integer positions that would sort the differences.
- Use these positions with df.iloc[] to select the N closest rows: df.iloc[diff_series.argsort()[:N]]. (Recommended for N closest)

Both methods are efficient. idxmin() is more direct for finding just the single closest match's index label, while argsort() followed by iloc is more flexible for getting multiple closest matches or their positional indices.

The Goal: Finding the Nearest Numerical Match​

Example DataFrame​

Method 1: Using Absolute Difference, argsort(), and iloc (Recommended for N Closest)​

Calculating Absolute Differences​

Getting Sorted Indices with argsort()​

Selecting the Row with the Closest Value​

Getting the Closest Value Itself (or N Closest Values)​

Method 2: Using Absolute Difference and idxmin() (For Single Closest Row Index)​

Getting the Index Label of the Closest Value​

Conclusion​

Table of Contents

The Goal: Finding the Nearest Numerical Match

Example DataFrame

Method 1: Using Absolute Difference, `argsort()`, and `iloc` (Recommended for N Closest)

Calculating Absolute Differences

Getting Sorted Indices with `argsort()`

Selecting the Row with the Closest Value

Getting the Closest Value Itself (or N Closest Values)

Method 2: Using Absolute Difference and `idxmin()` (For Single Closest Row Index)

Getting the Index Label of the Closest Value

Conclusion