Skip to main content

Python Pandas: Select First N or Last N Columns of DataFrame

Selecting a specific number of columns from the beginning or end of a Pandas DataFrame is a common requirement when you want to focus on a subset of your data, prepare data for a specific function, or simply for display purposes. Pandas provides powerful integer-location based indexing with iloc and other convenient methods to achieve this column selection.

This guide explains how to select the first N, last N, or exclude the last N columns of a Pandas DataFrame.

The Goal: Subsetting Columns by Position

Given a Pandas DataFrame, we want to create a new DataFrame that contains:

  • Only the first N columns from the original.
  • Only the last N columns from the original.
  • All columns except for the last N columns.

This selection is based on the position of the columns, not their names.

Example DataFrame

import pandas as pd

data = {
'Col_A': [1, 2, 3, 4, 5],
'Col_B': ['Mon', 'Tue', 'Wed', 'Thu', 'Fri'],
'Col_C': [10.1, 20.2, 30.3, 40.4, 50.5],
'Col_D': [True, False, True, False, True],
'Col_E': ['X', 'Y', 'Z', 'X', 'Y']
}

df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)

Output:

Original DataFrame:
Col_A Col_B Col_C Col_D Col_E
0 1 Mon 10.1 True X
1 2 Tue 20.2 False Y
2 3 Wed 30.3 True Z
3 4 Thu 40.4 False X
4 5 Fri 50.5 True Y

Select the FIRST N Columns

The DataFrame.iloc indexer allows selection by integer position. The syntax df.iloc[row_slicer, column_slicer] is used.

  • : for row_slicer selects all rows.
  • :N for column_slicer selects columns from the beginning (position 0) up to (but not including) position N.
import pandas as pd

df_example = pd.DataFrame({
'Col_A': [1], 'Col_B': ['Mon'], 'Col_C': [10.1], 'Col_D': [True], 'Col_E': ['X']
})

# N = number of first columns to select
n_first = 3

# ✅ Select all rows (:) and the first N columns (:n_first)
df_first_n_cols = df_example.iloc[:, :n_first]

print(f"Selecting the first {n_first} columns using .iloc:")
print(df_first_n_cols)

Output:

Selecting the first 3 columns using .iloc:
Col_A Col_B Col_C
0 1 Mon 10.1

Creating a Reusable Function

For convenience if you do this often:

import pandas as pd

def select_first_n_columns(dataframe, n):
"""Selects the first N columns of a DataFrame."""
if n <= 0:
return pd.DataFrame(index=dataframe.index) # Return empty DataFrame with same index
if n > len(dataframe.columns):
return dataframe.copy() # Or raise error/warning if n is too large
return dataframe.iloc[:, :n]

df_example = pd.DataFrame({
'Col_A': [1,2], 'Col_B': ['Mon','Tue'], 'Col_C': [10.1,20.2], 'Col_D': [True,False], 'Col_E': ['X','Y']
})

df_first_2 = select_first_n_columns(df_example, 2)
print("Using reusable function for first 2 columns:")
print(df_first_2)
print()

df_first_1 = select_first_n_columns(df_example, 1)
print("Using reusable function for first 1 column:")
print(df_first_1)

Output:

Using reusable function for first 2 columns:
Col_A Col_B
0 1 Mon
1 2 Tue

Using reusable function for first 1 column:
Col_A
0 1
1 2

Select the LAST N Columns

Negative indexing in slicing means "count from the end."

  • : for row_slicer selects all rows.
  • -N: for column_slicer selects columns from the Nth-to-last position up to the end.
import pandas as pd

df_example = pd.DataFrame({
'Col_A': [1], 'Col_B': ['Mon'], 'Col_C': [10.1], 'Col_D': [True], 'Col_E': ['X']
})

# N = number of last columns to select
n_last = 3

# ✅ Select all rows (:) and the last N columns (-n_last:)
df_last_n_cols = df_example.iloc[:, -n_last:]

print(f"Selecting the last {n_last} columns using .iloc:")
print(df_last_n_cols)

Output:

Selecting the last 3 columns using .iloc:
Col_C Col_D Col_E
0 10.1 True X

Using DataFrame.columns Slicing

You can slice the df.columns Index object to get the names of the last N columns, and then use these names to select from the DataFrame.

import pandas as pd

df_example = pd.DataFrame({
'Col_A': [1], 'Col_B': ['Mon'], 'Col_C': [10.1], 'Col_D': [True], 'Col_E': ['X']
})

n_last = 2

# Get the names of the last N columns
last_n_column_names = df_example.columns[-n_last:]
print(f"Names of last {n_last} columns: {last_n_column_names.tolist()}\n")

# ✅ Select columns by these names
df_last_n_cols_by_name = df_example[last_n_column_names]

print(f"Selecting the last {n_last} columns using df.columns slicing:")
print(df_last_n_cols_by_name)

Output:

Names of last 2 columns: ['Col_D', 'Col_E']

Selecting the last 2 columns using df.columns slicing:
Col_D Col_E
0 True X

While this works, df.iloc[:, -N:] is generally more direct for positional selection.

EXCLUDE the Last N Columns (Select All BUT Last N)

To select all columns except for the last N ones, use a negative stop index in the column slicer.

  • : for row_slicer selects all rows.
  • :-N for column_slicer selects columns from the beginning up to (but not including) the Nth-to-last column.
import pandas as pd

df_example = pd.DataFrame({
'Col_A': [1], 'Col_B': ['Mon'], 'Col_C': [10.1], 'Col_D': [True], 'Col_E': ['X']
})

# N = number of last columns to EXCLUDE
n_exclude_last = 2

# ✅ Select all rows (:) and all columns UP TO the last N (:-n_exclude_last)
df_exclude_last_n = df_example.iloc[:, :-n_exclude_last]

print(f"Excluding the last {n_exclude_last} columns using .iloc:")
print(df_exclude_last_n)

Output:

Excluding the last 2 columns using .iloc:
Col_A Col_B Col_C
0 1 Mon 10.1

Conclusion

Pandas DataFrame.iloc provides a powerful and concise way to select columns based on their integer positions:

  • To select the FIRST N columns: Use df.iloc[:, :N].
  • To select the LAST N columns: Use df.iloc[:, -N:].
  • To EXCLUDE the last N columns (select all columns except the last N): Use df.iloc[:, :-N].

Remember that iloc is purely integer-location based, so it selects columns by their order (0th, 1st, 2nd, ... or -1st, -2nd from the end), regardless of their names. Using df.columns slicing to get names first and then selecting is an alternative but usually less direct for positional tasks.