Skip to main content

Python Pandas: How to Fix "ValueError: Usecols do not match columns, columns expected but not found"

When using pandas.read_csv() to load data, the usecols parameter is a handy feature for selectively importing only specific columns, which can save memory and processing time. However, this can sometimes lead to the ValueError: Usecols do not match columns, columns expected but not found: ['your_column_name']. This error signals a mismatch: the column names you've specified in usecols do not correspond to any of the column headers Pandas was able to identify in the CSV file.

This guide will thoroughly explain the two primary reasons for this ValueError—misspelled column names in usecols and incorrect delimiter specification via the sep parameter—and provide clear solutions to ensure your read_csv calls correctly identify and load your desired columns.

Understanding the Error: usecols and Column Header Mismatch

The usecols parameter in pd.read_csv() allows you to provide a list of column names (or callable, or integer positions) that you want to import from the CSV file. Pandas first reads the header row of the CSV to identify all available column names. It then tries to match the names you provided in usecols against these identified headers.

The ValueError: Usecols do not match columns, columns expected but not found: ['X', 'Y'] means that one or more of the names you listed in usecols (like 'X' or 'Y' in the error message) were not found among the actual column headers Pandas read from the file.

Common Cause 1: Misspelled or Non-Existent Column Names in usecols

This is the most straightforward cause: typos, case sensitivity issues, or referencing column names that simply aren't present in the CSV file's header row.

Let's assume we have an inventory.csv file: inventory.csv:

ProductID,ProductName,StockQuantity,UnitPrice
P101,Laptop,50,1200.00
P102,Mouse,200,25.50
P103,Keyboard,150,75.00
P104,Monitor,30,300.00

Reproducing the Error

import pandas as pd

file_path = 'inventory.csv'

try:
# ⛔️ Incorrect: 'product_id' (lowercase 'p') and 'Unit_Price' (capital 'U') are not exact matches.
# Also, 'Category' does not exist in the CSV header.
df_error = pd.read_csv(
file_path,
usecols=['product_id', 'ProductName', 'Unit_Price', 'Category']
)
print(df_error)
except ValueError as e:
print(f"Error: {e}")

Output:

Error: Usecols do not match columns, columns expected but not found: ['Unit_Price', 'product_id', 'Category']

The CSV headers are ProductID, ProductName, StockQuantity, UnitPrice.

  • 'product_id' in usecols doesn't match 'ProductID' (case difference).
  • 'Unit_Price' in usecols doesn't match 'UnitPrice' (case difference).
  • 'Category' in usecols doesn't exist at all.

Solution: Verify and Correct Column Names in usecols

Ensure that every column name passed to the usecols list is an exact match (including case) to a column header present in your CSV file.

import pandas as pd

file_path = 'inventory.csv'

# ✅ Correct: Column names in usecols exactly match headers in inventory.csv
df_correct_usecols = pd.read_csv(
file_path,
sep=',', # Assuming comma is the separator
encoding='utf-8',
usecols=['ProductID', 'ProductName', 'UnitPrice'] # These names exist in the CSV
)

print("DataFrame with correctly specified usecols:")
print(df_correct_usecols)

Output:

DataFrame with correctly specified usecols:
ProductID ProductName UnitPrice
0 P101 Laptop 1200.0
1 P102 Mouse 25.5
2 P103 Keyboard 75.0
3 P104 Monitor 300.0

Common Cause 2: Incorrect Delimiter (Separator) Specification

If the sep (or delimiter) argument in pd.read_csv() does not match the actual delimiter used in your CSV file, Pandas will not be able to correctly parse the header row into distinct column names. Consequently, the names it thinks it found will not match what you provided in usecols.

How Incorrect sep Leads to the usecols Error

If your CSV is, for example, semicolon-separated (data;more_data) but you tell Pandas sep=',' (or rely on the default comma), Pandas will likely read the entire header line as a single column name (e.g., 'ProductID;ProductName;StockQuantity;UnitPrice'). When you then try usecols=['ProductID', 'ProductName'], neither of these individual names will match the single, long, incorrectly parsed header.

Reproducing the Error with Incorrect sep

Let's modify inventory.csv to be semicolon-separated for this example:

inventory_semicolon.csv:

ProductID;ProductName;StockQuantity;UnitPrice
P101;Laptop;50;1200.00
P102;Mouse;200;25.50
P103;Keyboard;150;75.00

Now, let's try to read it with the default comma separator (or an explicitly wrong one):

import pandas as pd

file_path_semicolon = 'inventory_semicolon.csv'

try:
# ⛔️ Incorrect: CSV uses ';', but we specify (or default to) sep=','
df_sep_error = pd.read_csv(
file_path_semicolon,
sep=',', # This is the problem
usecols=['ProductID', 'ProductName']
)
print(df_sep_error)
except ValueError as e:
print(f"Error with incorrect separator: {e}")

Output:

Error with incorrect separator: Usecols do not match columns, columns expected but not found: ['ProductName', 'ProductID']
note

Pandas, using sep=',', reads the first line 'ProductID;ProductName;StockQuantity;UnitPrice' as a single column name. Thus, 'ProductID' is not found as a distinct header.

Solution: Specify the Correct sep Argument

Ensure the sep argument in pd.read_csv() matches the delimiter actually used in your CSV file.

import pandas as pd

file_path_semicolon = 'inventory_semicolon.csv'

# ✅ Correct: Specify the correct separator sep=';'
df_correct_sep = pd.read_csv(
file_path_semicolon,
sep=';', # Correct delimiter for this file
encoding='utf-8',
usecols=['ProductID', 'UnitPrice']
)

print("DataFrame with correct separator and usecols:")
print(df_correct_sep)

Output:

DataFrame with correct separator and usecols:
ProductID UnitPrice
0 P101 1200.0
1 P102 25.5
2 P103 75.0
note

With the correct sep=';', Pandas correctly identifies ProductID and UnitPrice as distinct column headers, and usecols finds them.

Debugging Tips

Inspect CSV Header Manually

Open your CSV file in a simple text editor (or a spreadsheet program) and carefully examine the very first line (the header row). Note the exact spelling, casing, and the delimiter character used.

Read Without usecols First to See Inferred Headers

If you're unsure about the headers or delimiter, try reading the CSV without usecols (and perhaps without sep initially to see what Pandas infers):

import pandas as pd

file_path_to_inspect = 'inventory.csv' # Or inventory_semicolon.csv

# Read a few rows to inspect columns
df_inspect = pd.read_csv(file_path_to_inspect, nrows=5)
print("Inferred columns from initial read:")
print(df_inspect.columns.tolist())
print()

# If the above doesn't look right (e.g., one long string for columns),
# try with different common separators:
df_inspect_semicolon = pd.read_csv(file_path_to_inspect, sep=';', nrows=5)
print("Inferred columns with sep=';':")
print(df_inspect_semicolon.columns.tolist())

Output:

Inferred columns from initial read:
['ProductID', 'ProductName', 'StockQuantity', 'UnitPrice']

Inferred columns with sep=';':
['ProductID,ProductName,StockQuantity,UnitPrice']
note

This will show you the column names as Pandas sees them after applying the specified (or default) separator. You can then adjust your usecols list and sep argument accordingly.

Conclusion

The ValueError: Usecols do not match columns, columns expected but not found in Pandas is a direct indication that the column names you're trying to select via usecols are not being recognized as they appear in the CSV file's header. The two primary solutions are:

  1. Verify usecols Accuracy: Double-check the spelling, capitalization, and existence of every column name in your usecols list against the actual headers in your CSV file.
  2. Specify the Correct sep (Delimiter): Ensure the sep argument in pd.read_csv() accurately reflects the character used to separate columns in your CSV file. An incorrect separator will prevent Pandas from identifying the true column headers.

By carefully managing these two aspects, you can reliably use usecols to efficiently load only the data you need.