Python Pandas: How to Fix "ValueError: Usecols do not match columns, columns expected but not found"
When using pandas.read_csv()
to load data, the usecols
parameter is a handy feature for selectively importing only specific columns, which can save memory and processing time. However, this can sometimes lead to the ValueError: Usecols do not match columns, columns expected but not found: ['your_column_name']
. This error signals a mismatch: the column names you've specified in usecols
do not correspond to any of the column headers Pandas was able to identify in the CSV file.
This guide will thoroughly explain the two primary reasons for this ValueError
—misspelled column names in usecols
and incorrect delimiter specification via the sep
parameter—and provide clear solutions to ensure your read_csv
calls correctly identify and load your desired columns.
Understanding the Error: usecols
and Column Header Mismatch
The usecols
parameter in pd.read_csv()
allows you to provide a list of column names (or callable, or integer positions) that you want to import from the CSV file. Pandas first reads the header row of the CSV to identify all available column names. It then tries to match the names you provided in usecols
against these identified headers.
The ValueError: Usecols do not match columns, columns expected but not found: ['X', 'Y']
means that one or more of the names you listed in usecols
(like 'X' or 'Y' in the error message) were not found among the actual column headers Pandas read from the file.
Common Cause 1: Misspelled or Non-Existent Column Names in usecols
This is the most straightforward cause: typos, case sensitivity issues, or referencing column names that simply aren't present in the CSV file's header row.
Let's assume we have an inventory.csv
file:
inventory.csv
:
ProductID,ProductName,StockQuantity,UnitPrice
P101,Laptop,50,1200.00
P102,Mouse,200,25.50
P103,Keyboard,150,75.00
P104,Monitor,30,300.00
Reproducing the Error
import pandas as pd
file_path = 'inventory.csv'
try:
# ⛔️ Incorrect: 'product_id' (lowercase 'p') and 'Unit_Price' (capital 'U') are not exact matches.
# Also, 'Category' does not exist in the CSV header.
df_error = pd.read_csv(
file_path,
usecols=['product_id', 'ProductName', 'Unit_Price', 'Category']
)
print(df_error)
except ValueError as e:
print(f"Error: {e}")
Output:
Error: Usecols do not match columns, columns expected but not found: ['Unit_Price', 'product_id', 'Category']
The CSV headers are ProductID
, ProductName
, StockQuantity
, UnitPrice
.
'product_id'
inusecols
doesn't match'ProductID'
(case difference).'Unit_Price'
inusecols
doesn't match'UnitPrice'
(case difference).'Category'
inusecols
doesn't exist at all.
Solution: Verify and Correct Column Names in usecols
Ensure that every column name passed to the usecols
list is an exact match (including case) to a column header present in your CSV file.
import pandas as pd
file_path = 'inventory.csv'
# ✅ Correct: Column names in usecols exactly match headers in inventory.csv
df_correct_usecols = pd.read_csv(
file_path,
sep=',', # Assuming comma is the separator
encoding='utf-8',
usecols=['ProductID', 'ProductName', 'UnitPrice'] # These names exist in the CSV
)
print("DataFrame with correctly specified usecols:")
print(df_correct_usecols)
Output:
DataFrame with correctly specified usecols:
ProductID ProductName UnitPrice
0 P101 Laptop 1200.0
1 P102 Mouse 25.5
2 P103 Keyboard 75.0
3 P104 Monitor 300.0
Common Cause 2: Incorrect Delimiter (Separator) Specification
If the sep
(or delimiter
) argument in pd.read_csv()
does not match the actual delimiter used in your CSV file, Pandas will not be able to correctly parse the header row into distinct column names. Consequently, the names it thinks it found will not match what you provided in usecols
.
How Incorrect sep
Leads to the usecols
Error
If your CSV is, for example, semicolon-separated (data;more_data
) but you tell Pandas sep=','
(or rely on the default comma), Pandas will likely read the entire header line as a single column name (e.g., 'ProductID;ProductName;StockQuantity;UnitPrice'
). When you then try usecols=['ProductID', 'ProductName']
, neither of these individual names will match the single, long, incorrectly parsed header.
Reproducing the Error with Incorrect sep
Let's modify inventory.csv
to be semicolon-separated for this example:
inventory_semicolon.csv
:
ProductID;ProductName;StockQuantity;UnitPrice
P101;Laptop;50;1200.00
P102;Mouse;200;25.50
P103;Keyboard;150;75.00
Now, let's try to read it with the default comma separator (or an explicitly wrong one):
import pandas as pd
file_path_semicolon = 'inventory_semicolon.csv'
try:
# ⛔️ Incorrect: CSV uses ';', but we specify (or default to) sep=','
df_sep_error = pd.read_csv(
file_path_semicolon,
sep=',', # This is the problem
usecols=['ProductID', 'ProductName']
)
print(df_sep_error)
except ValueError as e:
print(f"Error with incorrect separator: {e}")
Output:
Error with incorrect separator: Usecols do not match columns, columns expected but not found: ['ProductName', 'ProductID']
Pandas, using sep=','
, reads the first line 'ProductID;ProductName;StockQuantity;UnitPrice'
as a single column name. Thus, 'ProductID'
is not found as a distinct header.
Solution: Specify the Correct sep
Argument
Ensure the sep
argument in pd.read_csv()
matches the delimiter actually used in your CSV file.
import pandas as pd
file_path_semicolon = 'inventory_semicolon.csv'
# ✅ Correct: Specify the correct separator sep=';'
df_correct_sep = pd.read_csv(
file_path_semicolon,
sep=';', # Correct delimiter for this file
encoding='utf-8',
usecols=['ProductID', 'UnitPrice']
)
print("DataFrame with correct separator and usecols:")
print(df_correct_sep)
Output:
DataFrame with correct separator and usecols:
ProductID UnitPrice
0 P101 1200.0
1 P102 25.5
2 P103 75.0
With the correct sep=';'
, Pandas correctly identifies ProductID
and UnitPrice
as distinct column headers, and usecols
finds them.
Debugging Tips
Inspect CSV Header Manually
Open your CSV file in a simple text editor (or a spreadsheet program) and carefully examine the very first line (the header row). Note the exact spelling, casing, and the delimiter character used.
Read Without usecols
First to See Inferred Headers
If you're unsure about the headers or delimiter, try reading the CSV without usecols
(and perhaps without sep
initially to see what Pandas infers):
import pandas as pd
file_path_to_inspect = 'inventory.csv' # Or inventory_semicolon.csv
# Read a few rows to inspect columns
df_inspect = pd.read_csv(file_path_to_inspect, nrows=5)
print("Inferred columns from initial read:")
print(df_inspect.columns.tolist())
print()
# If the above doesn't look right (e.g., one long string for columns),
# try with different common separators:
df_inspect_semicolon = pd.read_csv(file_path_to_inspect, sep=';', nrows=5)
print("Inferred columns with sep=';':")
print(df_inspect_semicolon.columns.tolist())
Output:
Inferred columns from initial read:
['ProductID', 'ProductName', 'StockQuantity', 'UnitPrice']
Inferred columns with sep=';':
['ProductID,ProductName,StockQuantity,UnitPrice']
This will show you the column names as Pandas sees them after applying the specified (or default) separator. You can then adjust your usecols
list and sep
argument accordingly.
Conclusion
The ValueError: Usecols do not match columns, columns expected but not found
in Pandas is a direct indication that the column names you're trying to select via usecols
are not being recognized as they appear in the CSV file's header. The two primary solutions are:
- Verify
usecols
Accuracy: Double-check the spelling, capitalization, and existence of every column name in yourusecols
list against the actual headers in your CSV file. - Specify the Correct
sep
(Delimiter): Ensure thesep
argument inpd.read_csv()
accurately reflects the character used to separate columns in your CSV file. An incorrect separator will prevent Pandas from identifying the true column headers.
By carefully managing these two aspects, you can reliably use usecols
to efficiently load only the data you need.