Skip to main content

How to Use Wildcards for String Matching and Filtering in Python

This guide explains how to use wildcard characters in Python to filter lists of strings and check if strings match specific patterns. We'll cover the fnmatch module (for shell-style wildcards) and regular expressions (for more complex patterns).

Filtering Lists with fnmatch

The fnmatch module provides support for Unix shell-style wildcards, which are simpler than full regular expressions.

Wildcard Characters in fnmatch

  • *: Matches any sequence of characters (including zero characters).
  • ?: Matches any single character.
  • [seq]: Matches any character in seq.
  • [!seq]: Matches any character not in seq.

Using fnmatch.filter()

The fnmatch.filter() function filters a list, returning only the elements that match a given pattern:

import fnmatch

a_list = ['abc_tom.csv', 'nolan', '!@#', 'abc_employees.csv']
pattern = 'abc_*.csv' # Matches strings starting with 'abc_' and ending with '.csv'

filtered_list = fnmatch.filter(a_list, pattern)
print(filtered_list) # Output: ['abc_tom.csv', 'abc_employees.csv']
  • fnmatch.filter(names, pattern): Filters the names list, keeping only strings that match the pattern.

Here's another example using ?:

import fnmatch

a_list = ['abc', 'abz', 'abxyz']

pattern = 'ab?' # Matches strings starting with 'ab' and followed by one character
filtered_list = fnmatch.filter(a_list, pattern)
print(filtered_list) # Output: ['abc', 'abz']

Using fnmatch.fnmatch() with a List Comprehension

You can also use fnmatch.fnmatch() within a list comprehension for more control:

import fnmatch
import re

a_list = ['abc_tom.csv', 'nolan', '!@#', 'abc_employees.csv']
pattern = 'abc_*.csv'

filtered_list = [
item for item in a_list
if fnmatch.fnmatch(item, pattern)
]

print(filtered_list) # Output: ['abc_tom.csv', 'abc_employees.csv']
  • fnmatch.fnmatch(item, pattern): Checks if a single item (string) matches the pattern. Returns True or False.
  • The list comprehension builds a new list containing only the matching items.

Checking if a String Matches a Pattern with fnmatch

To check if a single string matches a wildcard pattern, use fnmatch.fnmatch():

import fnmatch

a_string = '2023_tom.txt'
pattern = '2023*.txt'

matches_pattern = fnmatch.fnmatch(a_string, pattern)
print(matches_pattern) # Output: True

if matches_pattern:
print('The string matches the pattern')
else:
print('The string does NOT match the pattern')

Filtering and Matching with Regular Expressions

For more complex patterns, regular expressions (the re module) offer much greater power and flexibility.

Filtering a List with re.match()

import re

a_list = ['abc_tom.csv', 'nolan', '!@#', 'abc_employees.csv']
regex = re.compile(r'abc_.*\.csv') # Matches strings starting with "abc_" and ending with ".csv"

filtered_list = [
item for item in a_list
if re.match(regex, item) # Checks if each item matches the regex.
]

print(filtered_list) # Output: ['abc_tom.csv', 'abc_employees.csv']
  • re.compile(r'abc_.*\.csv'): Compiles the regular expression. This is optional but good practice for efficiency if you reuse the same pattern multiple times.
    • abc_: Matches the literal characters "abc_".
    • .*: Matches any character (.) zero or more times (*). This is the "wildcard" part.
    • \.csv: Matches the literal characters ".csv". The backslash (\) escapes the dot (.), which has a special meaning in regular expressions.
  • re.match(regex, item): Checks if the regular expression matches at the beginning of the string item. Returns a match object if it matches, None otherwise.

Matching a Single String with re.match()

import re

a_string = '2023_tom.txt'
matches_pattern = bool(re.match(r'2023_.*\.txt', a_string)) # Using re.match directly.
print(matches_pattern) # Output: True

if matches_pattern:
print('The string matches the pattern')
else:
print('The string does NOT match the pattern')
  • re.match tries to match from the start of the string.