Skip to main content

How to Replace Spaces with Underscores in Python Strings

Replacing spaces with underscores is a common string manipulation task in Python, often used for creating valid variable names, filenames, or URL slugs.

This guide explains how to efficiently replace spaces (and other whitespace) with underscores using str.replace(), regular expressions (re.sub()), and the combination of split() and join().

For simple replacement of single spaces, the str.replace() method is the most straightforward and efficient:

my_str = 'tutorial reference com'
result = my_str.replace(' ', '_')
print(result) # Output: tutorial_reference_com
  • my_str.replace(' ', '_'): This replaces all occurrences of a single space character (' ') with an underscore ('_'). It's very fast and readable.
  • The method does not modify the original string, but creates and returns a new string object with replaced characters.

Replacing All Whitespace with Underscores Using re.sub()

If you need to replace all whitespace characters (spaces, tabs, newlines, etc.) with underscores, and handle multiple consecutive spaces correctly, use re.sub() from the re (regular expression) module:

import re

my_str = 'tutorial reference com' # Multiple spaces
result = re.sub(r"\s+", '_', my_str) # Replace 1 or more whitespace with '_'
print(result) # Output: tutorial_reference_com
  • r"\s+": This is a regular expression.
    • \s: Matches any whitespace character (space, tab, newline, etc.).
    • +: Matches one or more occurrences of the preceding character (so it handles multiple spaces).
  • re.sub(pattern, replacement, string): Substitutes all occurrences of the pattern in string with replacement.

This is the most robust method for handling all types of whitespace and multiple spaces.

Replacing Spaces with Underscores Using split() and join()

An alternative approach combines str.split() and str.join():

my_str = 'tutorial reference com'
result = '_'.join(my_str.split())
print(result) # Output: tutorial_reference_com
  • The split() method with no arguments will separate the string into parts by splitting at any whitespace character.

  • The '_'.join(...) joins the parts into a single string, using _ as separator.

  • my_str.split(): Splits the string into a list of words, using any whitespace as the delimiter (multiple spaces are treated as one).

  • '_'.join(...): Joins the words back together, using an underscore as the separator.

This method automatically handles multiple spaces and other whitespace characters (like tabs and newlines), just like re.sub(r"\s+", ...). It is generally preferred over using re.sub() for this reason.

While possible, using a for loop is the least efficient and least readable way to do this:

my_str = 'tutorial reference com'
new_str = ''
for char in my_str:
if char == ' ':
new_str += '_'
else:
new_str += char

print(new_str) # Output: tutorial_reference_com
  • This approach manually iterates through the string, which is much slower than the built-in string methods or regular expressions. Avoid this unless you have a very specific reason.