Skip to main content

How to Remove Spaces, Tabs, and Newlines from Strings in Python

This guide explains how to remove unwanted whitespace (spaces, tabs, newlines) from strings in Python. We'll cover:

  • Removing all whitespace (including internal spaces, tabs, and newlines).
  • Removing only leading and trailing whitespace.
  • Splitting a string by tabs (or other whitespace).
  • Using regular expressions for more complex whitespace handling.

Removing All Whitespace

To remove all whitespace characters (spaces, tabs, newlines, etc.) from a string, the most efficient and readable method is to combine split() and join():

my_str = '   tutorial    reference   '
result = ''.join(my_str.split())
print(result) # Output: tutorialreference
  • my_str.split(): When called with no arguments, split() does two things:
    1. It removes leading and trailing whitespace.
    2. It splits the string into a list of words, using any sequence of whitespace characters (spaces, tabs, newlines) as the delimiter.
  • ''.join(...): This joins the resulting list of words back into a single string, using an empty string ('') as the separator. This effectively removes all whitespace.

This approach is concise, handles all whitespace characters, and is generally faster than using regular expressions for this specific task.

Using re.sub()

For more complex whitespace removal (e.g., removing specific whitespace characters but not others), you can use the re.sub() function from the re (regular expressions) module:

import re

my_str = ' tutorial reference '
result = re.sub(r'\s+', '', my_str) # Replace all whitespace with empty string.
print(result) # Output: tutorialreference
  • re.sub(pattern, replacement, string): Replaces all occurrences of the pattern in string with replacement.
  • r'\s+': This regular expression matches one or more whitespace characters (\s matches any whitespace character, and + means "one or more").

Removing Leading/Trailing Whitespace with strip(), lstrip(), and rstrip()

If you only want to remove whitespace from the beginning and end of a string (but keep internal spaces), use strip():

my_str = '   tutorial    reference   '
result = my_str.strip()
print(result) # Output: tutorial reference
  • my_str.strip(): Removes leading and trailing whitespace (spaces, tabs, newlines).
  • my_str.lstrip(): Removes only leading whitespace.
  • my_str.rstrip(): Removes only trailing whitespace.
my_str = '\ttutorial\treference\t'  # Example with tabs
result = my_str.strip() # Strips leading/trailing whitespace (including tabs)
print(repr(result)) # Output: 'tutorial\treference' (inner tabs preserved)

my_str = ' tutorial reference '
result = my_str.lstrip() # Left strip
print(repr(result)) # Output: 'tutorial reference '

result = my_str.rstrip() # Right strip
print(repr(result)) # Output: ' tutorial reference'

Splitting a String by Tabs

To split a string into a list of substrings based on tab characters (\t):

Using split()

my_str = 'tutorial\treference\tcom'
my_list = my_str.split('\t')
print(my_list) # Output: ['tutorial', 'reference', 'com']
  • my_str.split('\t') splits the string on each tab character.
  • To handle leading and trailing tabs, use strip() beforehand: my_str.strip().split('\t')

Using re.split() (for multiple tabs)

If your string contains multiple, consecutive tab characters, using re.split() ensures that the string is handled as you expect it to be:

import re
my_str = '\ttutorial\t\treference\t\tcom\t'
my_list = re.split(r'\t+', my_str.strip())
print(my_list) # Output: ['tutorial', 'reference', 'com']
  • re.split(r'\t+', my_str.strip()) splits the string by any sequence of one or more tab characters, removing any potential empty string.