Skip to main content

How to Manipulate Path Components in Python

Working with file paths often requires extracting specific parts of the path or removing prefixes and suffixes.

This guide explores various techniques in Python for manipulating path components, including getting the last part of a path, removing components from the front or end of paths, using both the pathlib and os.path modules.

Getting the Last Part of a Path

To extract the last part of a file path, representing either a filename or a folder name, you can use a variety of techniques.

Using pathlib.PurePath().name

The pathlib module provides a convenient way to get the last component of a path using the name attribute:

import pathlib
path = '/home/tomnolan/Desktop/last/'
last_part = pathlib.PurePath(path).name
print(last_part) # Output: last
  • The PurePath(path) creates a path object.
  • The .name attribute returns the last component of the given path string.
  • This approach works regardless of the style of path used.

Using ntpath for Windows-Style paths on Linux

If you have to process Windows-style paths in Linux environments, use the ntpath module:

import ntpath

def get_last_path(path):
head, tail = ntpath.split(path)
return tail or ntpath.basename(head)

last_path = get_last_path('C:\\Users\\tomnolan\\Desktop\\example.txt')
print(last_path) # Output: example.txt
  • The function uses ntpath.split() which splits the path into two strings: before the last separator, and after the last separator (the tail).
  • If the tail is empty, which means that the last character was / or \, then the function uses ntpath.basename to return the last path component of the head, if any.

Using os.path.basename()

Alternatively, you can use the os.path.basename() function to extract the last part of a path, after stripping any trailing slashes:

import os

path = '/home/tomnolan/Desktop/last/'
last_path = os.path.basename(os.path.normpath(path))
print(last_path) # Output: last
  • The os.path.normpath() is used to remove the trailing slashes, which helps to get consistent behavior regardless of whether the path ends with / or not.
  • The os.path.basename() method gets the last part of the path after processing the string with os.path.normpath() which removes trailing slashes.

Removing the Last Path Component

To remove the last part of a path, you can use pathlib or os.path module.

Using pathlib.Path().parent

The pathlib.Path class provides a convenient way to remove the last component using the parent attribute, which returns a new Path object:

from pathlib import Path

absolute_path = '/home/tomnolan/Desktop/python/main.py'
result = Path(absolute_path).parent
print(result) # Output: /home/tomnolan/Desktop/python

absolute_path = '/home/tomnolan/Desktop/'
result = Path(absolute_path).parent
print(result) # Output: /home/tomnolan
  • The Path() function creates a path object.
  • The parent attribute will extract the string until the last path component.

Using os.path.dirname()

The os.path.dirname() method can also remove the last path component, but it’s important to strip trailing slashes first using os.path.normpath():

import os

absolute_path = '/home/tomnolan/Desktop/python/main.py'
result = os.path.dirname(os.path.normpath(absolute_path))
print(result) # Output: /home/tomnolan/Desktop/python

absolute_path = '/home/tomnolan/Desktop/'
result = os.path.dirname(os.path.normpath(absolute_path))
print(result) # Output: /home/tomnolan
  • os.path.normpath() strips trailing slashes from the path.
  • The os.path.dirname() returns the directory component (everything before the last / or \ character), effectively removing the last component.

Removing a Path Prefix

To remove a path prefix, you can use the os.path.relpath() method or pathlib.Path().relative_to() method.

Using os.path.relpath()

The os.path.relpath() method returns the relative path from the start path to the path.

import os

absolute_path = '/home/tomnolan/Desktop/python/main.py'
relative_path = '/home/tomnolan'
without_prefix = os.path.relpath(absolute_path, relative_path)
print(without_prefix) # Output: Desktop/python/main.py
  • The path is returned relative to the second parameter which indicates the starting path of the relative path.

Using pathlib.PurePath().relative_to()

The pathlib module also offers a way to remove a prefix.

from pathlib import PurePath

absolute_path = '/home/tomnolan/Desktop/python/main.py'
a_path = PurePath(absolute_path)
without_prefix = str(a_path.relative_to('/home/tomnolan'))
print(without_prefix) # Output: Desktop/python/main.py
  • The relative_to() method calculates the relative path from the a_path to /home/tomnolan.

Removing the First Folder from a Path

To remove the first folder from a path, use the Path class to get path components using parts, then use relative_to to remove the first 2 parts of the path.

Using Path().relative_to() and Slicing

from pathlib import Path

absolute_path = '/home/tomnolan/Desktop/python/main.py'
a_path = Path(absolute_path)
result = a_path.relative_to(*a_path.parts[:2])
print(result) # Output: tomnolan/Desktop/python/main.py
  • The parts attribute of a Path object returns the components of the path in a tuple.
  • We used slice notation to select a tuple of the first 2 components.
  • The iterable unpacking operator * is used to pass the path components as separate arguments to relative_to() function.

Using os.path.join() and split()

An alternative way to remove the first path component is using str.split() with os.path.join():

import os

absolute_path = '/home/tomnolan/Desktop/python/main.py'
result = os.path.join(*(absolute_path.split(os.path.sep)[2:]))
print(result) # Output: tomnolan/Desktop/python/main.py
  • absolute_path.split(os.path.sep) splits the path into a list of components based on the system's path separator (such as / or \).
  • The [2:] then selects all elements from index 2 (skipping the first folder).
  • The asterisk (*) unpacks the sliced list into the os.path.join() method, which will join all of the path components using the path separator.