Skip to main content

How to Decode URL Parameters in Python

URL encoding (also known as percent-encoding) replaces unsafe ASCII characters with a % followed by two hexadecimal digits.

This guide explains how to decode URL parameters in Python, effectively reversing this process. We'll focus on the urllib.parse.unquote() and urllib.parse.unquote_plus() functions, handle double-encoding, and briefly touch on using the requests library.

Decoding URL Parameters with urllib.parse.unquote()

The urllib.parse.unquote() function is the standard and recommended way to decode URL-encoded strings in Python:

from urllib.parse import unquote

url = 'https://tutorialreference.com/doc%3Fpage%3D1%26offset%3D10'
decoded_url = unquote(url)
print(decoded_url) # Output: https://tutorialreference.com/doc?page=1&offset=10
  • unquote(url) replaces %xx escapes with their single-character equivalent. For example, %3F becomes ?, %3D becomes =, and %26 becomes &.
  • The unquote method handles UTF-8 encoding.

Handling Plus Signs (+) as Spaces with unquote_plus()

In HTML form encoding, spaces are often represented by plus signs (+). urllib.parse.unquote() does not automatically convert + to space. For this, use urllib.parse.unquote_plus():

from urllib.parse import unquote_plus

url = 'https://tutorialreference.com/doc%3Fpage%3D1+%26+offset%3D10' # + instead of space
result = unquote_plus(url, encoding='utf-8')
print(result) # Output: https://tutorialreference.com/doc?page=1 & offset=10
  • unquote_plus() behaves like unquote(), but also replaces plus signs with spaces. This is crucial for correctly decoding form data.

Decoding Double-Encoded Parameters

Sometimes, parameters might be encoded twice. In these cases, you need to call unquote() (or unquote_plus()) twice:

from urllib.parse import unquote

url = 'https://tutorialreference.com/doc%253Fpage%253D1%2526offset%253D10'
result = unquote(unquote(url)) # Call unquote() twice
print(result) # Output: https://tutorialreference.com/doc?page=1&offset=10
  • Each call to unquote() decodes one level of encoding.

Using requests.utils.unquote() (If you already have requests)

If you have installed requests, you can use the requests.utils.unquote() method.

    import requests
url = 'https://tutorialreference.com/doc%3Fpage%3D1%26offset%3D10'
result = requests.utils.unquote(url)
print(result)
  • The requests.utils.unquote decodes the string by replacing the %xx with their corresponding character.