Skip to main content

How to filter a JSON array in Python

Filtering data within JSON arrays is a common task when processing structured data in Python. Whether you're working with data from APIs, configuration files, or databases, you need effective ways to extract the specific information you need.

This guide explores different methods for filtering JSON arrays in Python, providing practical examples and explanations.

Filtering JSON Arrays with List Comprehension

List comprehensions provide a concise way to filter a JSON array after parsing it into a Python list.

import json

json_array = json.dumps(
[
{'name': 'Alice', 'salary': 100},
{'name': 'Bob', 'salary': 50},
{'name': 'Carl', 'salary': 75}
]
)

a_list = json.loads(json_array)

filtered_list = [
dictionary for dictionary in a_list
if dictionary['salary'] > 50
]

print(filtered_list)
# Output: [{'name': 'Alice', 'salary': 100}, {'name': 'Carl', 'salary': 75}]

Explanation:

  • First, we use json.dumps() to create a JSON string, this step is for the example purpose, in a real scenario you would probably have a JSON string already or have loaded the string from a file.
  • json.loads(json_array) parses the JSON string into a native Python list of dictionaries.
  • The list comprehension [dictionary for dictionary in a_list if dictionary['salary'] > 50] iterates through each dictionary in the a_list.
  • The condition if dictionary['salary'] > 50 checks if the salary value is greater than 50 for each dictionary.
  • Only the dictionaries that satisfy this condition are included in the filtered_list.

This approach provides a very readable and Pythonic way to filter JSON data.

Filtering JSON Arrays Using a for Loop

Alternatively, you can use a traditional for loop for filtering. This method is useful for developers who prefer more explicit control over the filtering process.

import json

json_array = json.dumps(
[
{'name': 'Alice', 'salary': 100},
{'name': 'Bob', 'salary': 50},
{'name': 'Carl', 'salary': 75}
]
)

a_list = json.loads(json_array)

filtered_list = []

for dictionary in a_list:
if dictionary['salary'] > 50:
filtered_list.append(dictionary)

print(filtered_list)
# Output: [{'name': 'Alice', 'salary': 100}, {'name': 'Carl', 'salary': 75}]

Explanation:

  • We parse the JSON string into a Python list using json.loads().
  • An empty list filtered_list is created to store the filtered dictionaries.
  • The for loop iterates through each dictionary in the a_list.
  • Inside the loop, the if condition dictionary['salary'] > 50 checks if the salary is greater than 50.
  • If the condition is true, the dictionary is appended to the filtered_list.

This approach achieves the same filtering result as the list comprehension but is more explicit in terms of the steps involved.

Filtering JSON Arrays from Files

Often, JSON data is stored in files. To filter data from a JSON file, you need to load the file content using the json.load() method.

import json

file_name = 'example.json' # Assuming 'example.json' exists

with open(file_name, 'r', encoding='utf-8') as f:
a_list = json.load(f)
print(a_list)

filtered_list = [
dictionary for dictionary in a_list
if dictionary['salary'] > 50
]

print(filtered_list)
# Output:
# [{'name': 'Alice', 'salary': 100}, {'name': 'Bob', 'salary': 50}, {'name': 'Carl', 'salary': 75}]
# [{'name': 'Alice', 'salary': 100}, {'name': 'Carl', 'salary': 75}]

Explanation:

  • The code opens the file using a with open(...) statement, which ensures that the file is closed correctly.
  • json.load(f) parses the JSON data from the opened file to a Python list.
  • The remaining steps are the same as the list comprehension examples, where we filter the list of dictionaries based on the salary.

This allows you to filter JSON data directly from a file and save the results in a variable.

note

You'll need an example.json file that contains a valid JSON array for this example to work:

[
{"name": "Alice", "salary": 100},
{"name": "Bob", "salary": 50},
{"name": "Carl", "salary": 75}
]
note

The json.load() method expects a text file or a binary file containing a JSON document that implements a .read() method.

Filtering JSON Arrays with the filter() Function

The built-in filter() function provides another way to filter data based on a given condition. It's particularly useful with lambda functions.

import json

json_array = json.dumps(
[
{'name': 'Alice', 'salary': 100},
{'name': 'Bob', 'salary': 50},
{'name': 'Carl', 'salary': 75}
]
)

a_list = json.loads(json_array)

filtered_list = list(
filter(
lambda dictionary: dictionary['salary'] > 50,
a_list
)
)

print(filtered_list)
# Output: [{'name': 'Alice', 'salary': 100}, {'name': 'Carl', 'salary': 75}]

Explanation:

  • We load the JSON string into a Python list using json.loads().
  • The filter() function takes two arguments: a function (in this case a lambda function) that serves as a condition and an iterable (a_list).
  • The lambda dictionary: dictionary['salary'] > 50 lambda function checks if the salary is greater than 50.
  • The filter() function returns an iterator containing all the items of the list that return True from the lambda function.
  • We use list() to convert this iterator into a new list, filtered_list.

This method can be more concise than using a for loop for straightforward filtering conditions.