Skip to main content

How to Resolve Python "ValueError: Sample larger than population or is negative"

The ValueError: Sample larger than population or is negative is a common error encountered when using the random.sample() function in Python. It arises when you attempt to select more unique random elements from a sequence than are actually available.

This guide explains the cause of this error and provides effective solutions, including using random.choices() and error handling.

Understanding the Error: random.sample() and Uniqueness

The random.sample(population, k) function is designed to return a list of k unique elements chosen randomly from the population sequence (like a list or tuple). Because it guarantees uniqueness (no element is chosen more than once), the requested sample size (k) can not be larger than the number of elements available in the population.

import random

a_list = ['tutorial', 'reference', 'com'] # Population size is 3

# This causes the error because we ask for 4 *unique* elements from a list of 3
# random_elements = random.sample(a_list, 4)
# ⛔️ ValueError: Sample larger than population or is negative

If you need to select k random elements and don't require them to be unique (meaning elements can be picked multiple times), use random.choices():

import random

a_list = ['tutorial', 'reference', 'com']

# Select 4 elements, allowing duplicates (sampling with replacement)
random_elements = random.choices(a_list, k=4)
print(random_elements) # Output (Example): ['tutorial', 'com', 'tutorial', 'reference']
  • random.choices(population, k=N) returns a list of k elements chosen from the population with replacement. This means the same element can appear multiple times in the result, and k can be larger than the population size.

Solution 2: Use min() to Limit Sample Size

If you must use random.sample() (e.g., you strictly need unique elements) but want to avoid the error if k might exceed the population size, you can limit k using the min() function:

import random

a_list = ['tutorial', 'reference', 'com']
requested_sample_size = 4

# Ensure the sample size doesn't exceed the list length
actual_sample_size = min(requested_sample_size, len(a_list))

random_elements = random.sample(a_list, actual_sample_size)
print(random_elements) # Output (Example, unique): ['com', 'tutorial', 'reference']
  • min(requested_sample_size, len(a_list)) ensures that the second argument passed to random.sample() is never larger than the number of elements in a_list.

Getting a Single Random Element with random.choice()

If you only need one random element from a sequence, use the simpler random.choice():

import random

a_list = ['tutorial', 'reference', 'com']
random_element = random.choice(a_list)
print(random_element) # Output (Example): reference

Handling Empty Sequences with random.choice()

random.choice() raises an IndexError if the sequence is empty. Use a try...except block to handle this:

import random

a_list = [] # Empty list

try:
random_element = random.choice(a_list)
print(random_element)
except IndexError:
print('The sequence is empty') # Output: The sequence is empty

Conclusion

The ValueError: Sample larger than population or is negative specifically occurs with random.sample() when requesting more unique items than available.

  • The best solution is often to use random.choices() if you allow duplicate selections (sampling with replacement).
  • If unique items are required, ensure your requested sample size (k) doesn't exceed the population size, potentially by using min(k, len(population)).
  • For selecting a single random item, random.choice() is the appropriate function.