How to Resolve Python "ValueError: Sample larger than population or is negative"
The ValueError: Sample larger than population or is negative
is a common error encountered when using the random.sample()
function in Python. It arises when you attempt to select more unique random elements from a sequence than are actually available.
This guide explains the cause of this error and provides effective solutions, including using random.choices()
and error handling.
Understanding the Error: random.sample()
and Uniqueness
The random.sample(population, k)
function is designed to return a list of k
unique elements chosen randomly from the population
sequence (like a list or tuple). Because it guarantees uniqueness (no element is chosen more than once), the requested sample size (k
) can not be larger than the number of elements available in the population
.
import random
a_list = ['tutorial', 'reference', 'com'] # Population size is 3
# This causes the error because we ask for 4 *unique* elements from a list of 3
# random_elements = random.sample(a_list, 4)
# ⛔️ ValueError: Sample larger than population or is negative
Solution 1: Use random.choices()
for Sampling with Replacement (Recommended)
If you need to select k
random elements and don't require them to be unique (meaning elements can be picked multiple times), use random.choices()
:
import random
a_list = ['tutorial', 'reference', 'com']
# Select 4 elements, allowing duplicates (sampling with replacement)
random_elements = random.choices(a_list, k=4)
print(random_elements) # Output (Example): ['tutorial', 'com', 'tutorial', 'reference']
random.choices(population, k=N)
returns a list ofk
elements chosen from thepopulation
with replacement. This means the same element can appear multiple times in the result, andk
can be larger than the population size.
Solution 2: Use min()
to Limit Sample Size
If you must use random.sample()
(e.g., you strictly need unique elements) but want to avoid the error if k
might exceed the population size, you can limit k
using the min()
function:
import random
a_list = ['tutorial', 'reference', 'com']
requested_sample_size = 4
# Ensure the sample size doesn't exceed the list length
actual_sample_size = min(requested_sample_size, len(a_list))
random_elements = random.sample(a_list, actual_sample_size)
print(random_elements) # Output (Example, unique): ['com', 'tutorial', 'reference']
min(requested_sample_size, len(a_list))
ensures that the second argument passed torandom.sample()
is never larger than the number of elements ina_list
.
Getting a Single Random Element with random.choice()
If you only need one random element from a sequence, use the simpler random.choice()
:
import random
a_list = ['tutorial', 'reference', 'com']
random_element = random.choice(a_list)
print(random_element) # Output (Example): reference
Handling Empty Sequences with random.choice()
random.choice()
raises an IndexError
if the sequence is empty. Use a try...except
block to handle this:
import random
a_list = [] # Empty list
try:
random_element = random.choice(a_list)
print(random_element)
except IndexError:
print('The sequence is empty') # Output: The sequence is empty
Conclusion
The ValueError: Sample larger than population or is negative
specifically occurs with random.sample()
when requesting more unique items than available.
- The best solution is often to use
random.choices()
if you allow duplicate selections (sampling with replacement). - If unique items are required, ensure your requested sample size (
k
) doesn't exceed the population size, potentially by usingmin(k, len(population))
. - For selecting a single random item,
random.choice()
is the appropriate function.