I need to obtain a k-sized sample without replacement from a population, where each member of the population has a associated weight (W).
Numpy's random.choices will not perform this task without replacement, and random.sample won't take a weighted input.
Currently, this is what I am using:
P = np.zeros((1,Parent_number))
n=0
while n < Parent_number:
draw = random.choices(population,weights=W,k=1)
if draw not in P:
P[0,n] = draw[0]
n=n+1
P=np.asarray(sorted(P[0]))
While this works, it reqires switching back and forth from arrays, to lists and back to arrays and is, therefore, less than ideal.
I am looking for the simplest and easiest to understand solution as this code will be shared with others.
You can use np.random.choice
with replace=False
as follows:
np.random.choice(vec,size,replace=False, p=P)
where vec
is your population and P
is the weight vector.
For example:
import numpy as np
vec=[1,2,3]
P=[0.5,0.2,0.3]
np.random.choice(vec,size=2,replace=False, p=P)
For numpy, Miriam Farber's answer is the way to go.
For pure python, the technique is to pre-weight the population and then use random.sample() to extract the values without replacement:
>>> # Extract 10 values without replacement from a population
>>> # of ten heads and four tails.
>>> from random import sample
>>> population = ['heads', 'tails']
>>> counts = [10, 4]
>>> weighted_pop = [elem for elem, cnt in zip(population, counts) for i in range(cnt)]
>>> sample(weighted_pop, k=10)
['heads', 'tails', 'tails', 'heads', 'heads', 'tails', 'heads', 'heads', 'heads', 'heads']
Note, the weights are really counts. This is important because when you sample without replacement, the count needs to be reduced by one for each selection.