stk.StochasticUniversalSampling
- class stk.StochasticUniversalSampling(num_batches=None, batch_size=1, duplicate_molecules=True, duplicate_batches=True, key_maker=Inchi(), fitness_modifier=<function StochasticUniversalSampling.<lambda>>, random_seed=None)[source]
Bases:
Selector
[T
]Yields batches of molecules through stochastic universal sampling.
Stochastic universal sampling lays out batches along a line, with each batch taking up length proportional to its fitness. It then creates a set of evenly spaced pointers to different points on the line, each of which is occupied by a batch. Batches which are pointed to are yielded.
This approach means weaker members of the population are given a greater chance to be chosen than in
Roulette
selection [1].References:
Examples
Yielding Single Molecule Batches
Yielding molecules one at a time. For example, if molecules need to be selected for mutation or the next generation.
import stk # Make the selector. stochastic_sampling = stk.StochasticUniversalSampling(5) population = { stk.MoleculeRecord( topology_graph=stk.polymer.Linear( building_blocks=[ stk.BuildingBlock('BrCCBr', stk.BromoFactory()), ], repeating_unit='A', num_repeating_units=2, ), ): i for i in range(100) } # Select the molecules. for selected, in stochastic_sampling.select(population): # Do stuff with each selected molecule. pass
- Parameters:
num_batches (int | None) – The number of batches to yield. If
None
then yielding will continue forever or until the generator is exhausted, whichever comes first.batch_size (int) – The number of molecules yielded at once.
duplicate_molecules (bool) – If
True
the same molecule can be yielded in more than one batch.duplicate_batches (bool) – If
True
the same batch can be yielded more than once.key_maker (MoleculeKeyMaker) – Used to get the keys of molecules. If two molecules have the same key, they are considered duplicates.
fitness_modifier (Callable[[dict[T, float]], dict[T, float]]) – Takes the population on which
select()
is called and returns adict
, which maps records in the population to the fitness values theSelector
should use.random_seed (int | Generator | None) – The random seed to use.
Methods
Yield batches of molecule records from population.
- select(population, included_batches=None, excluded_batches=None)
Yield batches of molecule records from population.
- Parameters:
population (dict[T, float]) – A collection of molecules from which batches are selected.
included_batches (set[BatchKey] | None) – The identity keys of batches which are allowed to be yielded, if
None
all batches can be yielded. If notNone
only batches included_batches will be yielded.excluded_batches (set[BatchKey] | None) – The identity keys of batches which are not allowed to be yielded. If
None
, no batch is forbidden from being yielded.
- Yields:
A batch of selected molecule records.
- Return type: