Roulette
- class Roulette(num_batches=None, batch_size=1, duplicate_molecules=True, duplicate_batches=True, key_maker=Inchi(), fitness_modifier=None, random_seed=None)[source]
Bases:
Selector
Uses roulette selection to select batches of molecules.
In roulette selection the probability a batch is selected is given by its fitness. If the total fitness is the sum of all fitness values, the chance a batch is selected is given by:
p = batch fitness / total fitness,
where
p
is the probability of selection and the batch fitness is the sum of all fitness values of molecules in the batch [1].References
Examples
Yielding Single Molecule Batches
Yielding molecules one at a time. For example, if molecules need to be selected for mutation or the next generation
import stk # Make the selector. roulette = stk.Roulette(num_batches=5) population = tuple( stk.MoleculeRecord( topology_graph=stk.polymer.Linear( building_blocks=( stk.BuildingBlock( smiles='BrCCBr', functional_groups=[stk.BromoFactory()], ), ), repeating_unit='A', num_repeating_units=2, ), ).with_fitness_value(i) for i in range(100) ) # Select the molecules. for selected, in roulette.select(population): # Do stuff with each selected molecule. pass
Yielding Batches Holding Multiple Molecules
Yielding multiple molecules at once. For example, if molecules need to be selected for crossover
import stk # Make the selector. roulette = stk.Roulette(num_batches=5, batch_size=2) population = tuple( stk.MoleculeRecord( topology_graph=stk.polymer.Linear( building_blocks=( stk.BuildingBlock( smiles='BrCCBr', functional_groups=[stk.BromoFactory()], ), ), repeating_unit='A', num_repeating_units=2, ), ).with_fitness_value(i) for i in range(100) ) # Select the molecules. for selected1, selected2 in roulette.select(population): # Do stuff to the molecules. pass
Methods
select
(population[, included_batches, ...])Yield batches of molecule records from population.
- __init__(num_batches=None, batch_size=1, duplicate_molecules=True, duplicate_batches=True, key_maker=Inchi(), fitness_modifier=None, random_seed=None)[source]
Initialize a
Roulette
instance.- Parameters:
num_batches (
int
, optional) – The number of batches to yield. IfNone
then yielding will continue forever or until the generator is exhausted, whichever comes first.batch_size (
int
, optional) – The number of molecules yielded at once.duplicate_molecules (
bool
, optional) – IfTrue
the same molecule can be yielded in more than one batch.duplicate_batches (
bool
, optional) – IfTrue
the same batch can be yielded more than once.key_maker (
MoleculeKeyMaker
, optional) – Used to get the keys of molecules. If two molecules have the same key, they are considered duplicates.fitness_modifier (
callable
, optional) – Takes the population on whichselect()
is called and returns adict
, which maps records in the population to the fitness values theSelector
should use. IfNone
, the regular fitness values of the records are used.random_seed (
int
, optional) – The random seed to use.
- select(population, included_batches=None, excluded_batches=None)
Yield batches of molecule records from population.
- Parameters:
population (
tuple
ofMoleculeRecord
) – A collection of molecules from which batches are selected.included_batches (
set
, optional) – The identity keys of batches which are allowed to be yielded, ifNone
all batches can be yielded. If notNone
only batches included_batches will be yielded.excluded_batches (class:set, optional) – The identity keys of batches which are not allowed to be yielded. If
None
, no batch is forbidden from being yielded.
- Yields:
Batch
ofMoleculeRecord
– A batch of selected molecule records.