stk.StochasticUniversalSampling

class stk.StochasticUniversalSampling(num_batches=None, batch_size=1, duplicate_molecules=True, duplicate_batches=True, key_maker=Inchi(), fitness_modifier=<function StochasticUniversalSampling.<lambda>>, random_seed=None)[source]

Bases: Selector[T]

Yields batches of molecules through stochastic universal sampling.

Stochastic universal sampling lays out batches along a line, with each batch taking up length proportional to its fitness. It then creates a set of evenly spaced pointers to different points on the line, each of which is occupied by a batch. Batches which are pointed to are yielded.

This approach means weaker members of the population are given a greater chance to be chosen than in Roulette selection [1].

References:

Examples

Yielding Single Molecule Batches

Yielding molecules one at a time. For example, if molecules need to be selected for mutation or the next generation.

import stk

# Make the selector.
stochastic_sampling = stk.StochasticUniversalSampling(5)

population = {
    stk.MoleculeRecord(
        topology_graph=stk.polymer.Linear(
            building_blocks=[
                stk.BuildingBlock('BrCCBr', stk.BromoFactory()),
            ],
            repeating_unit='A',
            num_repeating_units=2,
        ),
    ): i
    for i in range(100)
}

# Select the molecules.
for selected, in stochastic_sampling.select(population):
    # Do stuff with each selected molecule.
    pass
Parameters:
  • num_batches (int | None) – The number of batches to yield. If None then yielding will continue forever or until the generator is exhausted, whichever comes first.

  • batch_size (int) – The number of molecules yielded at once.

  • duplicate_molecules (bool) – If True the same molecule can be yielded in more than one batch.

  • duplicate_batches (bool) – If True the same batch can be yielded more than once.

  • key_maker (MoleculeKeyMaker) – Used to get the keys of molecules. If two molecules have the same key, they are considered duplicates.

  • fitness_modifier (Callable[[dict[T, float]], dict[T, float]]) – Takes the population on which select() is called and returns a dict, which maps records in the population to the fitness values the Selector should use.

  • random_seed (int | Generator | None) – The random seed to use.

Methods

select

Yield batches of molecule records from population.

select(population, included_batches=None, excluded_batches=None)

Yield batches of molecule records from population.

Parameters:
  • population (dict[T, float]) – A collection of molecules from which batches are selected.

  • included_batches (set[BatchKey] | None) – The identity keys of batches which are allowed to be yielded, if None all batches can be yielded. If not None only batches included_batches will be yielded.

  • excluded_batches (set[BatchKey] | None) – The identity keys of batches which are not allowed to be yielded. If None, no batch is forbidden from being yielded.

Yields:

A batch of selected molecule records.

Return type:

Iterator[Batch[T]]