Batch

class Batch(records, fitness_values, key_maker)[source]

Bases: object

Represents a batch of molecule records.

Batches can be compared, the comparison is based on their fitness values. Batches can also be iterated through, this iterates through all the records in the batch.

Examples

Sorting Batches by Fitness Value

Sorting batches causes them to be sorted by fitness value.

import stk

record1 = stk.MoleculeRecord(
    topology_graph=stk.polymer.Linear(
        building_blocks=(
            stk.BuildingBlock(
                smiles='BrCCBr',
                functional_groups=[stk.BromoFactory()],
            ),
        ),
        repeating_unit='A',
        num_repeating_units=2,
    ),
)
record2 = stk.MoleculeRecord(
    topology_graph=stk.polymer.Linear(
        building_blocks=(
            stk.BuildingBlock(
                smiles='BrCCBr',
                functional_groups=[stk.BromoFactory()],
            ),
        ),
        repeating_unit='A',
        num_repeating_units=2,
    ),
)
record3 = stk.MoleculeRecord(
    topology_graph=stk.polymer.Linear(
        building_blocks=(
            stk.BuildingBlock(
                smiles='BrCCBr',
                functional_groups=[stk.BromoFactory()],
            ),
        ),
        repeating_unit='A',
        num_repeating_units=2,
    ),
)

batches = (
    stk.Batch(
        records=(record1, ),
        fitness_values={record1: 1},
        key_maker=stk.Inchi(),
    ),
    stk.Batch(
        records=(record2, ),
        fitness_values={record2: 2},
        key_maker=stk.Inchi(),
    ),
    stk.Batch(
        records=(record3, ),
        fitness_values={record3: 3},
        key_maker=stk.Inchi(),
    ),
)
sorted_batches = sorted(batches)

Comparing Batches by Fitness Value

Comparison is also based on fitness value

import stk

record1 = stk.MoleculeRecord(
    topology_graph=stk.polymer.Linear(
        building_blocks=(
            stk.BuildingBlock(
                smiles='BrCCBr',
                functional_groups=[stk.BromoFactory()],
            ),
        ),
        repeating_unit='A',
        num_repeating_units=2,
    ),
)
batch1 = stk.Batch(
    records=(record1, ),
    fitness_values={record1: 1},
    key_maker=stk.Inchi(),
)

record2 = stk.MoleculeRecord(
    topology_graph=stk.polymer.Linear(
        building_blocks=(
            stk.BuildingBlock(
                smiles='BrCCBr',
                functional_groups=[stk.BromoFactory()],
            ),
        ),
        repeating_unit='A',
        num_repeating_units=2,
    ),
)
batch2 = stk.Batch(
    records=(record2, ),
    fitness_values={record2: 2},
    key_maker=stk.Inchi(),
)

if batch1 < batch2:
    print('batch1 has a smaller fitness value than batch2.')

Iterating Through Molecule Records in a Batch

Batches can be iterated through to get the molecule records in the batch

import stk

record1 = stk.MoleculeRecord(
    topology_graph=stk.polymer.Linear(
        building_blocks=(
            stk.BuildingBlock(
                smiles='BrCCBr',
                functional_groups=[stk.BromoFactory()],
            ),
        ),
        repeating_unit='A',
        num_repeating_units=2,
    ),
)
record2 = stk.MoleculeRecord(
    topology_graph=stk.polymer.Linear(
        building_blocks=(
            stk.BuildingBlock(
                smiles='BrCCBr',
                functional_groups=[stk.BromoFactory()],
            ),
        ),
        repeating_unit='A',
        num_repeating_units=2,
    ),
)
batch = stk.Batch(
    records=(record1, record2),
    fitness_values={record1: 1, record2: 2},
    key_maker=stk.Inchi(),
)
for record in batch:
    # Do stuff with record.
    pass

Methods

get_fitness_value()

Get the fitness value of the batch.

get_identity_key()

Get the identity key of the batch.

get_size()

Get the number of molecules in the batch.

__init__(records, fitness_values, key_maker)[source]

Initialize a Batch.

Parameters
  • records (tuple of MoleculeRecord) – The molecule records which are part of the batch.

  • fitness_values (dict) – Maps each MoleculeRecord in records to the fitness value which should be used for it.

  • key_maker (MoleculeKeyMaker) – Used to make keys for molecules, which are used to determine the identity key of the batch. If two batches have the same molecule keys, the same number of times, they will have the same identity key.

get_fitness_value()[source]

Get the fitness value of the batch.

Returns

The fitness value.

Return type

float

get_identity_key()[source]

Get the identity key of the batch.

If two batches hold the same molecules, the same number of times, they will have the same identity key.

Returns

A hashable object which can be used to compare if two batches have the same identity.

Return type

object

get_size()[source]

Get the number of molecules in the batch.

Returns

The number of molecules in the batch.

Return type

int