Genetic Recombination
- class GeneticRecombination(get_gene, name='GeneticRecombination')[source]
Bases:
MoleculeCrosser
Recombine building blocks using biological systems as a model.
Overall, this crosser mimics how animals and plants inherit DNA from their parents, except generalized to work with any number of parents. First it is worth discussing some terminology. A gene is a the smallest packet of genetic information. In animals, each gene can have multiple alleles. For example, there is a gene for hair color, and individual alleles for black, red, brown, etc. hair. This means that every person has a gene for hair color, but a person with black hair will have the black hair allele and a person with red hair will have the red hair allele. When two parents produce an offspring, the offspring will have a hair color gene and will inherit the allele of one of the parents at random. Therefore, if you have two parents, one with black hair and one with red hair, the offspring will either have black or red hair, depending on which allele they inherit.
In an
stk
ConstructedMolecule
, each building block represents an allele. The question is, which gene is each building block an allele of? To answer that, let’s first construct a couple of building block moleculesimport stk bb1 = stk.BuildingBlock( smiles='NCC(N)CN', functional_groups=[stk.PrimaryAminoFactory()], ) bb2 = stk.BuildingBlock('O=CCC=O', [stk.AldehydeFactory()]) bb3 = stk.BuildingBlock( smiles='O=CCNC(C=O)C=O', functional_groups=[stk.AldehydeFactory()], ) bb4 = stk.BuildingBlock( smiles='NCOCN', functional_groups=[stk.PrimaryAminoFactory()], )
We can define a function which analyzes a building block molecule and returns the gene it belongs to, for example
def get_gene(building_block): fg, = building_block.get_functional_groups(0) return type(fg)
Here, we can see that the gene, to which each building block molecule belongs, is given by the class of its first functional group. Therefore there is an
PrimaryAmino
gene, which has two allelesbb1
andbb4
, and there is anAldehyde
gene, which has two allelesbb2
andbb3
.Alternatively, we could have defined a function such as
def get_gene2(building_block): return building_block.get_num_functional_groups()
Now we can see that we end up with the gene called
3
, which has two allelesbb1
andbb3
, and a second gene called2
, which has the allelesbb2
andbb4
.To produce offspring molecules, this class categorizes each building block of the parent molecules into genes using the get_gene parameter. Then, to generate a single offspring, it picks a building block for every gene. The picked building blocks are used to construct the offspring. The topology graph of the offspring is one of the parent’s. For obvious reasons, this approach works with any number of parents.
Examples
Crossing Constructed Molecules
Note that any number of parents can be used for the crossover
import stk # Create the molecule records which will be crossed. bb1 = stk.BuildingBlock('NCCN', [stk.PrimaryAminoFactory()]) bb2 = stk.BuildingBlock('O=CCCCC=O', [stk.AldehydeFactory()]) graph1 = stk.polymer.Linear((bb1, bb2), 'AB', 2) polymer1 = stk.ConstructedMolecule(graph1) record1 = stk.MoleculeRecord(graph1) bb3 = stk.BuildingBlock('NCCCN', [stk.PrimaryAminoFactory()]) bb4 = stk.BuildingBlock( smiles='O=C[Si]CCC=O', functional_groups=[stk.AldehydeFactory()], ) graph2 = stk.polymer.Linear((bb3, bb4), 'AB', 2) polymer2 = stk.ConstructedMolecule(graph2) record2 = stk.MoleculeRecord(graph2) # Create the crosser. def get_functional_group_type(building_block): fg, = building_block.get_functional_groups(0) return type(fg) recombination = stk.GeneticRecombination( get_gene=get_functional_group_type, ) # Get the offspring molecules. cohort1 = tuple(recombination.cross( records=(record1, record2), ))
Methods
cross
(records)Cross records.
- __init__(get_gene, name='GeneticRecombination')[source]
Initialize a
GeneticRecombination
instance.- Parameters:
get_gene (
callable
) – Acallable
, which takes aBuildingBlock
object and returns its gene. To produce an offspring, one of the building blocks from each gene is picked.name (
str
, optional) – A name to identify the crosser instance.
- cross(records)[source]
Cross records.
- Parameters:
records (
iterable
ofMoleculeRecord
) – The molecule records on which a crossover operation is performed.- Yields:
CrossoverRecord
– A record of a crossover operation.