stk.ValueMongoDb

class stk.ValueMongoDb(mongo_client, collection, database='stk', key_makers=(InchiKey(),), put_lru_cache_size=128, get_lru_cache_size=128, indices=('InChIKey',))[source]

Bases: ValueDatabase

Use MongoDB to store and retrieve molecular property values.

Examples

See also examples in ValueDatabase.

Storing Molecular Properties in a Database

You want to store property values in a database.

import stk
import pymongo

# Connect to a MongoDB. This example connects to a local
# MongoDB, but you can connect to a remote DB too with
# MongoClient() - read the documentation for pymongo to see how
# to do that.
client = pymongo.MongoClient()
db = stk.ValueMongoDb(
    mongo_client=client,
    collection='atom_counts',
)

molecule = stk.BuildingBlock('BrCCBr')
# Add the value to the database.
db.put(molecule, molecule.get_num_atoms())
# Retrieve the value from the database.
num_atoms = db.get(molecule)

# Works with constructed molecules too.
polymer = stk.ConstructedMolecule(
    topology_graph=stk.polymer.Linear(
        building_blocks=(
            stk.BuildingBlock('BrCCBr', [stk.BromoFactory()]),
        ),
        repeating_unit='A',
        num_repeating_units=2,
    ),
)
db.put(polymer, polymer.get_num_atoms())
num_polymer_atoms = db.get(polymer)

Initialize a ValueMongoDb instance.

Parameters:
  • mongo_client (pymongo.MongoClient) – The database client.

  • collection (str) – The name of the MongoDB collection used for storing the property values.

  • database (str, optional) – The name of the MongoDB database used for storing the property values.

  • key_makers (tuple of MoleculeKeyMaker) – Used to make the keys of molecules, which the values are associated with. If two molecules have the same key, they will return the same value from the database.

  • put_lru_cache_size (int, optional) – A RAM-based least recently used cache is used to avoid writing to the database repeatedly. This sets the number of values which fit into the LRU cache. If None, the cache size will be unlimited.

  • get_lru_cache_size (int, optional) – A RAM-based least recently used cache is used to avoid reading from the database repeatedly. This sets the number of values which fit into the LRU cache. If None, the cache size will be unlimited.

  • indices (tuple of str, optional) – The names of molecule keys, on which an index should be created, in order to minimize lookup time.

Methods

get

Get the stored value for molecule.

put

Put a value into the database.

get(molecule)[source]

Get the stored value for molecule.

Parameters:

molecule (Molecule) – The molecule whose value is to be retrieved from the database.

Returns:

The value associated with molecule.

Return type:

object

Raises:

KeyError – If molecule is not found in the database.

put(molecule, value)[source]

Put a value into the database.

Parameters:
  • molecule (Molecule) – The molecule which is associated with the value.

  • value (object) – Some value associated with molecule.

Returns:

None

Return type:

NoneType