Value MongoDB

class ValueMongoDb(mongo_client, collection, database='stk', key_makers=(InchiKey()), put_lru_cache_size=128, get_lru_cache_size=128, indices=('InChIKey'))[source]

Bases: stk.databases.value.ValueDatabase

Use MongoDB to store and retrieve molecular property values.

Examples

See also examples in ValueDatabase.

Storing Molecular Properties in a Database

You want to store property values in a database.

import stk
import pymongo

# Connect to a MongoDB. This example connects to a local
# MongoDB, but you can connect to a remote DB too with
# MongoClient() - read the documentation for pymongo to see how
# to do that.
client = pymongo.MongoClient()
db = stk.ValueMongoDb(
    mongo_client=client,
    collection='atom_counts',
)

molecule = stk.BuildingBlock('BrCCBr')
# Add the value to the database.
db.put(molecule, molecule.get_num_atoms())
# Retrieve the value from the database.
num_atoms = db.get(molecule)

# Works with constructed molecules too.
polymer = stk.ConstructedMolecule(
    topology_graph=stk.polymer.Linear(
        building_blocks=(
            stk.BuildingBlock('BrCCBr', [stk.BromoFactory()]),
        ),
        repeating_unit='A',
        num_repeating_units=2,
    ),
)
db.put(polymer, polymer.get_num_atoms())
num_polymer_atoms = db.get(polymer)

Methods

get(molecule)

Get the stored value for molecule.

put(molecule, value)

Put a value into the database.

__init__(mongo_client, collection, database='stk', key_makers=(InchiKey()), put_lru_cache_size=128, get_lru_cache_size=128, indices=('InChIKey'))[source]

Initialize a ValueMongoDb instance.

Parameters
  • mongo_client (pymongo.MongoClient) – The database client.

  • collection (str) – The name of the MongoDB collection used for storing the property values.

  • database (str, optional) – The name of the MongoDB database used for storing the property values.

  • key_makers (tuple of MoleculeKeyMaker) – Used to make the keys of molecules, which the values are associated with. If two molecules have the same key, they will return the same value from the database.

  • put_lru_cache_size (int, optional) – A RAM-based least recently used cache is used to avoid writing to the database repeatedly. This sets the number of values which fit into the LRU cache. If None, the cache size will be unlimited.

  • get_lru_cache_size (int, optional) – A RAM-based least recently used cache is used to avoid reading from the database repeatedly. This sets the number of values which fit into the LRU cache. If None, the cache size will be unlimited.

  • indices (tuple of str, optional) – The names of molecule keys, on which an index should be created, in order to minimize lookup time.

get(molecule)[source]

Get the stored value for molecule.

Parameters

molecule (Molecule) – The molecule whose value is to be retrieved from the database.

Returns

The value associated with molecule.

Return type

object

Raises

KeyError – If molecule is not found in the database.

put(molecule, value)[source]

Put a value into the database.

Parameters
  • molecule (Molecule) – The molecule which is associated with the value.

  • value (object) – Some value associated with molecule.

Returns

None

Return type

NoneType