In-silico screening of drug-like molecules.

Documentation Status License

Potencyscreen is a small library to predict molecular properties from string descriptors such as SMILES or SELFIES. It features classical ML models based on hand-crafted fingerprint functions as well as deep learning models, leveraging the molfeat library.

Getting Started

The following code provides an example on how to optimize the hyperparameters of a random forrest model and evaluate its performance on the test set.

import datamol as dm
from molfeat.trans.fp import FPVecTransformer
from sklearn.ensemble import RandomForestRegressor
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import GridSearchCV
from sklearn.pipeline import Pipeline

from potencyscreen import trainers

df = dm.data.freesolv()  # test dataset

# define ML model and hyperparameters
param_grid = {
    'feat__kind': ['fcfp:6', 'ecfp:6'],
    'feat__length': [1024],
    'rf__n_estimators': [100, 500]
}
pipe = Pipeline([('feat', FPVecTransformer(kind='rdkit')),
                 ('scaler', StandardScaler()),
                 ('rf', RandomForestRegressor())])
grid_search = GridSearchCV(pipe, param_grid=param_grid)

# optimize hyperparameters and evaluate performance metrics on test set
sk_trainer = trainers.SklearnTrainer(
    grid_search, df['smiles'], df['expt'], thresh=0.)
_ = sk_trainer.train()
_, _ = sk_trainer.test_metrics()

Installation

To use potencyscreen and run all examples, you can simply install the package via the following command:

pip install -e .

Run example

To run the jupyter notebook, execute

jupyter notebook examples/EGFR/

and open the file ‘training_inference.ipynb’ in the browser window.

Requirements

To run the example, the following packages are required:

    "datamol",
    "rdkit",
    "molfeat",
    "python-dotenv",
    "pandas",
    "numpy",
    "scikit-learn",
    "torch",
    "torch-geometric",
    "tqdm",
    "jupyter"

The code was tested with Python 3.8.

For Developers

To install all packages required to compile the documentation and to run pytest, you may install potencyscreen with the following options:

pip install -e .[docs,test]

To run the tests:

pytest tests

To compile the documentation:

make -C docs html

Contact

For questions, please contact stephan.thaler@tum.de.

potencyscreen

Indices and tables