Member-only story

Druglikeness Prediction

Patrick Chirdon
31 min readNov 19, 2020

--

I preferred to use rdkit to make predictions in the past because it was easier to use. However a lot of groups have been using graph algorithms instead of SMILES strings to represent their molecules. I thought I would give it a shot using deepchem. I could have also used dgl life sciences from Amazon to accomplish something similar.

I wanted to predict the druglikeness of molecules in a library and sort them based on the frequency of druglike fragments. I have attached the fragments at the end of this article in case you want to reproduce this.

Here is an example output

import deepchem as dc
from deepchem.models import GraphConvModel
import numpy as np
import sys
import pandas as pd
import seaborn as sns
from rdkit.Chem import PandasTools

def generate_graph_conv_model():
batch_size = 128
model = GraphConvModel(1, batch_size=batch_size, mode=’classification’,model_dir=”./model_dir”)
return model

dataset_file = “fragments.csv”
tasks = [“is_active”]
featurizer = dc.feat.ConvMolFeaturizer()
loader = dc.data.CSVLoader(tasks=tasks, smiles_field=”SMILES”, featurizer=featurizer)…

--

--

No responses yet