Member-only story
Tox21 in Deepchem
Itś not so simple to label a molecule as toxic or not toxic. There are usually multiple metrics to measure this. The tox21 dataset used 12 different tests to predict if a given molecule was toxic or not. These tests included —
[‘NR-AR’,
‘NR-AR-LBD’,
‘NR-AhR’,
‘NR-Aromatase’,
‘NR-ER’,
‘NR-ER-LBD’,
‘NR-PPAR-gamma’,
‘SR-ARE’,
‘SR-ATAD5’,
‘SR-HSE’,
‘SR-MMP’,
‘SR-p53’]
You can read about the tox21 challenge here. https://www.frontiersin.org/articles/10.3389/fenvs.2015.00080/full
import numpy as np
import deepchem as dc
tox21_tasks, tox21_datasets, transformers= dc.molnet.load_tox21()
model=dc.models.MultitaskClassifier(n_tasks=12, n_features=1024, layer_sizes=[1000])
train_dataset, valid_dataset, test_dataset=tox21_datasets
model.fit(train_dataset, nb_epoch=10)
pred = [x for x in model.predict(valid_dataset)]
On a test data set the output for each of the 12 tasks is —
The first column is probability of being negative, the second is probability of being positive.
[array([[9.75349307e-01, 2.46507041e-02]…