Downloaded several GB of data from https://www.ebi.ac.uk/chembl/. There are around 75 tables of data on around 2.1M chemical compounds. In bioassays for homo sapiens alone, there are:
- 45,941 – ADMET (A) – ADME and Tox data e.g. t1/2, oral bioavailability, LD50.
- 122,533 – Binding (B) – Data measuring binding of a compound to a molecular target, e.g. Ki, IC50, Kd.
- 172,353 – Functional (F) – Data measuring the biological effect of a compound, e.g. %cell death in a cell line, rat weight.
In the compounds table, there are 6846 molecules related to the assay for Inhibitors and Substrates for the Cytochrome P450 3A4.
This is pretty cool. While I have a some of the data already, I’m going in and exploring the data and seeing what’s there. I’m arranging some SQL queries to pull some initial data, but will need to go back and make a tool for getting all the structure files too.