On the ChEMBL Platform, a Large-scale Evaluation of Machine Learning Algorithms for Drug Target Prediction

Authors

  • Naresh Babu Bynagari Andriod Developer, Keypixel Software Solutions, 777 Washington rd Parlin NJ 08859, Middlesex, USA

Keywords:

ChEMBL Platform
Machine Learning Algorithms
Drug Target Prediction

Abstract

Deep learning is currently the most successful machine learning technology in a wide range of application fields, and it has recently been used to forecast possible therapeutic targets and screen for active compounds in drug discovery research. However, it is unclear whether deep learning can outperform existing computational methods in drug discovery tasks due to the lack of large-scale studies, the compound series bias that is common in drug discovery datasets, and the hyperparameter selection bias that comes with the large number of potential deep learning architectures. As a result, we compared the outcomes of different deep learning methods to those of other machine learning and target prediction methods on a large-scale drug development dataset. We employed a stacked cluster-cross-validation technique to avoid any biases from hyperparameter selection or compound series. We discovered that (i) deep learning methods beat all competing methods, and (ii) deep learning's prediction performance is often comparable to that of tests conducted in wet labs (i.e., in vitro assays).

References

Bahdanau, D., Cho K. and Bengio, Y. 2014. arXiv preprint arXiv: 1409.0473.

Baumann, D. and Baumann, K. 2014. J. Cheminf., 6, 1. DOI: https://doi.org/10.1186/s13321-014-0047-1

Bengio, Y. 2013. Proceedings of the First International Conference on Statistical Language and Speech Processing, pp. 1–37. DOI: https://doi.org/10.1007/978-3-642-39593-2_1

Bengio, Y., Courville, A. and Vincent, P. 2013. IEEE Trans. Pattern Anal. Mach. Intell., 35, 1798–1828. DOI: https://doi.org/10.1109/TPAMI.2013.50

Bento, A. P., Gaulton, A., Hersey, A., Bellis, L. J., Chambers, J., Davies, M., Kr¨uger, F. A., Light, Y., Mak, L., McGlinchey, S. 2014. Nucleic Acids Res., 42, D1083–D1090. DOI: https://doi.org/10.1093/nar/gkt1031

Breiman, L. 2001. Mach. Learn., 45, 5–32. DOI: https://doi.org/10.1023/A:1010933404324

Bynagari, N. B. (2014). Integrated Reasoning Engine for Code Clone Detection. ABC Journal of Advanced Research, 3(2), 143-152. https://doi.org/10.18034/abcjar.v3i2.575 DOI: https://doi.org/10.18034/abcjar.v3i2.575

Bynagari, N. B. (2015). Machine Learning and Artificial Intelligence in Online Fake Transaction Alerting. Engineering International, 3(2), 115-126. https://doi.org/10.18034/ei.v3i2.566 DOI: https://doi.org/10.18034/ei.v3i2.566

Bynagari, N. B. (2016). Industrial Application of Internet of Things. Asia Pacific Journal of Energy and Environment, 3(2), 75-82. https://doi.org/10.18034/apjee.v3i2.576 DOI: https://doi.org/10.18034/apjee.v3i2.576

Bynagari, N. B. (2017). Prediction of Human Population Responses to Toxic Compounds by a Collaborative Competition. Asian Journal of Humanity, Art and Literature, 4(2), 147-156. https://doi.org/10.18034/ajhal.v4i2.577 DOI: https://doi.org/10.18034/ajhal.v4i2.577

Cao, D.S., Xu, Q.S., Hu Q.N. and Liang, Y.Z. 2013. Bioinformatics, 29, 1092–1094. DOI: https://doi.org/10.1093/bioinformatics/btt105

Caruana, R. 1997. Mach. Learn., 1997, 28, 41–75. DOI: https://doi.org/10.1023/A:1007379606734

Cho, K., Van B., Merri¨enboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk and Bengio, Y. 2014. arXiv preprint arXiv: 1406.1078.

Cortes C. and Vapnik, V. 1995. Mach. Learn., 20, 273–297. DOI: https://doi.org/10.1007/BF00994018

Dahl, G. E., Jaitly N. and Salakhutdinov, R. 2014. arXiv preprint arXiv: 1406.1231.

Democratizing Deep-Learning for Drug Discovery, Quantum Chemistry, Materials Science and Biology, https://github.com/deepchem/deepchem, 2016.

Deng, L., Li, J., Huang, J.T., Yao, K., Yu, D., Seide, F., Seltzer, M., Zweig, G., He, X., Williams, J., Gong, Y. and Acero, A. 2013. Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference, pp. 8604–8608.

Duvenaud, D. K., Maclaurin, D., Iparraguirre, J., Bombarell, R., Hirzel, T., Aspuru-Guzik A. and Adams, R. P. 2015. Advances in Neural Information Processing Systems 28, pp. 2224–2232.

G´omez-Bombarelli, R., Aguilera-Iparraguirre, J., Hirzel, T. D., Duvenaud, D., Maclaurin, D., Blood-Forsythe, M. A., Chae, H. S., Einzinger, M., Ha, D.G., Wu, T. 2016. Nat. Mater., 15, 1120. DOI: https://doi.org/10.1038/nmat4717

G´omez-Bombarelli, R., Wei, J. N., Duvenaud, D., Hern´andez-Lobato, J. M., S´anchez-Lengeling, B., Sheberla, D., Aguilera-Iparraguirre, J., Hirzel, T. D., Adams, R. P. and Aspuru-Guzik, A. 2016. ACS Cent. Sci., 4, 268–276. DOI: https://doi.org/10.1021/acscentsci.7b00572

Ganapathy, A. (2015). AI Fitness Checks, Maintenance and Monitoring on Systems Managing Content & Data: A Study on CMS World. Malaysian Journal of Medical and Biological Research, 2(2), 113-118. https://doi.org/10.18034/mjmbr.v2i2.553 DOI: https://doi.org/10.18034/mjmbr.v2i2.553

Ganapathy, A. (2016a). Blockchain Technology Use on Transactions of Crypto Currency with Machinery & Electronic Goods. American Journal of Trade and Policy, 3(3), 115-120. https://doi.org/10.18034/ajtp.v3i3.552 DOI: https://doi.org/10.18034/ajtp.v3i3.552

Ganapathy, A. (2016b). Virtual Reality and Augmented Reality Driven Real Estate World to Buy Properties. Asian Journal of Humanity, Art and Literature, 3(2), 137-146. https://doi.org/10.18034/ajhal.v3i2.567 DOI: https://doi.org/10.18034/ajhal.v3i2.567

Ganapathy, A. (2017). Friendly URLs in the CMS and Power of Global Ranking with Crawlers with Added Security. Engineering International, 5(2), 87-96. https://doi.org/10.18034/ei.v5i2.541 DOI: https://doi.org/10.18034/ei.v5i2.541

Ganapathy, A., & Neogy, T. K. (2017). Artificial Intelligence Price Emulator: A Study on Cryptocurrency. Global Disclosure of Economics and Business, 6(2), 115-122. https://doi.org/10.18034/gdeb.v6i2.558 DOI: https://doi.org/10.18034/gdeb.v6i2.558

Graves A. and Jaitly, N. 2014. Proceedings of the 31st International Conference on Machine Learning, 2014, pp. II-1764–II-1772.

Hanley J. A. and McNeil, B. J. 1982. Radiology, 143, 29–36. DOI: https://doi.org/10.1148/radiology.143.1.7063747

Hinselmann, G., Rosenbaum, L., Jahn, A., Fechner N. and A. Zell, J. Cheminf., 2011, 3, 1–14. DOI: https://doi.org/10.1186/1758-2946-3-3

Hochreiter S. and Obermayer, K. 2004. Kernel Methods in Computational Biology, MIT Press, pp. 319–355.

Hochreiter S. and Schmidhuber, J. 1997. Neural Comput., 9, 1735–1780. DOI: https://doi.org/10.1162/neco.1997.9.8.1735

Hochreiter, S. 1991. MSc thesis, Institut f¨ur Informatik, Lehrstuhl Prof. Dr. Dr. h.c. Brauer, Technische Universit¨at M¨unchen.

Hochreiter, S., Bengio, Y., Frasconi P. and Schmidhuber, J. 2000. A Field Guide to Dynamical Recurrent Networks, IEEE, pp. 237–244.

Huang, R., Xia, M., Nguyen, D.T., Zhao, T., Sakamuru, S., Zhao, J., Shahane, S. A., Rossoshek, A. and Simeonov, A. 2016. Front. Environ. Sci. Eng., 3, 85. DOI: https://doi.org/10.3389/fenvs.2015.00085

Kalliokoski, T., Kramer, C., Vulpetti A. and Gedeck, P. 2013. PLoS One, 8, 1–12. DOI: https://doi.org/10.1371/journal.pone.0061007

Kazius, J., McGuire, R. and Bursi, R. 2005. J. Med. Chem., 48, 312–320. DOI: https://doi.org/10.1021/jm040835a

Kearnes, S., Goldman, B. and Pande, V. 2016. arXiv preprint arXiv: 1606.08793, 2016.

Kearnes, S., McCloskey, K., Berndl, M., Pande V. and Riley, P. 2016. J. Comput. Aided Mol. Des., 30, 595–608. DOI: https://doi.org/10.1007/s10822-016-9938-8

Keiser M. J. and Hert, J. 2009. Chemogenomics, Humana Press, pp. 195–205. DOI: https://doi.org/10.1007/978-1-60761-274-2_8

Keiser, M. J., Roth, B. L., Armbruster, B. N., Ernsberger, P, Irwin J. J. and Shoichet, B. K. 2007. Nat. Biotechnol., 25, 197–206. DOI: https://doi.org/10.1038/nbt1284

Keiser, M. J., Setola, V., Irwin, J. J., Laggner, C., Abbas, A. I, Hufeisen, S. J., Jensen, N. H.,. Kuijer, M. B., Matos, R. C., Tran, T. B., Whaley, R., Glennon, R. A., Hert, J.,

Koutsoukas, A., Monaghan, K. J., Li X. and Huan, J. 2017. J. Cheminf., 9, 42. DOI: https://doi.org/10.1186/s13321-017-0226-y

Krizhevsky, Sutskever A., I. and Hinton, G. E. 2012. Advances in Neural Information Processing Systems 25, 2012, pp. 1097– 1105

LeCun, Y., Bottou, L., Bengio Y. and Haffner, P. 1998. Proc. IEEE, 86, 2278–2324. DOI: https://doi.org/10.1109/5.726791

Ma, J., Sheridan, R. P., Liaw, A., Dahl, G. E. and Svetnik, V. 2015. J. Chem. Inf. Model., 55, 263–274. DOI: https://doi.org/10.1021/ci500747n

Mayr, A., Klambauer, G., Unterthiner, T. and Hochreiter, S. 2016. Front. Environ. Sci. Eng., 3, 80. DOI: https://doi.org/10.3389/fenvs.2015.00080

Molina, D. M., Jafari, R., Ignatushchenko, M., Seki, T., Larsson, E. A., Dan, C., Sreekumar, L., Cao, Y. and Nordlund, P. 2013. Science, 341, 84–87. DOI: https://doi.org/10.1126/science.1233606

Neogy, T. K., & Paruchuri, H. (2014). Machine Learning as a New Search Engine Interface: An Overview. Engineering International, 2(2), 103-112. https://doi.org/10.18034/ei.v2i2.539 DOI: https://doi.org/10.18034/ei.v2i2.539

Olivecrona, M., Blaschke, T., Engkvist, O. and Chen, H. 2017. J. Cheminf., 9, 48. DOI: https://doi.org/10.1186/s13321-017-0235-x

Paruchuri, H. (2015). Application of Artificial Neural Network to ANPR: An Overview. ABC Journal of Advanced Research, 4(2), 143-152. https://doi.org/10.18034/abcjar.v4i2.549 DOI: https://doi.org/10.18034/abcjar.v4i2.549

Paruchuri, H. (2017). Credit Card Fraud Detection using Machine Learning: A Systematic Literature Review. ABC Journal of Advanced Research, 6(2), 113-120. https://doi.org/10.18034/abcjar.v6i2.547 DOI: https://doi.org/10.18034/abcjar.v6i2.547

Preuer, K., Lewis, R. P. I., Hochreiter, S., Bender, A., Bulusu, K. C. and Klambauer, G. 2017. Bioinformatics, 34, 1538–1546. DOI: https://doi.org/10.1093/bioinformatics/btx806

Preuer, K., Renz, P., Unterthiner, T., Hochreiter, S. and Klambauer, G. 2018. arXiv preprint arXiv: 1803.09518.

Ramsundar, B., Kearnes, S., Riley, P., Webster, D. Konerding, D. and Pande, V. 2015. arXiv preprint arXiv: 1502.02072.

Rogers D. and Hahn, M. 2010. J. Chem. Inf. Model., 50, 742– 754. DOI: https://doi.org/10.1021/ci100050t

Russakovsky, O., J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg and Fei-Fei, L. 2015. Int. J. Comput. Vis., 115, 211–252. DOI: https://doi.org/10.1007/s11263-015-0816-y

Sch¨utt, K., Kindermans, P.J., Sauceda Felix, H. E., Chmiela, S., Tkatchenko, A. and M¨uller, K.R. 2017. Advances in Neural Information Processing Systems, 30, pp. 991–1001.

Segler, M. H. S., Kogej, T., Tyrchan, C. and Waller, M. P. 2018. ACS Cent. Sci., 4, 120–131. DOI: https://doi.org/10.1021/acscentsci.7b00512

Segler, M. H., Preuss, M. and Waller, M. P. 2018. Nature, 555, 604. DOI: https://doi.org/10.1038/nature25978

Sheridan, R. P., 2013. J. Chem. Inf. Model., 53, 783–790. DOI: https://doi.org/10.1021/ci400084k

Simonyan K. and Zisserman, A. 2014. arXiv preprint arXiv: 1409.1556.

Smith, J. S., Isayev, O. and Roitberg, A. E. 2017. Chem. Sci., 8, 3192–3203. DOI: https://doi.org/10.1039/C6SC05720A

Sutskever, I., Vinyals O. and Le, Q. V. 2014. Advances in Neural Information Processing Systems 27, pp. 3104–3112.

Swamidass, S. J., Chen, J., Bruand, J., Phung, P., Ralaivola, L. and Baldi, P. 2005. Bioinformatics, 21, i359–i368. DOI: https://doi.org/10.1093/bioinformatics/bti1055

Szegedy, C., Liu, W. Jia, Y., Sermanet, P., Reed, S., Anguelov, D. Erhan, D., Vanhoucke V. and Rabinovich, A. 2015. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–9.

Thomas, K. L., Edwards, D. D., Shoichet B. K. and Roth, B. L. 2009. Nature, 462, 175–181. DOI: https://doi.org/10.1038/nature08506

Vadlamudi, S. (2015). Enabling Trustworthiness in Artificial Intelligence - A Detailed Discussion. Engineering International, 3(2), 105-114. https://doi.org/10.18034/ei.v3i2.519 DOI: https://doi.org/10.18034/ei.v3i2.519

Vadlamudi, S. (2016). What Impact does Internet of Things have on Project Management in Project based Firms?. Asian Business Review, 6(3), 179-186. https://doi.org/10.18034/abr.v6i3.520 DOI: https://doi.org/10.18034/abr.v6i3.520

Vadlamudi, S. (2017). Stock Market Prediction using Machine Learning: A Systematic Literature Review. American Journal of Trade and Policy, 4(3), 123-128. https://doi.org/10.18034/ajtp.v4i3.521 DOI: https://doi.org/10.18034/ajtp.v4i3.521

Weininger, D. 1988. J. Chem. Inf. Comput. Sci., 28, 31–36. DOI: https://doi.org/10.1021/ci00057a005

Wu, Z., B. Ramsundar, E. N. Feinberg, J. Gomes, C. Geniesse, A. S. Pappu, K. Leswing and Pande, V. 2018. Chem. Sci., 9, 513–530. DOI: https://doi.org/10.1039/C7SC02664A

Yang, X., Zhang, J., Yoshizoe, K., Terayama, K. and Tsuda, K. 2017. Sci. Technol. Adv. Mater., 18, 972–976. DOI: https://doi.org/10.1080/14686996.2017.1401424

Downloads

Published

2018-07-10

How to Cite

Bynagari, N. B. (2018). On the ChEMBL Platform, a Large-scale Evaluation of Machine Learning Algorithms for Drug Target Prediction. Asian Journal of Applied Science and Engineering, 7(1), 53-64. https://doi.org/10.18034/ajase.v7i1.46

Issue

Section

Articles