On the ChEMBL Platform, a Large-scale Evaluation of Machine Learning Algorithms for Drug Target Prediction
Keywords:
Abstract
Deep learning is currently the most successful machine learning technology in a wide range of application fields, and it has recently been used to forecast possible therapeutic targets and screen for active compounds in drug discovery research. However, it is unclear whether deep learning can outperform existing computational methods in drug discovery tasks due to the lack of large-scale studies, the compound series bias that is common in drug discovery datasets, and the hyperparameter selection bias that comes with the large number of potential deep learning architectures. As a result, we compared the outcomes of different deep learning methods to those of other machine learning and target prediction methods on a large-scale drug development dataset. We employed a stacked cluster-cross-validation technique to avoid any biases from hyperparameter selection or compound series. We discovered that (i) deep learning methods beat all competing methods, and (ii) deep learning's prediction performance is often comparable to that of tests conducted in wet labs (i.e., in vitro assays).
References
Bahdanau, D., Cho K. and Bengio, Y. 2014. arXiv preprint arXiv: 1409.0473.
Baumann, D. and Baumann, K. 2014. J. Cheminf., 6, 1. DOI: https://doi.org/10.1186/s13321-014-0047-1
Bengio, Y. 2013. Proceedings of the First International Conference on Statistical Language and Speech Processing, pp. 1–37. DOI: https://doi.org/10.1007/978-3-642-39593-2_1
Bengio, Y., Courville, A. and Vincent, P. 2013. IEEE Trans. Pattern Anal. Mach. Intell., 35, 1798–1828. DOI: https://doi.org/10.1109/TPAMI.2013.50
Bento, A. P., Gaulton, A., Hersey, A., Bellis, L. J., Chambers, J., Davies, M., Kr¨uger, F. A., Light, Y., Mak, L., McGlinchey, S. 2014. Nucleic Acids Res., 42, D1083–D1090. DOI: https://doi.org/10.1093/nar/gkt1031
Breiman, L. 2001. Mach. Learn., 45, 5–32. DOI: https://doi.org/10.1023/A:1010933404324
Bynagari, N. B. (2014). Integrated Reasoning Engine for Code Clone Detection. ABC Journal of Advanced Research, 3(2), 143-152. https://doi.org/10.18034/abcjar.v3i2.575 DOI: https://doi.org/10.18034/abcjar.v3i2.575
Bynagari, N. B. (2015). Machine Learning and Artificial Intelligence in Online Fake Transaction Alerting. Engineering International, 3(2), 115-126. https://doi.org/10.18034/ei.v3i2.566 DOI: https://doi.org/10.18034/ei.v3i2.566
Bynagari, N. B. (2016). Industrial Application of Internet of Things. Asia Pacific Journal of Energy and Environment, 3(2), 75-82. https://doi.org/10.18034/apjee.v3i2.576 DOI: https://doi.org/10.18034/apjee.v3i2.576
Bynagari, N. B. (2017). Prediction of Human Population Responses to Toxic Compounds by a Collaborative Competition. Asian Journal of Humanity, Art and Literature, 4(2), 147-156. https://doi.org/10.18034/ajhal.v4i2.577 DOI: https://doi.org/10.18034/ajhal.v4i2.577
Cao, D.S., Xu, Q.S., Hu Q.N. and Liang, Y.Z. 2013. Bioinformatics, 29, 1092–1094. DOI: https://doi.org/10.1093/bioinformatics/btt105
Caruana, R. 1997. Mach. Learn., 1997, 28, 41–75. DOI: https://doi.org/10.1023/A:1007379606734
Cho, K., Van B., Merri¨enboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk and Bengio, Y. 2014. arXiv preprint arXiv: 1406.1078.
Cortes C. and Vapnik, V. 1995. Mach. Learn., 20, 273–297. DOI: https://doi.org/10.1007/BF00994018
Dahl, G. E., Jaitly N. and Salakhutdinov, R. 2014. arXiv preprint arXiv: 1406.1231.
Democratizing Deep-Learning for Drug Discovery, Quantum Chemistry, Materials Science and Biology, https://github.com/deepchem/deepchem, 2016.
Deng, L., Li, J., Huang, J.T., Yao, K., Yu, D., Seide, F., Seltzer, M., Zweig, G., He, X., Williams, J., Gong, Y. and Acero, A. 2013. Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference, pp. 8604–8608.
Duvenaud, D. K., Maclaurin, D., Iparraguirre, J., Bombarell, R., Hirzel, T., Aspuru-Guzik A. and Adams, R. P. 2015. Advances in Neural Information Processing Systems 28, pp. 2224–2232.
G´omez-Bombarelli, R., Aguilera-Iparraguirre, J., Hirzel, T. D., Duvenaud, D., Maclaurin, D., Blood-Forsythe, M. A., Chae, H. S., Einzinger, M., Ha, D.G., Wu, T. 2016. Nat. Mater., 15, 1120. DOI: https://doi.org/10.1038/nmat4717
G´omez-Bombarelli, R., Wei, J. N., Duvenaud, D., Hern´andez-Lobato, J. M., S´anchez-Lengeling, B., Sheberla, D., Aguilera-Iparraguirre, J., Hirzel, T. D., Adams, R. P. and Aspuru-Guzik, A. 2016. ACS Cent. Sci., 4, 268–276. DOI: https://doi.org/10.1021/acscentsci.7b00572
Ganapathy, A. (2015). AI Fitness Checks, Maintenance and Monitoring on Systems Managing Content & Data: A Study on CMS World. Malaysian Journal of Medical and Biological Research, 2(2), 113-118. https://doi.org/10.18034/mjmbr.v2i2.553 DOI: https://doi.org/10.18034/mjmbr.v2i2.553
Ganapathy, A. (2016a). Blockchain Technology Use on Transactions of Crypto Currency with Machinery & Electronic Goods. American Journal of Trade and Policy, 3(3), 115-120. https://doi.org/10.18034/ajtp.v3i3.552 DOI: https://doi.org/10.18034/ajtp.v3i3.552
Ganapathy, A. (2016b). Virtual Reality and Augmented Reality Driven Real Estate World to Buy Properties. Asian Journal of Humanity, Art and Literature, 3(2), 137-146. https://doi.org/10.18034/ajhal.v3i2.567 DOI: https://doi.org/10.18034/ajhal.v3i2.567
Ganapathy, A. (2017). Friendly URLs in the CMS and Power of Global Ranking with Crawlers with Added Security. Engineering International, 5(2), 87-96. https://doi.org/10.18034/ei.v5i2.541 DOI: https://doi.org/10.18034/ei.v5i2.541
Ganapathy, A., & Neogy, T. K. (2017). Artificial Intelligence Price Emulator: A Study on Cryptocurrency. Global Disclosure of Economics and Business, 6(2), 115-122. https://doi.org/10.18034/gdeb.v6i2.558 DOI: https://doi.org/10.18034/gdeb.v6i2.558
Graves A. and Jaitly, N. 2014. Proceedings of the 31st International Conference on Machine Learning, 2014, pp. II-1764–II-1772.
Hanley J. A. and McNeil, B. J. 1982. Radiology, 143, 29–36. DOI: https://doi.org/10.1148/radiology.143.1.7063747
Hinselmann, G., Rosenbaum, L., Jahn, A., Fechner N. and A. Zell, J. Cheminf., 2011, 3, 1–14. DOI: https://doi.org/10.1186/1758-2946-3-3
Hochreiter S. and Obermayer, K. 2004. Kernel Methods in Computational Biology, MIT Press, pp. 319–355.
Hochreiter S. and Schmidhuber, J. 1997. Neural Comput., 9, 1735–1780. DOI: https://doi.org/10.1162/neco.1997.9.8.1735
Hochreiter, S. 1991. MSc thesis, Institut f¨ur Informatik, Lehrstuhl Prof. Dr. Dr. h.c. Brauer, Technische Universit¨at M¨unchen.
Hochreiter, S., Bengio, Y., Frasconi P. and Schmidhuber, J. 2000. A Field Guide to Dynamical Recurrent Networks, IEEE, pp. 237–244.
Huang, R., Xia, M., Nguyen, D.T., Zhao, T., Sakamuru, S., Zhao, J., Shahane, S. A., Rossoshek, A. and Simeonov, A. 2016. Front. Environ. Sci. Eng., 3, 85. DOI: https://doi.org/10.3389/fenvs.2015.00085
Kalliokoski, T., Kramer, C., Vulpetti A. and Gedeck, P. 2013. PLoS One, 8, 1–12. DOI: https://doi.org/10.1371/journal.pone.0061007
Kazius, J., McGuire, R. and Bursi, R. 2005. J. Med. Chem., 48, 312–320. DOI: https://doi.org/10.1021/jm040835a
Kearnes, S., Goldman, B. and Pande, V. 2016. arXiv preprint arXiv: 1606.08793, 2016.
Kearnes, S., McCloskey, K., Berndl, M., Pande V. and Riley, P. 2016. J. Comput. Aided Mol. Des., 30, 595–608. DOI: https://doi.org/10.1007/s10822-016-9938-8
Keiser M. J. and Hert, J. 2009. Chemogenomics, Humana Press, pp. 195–205. DOI: https://doi.org/10.1007/978-1-60761-274-2_8
Keiser, M. J., Roth, B. L., Armbruster, B. N., Ernsberger, P, Irwin J. J. and Shoichet, B. K. 2007. Nat. Biotechnol., 25, 197–206. DOI: https://doi.org/10.1038/nbt1284
Keiser, M. J., Setola, V., Irwin, J. J., Laggner, C., Abbas, A. I, Hufeisen, S. J., Jensen, N. H.,. Kuijer, M. B., Matos, R. C., Tran, T. B., Whaley, R., Glennon, R. A., Hert, J.,
Koutsoukas, A., Monaghan, K. J., Li X. and Huan, J. 2017. J. Cheminf., 9, 42. DOI: https://doi.org/10.1186/s13321-017-0226-y
Krizhevsky, Sutskever A., I. and Hinton, G. E. 2012. Advances in Neural Information Processing Systems 25, 2012, pp. 1097– 1105
LeCun, Y., Bottou, L., Bengio Y. and Haffner, P. 1998. Proc. IEEE, 86, 2278–2324. DOI: https://doi.org/10.1109/5.726791
Ma, J., Sheridan, R. P., Liaw, A., Dahl, G. E. and Svetnik, V. 2015. J. Chem. Inf. Model., 55, 263–274. DOI: https://doi.org/10.1021/ci500747n
Mayr, A., Klambauer, G., Unterthiner, T. and Hochreiter, S. 2016. Front. Environ. Sci. Eng., 3, 80. DOI: https://doi.org/10.3389/fenvs.2015.00080
Molina, D. M., Jafari, R., Ignatushchenko, M., Seki, T., Larsson, E. A., Dan, C., Sreekumar, L., Cao, Y. and Nordlund, P. 2013. Science, 341, 84–87. DOI: https://doi.org/10.1126/science.1233606
Neogy, T. K., & Paruchuri, H. (2014). Machine Learning as a New Search Engine Interface: An Overview. Engineering International, 2(2), 103-112. https://doi.org/10.18034/ei.v2i2.539 DOI: https://doi.org/10.18034/ei.v2i2.539
Olivecrona, M., Blaschke, T., Engkvist, O. and Chen, H. 2017. J. Cheminf., 9, 48. DOI: https://doi.org/10.1186/s13321-017-0235-x
Paruchuri, H. (2015). Application of Artificial Neural Network to ANPR: An Overview. ABC Journal of Advanced Research, 4(2), 143-152. https://doi.org/10.18034/abcjar.v4i2.549 DOI: https://doi.org/10.18034/abcjar.v4i2.549
Paruchuri, H. (2017). Credit Card Fraud Detection using Machine Learning: A Systematic Literature Review. ABC Journal of Advanced Research, 6(2), 113-120. https://doi.org/10.18034/abcjar.v6i2.547 DOI: https://doi.org/10.18034/abcjar.v6i2.547
Preuer, K., Lewis, R. P. I., Hochreiter, S., Bender, A., Bulusu, K. C. and Klambauer, G. 2017. Bioinformatics, 34, 1538–1546. DOI: https://doi.org/10.1093/bioinformatics/btx806
Preuer, K., Renz, P., Unterthiner, T., Hochreiter, S. and Klambauer, G. 2018. arXiv preprint arXiv: 1803.09518.
Ramsundar, B., Kearnes, S., Riley, P., Webster, D. Konerding, D. and Pande, V. 2015. arXiv preprint arXiv: 1502.02072.
Rogers D. and Hahn, M. 2010. J. Chem. Inf. Model., 50, 742– 754. DOI: https://doi.org/10.1021/ci100050t
Russakovsky, O., J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg and Fei-Fei, L. 2015. Int. J. Comput. Vis., 115, 211–252. DOI: https://doi.org/10.1007/s11263-015-0816-y
Sch¨utt, K., Kindermans, P.J., Sauceda Felix, H. E., Chmiela, S., Tkatchenko, A. and M¨uller, K.R. 2017. Advances in Neural Information Processing Systems, 30, pp. 991–1001.
Segler, M. H. S., Kogej, T., Tyrchan, C. and Waller, M. P. 2018. ACS Cent. Sci., 4, 120–131. DOI: https://doi.org/10.1021/acscentsci.7b00512
Segler, M. H., Preuss, M. and Waller, M. P. 2018. Nature, 555, 604. DOI: https://doi.org/10.1038/nature25978
Sheridan, R. P., 2013. J. Chem. Inf. Model., 53, 783–790. DOI: https://doi.org/10.1021/ci400084k
Simonyan K. and Zisserman, A. 2014. arXiv preprint arXiv: 1409.1556.
Smith, J. S., Isayev, O. and Roitberg, A. E. 2017. Chem. Sci., 8, 3192–3203. DOI: https://doi.org/10.1039/C6SC05720A
Sutskever, I., Vinyals O. and Le, Q. V. 2014. Advances in Neural Information Processing Systems 27, pp. 3104–3112.
Swamidass, S. J., Chen, J., Bruand, J., Phung, P., Ralaivola, L. and Baldi, P. 2005. Bioinformatics, 21, i359–i368. DOI: https://doi.org/10.1093/bioinformatics/bti1055
Szegedy, C., Liu, W. Jia, Y., Sermanet, P., Reed, S., Anguelov, D. Erhan, D., Vanhoucke V. and Rabinovich, A. 2015. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–9.
Thomas, K. L., Edwards, D. D., Shoichet B. K. and Roth, B. L. 2009. Nature, 462, 175–181. DOI: https://doi.org/10.1038/nature08506
Vadlamudi, S. (2015). Enabling Trustworthiness in Artificial Intelligence - A Detailed Discussion. Engineering International, 3(2), 105-114. https://doi.org/10.18034/ei.v3i2.519 DOI: https://doi.org/10.18034/ei.v3i2.519
Vadlamudi, S. (2016). What Impact does Internet of Things have on Project Management in Project based Firms?. Asian Business Review, 6(3), 179-186. https://doi.org/10.18034/abr.v6i3.520 DOI: https://doi.org/10.18034/abr.v6i3.520
Vadlamudi, S. (2017). Stock Market Prediction using Machine Learning: A Systematic Literature Review. American Journal of Trade and Policy, 4(3), 123-128. https://doi.org/10.18034/ajtp.v4i3.521 DOI: https://doi.org/10.18034/ajtp.v4i3.521
Weininger, D. 1988. J. Chem. Inf. Comput. Sci., 28, 31–36. DOI: https://doi.org/10.1021/ci00057a005
Wu, Z., B. Ramsundar, E. N. Feinberg, J. Gomes, C. Geniesse, A. S. Pappu, K. Leswing and Pande, V. 2018. Chem. Sci., 9, 513–530. DOI: https://doi.org/10.1039/C7SC02664A
Yang, X., Zhang, J., Yoshizoe, K., Terayama, K. and Tsuda, K. 2017. Sci. Technol. Adv. Mater., 18, 972–976. DOI: https://doi.org/10.1080/14686996.2017.1401424
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2018 Asian Journal of Applied Science and Engineering
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.