Revolutionizing Drug Discovery; Transformative Role of Machine Learning


  • Moazzam Siddiq Independent Researcher, Manchester, United Kingdom.


Machine Learning, Drug discovery, pharmaceutics, Robotics, Artificial Intelligence


The use of machine learning in drug discovery is examined in this review article along with any potential advantages, difficulties, and prospective future developments. The article examines the many machine learning models that have been created for these uses and emphasises the value of machine learning in predicting drug characteristics, discovering new therapeutic targets, and creating new drug candidates. The need for high-quality data, increased collaboration and data sharing, as well as ethical and regulatory considerations, are just a few of the obstacles and limitations of employing machine learning in drug discovery that are covered in this article. The study also highlights the necessity of regulatory frameworks that can guarantee the safety and efficacy of novel pharmaceuticals generated using these models, as well as the significance of transparency and accountability in the usage of machine learning algorithms. The discussion of potential future paths and prospects for development in the field of machine learning in drug discovery finishes the essay. Deep learning models, multi-task learning, personalised medicine, and the fusion of machine learning with other technologies like robotics and automation are a few examples of these. In order to speed up the drug discovery process and provide novel, efficient medicines to patients in need, the authors propose tackling the difficulties and limitations of machine learning in drug discovery as well as continuing to investigate these exciting areas of research and development. This review paper offers a thorough summary of the current status of machine learning in drug discovery, stressing its potential advantages and disadvantages as well as outlining the major areas for future research and development that are expected to spur advancement. Researchers, medication developers, and politicians who are curious about how machine learning could change the drug discovery process and enhance patient outcomes will find the paper interesting.



Author Biography

Moazzam Siddiq, Independent Researcher, Manchester, United Kingdom.





Utsha Sinha, Abhinav Singh, Deepak Kumar Sharma, Machine Learning in the Medical Industry, Handbook of Research on Emerging Trends and Applications of Machine Learning, 10.4018/978-1-5225-9643-1.ch019, (403-424), (2020).

Susanne Uusitalo, Jarno Tuominen, Valtteri Arstila, Mapping out the philosophical questions of AI and clinical practice in diagnosing and treating mental disorders, Journal of Evaluation in Clinical Practice, 10.1111/jep.13485, 27, 3, (478-484), (2020).

Rashid, M. T., Zhang, D. Y., & Wang, D. (2019, December). Socialcar: A task allocation framework for social media driven vehicular network sensing systems. In 2019 15th International Conference on Mobile Ad-Hoc and Sensor Networks (MSN) (pp. 125-130). IEEE.

Shams, A. T., & Akter, S. (2022). Eco-Centric Versus Anthropocentric Approach in Literary Pedagogy: Inclusion of Non-Human Narratives as Teaching Social Justice.

Ozlem Erdas-Cicek, Ali Osman Atac, A. Selen Gurkan-Alp, Erdem Buyukbingol, Ferda Nur Alpaslan, Three-Dimensional Analysis of Binding Sites for Predicting Binding Affinities in Drug Design, Journal of Chemical Information and Modeling, 10.1021/acs.jcim.9b00206, 59, 11, (4654-4662), (2019).

Sebastian Raschka, Automated discovery of GPCR bioactive ligands, Current Opinion in Structural Biology, 10.1016/, 55, (17-24), (2019).

Javier Pérez-Sianes, Horacio Pérez-Sánchez, Fernando Díaz, Virtual Screening Meets Deep Learning, Current Computer-Aided Drug Design, 10.2174/1573409914666181018141602, 15, 1, (6-28), (2018).

Rashid, M. T., Zhang, D. Y., & Wang, D. (2020). DASC: Towards a road Damage-Aware Social-media-driven Car sensing framework for disaster response applications. Pervasive and Mobile Computing, 67, 101207.

Isidro Cortés-Ciriano, Nicholas C. Firth, Andreas Bender, Oliver Watson, Discovering Highly Potent Molecules from an Initial Set of Inactives Using Iterative Screening, Journal of Chemical Information and Modeling, 10.1021/acs.jcim.8b00376, 58, 9, (2000-2014), (2018).

Dries Harnie, Mathijs Saey, Alexander E. Vapirev, Jörg Kurt Wegner, Andrey Gedich, Marvin Steijaert, Hugo Ceulemans, Roel Wuyts, Wolfgang De Meuter, Scaling machine learning for target prediction in drug discovery using Apache Spark, Future Generation Computer Systems, 10.1016/j.future.2016.04.023, 67, (409-417), (2017).

Akhter, A., & Shams, A. T. (2022). Identity Economics in Emily Brontë’s Wuthering Heights: An Empathetic Inquiry into Psychoanalysis. SCHOLARS: Journal of Arts & Humanities, 4(2), 74-80.

Hongming Chen, Udo Bauer, Ola Engkvist, Merged Multiple Ligands, Drug Selectivity, 10.1002/9783527674381.ch9, (247-274), (2017).

Rashid, M. T., Chowdhury, P., & Rhaman, M. K. (2015, December). Espionage: A voice guided surveillance robot with DTMF control and web based control. In 2015 18th International Conference on Computer and Information Technology (ICCIT) (pp. 419-422). IEEE.

Fenglei Li, Qiaoyu Hu, Xianglei Zhang, Renhong Sun, Zhuanghua Liu, Sanan Wu, Siyuan Tian, Xinyue Ma, Zhizhuo Dai, Xiaobao Yang, Shenghua Gao, Fang Bai, DeepPROTACs is a deep learning-based targeted degradation predictor for PROTACs, Nature Communications, 10.1038/s41467-022-34807-3, 13, 1, (2022).

Muhammad Waqar Ashraf, Artificial Intelligence for Drug Development, Advances in Artificial Intelligence, Computation, and Data Science, 10.1007/978-3-030-69951-2_5, (127-132), (2021).

Ling Hao, Tyler Greer, David Page, Yatao Shi, Chad M. Vezina, Jill A. Macoska, Paul C. Marker, Dale E. Bjorling, Wade Bushman, William A. Ricke, Lingjun Li, In-Depth Characterization and Validation of Human Urine Metabolomes Reveal Novel Metabolic Signatures of Lower Urinary Tract Symptoms, Scientific Reports, 10.1038/srep30869, 6, 1, (2016).

Zhang, Y., Zong, R., Shang, L., Rashid, M. T., & Wang, D. (2021, June). Superclass: A deep duo-task learning approach to improving qos in image-driven smart urban sensing applications. In 2021 IEEE/ACM 29th International Symposium on Quality of Service (IWQOS) (pp. 1-6). IEEE.

Harsh Chauhan, Jonathan Bernick, Dev Prasad, Vijay Masand, The Role of Artificial Neural Networks on Target Validation in Drug Discovery and Development, Artificial Neural Network for Drug Design, Delivery and Disposition, 10.1016/B978-0-12-801559-9.00002-8, (15-27), (2016).

Sergio Ruiz-Carmona, Xavier Barril, Docking-undocking combination applied to the D3R Grand Challenge 2015, Journal of Computer-Aided Molecular Design, 10.1007/s10822-016-9979-z, 30, 9, (805-815), (2016).

Dries Harnie, Alexander E. Vapirev, Jorg Kurt Wegner, Andrey Gedich, Marvin Steijaert, Roel Wuyts, Wolfgang De Meuter, undefined, 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, 10.1109/CCGrid.2015.50, (871-879), (2015).

Antonino Marvuglia, Mikhail Kanevski, Enrico Benetto, Machine learning for toxicity characterization of organic chemical emissions using USEtox database: Learning the structure of the input space, Environment International, 10.1016/j.envint.2015.05.011, 83, (72-85), (2015).

Rashid, M. T., Zhang, D., & Wang, D. (2020, July). A Computational Model-Driven Hybrid Social Media and Drone-Based Wildfire Monitoring Framework. In IEEE INFOCOM 2020-IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS) (pp. 1362-1363). IEEE.

Robert Wolfgang Rumpf, Samuel L. Wolock, William C. Ray, StickWRLD as an Interactive Visual Pre-Filter for Canceromics-Centric Expression Quantitative Trait Locus Data, Cancer Informatics, 10.4137/CIN.S14024, 13s3, (CIN.S14024), (2014).

Bin Chen, Huijun Wang, Ying Ding, David Wild, Semantic Breakthrough in Drug Discovery, Synthesis Lectures on the Semantic Web: Theory and Technology, 10.2200/S00600ED1V01Y201409WEB009, 4, 2, (1-142), (2014).

Donald Petrey, Barry Honig, Structural Bioinformatics of the Interactome, Annual Review of Biophysics, 10.1146/annurev-biophys-051013-022726, 43, 1, (193-210), (2014).

Agarwal S, Dugar D, Sengupta S. 2010. Ranking chemical structures for drug discovery:A new machine learning approach. J Chem Info Model 50:716–731.

Arodz T, Yuen DA, Dudek AZ. 2006. Ensemble of linear models for predicting drug properties. J Chem Info Model 46:416–423.

Chowdhury, M. S. S., Nawal, M. F., Rashid, T., & Rhaman, K. (2015, December). Terminal analysis of the operations of a Rescue Robot constructed for assisting secondary disaster situations. In 2015 IEEE Region 10 Humanitarian Technology Conference (R10-HTC) (pp. 1-5). IEEE.

Chen B, Harrison RF, Papadatos G, Willett P, Wood DJ, Lewell QX, Greenidge P, Stiefl N. 2007. Evaluation of machine learningmethods for ligand based virtual screening. J Comput-Aided MolDes 21:53–62.

Deshpande M, Kuramochi M, Wale N, Karypis G. 2005. Frequentsubstructure based approaches for classifying chemical compounds. IEEE TKDE 17:1036–1050.

Rashid, M. T., & Wang, D. (2021, October). Unravel: An anomalistic crowd investigation framework using social airborne sensing. In 2021 IEEE International Performance, Computing, and Communications Conference (IPCCC) (pp. 1-10). IEEE.

Devore J, Peck R. 2004. Statistics: the exploration and analysis ofdata, 5th ed. Belmont, CA: Duxbury Press.

Dix DJ, Houck KA, Martin MT, Richard AM, Woodrow Setzer R,Kavlock RJ. 2007. The toxcast program for prioritizing toxicitytesting of environmental chemicals. Toxicol Sci 95:5–12.

Eom JH, Zhang BT. 2004. Pubminer: Machine learning based textmining system for biomedical information mining.

Feldman HJ, Snyder KA, Ticoll A, Pintilie G, Hogue CWV. 2006.A complete small molecule dataset from the protein data bank.FEBS Lett 580:1649–1165.

Geppert H, Horvath T, Gartner T, Wrobel S, Baorath J. 2008.Support vector machine based ranking significantly improves theeffectiveness of similarity searching using 2d fingerprints andmultiple reference compounds. J Chem Info Model 48:742–746.

Guo J, Chen H, Sun Z, Lin Y. 2004. A novel method for proteinsecondary structure prediction using dual layer svm and profiles.Proteins 54:738–743.

Helma C, Cramer T, Kramer S, Raedt LD. 2004. Data miningand machine learning techniques for the identification ofmutagenicity inducing substructures and structure activity relationships of noncongeneric compounds. J Chem Info Comp Sci44:1402–1411.

Rashid, M. T., Zhang, D., & Wang, D. (2019, August). Edgestore: Towards an edge-based distributed storage system for emergency response. In 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS) (pp. 2543-2550). IEEE.

Hert J, Willet P, Wilton D. 2006. New methods for ligand basedvirtual screening: use of data fusion and machine learning toenhance the effectiveness of similarity searching. J Chem InfoModel 46:462–470.

Hopkins AL. 2008. Network pharmacology: the next paradigm indrug discovery. Nat Chem Biol 4:682–690.

Hopkins AL. 2009. Drug discovery: predicting promiscuity. Nature462:167–168.

Jacob L, Vert JP. 2007. Kernel methods for in silico chemogenomics.In: Proceedings of the NIPS Workshop on Machine Learning inComputational Biology; Vancouver, Canada: Curran Associates,

Jenkins JL, Bender A, Davies JW. 2006. In silico target fishing:Predicting biological targets from chemical structure. DrugDiscov Today 3:413–421.

Jensen D, Neville J. 2002. Data mining in social networks. In:National Academy of Sciences Symposium on Dynamic SocialNetwork Modeling and Analysis; November 7–9, 2002. Washington, DC: The National Academies of Sciences.

Rashid, M. T., Abir, I. K., Shourove, N. S., Muntaha, R., & Rhaman, M. K. (2016, May). Intelligent intrusion prevention system for households based on system-on-chip computer. In 2016 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE) (pp. 1-5). IEEE.

Joachims T. 1998. Text categorization with support vector machines:Learning with many relevant features. In: Proc. of the EuropeanConference on Machine Learning. Chemnitz, Germany: Springer.p 137–142.

Karypis G. 2006. Yasspp: better kernels and coding schemes lead toimprovements in protein secondary structure prediction. Proteins64:575–586.

Kola I, Landis J. 2004. Can the pharmaceutical industry reduceattrition rates? Nature Rev Drug Discov 3:711–716.

Yasar, M. S., & Rashid, M. (2015). Implementation of dynamic traffic light controllers using artificial neural networks to diminish traffic ordeals. In IEEE European Modelling Symposium.

Kosala R. 2000. Web mining research: a survey. SIGKDD Explor2:1–15.

Kubinyi H. 2006. Chemogenomics in drug discovery. Ernst ScheringRes Found Workshop 58:1–19.

Lanckriet GR, Deng M, Cristianini N, Jordan MJ, Noble WS. 2004.Kernel based data fusion and its application to protein functionprediction in yeast. Proceedings, January 6–10, 2004, Hawaii.World Scientific: Pac Symp Biocomput. p 300–311.

Menchetti S, Costa F, Frasconi P. 2005. Weighted decompositionkernels. Proceedings of the 22nd International Conference inMachine Learning; August 7–11, 2005; Bonn, Germany. NewYork: ACM 119:585–592.

Michielan L, Stephanie F, Terfloth L, Hristozov D, Cacciari B, Klotz K, Spalluto G, Gasteiger J, Moro S. 2009. Exploring potency Mitchell TM. 1997.

Muegge I, Oloff S. 2006. Advances in virtual screening. Drug DiscovToday 3:405–411.Palmer DS, O’Boyle NM, Glen RC, Mitchell JBO. 2007. Randomforest models to predict aqueous solubility. J Chem Info Model47:150–158.

Paolini GV, Shapland RH, Van Hoorn WP, Mason JS, Hopkins AL.2006. Global mapping of pharmacological space. Nature Biotechnology 24:805–815.

Ralaivola L, Swamidass SJ, Saigo H, Baldi P. 2005. Graph kernels forchemical informatics. Neural Netwk 18:1093–1110.

Rangwala H, Karypis G. 2006. Building multiclass classifiers forremote homology detection and fold recognition. BMC Bioinformatics 7:455.

Rangwala H, Karypis G. 2007. frmsdpred: predicting local rmsdbetween structural fragments using sequence information.Comput Syst Bioinformat Conf 6:311–322.

Rangwala H, Karypis G. 2008. frmsdalign: frmsdalign: Proteinsequence alignment using predicted local structure information.In: Proceedings of the 6th Asia Pacific Bioinformatics Conference;January 17–19, 2008. London: Imperial College.

Rangwala H, Kauffman C, Karypis G. 2007. A generalized framework for protein sequence annotation. In: Proceedings of the NIPS Workshop on Machine Learning in Computational Biology; December 10, 2008. Vancouver, Canada: CurranAssociates.

Raymond JW, Cardiner EJ, Willet P. 2002. Heuristic for similaritysearching of chemical graphs using a maximum common edge subgraph algorithm. J Chem Info Comp Sci 42:305–316.

Rognan D. 2007. Chemogenomic approaches to rational drugdesign. Br J Pharmacol 152:38–52.

Root DE, Kelley BP, Stockwell BR. 2002. Global analysis oflargescale chemical and biological experiments. Curr Opin DrugDiscov Dev 5:355–360.

Russ AP, Lampel S. 2005. The druggable genome: an update. DrugDiscov Today 10:1607–1610.

Sakiyama Y. 2009. The use of machine learning and nonlinearstatistical tools for adme prediction. Expert Opin Drug MetabToxicol 5:149–169.

Salim N, Holliday JD, Willett P. 2003. Combination of fingerprintbased similarity coefficients using data fusion. J Chem InfoComput Sci 43:435–442.

Schreiber SL. 1998. Chemical genetics resulting from a passion for synthetic organic chemistry. Bioorg Med Chem 6:1127–1152.

Schroeter TS, Schwaighofer A, Mika S, Laak AT, Suelzle D, Ganzer U, Heinrich N, Muller KR. 2007. Estimating the domainof applicability for machine learning qsar rmodels: a study onaqueous solubility of drug discovery molecules. J Comput-Aided21:485–498.




How to Cite

Moazzam Siddiq. (2022). Revolutionizing Drug Discovery; Transformative Role of Machine Learning. BULLET : Jurnal Multidisiplin Ilmu, 1(02), 162–170. Retrieved from