Covid-19 Research

Mini Review

OCLC Number/Unique Identifier:

AI-Driven Retrosynthesis Framework for Drug Discovery: The Use of LLMs

Biology Group    Start Submission

David Joshua Ferguson*

Volume6-Issue5
Dates: Received: 2025-05-16 | Accepted: 2025-05-24 | Published: 2025-05-28
Pages: 556-562

Abstract

The process of retrosynthetic analysis, introduced by Corey, systematically deconstructs complex molecules into simpler precursors, providing a logical pathway for chemical synthesis. Here, we propose an innovative AI-driven retrosynthesis framework for drug discovery leveraging Large Language Models (LLMs) and advanced computational tools. This "retro drug discovery" platform integrates AlphaFold2-generated protein structures, MolGPT-driven scaffold generation, and a tailored ChatGPT model orchestrating Structure-Activity Relationship (SAR) analyses, virtual screening, and iterative optimization cycles. We applied this framework retrospectively to twenty FDA-approved small-molecule drugs spanning cardiovascular, neurological, oncology, and endocrine therapeutic areas. Each case study illustrates how AI systems can recapitulate historical discovery pathways with high fidelity, as demonstrated by metrics including structural similarity (average Tanimoto coefficient ≈ 0.82) and bioactivity-prediction concordance (mean Pearson r ≈ 0.78). The methodology emphasizes bioisosteric replacements, scaffold hopping, and pharmacophore optimization, reflecting human medicinal-chemistry strategies. The implementation of an AI-driven retrosynthetic platform, "ChemGPT Discover," exemplifies automation of medicinal-chemistry processes, enhancing efficiency in hit-to-lead development. Our results validate the capability of LLM-assisted retrosynthesis to rediscover known drug leads accurately, underscoring the transformative potential of AI in accelerating drug discovery and medicinal chemistry research.

FullText HTML FullText PDF DOI: 10.37871/jbres2110


Certificate of Publication




Copyright

© 2025 Ferguson DJ, Distributed under Creative Commons CC-BY 4.0

How to cite this article

Ferguson DJ. AI-Driven Retrosynthesis Framework for Drug Discovery: The Use of LLMs. J Biomed Res Environ Sci. 2025 May 28; 6(5): 556-562. doi: 10.37871/jbres2110, Article ID: JBRES2110, Available at: https://www.jelsciences.com/ articles/jbres2110.pdf


Subject area(s)

References


  1. DiMasi JA, Grabowski HG, Hansen RW. Innovation in the pharmaceutical industry: New estimates of R&D costs. J Health Econ. 2016 May;47:20-33. doi: 10.1016/j.jhealeco.2016.01.012. Epub 2016 Feb 12. PMID: 26928437.
  2. Schneider G. Automating drug discovery. Nat Rev Drug Discov. 2018 Feb;17(2):97-113. doi: 10.1038/nrd.2017.232. Epub 2017 Dec 15. PMID: 29242609.
  3. Vamathevan J, Clark D, Czodrowski P, Dunham I, Ferran E, Lee G, Li B, Madabhushi A, Shah P, Spitzer M, Zhao S. Applications of machine learning in drug discovery and development. Nat Rev Drug Discov. 2019 Jun;18(6):463-477. doi: 10.1038/s41573-019-0024-5. PMID: 30976107; PMCID: PMC6552674.
  4. Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, Tunyasuvunakool K, Bates R, Žídek A, Potapenko A, Bridgland A, Meyer C, Kohl SAA, Ballard AJ, Cowie A, Romera-Paredes B, Nikolov S, Jain R, Adler J, Back T, Petersen S, Reiman D, Clancy E, Zielinski M, Steinegger M, Pacholska M, Berghammer T, Bodenstein S, Silver D, Vinyals O, Senior AW, Kavukcuoglu K, Kohli P, Hassabis D. Highly accurate protein structure prediction with AlphaFold. Nature. 2021 Aug;596(7873):583-589. doi: 10.1038/s41586-021-03819-2. Epub 2021 Jul 15. PMID: 34265844; PMCID: PMC8371605.
  5. Corey EJ. General methods for the construction of complex molecules. Pure Appl Chem. 1967;14:19-37. doi: 10.1351/pac196714010019.
  6. Varadi M, Anyango S, Deshpande M, Nair S, Natassia C, Yordanova G, Yuan D, Stroe O, Wood G, Laydon A, Žídek A, Green T, Tunyasuvunakool K, Petersen S, Jumper J, Clancy E, Green R, Vora A, Lutfi M, Figurnov M, Cowie A, Hobbs N, Kohli P, Kleywegt G, Birney E, Hassabis D, Velankar S. AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic Acids Res. 2022 Jan 7;50(D1):D439-D444. doi: 10.1093/nar/gkab1061. PMID: 34791371; PMCID: PMC8728224.
  7. Bagal V, Aggarwal R, Vinod PK, Priyakumar UD. MolGPT: Molecular Generation Using a Transformer-Decoder Model. J Chem Inf Model. 2022 May 9;62(9):2064-2076. doi: 10.1021/acs.jcim.1c00600. Epub 2021 Oct 25. PMID: 34694798.
  8. Santos R, Ursu O, Gaulton A, Bento AP, Donadi RS, Bologa CG, Karlsson A, Al-Lazikani B, Hersey A, Oprea TI, Overington JP. A comprehensive map of molecular drug targets. Nat Rev Drug Discov. 2017 Jan;16(1):19-34. doi: 10.1038/nrd.2016.230. Epub 2016 Dec 2. PMID: 27910877; PMCID: PMC6314433.
  9. Roth BD. The discovery and development of atorvastatin, a potent novel hypolipidemic agent. Prog Med Chem. 2002;40:1-22. doi: 10.1016/s0079-6468(08)70080-8. PMID: 12516521.
  10. Cushman, D. W. & Ondetti, M. A. History of the design of captopril and related inhibitors of angiotensin converting enzyme. Hypertension 17, 589–592 (1991).
  11. Capdeville R, Buchdunger E, Zimmermann J, Matter A. Glivec (STI571, imatinib), a rationally developed, targeted anticancer drug. Nat Rev Drug Discov. 2002 Jul;1(7):493-502. doi: 10.1038/nrd839. PMID: 12120256.
  12. Alpha fold protein structure database.
  13. Tian W, Chen C, Lei X, Zhao J, Liang J. CASTp 3.0: computed atlas of surface topography of proteins. Nucleic Acids Res. 2018 Jul 2;46(W1):W363-W367. doi: 10.1093/nar/gky473. PMID: 29860391; PMCID: PMC6031066.
  14. Mendez D, Gaulton A, Bento AP, Chambers J, De Veij M, Félix E, Magariños MP, Mosquera JF, Mutowo P, Nowotka M, Gordillo-Marañón M, Hunter F, Junco L, Mugumbate G, Rodriguez-Lopez M, Atkinson F, Bosc N, Radoux CJ, Segura-Cabrera A, Hersey A, Leach AR. ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Res. 2019 Jan 8;47(D1):D930-D940. doi: 10.1093/nar/gky1075. PMID: 30398643; PMCID: PMC6323927.
  15. Eberhardt J, Santos-Martins D, Tillack AF, Forli S. AutoDock Vina 1.2.0: New Docking Methods, Expanded Force Field, and Python Bindings. J Chem Inf Model. 2021 Aug 23;61(8):3891-3898. doi: 10.1021/acs.jcim.1c00203. Epub 2021 Jul 19. PMID: 34278794; PMCID: PMC10683950.
  16. RDKit: Open-source cheminformatics.
  17. Jain AN, Nicholls A. Recommendations for evaluation of computational methods. J Comput Aided Mol Des. 2008 Mar-Apr;22(3-4):133-9. doi: 10.1007/s10822-008-9196-5. Epub 2008 Mar 13. PMID: 18338228; PMCID: PMC2311385.


Comments


Swift, Reliable, and studious. We aim to cherish the world by publishing precise knowledge.

  • Brown University Library
  • University of Glasgow Library
  • University of Pennsylvania, Penn Library
  • University of Amsterdam Library
  • The University of British Columbia Library
  • UC Berkeley’s Library
  • MIT Libraries
  • Kings College London University
  • University of Texas Libraries
  • UNSW Sidney Library
  • The University of Hong Kong Libraries
  • UC Santa Barbara Library
  • University of Toronto Libraries
  • University of Oxford Library
  • Australian National University
  • ScienceOpen
  • UIC Library
  • KAUST University Library
  • Cardiff University Library
  • Ball State University Library
  • Duke University Library
  • Rutgers University Library
  • Air University Library
  • UNT University of North Texas
  • Washington Research Library Consortium
  • Penn State University Library
  • Georgetown Library
  • Princeton University Library
  • Science Gate
  • Internet Archive
  • WashingTon State University Library
  • Dimensions
  • Zenodo
  • OpenAire
  • Index Copernicus International
  • icmje
  •  International Scientific Indexing (ISI)
  • Sherpa Romeo
  • ResearchGate
  • Universidad De Lima
  • WorldCat
  • JCU Discovery
  • McGill
  • National University of Singepore Libraries
  • SearchIT
  • Scilit
  • SemantiScholar
  • Base Search
  • VU
  • KB
  • Publons
  • oaji
  • Harvard University
  • sjsu-library
  • UWLSearch
  • Florida Institute of Technology
  • CrossRef
  • LUBsearch
  • Universitat de Paris
  • Technical University of Denmark
  • ResearchBIB
  • Google Scholar
  • Microsoft Academic Search