The Future of Drug Discovery: Why AI Is More Than Just a Buzzword

   

GMP/GDP – On Demand Online Training

You can book the desired online training from our extensive database at any time. Click below for more information.

   

Stay informed with the GMP Newsletters from ECA

The ECA offers various free of charge GMP newsletters  for which you can subscribe to according to your needs.

In 2025, drug development faces ever-growing challenges: pipelines are expensive, timelines are long, and failure rates remain dauntingly high. A novel therapeutic often takes over a decade and roughly \$2.6 billion to reach patients1, with fewer than 10% of candidates ultimately approved. Against this backdrop, artificial intelligence (AI) has moved from buzzword status to a key enabler in biotech and pharma R&D. Unlike marketing hype, today's AI applications deliver tangible value by significantly compressing development timelines.

Key AI applications in drug discovery include:

  • Predictive modeling and virtual screening - using machine learning (ML) to forecast molecular properties (e.g. activity or ADMET) and to explore ultra-large chemical libraries2, 3.
  • Data-driven target identification - mining genomics, proteomics, imaging, and literature data to propose novel drug targets4.
  • Generative molecule design - AI-driven creation of new small molecules, antibodies, and other modalities tailored to specfic targets5, 6.
  • Preclinical lead optimization - in silico prediction of ADME (absorption, distribution, metabolism, excretion), toxicity, potency, etc., to prioritize the best candidates3 .
  • Smart clinical trials - AI-assisted patient matching, predic-tive simulation of trial outcomes, and innovative designs (e.g. synthetic control arms, digital twins) to streamline development7, 8.

Each of these areas offers concrete benefits:

Predictive Modeling and Data Analytics

Modern drug discovery produces enormous datasets. AI excels at predictive modeling - learning complex, non-linear relationships in data that traditional methods cannot capture2, 9. For example, deep-learning QSAR models now enable ultra-large virtual screening campaigns far beyond what human analysts could handle2 . Emerging neural-network scoring functions likewise improve predictions of binding affinity and other properties10. These models can also forecast ADME (absorption, distribution, metabolism, excretion) and toxicity profiles early in development3, helping chemists focus on compounds with the most promising drug-like properties. In practice, these predictive tools shorten the hit-finding and hit-to-lead optimization phases by focusing resources on the best candidates.

Qualified Person Education Course Module A PLUS IMP Pre-Course Session

Recommendation

Hamburg, Germany10-12 March 2026

Qualified Person Education Course Module A PLUS IMP Pre-Course Session

Notably, industry leaders are pushing even further. Roche/Genentech describe an ML "trifecta" - predictive, generative, and interpretable models - that could predict whether a molecule will reach a target, generate a molecule to bind that target, and explain how they interact11. This integrated approach promises to recast hit-finding and lead optimization as continuous, data-driven processes rather than lengthy trial-and-error campaigns.

AI-Driven Target Identification

Selecting the right biological target is crucial: a wrong target early on often leads to failure in late-stage trials12. AI enhances target discovery by integrating diverse data sources to uncover hidden patterns13. For example, Insilico Medicine's PandaOmics platform combined patient multi-omics (genomic and transcriptomic data), network analysis, and natural-language mining of the literature to rank potential fibrosis targets14, 15. This AI pipeline identified TNIK - a kinase not previously studied in idiopathic pulmonary fibrosis - as the top prediction16. That novel target is now being explored further, showing how AI can spotlight therapeutic hypotheses that would have been missed by traditional approaches.

Large pharma are adopting similar strategies. Roche/Genentech have partnered with Recursion to fuse high-content cell imaging and single-cell genomics: by "generating and analyzing different types of cellular and genetic data - at a huge scale," they are building maps of human biology to reveal new druggable pathways17. Recursion's "Operating System" uses its massive image-and-omics dataset to continuously train ML models. In their words, this enables them to "rapidly identify new targets and design highly optimized molecules" from their data18. In other words, high-throughput phenotypic screening coupled with ML creates an iterative loop of experiment and design, which, as Roche scientists note, "bears the potential to impact target and drug discovery in a really powerful way"19.

Generative Molecule Design

Once a target is chosen, AI can drive de novo molecule creation. Advanced generative algorithms (transformers, GANs, reinforcement learning) can propose entirely new chemical structures optimized against a desired target5. Insilico's Chemistry42 engine exemplifies this: it employed 500 ML models (transformers, GANs, genetic algorithms, etc.) to generate and score millions of compounds, ultimately selecting a novel small-molecule TNIK inhibitor for development20. This kind of AI-driven search effectively explores chemical space far more efficiently than brute-force.

AI design is not limited to classic small molecules. In biologics discovery, new generative models are making strides. Diffusionbased tools (e.g. EvoDiff, DiffAb) can generate novel antibody sequences with specific structural features21. In fact, startup Nabla Bio recently reported that its AI platform produced the first fully de novo antibody against a G-protein-coupled receptor (CXCR7), as well as hundreds of other candidate antibodies22. This suggests that even challenging targets once deemed "undruggable" may be within reach of AI design.

AI is also extending to cutting-edge modalities. For antibody-drug conjugates (ADCs), early studies show ML predicting the optimal payload conjugation sites and trafficking behavior in cells23. However, ADCs remain exceptionally complex: they are three-part hybrids (antibody, linker, cytotoxic payload), and most existing small-molecule or antibody AI models still "fail to translate" directly to ADCs due to unique developability challenges24. In short, many AI tools were trained on traditional modalities and must be adapted for next-generation therapeutics. Even so, these examples highlight how AI can propose novel candidates - small or large - that give chemists and biologists a much higher-quality starting point.

Preclinical Lead Optimization

After design, AI continues to streamline preclinical development. Once lead candidates emerge, ML models can predict their pharmacokinetics and safety faster than lab assays. For instance, AI can estimate solubility, metabolic stability, off-target activity, and more in silico, flagging likely ADME or toxicity issues before costly animal studies3. In practice, this means chemists get rapid feedback on which chemical modifications improve drug-like properties, reducing the number of analogs that must be synthesized and tested.

Recent evidence suggests these AI approaches pay off. Deep-learning virtual screening and ML-enhanced scoring often outperform classical QSAR and docking10. Neural-network models can even incorporate predicted 3D structures (e.g. AlphaFold predictions) to refine binding site analysis25. Companies report efficiency gains: Recursion, for example, claims "significant improvements in speed, efficiency, and reduced costs from hit identification to IND-enabling studies" compared to industry norms26. In other words, AI helps deliver better-characterized molecules into animals, which should reduce costly late-stage failures. Overall, machine learning is making lead optimization faster and more precise by focusing experimental effort on the most promising candidates27, 10.

Accelerating Clinical Trials with AI

AI's impact is now extending into clinical development. Large electronic health record (EHR) databases are fertile ground. AI tools can rapidly match patients to trials - for example, by scanning EHRs to flag eligible participants or identify sites with suitable cohorts28. This accelerates enrollment and can improve population diversity by catching candidates that traditional screening might miss.

More radically, AI enables virtual trial simulations and novel trial designs. Predictive models can simulate trial outcomes under different scenarios (varying doses, patient subgroups, endpoints, etc.) to optimize protocols before any patient is enrolled29. Two innovations are synthetic control arms and digital twins30, 8. In a synthetic control arm, real-world or historical data create a virtual "placebo" group so fewer new patients must be assigned to control - reducing cost and ethical concerns. In digital twin approaches, AI builds computational avatars of patients (using molecular and clinical data) and virtually tests therapies on them. Both methods are already being piloted: synthetic controls can shorten trial duration by using prior data, while digital twin simulations help refine dosing strategies before the first patient is dosed8.

Pre-course Session: “Investigational Medicinal Products (IMP) QP Education Course”

Recommendation

Hamburg, Germany10 March 2026

Pre-course Session: “Investigational Medicinal Products (IMP) QP Education Course”

Challenges and Realistic Expectations

All that said, AI is not a magic bullet. Its capabilities come with caveats. Data quality and biases are chief concerns: AI models are only as reliable as their training data. If datasets are incomplete, noisy, or skewed toward certain demographics, predictions can be misleading31, 32. For example, oncology trial data often over-represents certain populations, so an AI trained on those data might not generalize well worldwide. Overfitting is another risk: models might pick up spurious patterns that don't hold up on new molecules or patients. In short, historical and real-world data can accelerate discovery, but they must be used judiciously.

Certain modalities reveal AI's limits. As noted above, ADCs (and other advanced bioconjugates or cell/gene therapies) pose unique hurdles that most current models can't capture out-of-the-box24. Many AI tools were built on classic small molecules or antibodies and must be extended for truly novel therapeutic formats. The biotech field is actively developing new algorithms to handle these cases, but it's work in progress.

Importantly, every AI prediction still requires experimental validation. An in silico "hit" must be synthesized, tested in vitro and in vivo, and optimized by human experts. No algorithm replaces the need to characterize safety and efficacy in the lab. Moreover, the "black-box" nature of many deep learning models raises practical issues: regulatory agencies demand transparency and robust evidence. In practice, companies form cross-functional teams of computational scientists, chemists, biologists, and clinicians to interpret AI suggestions and ensure they fit within established validation pipelines.

Despite these limitations, the pace of progress is striking. Not long ago AI drug design was speculative; today some AI-discovered molecules are in clinical trials33, 34, and major pharmas are fully investing in AI capabilities. The technology is not static - models continually improve as more data accrue. Over time we can expect AI's current blind spots (e.g. underrepresented data domains, interpretability issues) to narrow.

Looking Ahead

By 2025, AI in drug discovery has proven it's more than just marketing jargon. It is already contributing to target discovery, molecule generation, and even trial design in tangible ways. The real question is how the field will build on this foundation. Technical challenges - such as better AI models for ADCs, nucleic acids, or truly personalized therapies - remain a focus. Non-technical issues (data sharing practices, IP strategy, regulatory frameworks) also need attention (though full discussion of those is beyond our scope here).

What's clear is that smart companies in biotech and pharma are no longer asking if AI can help, but how to harness it best. Integrating AI requires robust data pipelines and collaboration across disciplines, but it holds transformative potential. As one industry review puts it, combining ML with experimental feedback loops "bears the potential to impact target and drug discovery in a really powerful way"19. In practice, the synergy of human expertise with machine intelligence will likely define the next wave of biotech innovation.

 

About the Author
Dr Mohamad Toutounji has held various positions in R&D, CMC, and manufacturing at Molgenium, Sanofi, and GE Healthcare in recent years. He is also the CEO and founder of Molgenium.

Note:
1, 2, 3, 4, 5, 7, 8, 9, 10, 13, 25, 27, 28, 29, 30, 31, 32
Integrating artificial intelligence in drug discovery and early drug development: a transformative approach | Biomarker Research | Full Text
https://biomarkerres.biomedcentral.com/articles/10.1186/s40364-025-00758-2
6, 22
Scratch That? De Novo Antibody Design Enters the AI Drug Discovery Toolbox
https://www.genengnews.com/topics/artificial-intelligence/scratch-that-de-novo-antibody-design-enters-the-ai-drug-discovery-toolbox/
11, 17, 19
Roche | Harnessing the power of AI
https://www.roche.com/stories/harnessing-the-power-of-ai
12, 15, 16, 34
A small-molecule TNIK inhibitor targets fibrosis in preclinical and clinical models | Nature Biotechnology
https://www.nature.com/articles/s41587-024-02143-0?error=cookies_not_supported&code=ff484f9d-f252-451d-89b5-21cdd6a04401
14, 20, 33
First Generative AI Drug Begins Phase II Trials with Patients | Insilico Medicine
https://insilico.com/blog/first_phase2
18, 26
Pioneering TechBio Solutions in Drug Discovery | Recursion
https://www.recursion.com/
21, 23, 24
frontiersin.org
https://www.frontiersin.org/journals/drug-discovery/articles/10.3389/fddsv.2025.1628789/pdf


 

Go back

To-Top
To-Bottom