Catalyzing the Drug Discovery Pipeline with AI

By Annabel Romero, PhD,

The main goal of Drug Discovery and Development is to find and develop new medicines to treat diseases and conditions more effectively. The process includes basic research, preclinical development, clinical trials, and FDA approval. Each of these steps has possible failure points in the chain and requires a significant amount of time and money. The cost of each phase increases gradually as consumables, personnel, and equipment become more specialized, with these sunk costs recoverable only at commercialization. Finding a promising drug candidate for a particular target is essential to guarantee that further steps in the downstream research and clinical processes are successful. 

Lack of ample background knowledge about the main target, its related disease, and the drug candidate itself often leads to silent failures that can often only be discovered, at great expense, during clinical trials. Ideally, a drug candidate that will succeed in clinical trials requires a robust research foundation for the target selected. An enormous amount of knowledge is needed to fully understand a biological system and, by proxy, the mechanisms by which a disease functions. With the advent of the ‘omics’ era, the amount of complex biological data continues to grow dramatically, expanding the research capabilities and understanding required for drug discovery. This presents a significant opportunity for AI solutions, which can provide 24/7 vigilance of the discovery process, are not beholden to any design heuristics or organizational bias, and offers the opportunity to anticipate failure points in the discovery process.

Traditionally, Drug Discovery has functioned in a linear pipeline where information gets collected in sequential steps. Some of the major steps involved are Target Identification, Screening of candidates, Selection of a lead molecule, Optimization of the Lead Candidate, Preclinical studies, and clinical trials (see image). Following these steps was the most efficient way to select organic compounds (small molecules) as new candidates several decades ago. However, modern drugs include larger, more complex classes of molecules; some examples are oligonucleotides, peptides, and monoclonal antibodies. Drugs that are being developed now can have multiple targets and diverse mechanisms of action, making their understanding more elaborate and difficult to fit in a traditional setup. 

Private and public databases are systematically measuring and curating data from both healthy individuals and those with specific diseases and clinical imaging material. This information presents new opportunities to understand a particular condition of interest. However, these massive, multi-dimensional datasets are impossible for humans to grasp alone–they require complex statistical analysis, which opens up an ample opportunity for exploiting AI and ML. To fully understand how Drug Discovery works today, we must see the information as a network rather than a linear process. 

By viewing Drug Discovery as a network, we can incorporate results and knowledge from various datasets and studies covering a vast array of research areas. This network-based view of the diverse biomedical literature and its associated results is possible today due to the nearly infinitely scalable storage available, relatively cheap computation available to power massive AI models, and the performance of new models. These models can now ingest and find links between diverse datasets covering multiple modalities such as text, images, and ‘omics’ data. The increasing availability of storage and decreasing compute costs in recent years have moved AI from theoretical studies to real-world applications, using graphical processing units (GPUs) to process large amounts of data in parallel. 

The application of ML and deep neural networks provides the advantage of understanding the complex contexts of biological space and mechanisms of action of new drugs and targets. Nonlinear models can be implemented to extract intricate patterns from multi-level representations that will include several variables simultaneously to make a prediction. Prediction models such as AlphaFold2 have revolutionized the drug discovery pipeline showing what can tangible applications achieve, using ML to accelerate steps in the process. In the past few years, many pharmaceutical companies have invested in resources and technology to advance the research in AI and integrate ML into their workflow. With the application of AI to Drug Discovery, we will see significant advances in biomedical research. Understanding how a disease functions and incorporating multiple protein targets when designing a new drug is the harness for a successful candidate moving from lead to approved drug in a faster and more affordable fashion than what the industry sees now.

Work with Us

Start a Project

We'd like to help creating your next solution, whether modernizing legacy platforms or developing new AI solutions. Technology moves fast, let's build sustainable solutions.
Get Started