Evidential Learning and Advancements in Drug Discovery
Over the past two decades, the pharmaceutical industry, valued at an impressive $1 trillion, has faced significant challenges. Despite committing nearly $80 billion annually to drug development, the return on every dollar invested in research and development has dropped from 10 cents in 2010 to just 2 cents presently. Academic studies show the quantity of newly approved drugs decreased by 50% approximately every 9 years since 1950. As the study indicates, this represents an 80-fold reduction even when adjusted for inflation. Additionally, the process of developing a drug, from research to market launch, is extensive, typically taking 10-12 years. The monetary implications of these challenges are profound, with the current R&D expenses for each drug reaching $2.17 billion, in contrast to $1.19 billion in 2010.
Amidst these challenges, the advent of AI in drug discovery represents a potentially momentous shift for the industry. In certain contexts, AI has markedly expedited drug discovery, enhancing the process by a factor of 15. For example, it has been shown that deep learning has enabled rapid identification of DDR1 kinase inhibitors, that is, substances that block the action of a protein that plays a key role in important cellular processes. Another example of groundbreaking discovery through AI is the AlphaFold deep learning system, created by Google Deep Mind, that is able to predict proteins’ structure with unprecedented accuracy.
From a clinical standpoint, the capacities of AI in drug discovery is very promising, especially in the following areas:
Enhancing Virtual Screening: By analyzing vast datasets, AI speeds up virtual screening, helping researchers pinpoint potential drug candidates more precisely.
AI-Powered Predictive Analytics in Clinical Trials: AI facilitates the prediction of clinical trial results, refines trial structures, and pinpoints specific patient groups for enhanced success.
The AI-Driven Shift in Drug Repurposing: AI fast-tracks the repurposing of drugs by spotting existing ones that might work for new medical conditions, conserving both time and resources.
Generating New Molecules with AI: Through de novo design powered by AI, novel molecules with the desired characteristics can be created, broadening the horizons of drug discovery.
The pharmaceutical industry has witnessed a Fundamental shift in the AI landscape resulting in a wide adoption of AI tools across biopharma R&D; and companies now demand products that are tech-centric, modular, bio-specific, and secure. Benevolent UK, an end-to-end AI augmented drug discovery company, in its latest investor report points out how there is a “clear, growing market demand from Biopharma to leverage AI in drug discovery and increase the probability of success as a key drive for revenue generation and value creation.” Also, Industry titans such as Pfizer, GlaxoSmithKlein and Novartis are not simply observers standing by but are proactively cultivating their AI capabilities in-house.
Anchored in more than five years of comprehensive research at MIT CSAIL, Themis AI has spearheaded innovations in uncertainty estimation, proving instrumental in molecular property prediction and discovery initiatives. In 2020, a new evidential deep learning approach was presented. This solution introduced a novel methodology for uncertainty quantification in neural network-based molecular structure-property predictions, achieved without increasing computational demands.
Molecular property prediction involves using computational models to predict specific properties or behaviors of molecules based on their chemical structure. These predictions can be related to various molecule characteristics, such as solubility, toxicity, binding affinity to specific proteins, or other physicochemical properties.
As shown in an academic publication, although neural networks excel in achieving state-of-the-art performance in numerous tasks related to molecular modeling and predicting structure-property relationships, they often face challenges when it comes to generalizing to examples outside their training data, have limited ability to efficiently learn from small amounts of data, and tend to generate inaccurate predictions.
Hence, it is crucial to gain a deeper insight into the predictive confidence of neural models, especially in contexts like drug discovery and virtual screening, where the prediction accuracy plays a vital role in guiding safety-critical experimental processes. The evidential learning approach delivers uncertainty estimates that allow reliable adoption of these models in the chemical sciences. These estimates track the robustness of the models and are different from output probabilities.
When estimating epistemic uncertainty, evidential learning outperforms competing techniques, such as Bayesian neural networks and sampling-based approaches (e.g., model ensembling, dropout sampling). These methods only provide rough estimates of uncertainty through the use of stochastic sampling. These methods also result in increased computational expenses and longer processing times.
Evidential Learning offers a fast and scalable uncertainty estimation technique that can be deployed to increase model robustness across a range of molecular property prediction and discovery tasks. The solution is model-agnostic and does not require significant architectural changes.
Evidential Learning takes the concept of learning probability distribution parameters a step further by predicting distributions over the initial likelihood parameters. That is, the key to Evidential Learning is that, rather than placing priors on the network weights (a common technique in Bayesian neural networks), it introduces evidential priors over the original Gaussian likelihood function (also known as a normal distribution). In simpler terms, instead of making assumptions about the weights (parameters) of the neural network as in Bayesian approaches, Evidential Learning makes assumptions relating to model output. These assumptions are represented as a higher-order distribution, known as an evidential distribution. The neural network in Evidential Learning is designed to learn and output the hyperparameters of this evidential distribution. Hyperparameters are parameters that govern the learning process of the model itself. In this case, they define the shape and characteristics of the evidential distribution. By doing this, Evidential Learning allows the model to express its own uncertainty about its predictions.
The methodology has several compelling aspects. Predictions are calibrated to align uncertainty with actual errors, facilitating efficient training through uncertainty-informed active learning, and yielding improved experimental validation success rates. Results show that 60% less training data is required, an 18% error improvement rate, and a 95% hit rate with confidence filtering. It is estimated that this could produce a 75% reduction in drug discovery costs, more than a billion dollars in savings within four major therapeutic areas and a tenfold increase in discovery speed.
As the pharmaceutical sector faces pivotal decisions and momentous changes, innovative endeavors such as AI in drug discovery and groundbreaking methodologies like evidential deep learning show promise towards renewed industry growth and scientific achievements.