Scientists at Lawrence Livermore National Laboratory (LLNL) are combining artificial intelligence/machine learning, bioinformatics and supercomputing to help discover candidates for new antibodies and pharmaceutical drugs to combat COVID-19.
Backed by five high performance computing (HPC) clusters and years of expertise in vaccine and countermeasure development, a COVID-19 response team of LLNL researchers from various disciplines has used modeling & simulation, along with machine learning, to identify about 20 initial, yet promising, antibody designs from a nearly infinite set of potentials and to examine millions of small molecules that could have anti-viral properties. The candidates will need to be synthesized and experimentally tested — which Lab researchers cautioned could take time — but progress is being made.
“For several decades, the Laboratory has been at the forefront of protecting the country against biological threats of any type,” said Lab Senior Science Adviser Dave Rakestraw, who formerly ran LLNL’s biodefense programs and is coordinating the Lab’s COVID-19 technical response. “We’ve been putting a large amount of focus for the last six years on applying the computational resources at LLNL to try to accelerate the timescales for developing a response to an emerging biological threat. We’ve done that by using our extensive computational capabilities (staff and computer infrastructure) and developing partnerships with universities, drug companies and tech companies. That effort has put us in a position where we have tools now that are applicable to helping with the current response.”
When the COVID-19 outbreak began, LLNL’s Adam Zemla developed and published a predicted 3D protein structure of the virus, which was downloaded and used by more than a dozen outside research groups. Since then, the actual crystal structure of the key protein from SARS-CoV-2, the virus that causes COVID-19, has been determined, which closely matched the team’s predictions, researchers said.
Armed with the virus’ predicted 3D structure and a few antibodies known to bind and neutralize SARS, an LLNL team led by Daniel Faissol and Thomas Desautels used two HPC clusters to perform AI-driven virtual screening of antibodies capable of binding to SARS-CoV-2, generating high-fidelity simulations to test the molecular interactions for efficacy. The modeling platform, supported by the Defense Advanced Research Projects Agency (DARPA) and internal Laboratory Directed Research and Development (LDRD) funding, is the first of its kind in integrating experimental data, structural biology, bioinformatic modeling and molecular simulations – driven by a machine learning algorithm – to design antibody candidates. This platform was used to identify potential high value modifications to the SARS antibodies so that it binds to SARS-CoV-2.
“Our approach, while still being developed, is aimed at designing high quality antibody therapeutics or vaccines in extremely rapid time-scales for scenarios in which waiting for many rounds of time-consuming experimental steps is not an option,” said Faissol. “Experimental data and structural bioinformatics are important components to enable high-quality predictions, but integrating machine learning and molecular simulations on HPC are the key to enabling the speed and scalability we need to search and evaluate huge numbers of possible antibody designs.”
The approach has not only sped up the process considerably over selection guided solely by human intuition — narrowing down the number of antibody candidates from 1039 possibilities to a handful in a matter of weeks — but also has focused on areas where scientists may not have otherwise looked.
“Now we’re not just searching blindly; we’re actually creating structures that we think are in the proper part of the design space, then we do our evaluations on those,” said Jim Brase, the Laboratory’s deputy associate director for Data Science. “We’ll get novelty, and — we hope — a higher percentage of real validated answers out of this approach at the end.”
Researchers said they are just beginning to look at the data and are currently working to arrange synthesis, as well as set up testing and evaluation of the designs, through both internal efforts and targeted external collaborations..
Antiviral drug design
Another component of the multi-pronged response involves antiviral drug design. A group of Lab scientists led by Felice Lightstone and Jonathan Allen recently used dedicated access time on the entire Quartz supercomputing cluster to perform virtual screening of small molecules against two COVID-19 proteins. Using LLNL-customized software, created by Lab scientist Xiaohua Zhang, the LLNL team has performed a large-scale computational run to screen 26 million molecules against four protein sites (totaling more than 100 million docking calculations) to identify compounds that possibly could prevent infection or treat COVID-19.
“Using the computational tools and data that we created from our American Heart Association’s Center for Accelerated Drug Discovery, we were able to computationally screen these molecules so quickly and at such a large scale,” Lightstone said. “This is the first step toward finding a new antiviral. We developed a whole pipeline for drug design and plan to continue in the coming weeks, ending with experimental testing of the predicted molecules. This should speed up the drug design process.”
Some models being used to determine the molecules’ safety are derived from the system developed through the multi-institutional ATOM (Accelerating Therapeutics for Opportunities in Medicine) consortium, a project aimed at speeding up cancer drug discovery. That work has helped the Lab evaluate molecules in a timely manner and produce models useful for any outbreak, researchers said.
Need for DOE lab capabilities
LLNL scientists called the COVID-19 pandemic a “wake-up call” signifying the need for a longer-term investment and sustained government-wide effort, particularly in applying high performance computing to personalized medicine.
“It has clarified the need for and value of leadership Department of Energy capabilities,” said Shankar Sundaram, director of LLNL’s Center for Bioengineering. “The Laboratory anticipated this kind of situation in pursuing a predictive biology initiative. The reason we were able to jump onto this quickly was not just because we had the capabilities, but because we’ve been thinking about these scenarios for a long time.”
LLNL is also adapting its portable, rapid PCR-based molecular diagnostics platform (Bio ID) developed by LLNL biomedical scientist Larry Dugan as a potential tool to quickly diagnose COVID-19.
The overall COVID-19 response effort involves all 17 DOE national laboratories. LLNL’s response team includes scientists and engineers from the Lab’ Center for Bioengineering, Forensic Science Center and Biodefense Knowledge Center (BKC), Biosciences and Biotechnology Division, and the HPC capabilities of cluster systems Quartz, Lassen, Corona, Pascal and Catalyst.