To Save Lives, and Energy, Wellcome Sanger Institute Speeds Cancer Research With NVIDIA Accelerated Computing

The U.K.-based institute’s DNA sequencing lab analyzes tens of thousands of genomes a year, providing insights into cancer formation and treatment effectiveness.
by Harry Clifford

The Wellcome Sanger Institute, a key contributor to the international Human Genome Project, is turning to NVIDIA accelerated computing to save energy while saving lives.

With one of the world’s largest sequencing facilities, the U.K.-based institute has read more than 48 petabases — or 48 quadrillion base pairs — of DNA and RNA sequences to uncover crucial insights into health and disease.

Its Cancer, Ageing and Somatic Mutation (CASM) program sequences and analyzes tens of thousands of cancer genomes a year to study the mutational processes driving cancer formation, as well as genetic variations that determine treatment effectiveness.

To tackle such large-scale initiatives, the Sanger Institute is exploring the use of an NVIDIA DGX system with NVIDIA Parabricks, a scalable genomics analysis software suite that taps into accelerated computing to process data in just minutes.

“The Sanger Institute handles hundreds of thousands of somatic samples annually,” said Jingwei Wang, principal software developer for CASM at the Wellcome Sanger Institute. “NVIDIA accelerated computing and Parabricks will save us considerable time, cost and energy when analyzing samples, and we’re excited to explore NVIDIA’s advanced architectures, such as NVIDIA Grace and Grace Hopper, for even higher performance and efficiency.”

Reducing Runtime and Energy Consumption

The Sanger Institute develops high-throughput models of cancer samples for genome-wide functional screens and drug testing.

NVIDIA accelerated computing and software drastically reduce the institute’s analysis runtime and energy consumption per genome.

To accelerate genomic analysis with Burrows-Wheeler Aligner (BWA), a software package for mapping DNA sequences against a large reference genome, Sanger uses its proprietary CaVEMan workflow running on CPUs and is tapping into Parabricks on NVIDIA GPUs.

The institute reduced runtime 1.6x, costs 24x and energy consumption up to 42x — using one NVIDIA DGX system compared with 128 dual-socket CPU servers.

About 125 million CPU hours are consumed per 10,000 genomes sequenced by the institute annually.

This means that the Sanger Institute could, each year, save $1 million and 1,000 megawatt-hours by switching to using BWA with Parabricks on GPUs. That’s about the amount of energy needed to power an average American home for a century.

Collaborating With Industry Leaders

The Sanger Institute’s NVIDIA-accelerated sequencing lab can be considered an AI factory, where data comes in and intelligence comes out.

AI factories are next-generation data centers that host advanced, full-stack accelerated computing platforms for the most computationally intensive tasks.

As it explores crucial scientific questions to discover new cancer genes and mutational processes, the Sanger Institute is boosting operational and energy efficiency by using NVIDIA infrastructure for its AI factory.

In addition, companies and organizations building AI factories are participating in cross-industry collaborations with leaders like Schneider Electric, an energy management and automation company, to optimize data center designs for running demanding workloads in the most energy-efficient way.

The Sanger Institute is collaborating with Schneider Electric to minimize data center downtime and equip the DNA sequencing lab’s data center with uninterruptible power supplies and cooling equipment, among other technologies pivotal to reducing energy consumption.

At the NVIDIA GTC conference in March, Schneider Electric announced it’s helping organizations across industries optimize infrastructure by releasing AI data center reference designs tailored for NVIDIA accelerated computing clusters.

The reference designs — built for data processing, engineering simulation, electronic design automation, computer-aided drug design and generative AI — will focus on high-power distribution, liquid-cooling systems and other aspects of scalable, high-performance, sustainable data centers.

In an NYC Climate Week panel this week hosted by The Economist, representatives from Sanger, Schneider Electric and NVIDIA will talk about their work.

Learn more about sustainable computing and the Sanger Institute’s potentially life-saving work.

Featured image courtesy of the Wellcome Sanger Institute.