BEAGLE Eyes: Hidden Evolutionary History Uncovered Using GPUs

by Tonie Hansen

Editor’s note: This is one of five profiles of finalists for NVIDIA’s 2017 Global Impact Award, which provides $150,000 to researchers using NVIDIA technology for groundbreaking work that addresses social, humanitarian and environmental problems.

The HMS Beagle took naturalist Charles Darwin on a round-the-world journey that helped him unlock ideas about the process of evolution. Almost two centuries later, another BEAGLE is helping scientists unblock bottlenecks in complex genetic data to advance our understanding of the living world, and potentially save lives.

Monkey Flower
Monkey flower

This 21st century BEAGLE, which stands for Broad-platform Evolutionary Analysis General Likelihood Evaluator, is an open source library and API using NVIDIA GPUs. The speed with which its software can crunch through data is a critical step in analyzing biological sequence data such as DNA, which carries the genetic material of all living organisms and many viruses, including those that cause AIDS, influenza and Ebola.

Thanks to its fast and accurate computations of specific models, BEAGLE has become an essential component in the software workflow of many scientists studying the evolutionary history of organisms. The field, known as phylogenetic inference, spans everything from plague-causing bacteria to the study of how monkey flowers adapted to different geographic regions.

Michael Cummings, a professor at the University of Maryland’s Institute for Advanced Computer Studies, and colleague Daniel Ayres, who handled software design and programming, led the development of BEAGLE.

Their efforts have placed them among five finalists for NVIDIA’s 2017 Global Impact Award. Our annual grant of $150,000 is given to researchers using NVIDIA technology for groundbreaking work that addresses social, humanitarian and environmental problems.

Computer Catchup

Cummings first had his idea of using GPUs for phylogenetic analysis in 2003, but nascent development frameworks at the time weren’t very capable. With the advent of NVIDIA’s CUDA in 2007, GPUs for high performance computing, and funding from the National Science Foundation, BEAGLE came alive.

Working with the giant, computationally demanding datasets used in phylogenetic inference is slow going, and prone to logjams. With the ability to produce results quickly, researchers have a better chance of helping public health agencies react to health threats.

Phylogenetic relationships describe the inferred evolutionary relationships among various biological species. Think of Darwin seeking the connections between varieties of finches living on different islands. Researchers use BEAGLE to similarly understand the evolutionary dynamics of organisms that might otherwise seem unconnected.

Ebola virus
Ebola virus

Powerful Performance

Now, with BEAGLE’s powerful GPU performance, scientists can use more complex models and larger datasets. This improves the quality of their inferences, and in much less time.

“BEAGLE is used for inferring the evolutionary history of influenza and Ebola,” said Cummings. “That’s allowed scientists to try and see where outbreaks originated both geographically and in specific time periods.”

The BEAGLE library is part of the CIPRES Science Gateway portal, a public resource for phylogenetic analyses. The computing infrastructure includes a compute cluster populated with NVIDIA Tesla K20 cards.

The team’s latest work on the CUDA platform uses Tesla K40 and Quadro P5000 cards. They harness the large number of processing cores to efficiently parallelize calculations when implementing novel computational methods.

Disease Outbreaks

Some of the most widely used programs in evolutionary biology have adopted the BEAGLE library, giving access to thousands of scientists working on other human disease-causing viruses such as HIV, Dengue and foot-and-mouth disease.

Studies include air travel-associated spread of Dengue virus in Brazil, multiple episodes of polioviruses in Nigeria and global epidemics of drug-resistant bacteria causing gastroenteritis.

Among animal populations, BEAGLE has been used to study the characterization of swine influenza viruses in North America, the relationship of waterfowl migration on influenza in Korea, and the dynamics of vampire bat-transmitted rabies in Argentina, to name a few.

Vampire Bat
Vampire bat

“BEAGLE is used in a variety of studies that contribute to our understanding of evolution and biology, which can have a role in informing decision-making,” Ayres said. “The performance afforded by GPUs is especially relevant in epidemiology, where one might be tasked with characterizing fast-spreading disease agents.”

Cummings and Ayres are now focused on taking fuller advantage of NVIDIA’s powerful Pascal and upcoming Volta processor architectures to develop ways to further increase performance. The BEAGLE project benefited from a community of scientists, including Marc Suchard, a professor at UCLA, and Andrew Rambaut, a professor at the University of Edinburgh, contributing to its development.

The winner of the 2017 Global Impact Award will be revealed at the GPU Technology Conference, May 8-11, in Silicon Valley. To register for the conference, visit our GTC registration page.

Other Global Impact Award 2017 finalists include: The Indian Institute of Technology Guwahati.

Check out the work of last year’s Global Impact Award winner.

AI Podcast: Deep Learning Hears Once Extinct Bird

And if you’re interested in the ways technology and biology intersection, this episode of our AI Podcast is worth a listen. We spoke with Matthew McKown, CEO of Conservation Metrics, about how deep learning techniques helped rediscover a bird that was once thought extinct, and how GPU-powered AI now helps biologists crunch vast quantities of data to spot trends that would have been impossible to detect before.