Faster Physics: How AI and NVIDIA A100 GPUs Automate Particle Physics

by Brian Caulfield

What are the fundamental laws that govern our universe? How did the matter in the universe today get there? What exactly is dark matter?

The questions may be eternal, but no human scientist has an eternity to answer them.

Now, thanks to NVIDIA technology and cutting-edge AI, the more than 1,000 collaborators from 26 countries working on the Belle II particle physics experiment are able to learn more about these big questions, faster.

The Belle II detector, based just north of Tokyo, reproduces the particles created during the early universe by smashing high-energy electrons and anti-electrons together.

These collisions generate a serious amount of data. Researchers will make high-precision recordings of hundreds of billions of collisions over the experiment’s lifetime. Sifting through all this data, without sacrificing the detailed information needed for high-precision measurements, is a daunting task.

To reconstruct the way individual particles, detected at Belle II, decayed from larger groups of particles, researchers turned to AI, says James Kahn from the Karlsruhe Institute of Technology, or KIT, a Belle II researcher and AI consultant with Helmholtz AI, a German public research platform for applied AI.

“Given the successes of AI and its ability to learn by example on large volumes of data, this is the perfect place to apply it,” Kahn said.

And to accelerate that AI, they’re using the NVIDIA Ampere architecture’s multi-instance GPU technology, built into the NVIDIA A100 GPU.

Physics Meets the A100

Kahn’s team was able to get early access to the “fresh out of the oven” NVIDIA DGX A100, a compact system packing 5 petaflops of AI computing power.

It’s among the first in Europe, and the first connected via InfiniBand high-speed interconnect technology. It was installed at KIT thanks to the high-performance computing operations team at the Steinbuch Center for Computing.

This close connection among the AI consultant team, international scientists and the HPC operations team will be a benefit for future research.

“We are really happy to see that only a few hours after we had the DGX A100 up and running, scientific analyses were already being performed,” said Jennifer Buchmüller, HPC core facility leader at KIT.

There’s more to come: HoreKa, the next supercomputer at KIT, will be equipped with more than 740 NVIDIA A100 GPUs.

A New View on Particle Decays

All of this helps Kahn and his team accelerate a new approach developed at KIT in collaboration with researchers from the nearby University of Strasbourg.

By designing a new representation of particle decays, or how unstable subatomic particles fall apart, Kahn’s team has been able to use a specialized neural network, known as a graph neural network, to automate the reconstruction of the particle decays from the individual particles detected by Belle II.

“We realized we could re-express particle decays in terms of the detected particles’ relations alone,” said Kahn. “This was the key ingredient to enable a full, end-to-end AI solution.”

The team has already demonstrated this technique’s success on a selection of specially designed simulations of particle decays, and recently scaled up to simulations of the interactions occurring at Belle II.

Scaling up, however, required resources that could handle both the volume of data and the large neural networks trained on it.

To do so they split up the GPUs using the multi-instance GPU technology — which allows a single GPU to perform multiple tasks simultaneously — to perform a spread-and-search of the network hyperparameters.

“Architecture searches which took days could now be completed in a matter of hours,” Kahn said.

The result: more time for more science, and for more of those eternal questions to be asked, and answered.