Three, it turns out, is better than one. At least that’s how it worked for a trio of former rivals who teamed up to claim the just-announced top prize in this year’s Data Science Bowl.
The fourth annual event focused on one of healthcare’s most pressing problems — the soaring cost and time needed to discover new drugs. A record-setting 18,000 participants battled over 90 days to deliver a deep learning algorithm to accelerate a crucial step in the drug-discovery pipeline: identifying the nucleus of each cell.
This year’s Data Science Bowl was “driven by a very real need to develop new treatments faster and more accurately,” said Anne Carpenter, director of the imaging platform at the Broad Institute of MIT and Harvard, the nonprofit partner for the contest.
International Team Takes the Prize
The winners beat out nearly 4,000 teams to win the Data Science Bowl, presented by the consulting firm Booz Allen Hamilton and the Kaggle platform for data science competitions, with additional sponsorship from NVIDIA and the medical diagnostics company PerkinElmer. Creators of the top algorithms will split $170,000 in cash and prizes, including powerful NVIDIA GPU hardware for deep learning.
In addition to the difficulty of spotting cell nuclei in dense medical images, the winning threesome — Selim Seferbekov, Alexander Buslaev and Victor Durnov — faced the challenge of collaborating across six time zones and three countries, Germany, Belarus and Russia. Using our GPUs for both training and inference, the team toiled for some 300 hours to create and implement their algorithm.
Their efforts paid off: Together they’ll collect $50,000 in cash, plus an estimated $70,000 in the latest NVIDIA GPUs built on our new Volta architecture. Volta uses NVIDIA CUDA Tensor Cores to deliver unprecedented levels of deep learning performance in hardware like our DGX Station, one of the most powerful tools for researchers.
Record-Setting Data Science Bowl
Collectively, competition participants worked an estimated 288,000 hours and submitted 68,000 algorithms, nearly three times as many submissions as in last year’s Data Science Bowl.
All three top teams used our GPUs to achieve their winning results. Other teams in the top three were:
- Second Place ($25,000): Minxi Jiang, chief data scientist at a Beijing-based startup, who finished in the top one percent in last year’s Data Science Bowl.
- Third Place ($12,000): Angel Lopez-Urrutia, a marine biologist in Spain who uses machine learning to automatically classify images of plankton, a challenge that was central to the inaugural Data Science Bowl.
Drug Discovery Bottleneck
Finding new drugs is a complex and laborious task that can cost billions and take a decade or more per treatment. Biochemists try thousands of chemical compounds to figure out which, if any, are effective against a particular virus or bacteria or which cause a desired reaction in the human body. They do that by measuring how diseased and healthy cells respond to various treatments.
Because nearly all human cells contain a nucleus, the most direct route to identifying each cell is to spot the nucleus. Existing methods require time-consuming researcher oversight. Sometimes biologists have no choice but to personally examine thousands of images to complete their experiments.
“By identifying nuclei quickly and accurately, the algorithms developed in this competition can free up biologists to focus on other aspects of their research, shortening the approximately 10 years it takes for each new drug to come to market and, ultimately, improving quality of life,” said Ray Hensberger, a Booz Allen Hamilton principal.
Carpenter, of the Broad Institute, aims to use a winning algorithm to build deep learning software for drug discovery. The institute is now exploring the idea of creating a user-friendly, open source software that biomedical researchers can use in their day-to-day work.
Learn more about NVIDIA technology to advance deep learning in healthcare.
* Main image for this story shows human cell nuclei, which contains most of cells’ genetic material. RNA-processing proteins are in red and chromosomes are in blue. Image courtesy of the National Cancer Institute.