Get Well Sooner: Data Science Bowl Aims to Speed Drug Discovery

More than 2,100 teams are competing in fourth annual Data Science Bowl.
by Kimberly Powell

When it comes to developing drugs to fight disease, a billion dollars doesn’t buy what it used to.

Not so long ago, a fortune that size might have covered the cost of creating 30 new medicines. Today, it won’t even buy you one.

This year’s Data Science Bowl could help cure what ails drug discovery. In the world’s largest AI competition for social good, more than 2,100 teams are vying to reduce the soaring costs and testing times for new drugs.

Their challenge: To use deep learning to accelerate and improve accuracy of a crucial step in the drug-discovery pipeline — identifying the nucleus of each cell.

“The 2018 Data Science Bowl is driven by a very real need to develop new treatments faster and more accurately,” said Anne Carpenter, director of the imaging platform at the Broad Institute of MIT and Harvard, the nonprofit partner for the competition.

Intrigued? It’s not too late to get in on the action: The entry deadline is April 9. Final submissions are due on April 16.

$100,000 in Cash, and a Deep Learning Supercomputer

The fourth annual Data Science Bowl calls on participants from around the world to train deep learning models to examine cell images and identify nuclei.

“If we can improve that process, new cures and new treatments could come out at a much faster pace,” said Ray Hensberger, director of data solutions and machine intelligence at the consulting firm Booz Allen Hamilton.

Booz Allen and the Kaggle platform for data science competitions are co-presenting the contest, with additional sponsorship from NVIDIA, the medical diagnostics company PerkinElmer and others. In addition to potentially advancing drug discovery, top teams will split $170,000 in cash and prizes, including an NVIDIA DGX Station personal AI supercomputer.

The Opposite of Moore’s Law

Finding new drugs is a complex and laborious task that can cost billions and take a decade or more per treatment. Despite improvements in technology, the cost of developing a new drug roughly doubles every nine years, according to an observation known as Eroom’s law (that’s Moore’s law spelled backwards).

Biochemists try thousands of chemical compounds to figure out which, if any, are effective against a particular virus or bacteria or which cause a desired reaction in the human body. They do that by measuring how diseased and healthy cells respond to various treatments.

Because nearly all human cells contain a nucleus, the most direct route to identifying each cell is to spot the nucleus, Carpenter said. Today’s image-processing algorithms can find nuclei and measure the disease status of cells, but they work best when nuclei are fairly round and not too crowded, she said.

The algorithms fall short when nuclei are unusual shapes or crowded together, which occurs in complicated experiments involving tissue samples.

“Sometimes biologists have no choice but to personally examine thousands of images to complete their experiments,” Carpenter said. She discusses additional technical details of the Data Science Bowl in the video below:

Deep Learning Drug Discovery

Existing methods also require scientists to repeatedly revise algorithms to fit different types of images and cells. Carpenter wants deep learning software to do that without biologists’ intervention, saving hundreds of thousands of hours a year and opening up faster channels for new discoveries.

She hopes to use a winning algorithm to build deep learning software for drug discovery.

“Deep learning will help find relevant details within an image that match or surpass the power of human observation,” said Kyle Karhohs, a postdoctoral researcher working in Carpenter’s lab. “This could increase the scale and speed of drug discovery, and enhance our ability to characterize the biology in images with unprecedented precision and accuracy.”

To find out how the Data Science Bowl and similar AI competitions can help society, attend “AI for Social Good as a Technology Driver” and other sessions at the GPU  Technology Conference, March 26-29, in Silicon Valley. Register today.

* The main photo in this story shows the inside of a human cell nucleus. Image by Steve Mabon and Tom Misteli of the National Cancer Institute.