Evening the Odds: Cornell’s STORK AI Tool Evaluates Embryo Candidates for Better IVF

by Isha Salian

There’s less than a 50 percent chance that a round of in vitro fertilization — one of the most common treatments for infertility, running up to $15,000 — will succeed. But those odds could be dramatically improved with an AI tool developed by researchers at Cornell University.

Introduced in 1978, IVF is a process through which eggs are fertilized with sperm in a lab, creating multiple embryos that can be transferred into a patient’s uterus. Clinics monitor embryo development to pick the highest-quality embryos for transfer, improving the odds of pregnancy.

Still, less than half of transferred blastocysts (embryos that have grown for around five days) successfully implant in a patient’s uterus, according to the CDC. That figure drops below 15 percent for patients over the age of 40.

Trained and tested on a dataset of over 10,000 time-lapse images of human embryos, Cornell researchers created an AI model dubbed STORK that uses convolutional neural networks to analyze embryo growth and evaluate which candidates are most likely to lead to successful implantation.

To increase the probability of pregnancy, clinics often transfer multiple embryos at once. And that carries risks.

“This can lead to twins, triplets and other multiples, which adds to the complications,” said Iman Hajirasouliha, assistant professor of computational genomics at Weill Cornell Medicine. “If we can reliably predict the implantation success rate based on an algorithm, then we can limit the number of transfers.”

Betting on the Best Embryo Candidate

Over 2.5 million cycles of IVF are performed each year, resulting in around 500,000 births. For each of these cycles, the task of choosing which embryos are most likely to result in a successful pregnancy lies with a team of embryologists.

These experts manually grade the developing embryos based on time-lapse images — a time-consuming and subjective evaluation. With no universal grading system, there’s little agreement among embryologists on which are the best embryo candidates.

The scientists developing STORK found that a panel of five embryologists unanimously agreed less than 25 percent of the time on whether an embryo was high, fair or low quality.

In contrast, STORK’s predictions agreed with the embryologist panel’s majority vote more than 95 percent of the time — suggesting that the tool may outperform individual embryologists and bring better consistency to the embryo evaluation process.

AI is also much faster at analyzing the image data. A clinic that treats around 4,000 people a year may have three embryologists manually evaluate embryo candidates for each patient. STORK can evaluate embryo candidate quality for 2,000 patients in just four minutes.

The Cornell researchers developed the deep learning model using the TensorFlow framework and four NVIDIA GPUs, accelerating the training process up to 4x over CPUs.

So far, the scientists have tested their tool on embryo images from clinics in New York, Spain and the United Kingdom. They hope any IVF facility that collects time-series images of embryos could use the tool.

However, embryo quality is just one clinical factor behind IVF success rates. Patient age is a key variable affecting the probability of implantation — and the likelihood of a healthy full-term pregnancy.

To better assess the rate of successful pregnancy and live birth, the researchers have developed a decision tree model that incorporates STORK’s embryo quality analyses as well as patient age data.