Depending on which radiologist analyzes a mammogram, there’s a huge variation in breast density readings — an assessment that indicates a patient’s risk for developing breast cancer.
One study found that radiologists classified anywhere between six and 85 percent of mammograms into the higher cancer risk areas of “heterogeneously dense” or “extremely dense.”
Researchers at MIT are using neural networks to reduce this variation in radiologists’ interpretation of mammograms.
Their deep learning model is being used by radiologists in Massachusetts General Hospital’s screening centers. It’s the first time such a model has been deployed in the daily workflow of a large-scale clinical practice, according to the researchers.
A Better Picture of Risk
Around 33 million screening mammography exams are performed in the U.S. each year. These screens can reveal the presence of breast cancer before any symptoms appear, but include another important assessment: breast tissue density.
When assessing a mammogram, radiologists categorize the scan into one of four buckets, depending on the density and distribution of breast tissue: fatty, scattered, heterogeneously dense or extremely dense.
The latter two categories are the ones to watch out for. If a mammogram is assessed as one of these, it means there’s a higher proportion of dense, supportive breast tissue. Unlike fatty tissue, supportive tissue looks less transparent on a mammogram, which obscures other parts of the breast and makes it harder to spot abnormalities.
It’s also an independent risk factor for cancer — women with high breast density are four to five times more likely to get breast cancer than those with low breast density.
Roughly half of American women ages 40 to 74 are assessed as having dense breasts, which means they may need to undergo other screening methods like MRI due to the higher long-term risk of developing breast cancer.
Deep learning can help give patients the most consistent screening results, providing them with a better understanding of risk.
Breast density is a holistic feature, an attribute that’s gauged based on the whole mammogram. That makes it easier for neural networks to analyze them, said Kyle Swanson, an MIT graduate student and paper co-author.
“It’s not just, ‘Is there dense tissue?’, but ‘What’s the structure? How does it look holistically?’” said Adam Yala, second author on the paper and a Ph.D. student in MIT’s Computer Science and Artificial Intelligence Laboratory. “That global pattern is something a neural network can learn.”
The team trained their deep learning tool on thousands of labeled digital mammograms, assessed by different radiologists.
As a result, the neural network’s mammogram assessments more closely match the consensus reading of multiple radiologists than any individual doctor. In a clinical setting, this allows radiologists to make a decision about a scan with this consensus assessment in mind.
Bringing Deep Learning to the Clinic
Since January, radiologists at Mass General screening centers have been using the deep learning model as part of their clinical workflow. When analyzing mammograms, radiologists see the assessment made by the deep learning model and decide whether or not to agree with it.
To evaluate the model’s success, the researchers recorded how often the neural network’s assessments of more than 10,000 scans were accepted by the interpreting radiologist.
When radiologists read mammograms without first seeing the model’s decision, their assessments agreed with the neural network 87 percent of the time. But when shown the deep learning assessment, the mammographers agreed with the model 94 percent of the time.
The paper’s results show the deep learning model can read scans at the level of experienced radiologists and improve the consistency in their density assessments. Other automated approaches that don’t use deep learning don’t agree as well with radiologists, Yala said.
So far, the deep learning model has been used by radiologists on around 18,000 mammograms. The researchers use NVIDIA GPUs to train their convolutional neural networks, developed using the PyTorch deep learning framework.
Yala says the goal is to reduce the amount of variation in this subjective judgment, making sure that patients are evaluated into the correct risk bucket.
“It shouldn’t be a matter of luck,” he said. “Anyone should be able to give you the same assessment.”
Density assessment is just the first step — the researchers are also working on deep learning tools to detect five years in advance which patients have a high risk of developing cancer.