AI model identifies certain breast tumor stages likely to progress to invasive cancer

Adam Zewe | MIT News • July 22, 2024

Ductal carcinoma in situ (DCIS) is a type of preinvasive tumor that sometimes progresses to a highly deadly form of breast cancer. It accounts for about 25 percent of all breast cancer diagnoses.

Because it is difficult for clinicians to determine the type and stage of DCIS, patients with DCIS are often overtreated. To address this, an interdisciplinary team of researchers from MIT and ETH Zurich developed an AI model that can identify the different stages of DCIS from a cheap and easy-to-obtain breast tissue image. Their model shows that both the state and arrangement of cells in a tissue sample are important for determining the stage of DCIS.

Because such tissue images are so easy to obtain, the researchers were able to build one of the largest datasets of its kind, which they used to train and test their model. When they compared its predictions to conclusions of a pathologist, they found clear agreement in many instances.

In the future, the model could be used as a tool to help clinicians streamline the diagnosis of simpler cases without the need for labor-intensive tests, giving them more time to evaluate cases where it is less clear if DCIS will become invasive.

“We took the first step in understanding that we should be looking at the spatial organization of cells when diagnosing DCIS, and now we have developed a technique that is scalable. From here, we really need a prospective study. Working with a hospital and getting this all the way to the clinic will be an important step forward,” says Caroline Uhler, a professor in the Department of Electrical Engineering and Computer Science (EECS) and the Institute for Data, Systems, and Society (IDSS), who is also director of the Eric and Wendy Schmidt Center at the Broad Institute of MIT and Harvard and a researcher at MIT’s Laboratory for Information and Decision Systems (LIDS).

Uhler, co-corresponding author of a paper on this research, is joined by lead author Xinyi Zhang, a graduate student in EECS and the Eric and Wendy Schmidt Center; co-corresponding author GV Shivashankar, professor of mechogenomics at ETH Zurich jointly with the Paul Scherrer Institute; and others at MIT, ETH Zurich, and the University of Palermo in Italy. The open-access research was published July 20 in Nature Communications .

Combining imaging with AI    

Between 30 and 50 percent of patients with DCIS develop a highly invasive stage of cancer, but researchers don’t know the biomarkers that could tell a clinician which tumors will progress.

Researchers can use techniques like multiplexed staining or single-cell RNA sequencing to determine the stage of DCIS in tissue samples. However, these tests are too expensive to be performed widely, Shivashankar explains.

In previous work, these researchers showed that a cheap imagining technique known as chromatin staining could be as informative as the much costlier single-cell RNA sequencing.

For this research, they hypothesized that combining this single stain with a carefully designed machine-learning model could provide the same information about cancer stage as costlier techniques.

First, they created a dataset containing 560 tissue sample images from 122 patients at three different stages of disease. They used this dataset to train an AI model that learns a representation of the state of each cell in a tissue sample image, which it uses to infer the stage of a patient’s cancer.

However, not every cell is indicative of cancer, so the researchers had to aggregate them in a meaningful way.

They designed the model to create clusters of cells in similar states, identifying eight states that are important markers of DCIS. Some cell states are more indicative of invasive cancer than others. The model determines the proportion of cells in each state in a tissue sample.

Organization matters

“But in cancer, the organization of cells also changes. We found that just having the proportions of cells in every state is not enough. You also need to understand how the cells are organized,” says Shivashankar.

With this insight, they designed the model to consider proportion and arrangement of cell states, which significantly boosted its accuracy.

“The interesting thing for us was seeing how much spatial organization matters. Previous studies had shown that cells which are close to the breast duct are important. But it is also important to consider which cells are close to which other cells,” says Zhang.

When they compared the results of their model with samples evaluated by a pathologist, it had clear agreement in many instances. In cases that were not as clear-cut, the model could provide information about features in a tissue sample, like the organization of cells, that a pathologist could use in decision-making.

This versatile model could also be adapted for use in other types of cancer, or even neurodegenerative conditions, which is one area the researchers are also currently exploring.

“We have shown that, with the right AI techniques, this simple stain can be very powerful. There is still much more research to do, but we need to take the organization of cells into account in more of our studies,” Uhler says.

This research was funded, in part, by the Eric and Wendy Schmidt Center at the Broad Institute, ETH Zurich, the Paul Scherrer Institute, the Swiss National Science Foundation, the U.S. National Institutes of Health, the U.S. Office of Naval Research, the MIT Jameel Clinic for Machine Learning and Health, the MIT-IBM Watson AI Lab, and a Simons Investigator Award.

A collage of four pictures of a yellow robot dog.
By Alex Shipps | MIT CSAIL August 8, 2024
A new algorithm helps robots practice skills like sweeping and placing objects, potentially helping them improve at important tasks in houses, hospitals, and factories.
A man wearing glasses and a blue shirt is smiling for the camera.
By Sara Feijo | MIT Open Learning August 8, 2024
Leveraging more than 35 years of experience at MIT, Bertsimas will work with partners across the Institute to transform teaching and learning on and off campus.
Two men are standing next to each other in front of a table with a robot on it.
By Rachel Gordon | MIT CSAIL July 31, 2024
CSAIL researchers introduce a novel approach allowing robots to be trained in simulations of scanned home environments, paving the way for customized household automation accessible to anyone.
A bunch of green thermometer on a pink background.
By Adam Zewe | MIT News July 31, 2024
More efficient than other approaches, the “Thermometer” technique could help someone know when they should trust a large language model.
A bunch of dice are flying in the air in a dark room.
By Adam Zewe | MIT News July 24, 2024
Introducing structured randomization into decisions based on machine-learning model predictions can address inherent uncertainties while maintaining efficiency.
A computer generated image of a brain on a motherboard.
By Rachel Gordon | MIT CSAIL July 23, 2024
MAIA is a multimodal agent that can iteratively design experiments to better understand various components of AI systems.
A computer generated image of a molecule on a green background
By David L. Chandler | MIT News July 23, 2024
Analysis and materials identified by MIT engineers could lead to more energy-efficient fuel cells, electrolyzers, batteries, or computing devices.
A hand is touching a screen with its finger.
By Adam Zewe | MIT News July 23, 2024
A new study shows someone’s beliefs about an LLM play a significant role in the model’s performance and are important for how it is deployed.
A grid of colorful balls connected to each other on a white background.
By Poornima Apte | Department of Materials Science and Engineering July 18, 2024
An MIT team uses computer models to measure atomic patterns in metals, essential for designing custom materials for use in aerospace, biomedicine, electronics, and more.
A cartoon illustration of a drone flying over mountains.
By Alex Shipps | MIT CSAIL July 18, 2024
Neural network controllers provide complex robots with stability guarantees, paving the way for the safer deployment of autonomous vehicles and industrial machines.
More Posts