Lung cancer research may have just taken a considerable leap forward — a recent study reports that a computer using a machine-learning approach was trained to predict the type, severity, and prognosis of such cancers far better than a skilled pathologist.
In addition to giving oncologists a tool for predicting outcomes, the study, “Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features,” published in the journal Nature Communications, shows the method could also provide scientists with new insights into cancer cell characteristics, helping them understand how a tumor starts developing and how it changes during disease progression.
Basically, machine learning is always based on data, observations, direct experience, or instruction. The computer, literally, studies computer algorithms and then computes the information to complete a task or make an accurate prediction.
While the study was performed on lung cancer, researchers believe the method can be adapted to other types of tumors. They also believe that the wealth of data generated by this approach can be used with other “big data” approaches, such as analyses of all genes or proteins in a tumor, to better understand how tumors evolve.
“This approach replaces this subjectivity with sophisticated, quantitative measurements that we feel are likely to improve patient outcomes,” said Michael Snyder, PhD, a professor of genetics at Stanford University in a news release.
Traditional assessments of cancer grade and stage by oncologists are often limited. The “grade” basically tells an oncologist how abnormal the cells are compared to the tissue from which they grew, so that a high-grade tumor bears little resemblance to a healthy cell. The “stage” describes if and to what extent a cancer has spread.
These measures are often used to set a prognosis, but oncologists know they often are not accurate.
The first problem with cancer tissue assessments is that they are made by a pathologist who explores how the cancer tissue looks under a microscope. Because humans are subjective, results depend on the pathologist’s experience in making a judgment. Further, even when two highly experienced pathologists look at the same tissue, the chance of them agreeing totally are only about 60 percent, according to Snyder.
Another problem is that outcomes often differ from what was expected, based on the grade and stage assessment. While about 50 percent of patients who get a lung adenocarcinoma diagnosis live for about five years, a small group will likely live 10 years.
The current study indicates that while humans are not able to extract enough data from a tissue sample to determine how the groups differ, the machine-learning computer can.
Researchers at Stanford University Medical Center used 2,186 microscope images of either adenocarcinoma or squamous cell carcinoma, two lung cancer types that are particularly difficult to distinguish between, to start training the software. The team used pictures from the Cancer Genome Atlas, a national database that also holds information about the grade and stage of the cancer, and on how long each patient lived after diagnosis.
The machine-learning software was eventually able to spot nearly 10,000 characteristics among the samples, which is much more than the usual few hundred a pathologist can spot. In addition to traditional measurements, such as cell shape and size, the software could assess the shape and texture of cell nuclei, and how neighboring cells were arranged in relation to each other.
“We began the study without any preconceived ideas, and we let the software determine which characteristics are important,” Snyder said. “In hindsight, everything makes sense. And the computers can assess even tiny differences across thousands of samples many times more accurately and rapidly than a human.”
The team let the software pick features that could be used to see the difference between cancerous and normal cells, identify cancer subtype, and predict the life expectancy of the patient. They then tested the approach on 294 lung cancer tissue samples from another database.
This group was distinguished by patients who either survived post-diagnosis for a long time or who died shortly afterward. The machine-learning system was seen to effectively distinguish between patients, and to accurately predict their survival.
“Ultimately this technique will give us insight into the molecular mechanisms of cancer by connecting important pathological features with outcome data,” Snyder said.