A new study finds that an artificial intelligence (AI) tool — trained on roughly a million screening mammography images — identified breast cancer with approximately 90 percent accuracy when combined with analysis by radiologists (Wu N, et al. Deep Neural Networks Improve Radiologists’ Performance in Breast Cancer Screening. IEEE Trans Med Imaging. 2019 Oct 7. doi: 10.1109/TMI.2019.2945514).
The study examined the ability of a type of A I machine learning computer program, to add value to the diagnoses reached by a group of 14 radiologists as they reviewed 720 mammogram images.
“ … an extraordinarily large dataset for the AI tool to train on, consisting of 229,426 digital screening mammography exams and 1,001,093 images. Most databases used in studies to date have been limited to 10,000 images…”
“Our study found that AI identified cancer-related patterns in the data that radiologists could not, and vice versa,” says senior study author Dr. Krzysztof J. Geras,. “AI detected pixel-level changes in tissue invisible to the human eye, while humans used forms of reasoning not available to AI,” adds Dr. Geras, . “The ultimate goal of our work is to augment, not replace, human radiologists.”
In 2014, more than 39 million mammography exams were performed to screen women asymptomatic for breast cancer. Women whose test results yield abnormal mammography findings are referred for biopsy,
In the new study, the research team designed statistical techniques that let their program “learn” how to get better at a task without being told exactly how. Such programs build mathematical models that enable decision-making based on data examples fed into them, with the program getting “smarter” as it reviews more and more data.
Modern AI approaches, inspired by the human brain, use complex circuits to process information in layers, with each step feeding information into the next, and assigning more or less importance to each piece of information along the way.
The authors trained their AI tool on many images matched with the results of biopsies performed in the past. Their goal was to reduce the number biopsies needed. This can only be achieved, says Dr. Geras, by increasing the confidence that physicians have in the accuracy of assessments made for screening exams (for example, reducing false-positive and false-negative results).
For the current study, the research team analyzed images collected at NYU Langone Health over seven years, sifting through the collected data and connecting the images with biopsy results.
This effort created an extraordinarily large dataset for the AI tool to train on, consisting of 229,426 digital screening mammography exams and 1,001,093 images. Most databases used in studies to date have been limited to 10,000 images.
Thus, the researchers trained their neural network by programming it to analyze images from the database for which cancer diagnoses had already been determined. This meant that researchers knew the “truth” for each mammography image as they tested the tool’s accuracy, while the tool had to guess. Accuracy was measured in the frequency of correct predictions.
In addition, the researchers designed the study AI model to first consider very small patches of the full resolution image separately to create a heat map, a statistical picture of disease likelihood. Then the program considers the entire breast for structural features linked to cancer, paying closer attention to the areas flagged in the pixel-level heat map.
Rather than have the researchers identify image features for their AI to search for, the tool discovers on its own which image features increase prediction accuracy. The team plans to further increase this accuracy by training the AI program on more data, perhaps even identifying changes in breast tissue that are not yet cancerous “The transition to AI support in diagnostic radiology should proceed like the adoption of self-driving cars — slowly and carefully, building trust, and improving systems along the way with a focus on safety,” says first author Nan Wu, a doctoral candidate at the NYU Center for Data Science.
doi: 10.1109/TMI.2019.2945514