Clinical Application of AI in Mammography: Insights from a Prospective Study

Yilmaz, GÜL; Seker, Mustafa; Guldogan, Nilgun; Turk, Ebru; Erdemli, Servet; Koyluoglu, Yilmaz; Sancak, Sehla; ARIBAL, MUSTAFA

doi:10.1016/j.acra.2025.05.025

Clinical Application of AI in Mammography: Insights from a Prospective Study

Yilmaz E., Seker M. E., Guldogan N., Turk E. B., Erdemli S., Koyluoglu Y. O., ...Daha Fazla

Academic Radiology, cilt.32, sa.9, ss.5016-5027, 2025 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 32 Sayı: 9
Basım Tarihi: 2025
Doi Numarası: 10.1016/j.acra.2025.05.025
Dergi Adı: Academic Radiology
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, CINAHL, EMBASE, MEDLINE
Sayfa Sayıları: ss.5016-5027
Anahtar Kelimeler: AI, Breast Cancer, Mammography
Acıbadem Mehmet Ali Aydınlar Üniversitesi Adresli: Evet

Özet

Rationale and Objectives: This prospective study evaluated the performance of AI in a diagnostic clinic setting, comparing its effectiveness with radiologists of varying experience. Materials and Methods: The study was conducted at a single center and included 1063 patients undergoing diagnostic or screening mammography. Five radiologists with different experience levels assessed the images using the fifth edition of the BI-RADS lexicon. Standalone AI software assigned risk scores (0−100), with scores above 30.44 considered positive. AI risk assessments were compared with radiologists’ BI-RADS scores. Radiologists also re-evaluated AI-positive mammograms as a second look. Ground truth was established through histopathology and two years of follow-up. Results: Right and left breasts were analyzed separately, and 2126 mammography images were evaluated from 1063 women. A total of 29 cancers were diagnosed in 28 women. Among all examinations, 2.44% (52/2126) were positive, of which 46.15% (24/52) were true positive. Standalone AI detected 82.75% (24/29) of cancers, and the majority voting of radiologists scored positive (BI-RADS 0,4 and 5) in 8% (172/2126) where 89.65% (26/29) of cancers were detected. The AUC score of majority voting was 94.7% (95% CI: 91.1–98.3), and AI was 94.4% (95% CI: 88.5–100). AI was statistically not significantly different than (p=0.79) AUC of the majority voting. The re-evaluation assessment of AI-flagged images achieved an AUC of 94.8% (95% CI: 91.2–98.3), significantly different from the initial evaluation (p=0.015). However, it was not significantly different from AI (p=0.74). Conclusion: AI algorithms in diagnostic settings can serve as effective CAD systems, aiding in breast cancer detection and reducing inter-reader variability. Data availability: The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.