220 likes | 406 Views
Age Stratified Risk Prediction of Invasive versus In-situ Breast Cancer: A Logistic Regression Model. Mehmet Ayvaci 1,2 Oguzhan Alagoz 1 ,Jagpreet Chhatwal 3 , Mary Lindstrom 4 ,Houssam Nassif 5 ,Elizabeth S Burnside 2. 1 Industrial and Systems Engineering, UW-Madison 2 Radiology, UW-Madison
E N D
Age Stratified Risk Prediction of Invasive versus In-situ Breast Cancer: A Logistic Regression Model Mehmet Ayvaci1,2 Oguzhan Alagoz1,Jagpreet Chhatwal3, Mary Lindstrom4,Houssam Nassif5,Elizabeth S Burnside2 1Industrial and Systems Engineering, UW-Madison 2Radiology, UW-Madison 3Merck Research Labaratories 4Biostatistics 5Computer Science, UW-Madison
Breast Anatomy Breast profile: • A ducts • Blobules • C dilated section of duct to hold milk • D nipple • E fat • Fpectoralis major muscle • G chest wall/rib cage • Enlargement: • A normal duct cells • B basement membrane • C lumen (center of duct) www.breastcancer.org
Progression of Breast Cancer Age Stratified Risk Prediction of Invasive vs In-situ Breast Cancer: A Logistic Regression Model Atypical ductal hyperplasia Normal duct Invasive ductal carcinoma Typical ductal hyperplasia Ductal carcinoma in situ (DCIS)
Age Stratification forInvasive vs In-situ Breast Cancer • Primary modality of screening or diagnosis: Mammography • Performs differently in different age groups • Sensitivity: Age • <40 54% • 40-49 77% • 50-65 78% • >65 81% • Sensitivity: Breast Density • 68% vs. 85% • Younger vs. older
Age Stratification forInvasive vs In-situ Breast Cancer • Primary modality of detecting type of breast cancer: Biopsy • Incidence of DCIS has increased since adoption of mammography • DCIS has favorable prognosis: will often not cause mortality for years • PPV of biopsy 20%
Age Stratification forInvasive vs In-situ Breast Cancer • Invasive vs. DCIS distinction important because: • Requires different treatment • Life expectancy difference in older and younger women • Over diagnosis which does not correspond to reduced mortality • Breast cancer less aggressive in older women • Invasive procedures more risky in older women • Resources could be better spent on more serious co-morbidities
Purpose and Methods LOGISTIC REGRESSION • Develop a risk prediction model for prospective differentiation of DCIS versus invasive breast cancer • Measure and compare model performance for different age groups ROC Curves
Purpose and Methods Contd. Measure Risk Of Invasive Cancer Given Information Risk Assessment Tools Clinical Implications Validation of The Model ROC & PR Curves, Statistical Testing Optimize Sequential Decision Making in the Context Of Breast Cancer Screening Markov Decision Processes Clinical Implications
Structure of Data Used NMD National Mammography Database Format Radiologists’ Overall Assessment of the Mammogram with Some Repeat to the Structured Part Free Text Structured Demographic Factors Mammographic Descriptors Turned into Structured format using Natural Language Processing BIRADS descriptors
Methods: Processing Free Text • Information retrieval from free text given a standardized lexicon • Parse sentences to detect BIRADS descriptors using Natural Language Processing in PERL • Test on a set of 100 which is manually populated • 97.7% Precision • 95.5% Recall
Data in Detail Free Text == FeaturesExtracted Using NLP
Summary of Data • 1475 Diagnostic Mammograms 1378 Patients • 1298 patients with single mammogram • 81 patients with 2 mammograms • 5 patients with three mammograms • 1063 cases invasive vs. 412 DCIS • Age range 27 to 97 with • Mean 59.7 and standard deviation 13.4
Methods: Performing Logistic Regression • Regress with a dichotomous outcome, where the patient is known to have malignant condition, i.e. • Invasive or • DCIS • Stratified data into 3 groups • Overall Model LR 1475 records • Age Less Than 50 LRyoung374 records • Age Greater Than 65 LROld533 records • Used stepwise regression to find the appropriate models. Possibility of interactions were investigated P(Invasive|Demographic Factors, Mammographic Descriptors)
Data set Test fold … Fold 1 2 3 n Training fold … Merge tested folds for performance analysis … Fold n 1 2 3 Methods: Validation Technique • n fold cross-validation • Leave-one-out
Methods: Measuring Performance Sensitivity vs. 1-Specificity at all thresholds • Sensitivity: True Positive Rate • Specificity: True Negative Rate • Thresholds: Probability above which call “Invasive” • AUC: Area Under the Curve Sensitivity=a/(a+c) Specificity = d/(b+d)
Results: LR • Overall model significant at p-value<0.01 • Not enough power to justify inclusion of interaction terms (Over-fitting) • Acceptable ROC • Decreasing trend in Error rates
Results: LRyoungvs.LRold • Difference in AUC = 0.07 • Significant at p-value = 0.045
Results: LRyoungvs.LRold • Improvement is in False Negatives
In Summary • Mammography is not perfect and performs better in older women. • There is a need for discriminating between invasive and DCIS to better manage the breast disease in the context of age and other comorbidities • An age based risk prediction model for assessing performance difference in discriminating invasive vs. DCIS is necessary • Such a model would enable physicians to make more informed decisions • Demonstration of performance difference and varying risk factors in different age cohorts justifies
Future Work Measure Risk Of Invasive Cancer Given Information Risk Assessment Tools Clinical Implications Validation of The Model ROC & PR Curves, Statistical Testing Get in Literature Markov Decision Processes Optimize Sequential Decision Making in the Context Of Breast Cancer Screening Clinical Implications Using POMDPs to Determine the Optimal Mammography Screening Schedule From the Patient's Perspective Presenting Author: Turgay Ayer,University of Wisconsin Co-Author: Oguzhan Alagoz,Assistant Professor, University of Wisconsin-Madison
Questions? THANK YOU!