Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy

Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy Inês Dutra Universityof Porto, CRACS & INESC-Porto LA HoussamNassif, David Page, JudeShavlik, Roberta Strigel, YirongWu, MaiElezabiandElizabethBurnside UW-Madison

IntroandMotivation • USA: • 1 woman dies of breast cancer every 13 minutes • In 2011: • estimated 230,480 newcases ofinvasivecancer • 39,520 women are expected to die • Source: U.S. Breast Cancer Statistics, 2011 AMIA 2011

IntroandMotivation • Portugal: • Per year: • 4,500 new cases • 1,500 deaths (33%) • Source: Liga Portuguesa Contra o Cancro - September 2011 AMIA 2011

IntroandMotivation • Mammography • MRI • Ultrasound • Biopsies AMIA 2011

IntroandMotivation • Image-guidedpercutaneous core needlebiopsy • standard for thediagnosisofsuspiciousfindings • Over 700,000 womenundergobreast core biopsyinthe US • Imperfect: biopsiescanbeinconclusivein5-15% of cases • ≈ 35,000-105,000 oftheabovewillrequireadditionalbiopsies AMIA 2011

IntroandMotivation • Breastbiopsies: • Fine needlebreastbiopsy • Stereotacticbreastbiopsy • Ultrasound-guided core biopsy • Excisionalbiopsy • MRI-guided AMIA 2011

Objectives • Hypotheses • Canwelearncharacteristicsofinconclusivebiopsies? • Canweimprovetheclassiferbyaddingexpertknowledge? AMIA 2011

Terminology • Inconclusivebiopsies: “non-definitive”: • Discordantbiopsies • Highrisklesions • Atypicalductalhyperplasia • Lobular carcinoma insitu • Radial scar • etc. • Insufficientsampling AMIA 2011

MaterialsandMethods • Data collected • Oct1st, 2005 to Dec 31st, 2008 • Tools • WEKA • Aleph • Validation: leave-one-outcrossvalidation • Resultstested for statisticalsignificance AMIA 2011

Data source 1384 imageguided core biopsies 349 M 1035 B 925 (C) 110 (NC) 43 (D) 51 (ARS) 16 (I) Aftereliminatingredundancies 94 AMIA 2011

Data distribution • 94 non-concordant cases wentthroughadditionalprocedures (e.g., excision) • 15 were “upgraded” frombenign to malignant • Ourpopulation: 15+/79- • Task: find a distinctionbetween upgrades andnon-upgrades AMIA 2011

Data Features AMIA 2011

WEKA andAleph • Data cleaning, singletable • 15-fold stratifiedcross-validation (leave-one-out) • WEKA (BayesTAN) andAleph • Withoutexpertknowledge • Withexpertknowledge AMIA 2011

WEKA andAleph • Waikato Environment for KnowledgeAnalysis • Collectionofmachinelearningalgorithmsand data miningtasks • use “propositionalized” data (flattable) • ALearningEngine for ProposingHypotheses • InductiveLogicProgramming (ILP) system • Knowledgerepresentation: firstorderrules AMIA 2011

Method Learnrulesabouthowandwhynon-definitive cases are “upgraded” Otherrules (simple) providedby a specialist Combine Newrules AMIA 2011

Modelling upgrades withoutexpertknowledge • Alephlearningonthe 94 cases (15+/79-) • Exampleofrulelearned upgrade(A) IF bxneedlegauge(A,9) AND abndisappeared10(A,0) AND mass01(A,1) AND anotherlesionbxatsametime01(A,0). [4+/0-] AMIA 2011

ExamplesofExpertrules (1) Upgrade increases on stereo if calcifications and ADH are present upgrade(A) IF pathdx(A,atypical_ductal_hyperplasia) AND calcifications(A,’a,f’) AND biopsyprocedure(A, stereo). (2) Upgrade increases on US(Ultra-Sound) if high BIRADS category upgrade(A) IF concordance(A,d) AND mammobirads(A,4B/4C/5). AMIA 2011

Modelling upgrades withexpertknowledge • Exampleofrulelearned upgrade(A) IF numOfSpecimens(A,gt6) AND rule1_2(A) AND rule1_3(A). [3+/1-] rule1_2(A) IF pathdxabbr(A,adh) AND calcifications(A,f). [3+/4-] rule1_3(A) IF pathdxabbr(A,adh) AND calcification_distribution(A,g). [3+/6-] AMIA 2011

Results AMIA 2011

Results:WEKA versus Alephandhumanrules AMIA 2011

Conclusions • Both WEKA andAlephcanimproveresultswhencombining original data withexpertknowledge (withouttuning) • WEKA improvesonfalse positives andcanbeadjusted to achieveRecallof 100% withtheneed to re-examinearound60 women (outof 79) • Alephimprovesonfalse negatives andproduces a betterclassifierwiththeadvantageofusing a languagethatiseasilyunderstandablebythespecialist AMIA 2011

FutureWork • NLM projectstartedthismonth AMIA 2011

FutureWork • NLM projectstartedthismonth • Shorttermtasks: • Augmentthecurrentdataset • Reviewconsistentlymisclassifiedexamples • Possibily use associationrules to produce a firstsubsetofrules AMIA 2011

FutureWork • NLM projectstartedthismonth • Shorttermtasks: • Augmentthecurrentdataset • Reviewconsistentlymisclassifiedexamples • Possibily use associationrules to produce a firstsubsetofrules • Mediumtermtasks: • learnfromthe cases thatwerenotupgraded • Developtheiterativecycleoflearningandexpertrulerevision AMIA 2011

FutureWork • NLM projectstartedthismonth • Shorttermtasks: • Augmentthecurrentdataset • Reviewconsistentlymisclassifiedexamples • Possibily use associationrules to produce a firstsubsetofrules • Mediumtermtasks: • learnfromthe cases thatwerenotupgraded • Developtheiterativecycleoflearningandexpertrulerevision • Longtermtasks: • Use more sophisticatedlearningmethods to producebetterrules • To integratebiopsyattributesandtheclassifiersobtained to MammoClass (http://cracs.fc.up.pt/pt/mammoclass/) AMIA 2011

Acknowledgments • UW-MadisonMedicalSchool • FCT-PortugalandHorusandDigiScopeprojects • AMIA reviewersandorganizers • Specialack to theUW-MadisonMedicalSchoolpeoplethatwerenever “scared” ofcomputertechnologyandhavebeenalwaysveryenthusiasticaboutusingcomputationalmethodologies to helptheirwork AMIA 2011

CONTACT: ines@dcc.fc.up.pt THANK YOU! AMIA 2011

Experiment 1Significancetests - p-values AMIA 2011

Experiment 2tuning • 5x4 stratifiedcrossvalidation • Parameterstuned: • Minacc: 0.05 and 0.1 • Minpos: 1, 2, 3 • Noise: 0, 1, 2 AMIA 2011

Experiment 2tuning, bestparametersperfold AMIA 2011

Experiment 2 - Results p= 0.1 0.3 0.03 AMIA 2011

Summary AMIA 2011

UW upgrades data • 94 benign cases afterbiopsieswerediscussed • 15 considered to be, infact, malignant upgrades • Tasks: • learncharacteristicsof upgrades that do notappearinnon-upgrades interpretablerules • Canweimprovetheclassiferbyaddingexpertknowledge? AMIA 2011

Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy

Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy

Presentation Transcript

Breast Biopsy:

Lab 7: Handling and Evaluation of Breast Cancer Biopsy

The Challenges of Integrating Physician Group Operations

KNOWLEDGE OF PHYSICIAN

The Consolidation of Task Knowledge for Lifelong Machine Learning

Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy

Introduction to Machine Learning and Knowledge Representation

The Knowledge Machine

Knowledge Representation and Machine Learning

Breast Core Biopsy

Biopsy of recurrence in breast cancer

Acquiring and Integrating Knowledge

Methods to improve study accuracy and precision of estimates

Breakthrough Technology to Improve the Range and Accuracy of Cas9 Target

Breast Biopsy Stereotactic and Ultrasound Course in Ohio, USA | Hands on Breast Biopsy Stereotactic and Ultrasound Train

Breast Biopsy Market

Factors affecting the accuracy of feed machine

Breast Biopsy Market

Breast Biopsy Market

Valuable tips to improve data accuracy

How to Improve Accuracy Of Machine Learning Model?

Knowledge Discovery, Machine Learning, and Social Mining