710 likes | 1.05k Views
Applications of Machine Learning to Medical Imaging. Daniela S. Raicu, PhD Associate Professor, CDM DePaul University Email: draicu@cs.depaul.edu Lab URL: http://facweb.cs.depaul.edu/research/vc/. MS in CS from Wayne State University, Michigan. PhD in CS from Oakland University, Michigan.
E N D
Applications of Machine Learning to Medical Imaging Daniela S. Raicu, PhD Associate Professor, CDM DePaul University Email: draicu@cs.depaul.edu Lab URL: http://facweb.cs.depaul.edu/research/vc/
MS in CS from Wayne State University, Michigan • PhD in CS from Oakland University, Michigan About me… • BS in Mathematics from University of Bucharest, Romania
My dissertation work • Research areas: Data Mining & Computer Vision • Dissertation topic: Content-based image retrieval • Research hypothesis: “A picture is worth thousands of words…” • “There is enough information in the image content to perform image retrieval whose similarity results correspond to the human perceived similarity”.
My dissertation work (cont) • Research hypothesis: • “There is enough information in the image content to perform image retrieval whose similarity results correspond to the human perceived similarity”. • Methodology: • 1) extract color image features, 2) define color-based similarity, 3) cluster images based on color, 4) retrieve similar images • Output: • Color-based CBIR for general purpose image datasets • Proof of hypothesis: • Google similar images: • http://similar-images.googlelabs.com/
Towards an academic career • Teaching areas & research interests: • data analysis, data mining, image processing, computer vision & medical informatics • Co-director of the Intelligent Multimedia Processing, Medical Informatics lab & the NSF REU Program in Medical Informatics Assistant Professor at DePaul, 2002-2008 Associate Professor, 2008- Present
Outline Part I: Introduction to Medical Informatics • Medical Informatics • Clinical Decision Making • Imaging Modalities and Medical Imaging • Basic Concepts in Image Processing Part II: Advances in Medical Imaging Research • Computer-Aided Diagnosis • Computer-Aided Diagnostic Characterization • Texture-based Classification • Content-based Image Retrieval
Medical informatics research What is medical informatics? Medical informatics is the application of computers, communications and information technology and systems to all fields of medicine - medical care - medical education - medical research. MF Collen, MEDINFO '80, Tokyo
What is medical informatics? Medical informatics is the branch of science concerned with the use of computers and communication technology to acquire, store, analyze, communicate, and display medical information and knowledge to facilitate understanding and improve the accuracy, timeliness, and reliability of decision-making. Warner, Sorenson and Bouhaddou, Knowledge Engineering in Health Informatics, 1997
Clinical decision making Making sound clinical decisions requires: – right information, right time, right format Clinicians face a surplus of information – ambiguous, incomplete, or poorly organized Rising tide of information – Expanding knowledge sources 40K new biomedical articles per month Publicly accessible online health info Hundreds of pictures per scan for one patient
Clinical decision making: What is the problem? Man is an imperfect data processor – We are sensitive to the quantity and organizationof information Army officers and pilots commit ‘fatal errors’ when given too many, too few, or poorly organized data The same is true for clinicians who ‘watch’ for events Clinicians are particularly susceptible to errors of omission
Clinical decision making: What is the problem? Humans are “non-perfectable” data processors - Better performance requires more time to process - Irony • Clinicians increasingly face productivity expectations • Clinicians face increasing administrative tasks
Subdomains of medical informatics (by Wikipedia) imaging informatics clinical informatics nursing informatics consumer health informatics public health informatics dental informatics clinical research informatics bioinformatics pharmacy informatics
What is medical imaging (MI)? The study of medical imagingis concerned with the interaction of all forms of radiation with tissue and the development of appropriate technology to extract clinically useful information (usually displayed in an image format) from observation of this technology. Sources of Images: • Structural/anatomical information (CT, MRI, US) - within each elemental volume, tissue-differentiating properties are measured. • Information about function (PET, SPECT, fMRI).
The imaging “chain” Reconstruction Filtering Raw data “Raw data” Signal acquisition Processing Analysis 123…………… 2346………….. 65789………… 6578………….. Quantitative output
Image analysis: Turning an image into data User extracted qualitative features User extracted quantitative features Semi automated Automated Exam Level: Feature 1 Feature 2 Feature 3 . . Finding: Feature 1 Feature 2 . .
Major advances in medical imaging These major advances can play a major role in early detection, diagnosis, and computerized treatment planning in cancer radiation therapy. • Image Segmentation • Image Classification • Computer-Aided Diagnosis Systems • Computer-Aided Diagnostic Characterization • Content-based Image Retrieval • Image Annotation
Computer-Aided Diagnosis • Computed Aided Diagnosis (CAD) is diagnosis made by a radiologist when the output of computerized image analysis methods has been incorporated into his or her medical decision-making process. • CAD may be interpreted broadly to incorporate both • the detection of the abnormality task and • the classification task: likelihood that the abnormality represents a malignancy
Motivation for CAD systems The amount of image data acquired during a CT scan is becoming overwhelming for human vision and the overload of image data for interpretation may result in oversight errors. Computed Aided Diagnosis for: • Breast Cancer • Lung Cancer • A thoracic CT scan generates about 240 section images for radiologists to interpret. • Colon Cancer • CT colonography (virtual colonoscopy) is being examined as a potential screening device (400-700 images)
CAD for Breast Cancer A mammogram is an X-ray of breast tissue used as a screening tool searching for cancer when there are no symptoms of anything being wrong. A mammogram detects lumps, changes in breast tissue or calcifications when they're too small to be found in a physical exam. • Abnormal tissue shows up a dense white on mammograms. • The left scan shows a normal breast while the right one shows malignant calcifications.
CAD for Lung Cancer • Identification of lung nodules in thoracic CT scan; the identification is complicated by the blood vessels • Once a nodule has been detected, it may be quantitatively analyzed as follows: • The classification of the nodule as benign or malignant • The evaluation of the temporal size in the nodule size.
CAD for Colon Cancer • Virtual colonoscopy (CT colonography) is a minimally invasive imaging technique that combines volumetrically acquired helical CT data with advanced graphical software to create two and three-dimensional views of the colon. Three-dimensional endoluminal view of the colon showing the appearance of normal haustral folds and a small rounded polyp.
Role of Image Analysis & Machine Learning for CAD • An overall scheme for computed aided diagnosis systems
SoC Medical imaging research projects 1. Computer-aided characterization for lung nodules Goal: establish the link between computer-based image features of lung nodules in CT scans and visual descriptors defined by human experts (semantic concepts) for automatic interpretation of lung nodules Example:This lung nodule has a “solid” texture and has a “sharp” margin
Why computer-aided characterization? Lobulation=4 Malignancy=5 “highly suspicious” Sphericity=2 Lobulation=1 “marked” Malignancy=5 “highly suspicious” Sphericity=4 Lobulation=2 Malignancy=5 “highly suspicious” Sphericity=5 “round” Lobulation=5 “none” Malignancy=5 “highly suspicious” Sphericity=3 “ovoid” Ratings and Boundaries across radiologists are different!!! 25
Computer-aided characterization • Research Hypothesis • “The working hypothesis is that certain radiologists’ assessments can be mapped to the most important low-level image features”. • Methodology • new semi-supervised probabilistic learning approaches that will deal with both the inter-observer variability and the small set of labeled data (annotated lung nodules). • Our proposed learning approach will be based on an ensemble of classifiers (instead of a single classifier as with most CAD systems) built to emulate the LIDC ensemble (panel) of radiologists.
Computer-aided characterization (cont.) Expected outcome: an optimal set of quantitative diagnostic features linked to the visual descriptors (semantic concepts). Significance: The derived mappings can serve to show the computer interpretation of the corresponding radiologist rating in terms of a set of standard and objective image features, automatically annotate new images, and augment the lung nodule retrieval results with their probabilistic diagnostic interpretations.
Computer-aided characterization • Preliminary results • NIH Lung Image Database Consortium (LIDC): • 149 distinct nodules from about 85 cases/patients; • four radiologists marked the nodules using 9 semantic characteristics on a scale from 1 to 5 except for calcification (1 to 6) and internal structure (1 to 4)
Computer-aided characterization • LIDC high level concepts & ratings 29
Computer-aided characterization • Low-level image features 30
Computer-aided characterization • Accuracy results 31
Computer-aided characterization • Challenges • Small number of training samples and large number of features “curse of dimensionality” problem • Nodule size • Variation in the nodules’ boundaries • Different types of imaging acquisition parameters • Clinical evaluation: observer performance studies require collaboration with medical schools or hospitals
Pixel Level Texture Extraction Pixel Level Classification SoC Medical imaging research projects - 2.Texture-based Pixel Classification - tissue segmentation - context-sensitive tools for radiology reporting Organ Segmentation
Neighborhood of a pixel Texture-based Pixel Classification • Texture Feature extraction: consider texture around the pixel of interest. • Capture texture characteristic based on estimation of joint conditional probability of pixel pair occurrences Pij(d,θ). • Pij denotes the normalized co-occurrence matrix of specify by displacement vector (d) and angle (θ).
Examples of Texture Images Texture images: original image, energy and cluster tendency, respectively. M. Kalinin, D. S. Raicu, J. D. Furst, D. S. Channin,, " A Classification Approach for Anatomical Regions Segmentation", The IEEE International Conference on Image Processing (ICIP), Genoa, Italy, September 11-14, 2005.
Original Image Initial Seed at 90% Split & Merge at 85% Split & Merge at 80% Texture Classification of Tissues in CT Chest/Abdomen Example of Liver Segmentation: (J.D. Furst, R. Susomboon, and D.S. Raicu, "Single Organ Segmentation Filters for Multiple Organ Segmentation",IEEE 2006 International Conference of the Engineering in Medicine and Biology Society (EMBS'06)) Region growing at 70% Region growing at 60% Segmentation Result
Classification models: challenges (a) Optimal selection of an adequate set of textural features is a challenge, especially with the limited data we often have to deal with in clinical problems. Consequently, the effectiveness of any classification system will always be conditional on two things: (i) how well the selected features describe the tissues (ii) how well the study group reflects the overall target patient population for the corresponding diagnosis
Classification models: challenges (b) how other type of information can be incorporated into the classification models: - metadata - image features from other imaging modalities (need of image fusion) (c) how stable and general the classification models are
Content-based medical image retrieval (CBMS) systems - • Definition of Content-based Image Retrieval: • Content-based image retrieval is a technique for retrieving images on the basis of automatically derived image features such as texture and shape. • Applications of Content-based Image Retrieval: • Teaching • Research • Diagnosis • PACS and Electronic Patient Records
Image Features [D1, D2,…Dn] Feature Extraction Image Database Similarity Retrieval Query Image Feedback Algorithm User Evaluation Query Results Diagram of a CBIR http://viper.unige.ch/~muellerh/demoCLEFmed/index.php
CBIR as a Diagnosis Aid An image retrieval system can help when the diagnosis depends strongly on direct visual properties of images in the context of evidence-based medicine or case-based reasoning.
CBIR as a Teaching Tool An image retrieval system will allow students/teachers to browse available data themselves in an easy and straightforward fashion by clicking on “show me similar images”. Advantages: - stimulate self-learning and a comparison of similar cases - find optimal cases for teaching • Teaching files: • Casimage: http://www.casimage.com • myPACS: http://www.mypacs.net
CBIR as a Research Tool • Image retrieval systems can be used: • to complement text-based retrieval methods • for visual knowledge management whereby the images and associated textual data can be analyzed together • multimedia data mining can be applied to learn the unknown links between visual features and diagnosis or other patient information • for quality control to find images that might have been misclassified
CBIR as a tool for lookup and reference in CT chest/abdomen • Case Study: lung nodules retrieval • Lung Imaging Database Resource for Imaging Research http://imaging.cancer.gov/programsandresources/Inf ormationSystems/LIDC/page7 • 29 cases, 5,756 DICOM images/slices, 1,143 nodule images • 4 radiologists annotated the images using 9 nodule characteristics: calcification, internal structure, lobulation, malignancy, margin, sphericity, spiculation, subtlety, and texture • Goals: • Retrieve nodules based on image features: • Texture, Shape, and Size • Find the correlations between the image features and the radiologists’ annotations
Choose an image feature& a similarity measure M. Lam, T. Disney, M. Pham, D. Raicu, J. Furst, “Content-Based Image Retrieval for Pulmonary Computed Tomography Nodule Images”, SPIE Medical Imaging Conference, San Diego, CA, February 2007
CBIR systems: challenges • Type of features • image features: • - texture features: statistical, structural, model and filter-based • - shape features • textual features (such as physician annotations) • Similarity measures • -point-based and distribution based metrics • Retrieval performance: • precision and recall • clinical evaluation