870 likes | 1.24k Views
Machine Learning in BioMedical Informatics. SCE 5095: Special Topics Course Instructor: Jinbo Bi Computer Science and Engineering Dept. Course Information. Instructor: Dr. Jinbo Bi Office: ITEB 233 Phone: 860-486-1458 Email: jinbo@engr.uconn.edu
E N D
Machine Learning inBioMedical Informatics SCE 5095: Special Topics Course Instructor: Jinbo Bi Computer Science and Engineering Dept.
Course Information • Instructor: Dr. Jinbo Bi • Office: ITEB 233 • Phone: 860-486-1458 • Email:jinbo@engr.uconn.edu • Web: http://www.engr.uconn.edu/~jinbo/ • Time: Mon/ Wed. 2:00pm – 3:15pm • Location: CAST 204 • Office hours: Mon. 3:30-4:30pm • HuskyCT • http://learn.uconn.edu • Login with your NetID and password • Illustration
Introduction of the instructor • Ph.D in Mathematics • Previous work experience: • Siemens Medical Solutions Inc. • Department of Defense, Bioanalysis • Massachusetts General Hospital • Research Interests subtyping GWAS Color of flowers Cancer, Psychiatric disorders, … http://labhealthinfo.uconn.edu/EasyBreathing
Course Information • Prerequisite: Basics of linear algebra, calculus, and basics of programming • Course textbook (not required): • Introduction to Data Mining(2005) by Pang-Ning Tan, Michael Steinbach, Vipin Kumar • Pattern Recognition and Machine Learning (2006) Christopher M. Bishop • Pattern Classification (2nd edition, 2000) Richard O. Duda, Peter E. Hart and David G. Stork • Additional class notes and copied materials will be given • Reading material links will be provided
Course Information • Objectives: • Introduce students knowledge about the basic concepts of machine learning and the state-of-the-art literature in data mining/machine learning • Get to know some general topics in medical informatics • Focus on some high-demanding medical informatics problems with hands-on experience of applying data mining techniques • Format: • Lectures, Labs, Paper reviews, A term project
Survey • Why are you taking this course? • What would you like to gain from this course? • What topics are you most interested in learning about from this course? • Any other suggestions? (Please respond before NEXT THUR. You can also Login HuskyCT and download the MS word file, fill in, and shoot me an email.)
Grading • In-Class Lab Assignments (3): 30% • Paper review (1): 10% • Term Project (1): 50% • Participation (1): 10%
Policy • Computers • Assignments must be submitted electronically via HuskyCT • Make-up policy • If a lab assignment or a paper review assignment is missed, there will be a final take-home exam to make up • If two of these assignments are missed, an additional lab assignment and a final take-home exam will be used to make up.
Three In-class Lab Assignments • At the class where in-class lab assignment is given, the class meeting will take place in a computer lab, and no lecture • Computer lab will be at ITEB 138 (TA reserve) • The assignment is due at the beginning of the class one week after the assignment is given • If the assignment is handed in one-two days late, 10 credits will be reduced for each additional day • Assignments will be graded by our teaching assistant
Paper review • Topics of papers for review will be discussed • Each student selects 1 paper in each assignment, prepares slides and presents the paper in 8 – 15 mins in the class • The goal is to take a look at the state-of-the-art research work in the related field • Paper reviewassignment is on topics of state-of-the-art data mining techniques
Term Project • Possible project topics will be provided as links, students are encouraged to propose their own • Teams of 1-2students can be created • Each team needs to give a presentation in the last 1-2 weeks of the class (10-15min) • Each team needs to submit a project report • Definition of the problem • Data mining approaches used to solve the problem • Computational results • Conclusion (success or failure)
Final Exam • If you need make-up final exam, the exam will be provided on May. 1st (Wed) • Take-home exam • Due on May 9th (Thur.)
Three In-class Lab Assignments • BioMedical Informatics Topics • So many • Cardiac Ultrasound image categorization • Computerized decision support for Trauma Patient Care • Computer assisted diagnostic coding
Cardiac ultrasound view separation Classification (or clustering) Apical 4 chamber view Parasternal long axis view Parasternal short axis view
Trauma Patient Care • 25 min of transport time/patient • High-frequency vital-sign waveforms (3 waveforms) • ECG, SpO2, Respiratory • Low-frequency vital-sign time series (9 variables) Derived variables • ECG heart rate • SpO2 heart rate • SaO2 arterial O2 saturation • Respiratory rate • Discrete patient attribute data (100 variables) • Demographics, injury description, prehospital interventions, etc. • Measured variables • NIBP (systolic, diastolic, MAP) • NIBP heart rate • End tidal CO2 Vital signs used in decision-support algorithms HR RR SaO2 SBP DBP Propaq
Trauma Patient Care Heart Rate Respiratory Rate Saturation of Oxygen Blood Pressure Major Bleeding Make a prediction
Hospital Document DB Diagnostic Code DB Code database Patients – Criteria Patient – Notes diagnosis Patient Patient Note 428 A 1 250 B AMI C 1 2 414 D 250 E 429 F 3 SCIP G 2 ... ... ... ... ... ... ... ... ... ... Diagnostic coding heart failure diabetes Insurance Look up ICD-9 codes Statistics reimbursement /38 SIEMENS
Hospital Document DB Diagnostic Code DB Code database Patients – Criteria Patient – Notes diagnosis Patient Patient Note 428 A 1 250 B AMI C 1 2 414 D 250 E 429 F 3 SCIP G 2 ... ... ... ... ... ... ... ... ... ... Diagnostic coding heart failure diabetes Insurance Look up ICD-9 codes Statistics reimbursement /38 SIEMENS
Hospital Document DB Diagnostic Code DB Code database Patients – Criteria Patient – Notes diagnosis Patient Patient Note 428 A 1 250 B AMI C 1 2 414 D 250 E 429 F 3 SCIP G 2 ... ... ... ... ... ... ... ... ... ... Diagnostic coding heart failure diabetes Insurance Look up ICD-9 codes Statistics reimbursement /38 SIEMENS
Machine Learning / Data Mining • Data mining (sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it into useful information • The ultimate goal of machine learning is the creation and understanding of machine intelligence • The main goal of statistical learning theory is to provide a framework for studying the problem of inference, that is of gaining knowledge, making predictions, and making decisions from a set of data.
Traditional Topics in Data Mining /AI • Fuzzy set and fuzzy logic • Fuzzy if-then rules • Evolutionary computation • Genetic algorithms • Evolutionary strategies • Artificial neural networks • Back propagation network (supervised learning) • Self-organization network (unsupervised learning, will not be covered)
Next Class • Continue with data mining topics • Review of some basics of linear algebra and probability
Last Class • Described the syllabus of this course • Talked about HuskyCT website (Illustration) • Briefly introduce 3 medical informatics topics • Medical images: cardiac echo view recognition • Numerical: Trauma patient care • Free text: ICD-9 diagnostic coding • Introduce a little bit about definition of data mining, machine learning, statistical learning theory.
Challenges in traditional techniques • Lack theoretical analysis about the behavior of the algorithms • Traditional Techniquesmay be unsuitable due to • Enormity of data • High dimensionality of data • Heterogeneous, distributed nature of data Statistics/AI Machine Learning/ Pattern Recognition Soft Computing
Recent Topics in Data Mining • Supervised learning such as classification and regression • Support vector machines • Regularized least squares • Fisher discriminant analysis (LDA) • Graphical models (Bayesian nets) • others Draw from Machine Learning domains
Recent Topics in Data Mining • Unsupervised learning such as clustering • K-means • Gaussian mixture models • Hierarchical clustering • Graph based clustering (spectral clustering) • Dimension reduction • Feature selection • Compact feature space into low-dimensional space (principal component analysis)
Statistical Behavior • Many perspectives to analyze how the algorithm handles uncertainty • Simple examples: • Consistency analysis • Learning bounds (upper bound on test error of the constructed model or solution) • “Statistical” not “deterministic” • With probability p, the upper bound holds P( > p) <= Upper_bound
Tasks may be in Data Mining • Prediction tasks (supervised problem) • Use some variables to predict unknown or future values of other variables. • Description tasks (unsupervised problem) • Find human-interpretable patterns that describe the data. From [Fayyad, et.al.] Advances in Knowledge Discovery and Data Mining, 1996
Problems in Data Mining • Inference • Classification [Predictive] • Regression [Predictive] • Clustering [Descriptive] • Deviation Detection [Predictive]
Classification: Definition • Given a collection of examples (training set ) • Each example contains a set of attributes, one of the attributes is the class. • Find a model for class attribute as a function of the values of other attributes. • Goal: previously unseen examples should be assigned a class as accurately as possible. • A test set is used to determine the accuracy of the model. Usually, the given data set is divided into training and test sets, with training set used to build the model and test set used to validate it.
Test Set Model Classification Example categorical categorical continuous class Learn Classifier Training Set
Classification: Application 1 • High Risky Patient Detection • Goal: Predict if a patient will suffer major complication after a surgery procedure • Approach: • Use patients vital signs before and after surgical operation. • Heart Rate, Respiratory Rate, etc. • Monitor patients by expert medical professionals to label which patient has complication, which has not. • Learn a model for the class of the after-surgery risk. • Use this model to detect potential high-risk patients for a particular surgical procedure
Classification: Application 2 • Face recognition • Goal: Predict the identity of a face image • Approach: • Align all images to derive the features • Model the class (identity) based on these features
Classification: Application 3 • Cancer Detection • Goal: To predict class (cancer or normal) of a sample (person), based on the microarray gene expression data • Approach: • Use expression levels of all genes as the features • Label each example as cancer or normal • Learn a model for the class of all samples
Classification: Application 4 • Alzheimer's Disease Detection • Goal: To predict class (AD or normal) of a sample (person), based on neuroimaging data such as MRI and PET • Approach: • Extract features from neuroimages • Label each example as AD or normal • Learn a model for the class of all samples Reduced gray matter volume (colored areas) detected by MRI voxel-based morphometry in AD patients compared to normal healthy controls.
Regression • Predict a value of a given continuous valued variable based on the values of other variables, assuming a linear or nonlinear model of dependency. • Greatly studied in statistics, neural network fields. • Examples: • Predicting sales amounts of new product based on advertising expenditure. • Predicting wind velocities as a function of temperature, humidity, air pressure, etc. • Time series prediction of stock market indices.
Classification algorithms • K-Nearest-Neighbor classifiers • Naïve Bayes classifier • Neural Networks • Linear Discriminant Analysis (LDA) • Support Vector Machines (SVM) • Decision Tree • Logistic Regression • Graphical models
Clustering Definition • Given a set of data points, each having a set of attributes, and a similarity measure among them, find clusters such that • Data points in one cluster are more similar to one another. • Data points in separate clusters are less similar to one another. • Similarity Measures: • Euclidean Distance if attributes are continuous. • Other Problem-specific Measures
Illustrating Clustering • Euclidean Distance Based Clustering in 3-D space. Intracluster distances are minimized Intercluster distances are maximized
Clustering: Application 1 • High Risky Patient Detection • Goal: Predict if a patient will suffer major complication after a surgery procedure • Approach: • Use patients vital signs before and after surgical operation. • Heart Rate, Respiratory Rate, etc. • Find patients whose symptoms are dissimilar from most of other patients.
Clustering: Application 2 • Document Clustering: • Goal: To find groups of documents that are similar to each other based on the important terms appearing in them. • Approach: To identify frequently occurring terms in each document. Form a similarity measure based on the frequencies of different terms. Use it to cluster. • Gain: Information Retrieval can utilize the clusters to relate a new document or search term to clustered documents.
Illustrating Document Clustering • Clustering Points: 3204 Articles of Los Angeles Times. • Similarity Measure: How many words are common in these documents (after some word filtering).
Clustering algorithms • K-Means • Hierarchical clustering • Graph based clustering (Spectral clustering) • Semi-supervised clustering • Others
Basics of probability • An experiment (random variable) is a well-defined process with observable outcomes. • The set or collection of all outcomes of an experiment is called the sample space, S. • An event E is any subset of outcomes from S. • Probability of an event, P(E) is P(E) = number of outcomes in E / number of outcomes in S.
ProbabilityTheory Apples and Oranges X: identity of the fruit Y: identity of the box Assume P(Y=r) = 40%, P(Y=b) = 60% (prior) P(X=a|Y=r) = 2/8 = 25% P(X=o|Y=r) = 6/8 = 75% P(X=a|Y=b) = 3/4 = 75% P(X=o|Y=b) = 1/4 = 25% Marginal P(X=a) = 11/20, P(X=o) = 9/20 Posterior P(Y=r|X=o) = 2/3 P(Y=b|X=o) = 1/3
Probability Theory • Marginal Probability • Conditional Probability Joint Probability
Probability Theory • Sum Rule The marginal prob of X equals the sum of the joint prob of x and y with respect to y • Product Rule The joint prob of X and Y equals the product of the conditional prob of Y given X and the prob of X
p(X,Y) p(Y) Y=2 Y=1 p(X|Y=1) p(X) Illustration