130 likes | 276 Views
General Information. Course Id: COSC6342 Machine Learning Time: TU/TH 10a-11:30a Instructor: Christoph F. Eick Classroom: AH123 E-mail: ceick@aol.com Homepage: http://www2.cs.uh.edu/~ceick/. What is Machine Learning?. Machine Learning is the
E N D
General Information Course Id: COSC6342 Machine Learning Time: TU/TH 10a-11:30a Instructor: Christoph F. Eick Classroom: AH123 E-mail: ceick@aol.com Homepage: http://www2.cs.uh.edu/~ceick/
What is Machine Learning? • Machine Learning is the • study of algorithms that • improve their performance • at some task • with experience • Role of Statistics: Inference from a sample • Role of Computer science: Efficient algorithms to • Solve optimization problems • Representing and evaluating the model for inference
Applications of Machine Learning • Supervised Learning • Classification • Prediction • Unsupervised Learning • Association Analysis • Clustering • Preprocessing and Summarization of Data • Reinforcement Learning • Activities Related to Models • Learning parameters of models • Choosing/Comparing models • …
Prerequisites Background • Probabilities • Distributions, densities, marginalization… • Basic statistics • Moments, typical distributions, regression • Basic knowledge of optimization techniques • Algorithms • basic data structures, complexity… • Programming skills • We provide some background, but the class will be fast paced • Ability to deal with “abstract mathematical concepts”
Textbooks Textbook: EthemAlpaydin, Introduction to Machine Learning, MIT Press, 2004. Recommended Textbooks: Christopher M. Bishop, Pattern Recognition and Machine Learning, 2006. Tom Mitchell, Machine Learning, McGraw-Hill, 1997.
Grading 3 Exams 67-70% Project 18-24% Homeworks 10-15% Attendance 1-2% Remark: Weights are subject to change NOTE: PLAGIARISM IS NOT TOLERATED.
Topics Covered in 2009 (Based on Alpaydin) • Topic 1: Introduction • Topic 2: Supervised Learning • Topic 3: Bayesian Decision Theory (excluding Belief Networks) • Topic 4: Using Curve Fitting as an Example to Discuss Major Issues in ML • Topic 5: Parametric Model Selection • Topic 6: Dimensionality Reduction Centering on PCA • Topic 7: Clustering1: Mixture Models, K-Means and EM • Topic 8: Non-Parametric Methods Centering on kNN and Density Estimation • Topic 9: Clustering2: Density-based Approaches • Topic 10: Decision Trees • Topic 11: Comparing Classifiers • Topic 12: Combining Multiple Learners • Topic 13: Linear Discrimination • Topic 14: More on Kernel Methods • Topic 15: Naive Bayes' and Belief Networks • Topic 16: Hidden Markov Models • Topic 17: Sampling • Topic 18: Reinforcement Learning • Topic 19: Neural Networks • Topic 20: Computational Learning Theory • Remark: Topics 14, 16, 17, 19, and 20 likely will be only briefly covered or • skipped---due to the lack of time.
Course Project • The project will center on the application of machine learning techniques • to a challenging problem. It will be conducted in the window Feb. 12-April 11. • You can either conduct some novel experiments by applying machine learning • algorithm(s) to a challenging machine learning task or attempt a theoretical • analysis. • Findings of the project will be summarized in a report and in a brief presentation. • The report must include a short survey of related work with the corresponding list • of references.
March 31, 2009 Tentative ML Spring 2009 Schedule
Course Elements • Total: 25-26 classes • 18 lectures • 2-3 classes for review and discussing homework problems • 2 classes will be allocated for student presentations • 3 exams • homeworks • individual graded • group graded • not-graded (solutions will be discussed in lecture 7-9 days later).
Exams • Will be open notes/textbook • Will get a review list before the exam • Exams will center (80% or more) on material that was covered in the lecture • There will be a review prior to the second and third exam; first exam will mostly • center on basics. • Exam scores will be immediately converted into number grades • No sample exams; sorry I haven’t taught this course for a long time…
Other UH-CS Courses with Overlapping Contents • COSC 6368: Artificial Intelligence • Strong Overlap: Decision Trees, Bayesian Belief Networks • Medium Overlap: Reinforcement Learning • COSC 6335: Data Mining • Strong Overlap: Decision trees, SVM, kNN, Density- • based Clustering • Medium Overlap: K-means, Decision Trees, • Preprocessing/Exploratory DA, AdaBoost • COSC 6343: Pattern Classification • Medium Overlap: all classification algorithms, feature selection—discusses those topics taking • a different perspective.