130 likes | 155 Views
Learn about machine learning algorithms, statistics, and applications. Topics include supervised and unsupervised learning, decision trees, neural networks, and more. Recommended background in probabilities, statistics, and programming. Textbooks provided.
E N D
General Information Course Id: COSC6342 Machine Learning Time: TU/TH 1-2:30p Instructor: Christoph F. Eick Classroom: AH301 E-mail: ceick@aol.com Homepage: http://www2.cs.uh.edu/~ceick/
What is Machine Learning? • Machine Learning is the • study of algorithms that • improve their performance • at some task • with experience • Role of Statistics: Inference from a sample • Role of Computer science: Efficient algorithms to • Solve optimization problems • Representing and evaluating the model for inference
Applications of Machine Learning • Supervised Learning • Classification • Prediction • Unsupervised Learning • Association Analysis • Clustering • Preprocessing and Summarization of Data • Reinforcement Learning and Adaptation • Activities Related to Models • Learning parameters of models • Choosing/Comparing models • Evaluating Models (e.g. predicting their accuracy)
Prerequisites Background • Probabilities • Distributions, densities, marginalization… • Basic statistics • Moments, typical distributions, regression • Basic knowledge of optimization techniques • Algorithms • basic data structures, complexity… • Programming skills • We provide some background, but the class will be fast paced • Ability to deal with “abstract mathematical concepts”
Textbooks Textbook: EthemAlpaydin, Introduction to Machine Learning, MIT Press, 2010. Mildly Recommended Textbooks: Christopher M. Bishop, Pattern Recognition and Machine Learning, 2006. Tom Mitchell, Machine Learning, McGraw-Hill, 1997.
Grading Spring 2011 • 2 Exams 61-69% • 3 Projects and 4HW 35-40% • Attendance 1% Remark: Weights are subject to change NOTE: PLAGIARISM IS NOT TOLERATED.
Topics Covered in 2011 (Based on Alpaydin) • Topic 1: Introduction to Machine Learning • Topic 2: Supervised Learning • Topic 3: Bayesian Decision Theory (excluding Belief Networks) • Topic 5: Parametric Model Estimation • Topic 6: Dimensionality Reduction Centering on PCA • Topic 7: Clustering1: Mixture Models, K-Means and EM • Topic 8: Non-Parametric Methods Centering on kNN and density estimation • Topic 9: Clustering2: Density-based Approaches • Topic 10 Decision Trees • Topic 11: Comparing Classifiers • Topic 12: Combining Multiple Learners • Topic 13: Linear Discrimination Centering on Support Vector Machines • Topic 14: More on Kernel Methods • Topic 15: Graphical Models Centering on Belief Networks • Topic 16: Applications of Machine Learning---Urban Driving, Netflix, etc. • Topic 17: Hidden Markov Models • Topic 18: Reinforcement Learning • Topic 19: Neural Networks • Topic 20: Computational Learning Theory • Remark: Topics 17, 19, and 20 likely will be only briefly covered or • skipped---due to the lack of time.
Course Projects • February 2011: Homework1 (available Feb. 6)Individual Project; Classification and Prediction; learn how obtain, use, and evaluate models(available Feb. 10). • March/April 2011: Group Project, giving a survey about a subfield of Machine Learning, Homework2 (available after Spring Break) • Second Half April 2011: Individual Project (Short); Reinforcement Learning and Adaptation: Learn how to act intelligently in an unknown/changing environment
Course Elements • Total: 25-26 classes • 18-19 lectures • 3-4 classes for review and discussing course projects • 2 classes will be allocated for student presentations • 2 exams • Graded and ungraded paper and pencil problems
April 14, 2011 ScheduleML Spring 2011 Green: will use other teaching material
Exams • Will be open notes/textbook • Will get a review list before the exam • Exams will center (80% or more) on material that was covered in the lecture • Exam scores will be immediately converted into number grades • We only have 2009 sample exams; I taught this course only once recently
Other UH-CS Courses with Overlapping Contents • COSC 6368: Artificial Intelligence • Strong Overlap: Decision Trees, Bayesian Belief Networks • Medium Overlap: Reinforcement Learning • COSC 6335: Data Mining • Strong Overlap: Decision trees, SVM, kNN, Density- • based Clustering • Medium Overlap: K-means, Decision Trees, • Preprocessing/Exploratory DA, AdaBoost • COSC 6343: Pattern Classification • Medium Overlap: all classification algorithms, feature selection—discusses those topics taking • a different perspective.