160 likes | 295 Views
General Information. Course Id: COSC6342 Machine Learning Time: MO/WE 2:30-4p Instructor: Christoph F. Eick Classroom: SEC 201 E-mail: ceick@aol.com Homepage: http://www2.cs.uh.edu/~ceick/. What is Machine Learning?. Machine Learning is the
E N D
General Information Course Id: COSC6342 Machine Learning Time:MO/WE 2:30-4p Instructor:Christoph F. Eick Classroom:SEC 201 E-mail: ceick@aol.com Homepage: http://www2.cs.uh.edu/~ceick/
What is Machine Learning? • Machine Learning is the • study of algorithms that • improve their performance • at some task • with experience • Role of Statistics: Inference from a sample • Role of Computer science: Efficient algorithms to • Solve optimization problems • Learning, representing and evaluating models for inference
categorical categorical continuous class Example of a Decision Tree Model Splitting Attributes Refund Yes No NO MarSt Married Single, Divorced TaxInc NO < 80K > 80K YES NO Training Data Decision Tree Model Classification Model in General: f: {yes,no}{married,single,divorced}+ {yes,no}
Machine Learning Tasks • Supervised Learning • Classification • Prediction • Unsupervised Learning and Summarization of Data • Association Analysis • Clustering • Preprocessing • Reinforcement Learning and Adaptation • Activities Related to Models • Learning parameters of models • Choosing/Comparing models • Evaluating Models (e.g. predicting their accuracy)
Prerequisites Background • Probabilities • Distributions, densities, marginalization… • Basic statistics • Moments, typical distributions, regression • Basic knowledge of optimization techniques • Algorithms • basic data structures, complexity… • Programming skills • We provide some background, but the class will be fast paced • Ability to deal with “abstract mathematical concepts”
Textbooks Textbook: EthemAlpaydin, Introduction to Machine Learning, MIT Press, Second Edition, 2010. Mildly Recommended Textbooks: Christopher M. Bishop, Pattern Recognition and Machine Learning, 2006. Tom Mitchell, Machine Learning, McGraw-Hill, 1997.
Grading Spring 2014 • 2 Exams 58-62% • 3 Projects and 2HW38-41% • Attendance 1% Remark: Weights are subject to change NOTE: PLAGIARISM IS NOT TOLERATED.
Topics Covered in 2014 (Based on Alpaydin) Topic 1: Introduction to Machine Learning Topic 18: Reinforcement Learning Topic 2: Supervised Learning Topic 3: Bayesian Decision Theory (excluding Belief Networks) Topic 5: Parametric Model Estimation Topic 6: Dimensionality Reduction Centering on PCA Topic 7: Clustering1: Mixture Models, K-Means and EM Topic 8: Non-Parametric Methods Centering on kNN and density estimation Topic 9: Clustering2: Density-based Approaches Topic 10 Decision Trees Topic 11: Comparing Classifiers Topic 12: Combining Multiple Learners Topic 13: Linear Discrimination Centering on Support Vector Machines Topic 14: More on Kernel Methods Topic 15: Graphical Models Centering on Belief Networks Topic 16: Success Stories of Machine Learning Topic 17: Hidden Markov Models Topic 19: Neural Networks Topic 20: Computational Learning Theory Remark: Topics 17, 19, and 20 likely will be only briefly covered or skipped---due to the lack of time. For Topic 16 your input is appreciated!
Course Elements • Total: 26-27 classes • 18-19 lectures • 3 course projects • 2-3 classes for review and discussing course projects • 1-2 classes will be allocated for student presentations • 3 40 minutes reviews • 2 exams • Graded and ungraded paper and pencil home problems • Course Webpage: http://www2.cs.uh.edu/~ceick/ML/ML.html
2014 Plan of Course Activities Through March 15: Homework1; Individual Project1(Reinforcement Learning and Adaptation: Learn how to act intelligently in an unknown/changing environment); Homework2. We., March 5: Midterm Exam March 16-April 5: Group Project2 (TBDL). April 6-April 26: Homework3, Project3 (TBDL) Mo., May 5, 2p: Final Exam
Remark: Schedule is the same as in 2013, except reinforcement learning will covered after the introduction. ScheduleML Spring 2013 Green: will use other teaching material
Exams • Will be open notes/textbook • Will get a review list before the exam • Exams will center (80% or more) on material that was covered in the lecture • Exam scores will be immediately converted into number grades • We only have 2009, 2011 and 2013 sample exams; I taught this course only three times recently.
Other UH-CS Courses with Overlapping Contents • COSC 6368: Artificial Intelligence • Strong Overlap: Decision Trees, Bayesian Belief Networks • Medium Overlap: Reinforcement Learning • COSC 6335: Data Mining • Strong Overlap: Decision trees, SVM, kNN, Density- • based Clustering • Medium Overlap: K-means, Decision Trees, • Preprocessing/Exploratory DA, AdaBoost • COSC 6343: Pattern Classification?!? • Medium Overlap: all classification algorithms, feature selection—discusses those topics taking • a different perspective.
Purpose of COSC 6342 • Machine Learning is the study of how to build computer systems that learn from experience. It • intersects with statistics, cognitive science, information theory, artificial intelligence, pattern recognition and probability theory, among others. The course will explain how to build systems that learn and adapt using real-world applications. Its main themes include: • Learning how to create models from examples that classify or predict. • Learning in unknown and changing environments • Theory of machine learning • Preprocessing • Unsupervised learning and other learning paradigms
Course Objectives COSC 6342 • Upon completion of this course, students • will know what the goals and objectives of machine learning are • will have a basic understanding on how to use machine learning to build real-world systems • will have sound knowledge of popular classification and prediction techniques, such as decision trees, support vector machines, nearest-neighbor approaches and regression. • will learn how to build systems that explore unknown and changing environments • will get some exposure to machine learning theory, in particular how learn models that exhibit high accuracies. • will have some exposure to more advanced topics, such as ensemble approaches, kernel methods, unsupervised learning, feature selection and generation, density estimation.