100 likes | 115 Views
Explore web search engines, information retrieval basics, data mining fundamentals, and machine learning essentials in this comprehensive course at Shanghai Jiao Tong University.
E N D
Wu-Jun Li Department of Computer Science and Engineering Shanghai Jiao Tong University Lecture 0: Course Overview Web Search and Mining
General Information • Instructor: Wu-Jun Li (李武军) • Email: liwujun@cs.sjtu.edu.cn • Homepage: http://www.cs.sjtu.edu.cn/~liwujun • Office: Rm 3-537, SEIEE Building • Office Hours: Thur 10:00am - 11:00am • Course web site: http://www.cs.sjtu.edu.cn/~liwujun/course/wsm.html • Teaching Assistant: TBD • Lecture Time: Wed 10:00 - 10:45 & 10:55 - 11:40 Fri 12:55 - 13:40 & 14:00 - 14:45 • Lecture Venue: Rm 308, Rui-Qiu Chen Building(陈瑞球楼308) 2
Textbook Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. Introduction to Information Retrieval. Cambridge University Press, 2008. The English reprint edition (英文影印版) can be bought through China-Pub (http://www.china-pub.com/193197). You can also download it from the book website (http://nlp.stanford.edu/IR-book/information-retrieval-book.html).
Reference Books Bruce Croft, Donald Metzler, and Trevor Strohman. Search Engines: Information Retrieval in Practice. Addison Wesley, 2009. (The English reprint edition can be bought through China-Pub.) Bing Liu. Web Data Mining: Exploring Hyperlinks, Contents and Usage Data. Springer, 2006. Jiawei Han, and Micheline Kamber. Data Mining: Concepts and Techniques. Morgan Kaufmann, Second Edition, 2006. (The English reprint edition can be bought through China-Pub.) Trevor Hastie, Robert Tibshirani, Jerome Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, Second Edition,2009. (http://www-stat.stanford.edu/~tibs/ElemStatLearn/index.html) Christopher M. Bishop. Pattern Recognition and Machine Learning. Springer, 2006.
Course Topics Architecture of search engines The basics of information retrieval (IR) index construction and compression; Boolean retrieval; vector space model; evaluation of IR systems; relevance feedback and query expansion Probabilistic IR and language models Data mining and machine learning (ML) basics supervised learning; unsupervised learning; matrix factorization Graph mining, social search and recommender systems
Prerequisites Data structure Design and analysis of algorithms Linear algebra Probability theory
Grading Scheme In class quizzes (30%) Homework (30%) Project + presentation (40%)
Late Assignments Assignments turned in late will be penalized 20% per late day
Academic Honor Code Honesty and integrity are central to the academic work. All your submitted assignments must be entirely your own (or your own group's). Any student found cheating or performing plagiarism will receive a final score of zero for this course.