80 likes | 196 Views
Midterm E xam Review. General Information. Date: 3/13/2014 Time: 11-12.20 Location: 101 Davis Closed book, closed notes. Topics. Doing data science text: Ch.2 Statistical inference, exploratory data analysis, and data science process Population and samples, sample sizes Data model
E N D
General Information • Date: 3/13/2014 • Time: 11-12.20 • Location: 101 Davis • Closed book, closed notes
Topics • Doing data science text: Ch.2 • Statistical inference, exploratory data analysis, and data science process • Population and samples, sample sizes • Data model • Statistical model • Algorithms • Fitting a model • Probability distributions • EDA: plots, graphs and summaries • One question
Topics (contd.) • Doing data science: Ch. 3 • Comparison of algorithms and stat models • Three basic algorithms • Linear regression • K-NN (semi-supervised.. Classification) • K-means (unsupervised clustering) • Intuitive idea • Algorithmic steps for each of these algorithms • Representative examples • Why and when would you use each of these algorithms? • 2 questions
Topics: Lin & Dyer’s text • Hadoop: HDFS as in Chapter 2 • MapReduce: MR data-flow including combiners and partitioners • 2 questions
Bloomberg Tech Talk on ML • Building Intelligent solution • See the presentation • Up to slide#16 (No NLP or MT) • 1 question
Format • 5 questions not equally weighed • HDFS: direct • Ch.2 dds: direct • MR and K-NN: little tricky • K-means: direct • Questions will test your understanding of the concepts • Example: what is the effect of large K vs smaller K in K-NN?
Seating for the exam • Question, space for answer format • Designated seating: Will let you know the plan