50 likes | 80 Views
Prepare for your midterm exam on data-intensive computing with a focus on topics like web services, MapReduce, and Hadoop architecture. Study materials, solve problems, and practice writing pseudo code to ensure success.
E N D
Midterm Review CSE4/587 B.Ramamurthy 1/1/2020 B.Ramamurthy B.Ramamurthy 1
Exam Date • October 25, 2011 • Location 107 Talbert • Please bring • Pencils, pens and erasers. • This is a closed book exam. • NO Other material is allowed. • No calculators/phones. • Arrive on time, no extra time will be given if you arrive late 1/1/2020 B.Ramamurthy B.Ramamurthy 2
Topics • Defining data intensive computing ( as in Fourth Paradigm: up to p.19) • Enabling Technologies (ET): • ET1: Web service • ET2: Special data structures and algorithms • NO GAE • MapReduce model: components: Mapper, Reducer, Partitioner, Combiner; Execution framework , shuffle and sort • Hadoop (HDFS) : as in yahoo site: Ch1, 2, 4; 5 only partitioner. • Problem solving with MR: • Chapter 1-4 in Lin and Dryer’s text • Tom White analysis of web log (Don’t ask me for the handout, go find it) 1/1/2020 B.Ramamurthy B.Ramamurthy 3
Questions • Defining data-intensive computing: J. Gray • Given a problem solve it using MR • Given a MR provide, provide a numerical example trace • Best practices and design patterns described in the Lin&Dryer text • Web services and project 1 • Hadoop (HDFS) architecture • Functions of various MR modules B.Ramamurthy
How to study? • Make a list of all material to study. • Study the material • Practice writing pseudo code for the MRs • Use block diagrams and numerical examples when necessary B.Ramamurthy