MapReduce Programming and Cluster Accessing Instructions

Gang Luo Sept. 2, 2010 MapReduce ProgrammingandCluster Accessing Instructions

Dataflow (K1, V1) (K2, V2) (K2, List<V2>) (K3, V3)

A Query Example Table1 SELECT Year, MAX(Temperature) FROM Table1 WHERE AirQuality = 0|1|4|5|9 GROUPBY Year

Implementation in MapReduce Selection+ Projection Aggregation (MAX) (1998, 87, 2, …) (1998, 87) 87 94 1998, 84 87 78 (1998, 94)

Mapper

Reducer

Driver

Think more! • What if we want to get the average temperature for a year? • What if you are only interested in the temperature in Durham? (Assume the station ID at Durham is 212) You may want to change the code a little bit and fulfill a different query

Hadoop Cluster • Master node: • hadoop21.cs.duke.edu • Slave nodes • hadoop22.cs.duke.edu – hadoop36.cs.duke.edu • Online job tracker* • hadoop21.cs.duke.edu:50030 • Online HDFS info* • hadoop21.cs.duke.edu:50070 *You cannot access these pages outside CS trusted network. Solution: 1. ssh to any node, use lynx. 2. build “ssh -D port” connection to any node, set proxy in your browser

Now, let’s see how to compile and run a MapReduce job in a clusterWhat I will be showing you is covered by the instructions at the course website:http://www.cs.duke.edu/courses/fall10/cps216/Project/cluster_instruction

MapReduce Programming and Cluster Accessing Instructions

MapReduce Programming and Cluster Accessing Instructions

Presentation Transcript

MPI and MapReduce

MapReduce Programming Model

Instructions for accessing PSS\E

Natjam : Supporting Deadlines and Priorities in a Mapreduce Cluster

MapReduce and Hadoop

MapReduce Programming

21 st CCLC WebEx Instructions for Accessing Recordings

L22: Parallel Programming Language Features (Chapel and MapReduce)

Chapter 8 Single-bit Instructions and Programming

MapReduce Programming

Programming on Cluster Platform

MapReduce Programming Oct 25, 2011

Chapter 5 PLC Programming Instructions

x86 Programming Memory Accessing Modes, Characters, and Strings

Data-Centric Programming: SQL Extensions and MapReduce

Programming on Cluster Program (2)

CS 591x – Cluster and Parallel Programming

MapReduce Example | MapReduce Programming | Hadoop MapReduce Tutorial | Edureka

MapReduce Programming

MapReduce Programming Oct 25, 2011