160 likes | 280 Views
Hadoop System simulation with Mumak. Fei Dong, Tianyu Feng , Hong Zhang Dec 8, 2010. Agenda. Objective Comparison between MRPerf and Mumak Modifications to Mumak Results and discussion Conclusion. Objective. Large scale distributed system has enormous amount of parameters.
E N D
Hadoop System simulation with Mumak Fei Dong, TianyuFeng, Hong Zhang Dec 8, 2010
Agenda • Objective • Comparison between MRPerf and Mumak • Modifications to Mumak • Results and discussion • Conclusion
Objective • Large scale distributed system has enormous amount of parameters. • Running time of a user program depends non-linearly on these parameters. • Predict the running time under various settings to help user choose the “optimal” setting. • We start by varyingthe most basic parameter: cluster size.
MRPerf and Mumak • MRPerf • Build upon a network simulator • Calculate the task running time and network delay from physical parameters • Implemented the Hadoop system in TCL • Flexible in simulation
MRPerf and Mumak Running Time Map slots per node Reduce slots per node 4 nodes double rack data center (Chunk Size = 64M) By MRPerf
MRPerf and Mumak 4 nodes (Chunk Size = 64M) By Mumak
MRPerf and Mumak • Mumak • Inherit the JobTracker class from Hadoop and only defines the simulation interface • Use trace file to build the cluster topology / job story, then feed it into simulator • Can only reproduce previous finished experiment • Designed to verify/debug Hadoop system design • Only simulate the Map/Reduce tasks, no sort phase and shuffle phase
MRPerf and Mumak • The approach taken by MRPerf is better • Take in parameters to estimate running time • Can make predictions • MRPerf is simulating their implementation of Hadoop • The design of Mumak is better • Inherit source code from Hadoop • Easy to understand and to extend • We decide to take the good parts of MRPerf and then implement them in the framework of Mumak • Modify the Rumen log to change the parameters • Modify Mumak source code to add network simulator
Implementation • Simulate a different cluster size • Hack the rumen log, change data replication factor/ locality • Modify the topology, add in / delete nodes, for example, from 2 slave nodes to 6 slave nodes. • The job tracker will assign the tasks to different nodes.
Implementation • Simulate network delay • We defined a simple network simulator interface • Modified the source code of Mumak to add in the network delay • Actual the network delay can be ignored
Results and Discussion • Limitations and future work • Sort phase time not included • Only used single rack topology • Prediction is not always consistent for the same job with the same configuration
Conclusion • Our objective is to predict the running time with different parameters • We take the methods of MRPerf and implemented it on Mumak • To have more flexible and accurate prediction, more modification to Mumak is needed • Independent from trace file • Solve the unstable problem