190 likes | 292 Views
A Framework for Monitoring and Measuring a Large-Scale Distributed System in Real Time By Lei ZHAN. A Framework for Monitoring and Measuring a Large-Scale Distributed System in Real Time. By Lei ZHAN Aug 16 th , 2013. Department of Information Engineering The Chinese University of Hong Kong.
E N D
A Framework for Monitoring and Measuring a Large-Scale Distributed System in Real Time By Lei ZHAN A Framework for Monitoring and Measuring a Large-Scale Distributed System in Real Time By Lei ZHAN Aug 16th, 2013 Department of Information Engineering The Chinese University of Hong Kong
A Framework for Monitoring and Measuring a Large-Scale Distributed System in Real Time By Lei ZHAN Outline • Background • Framework Design • CaseStudy • Demonstration • Future Works Department of Information Engineering The Chinese University of Hong Kong
A Framework for Monitoring and Measuring a Large-Scale Distributed System in Real Time By Lei ZHAN Background • Internet Serviceson Distributed Infrastructure • Content Delivery Network • P2P Systems • Data Centers • Cloud Computing Services • Monitoring Framework • to guarantee reliable services and high quality of user experience • monitor and manage the deployed systems. Department of Information Engineering The Chinese University of Hong Kong
A Framework for Monitoring and Measuring a Large-Scale Distributed System in Real Time By Lei ZHAN Objectives • Accuracy • Real-time • Visualization • Scalability Department of Information Engineering The Chinese University of Hong Kong
A Framework for Monitoring and Measuring a Large-Scale Distributed System in Real Time By Lei ZHAN Framework Design Department of Information Engineering The Chinese University of Hong Kong
A Framework for Monitoring and Measuring a Large-Scale Distributed System in Real Time By Lei ZHAN Components – End Hosts • Refer to peer in P2P system, processing unit in Cloud, data center in CDN, etc. • Deployed in a large-scale and distributed manner • Measurement Data Resources • unique id for each End Host • generate feedback message periodically Department of Information Engineering The Chinese University of Hong Kong
A Framework for Monitoring and Measuring a Large-Scale Distributed System in Real Time By Lei ZHAN Components – Coordinator • Locates between End Host and Feedback Server • Responsible for • collecting feedback messages from End Hosts • forwarding them to Feedback Servers • Why Coordinator? • unique target for all End Hosts • making Feedback Server more flexible Department of Information Engineering The Chinese University of Hong Kong
A Framework for Monitoring and Measuring a Large-Scale Distributed System in Real Time By Lei ZHAN Components – Feedback Server • Locates between Coordinator & Monitoring Platform • Responsible for • aggregating feedback messages from Coordinator • responding data requests from Monitoring Platform Department of Information Engineering The Chinese University of Hong Kong
A Framework for Monitoring and Measuring a Large-Scale Distributed System in Real Time By Lei ZHAN Components – Monitoring Platform • Provides • measurement data processing and analysis • visualization views of data statistic for administrator • Operates in • real-time mode: communicate with Feedback Server • static mode: read data from local log files Department of Information Engineering The Chinese University of Hong Kong
A Framework for Monitoring and Measuring a Large-Scale Distributed System in Real Time By Lei ZHAN Framework Design Feedback Messages Feedback Messages Aggregated Log Files Request Department of Information Engineering The Chinese University of Hong Kong
A Framework for Monitoring and Measuring a Large-Scale Distributed System in Real Time By Lei ZHAN Case Study • 2012 London Olympic Games • live broadcast through the Internet within HK • P2P Video Streaming System • developed by ASTRI* • adopted by i-Cable** * The Hong Kong Applied Science and Technology Research Institute (ASTRI) was founded by the Government of Hong Kong SAR in 2000 with a mission to enhance Hong Kong’s competitiveness in technology-based industries through applied research. ** i-Cableis an internet Service Provider in Hong Kong, and is now one of Hong Kong's leading integrated communications companies. Department of Information Engineering The Chinese University of Hong Kong
A Framework for Monitoring and Measuring a Large-Scale Distributed System in Real Time By Lei ZHAN Real-time Monitoring • Whole Period • 17 days (July 27th – Aug 12th) • Key Metrics • system statistics • number of new peers, total number of peers • average peer upload rate, average peer download rate • average peer contribution ratio • system performance • peer startup delay, peer continuity • quality of experience Department of Information Engineering The Chinese University of Hong Kong
A Framework for Monitoring and Measuring a Large-Scale Distributed System in Real Time By Lei ZHAN Monitoring Platform • Playback in 2 Modes • Visualization • 4 different views • Map View • District View • Histogram View • Timeline View • filtering & control • More in the Demonstration Department of Information Engineering The Chinese University of Hong Kong
A Framework for Monitoring and Measuring a Large-Scale Distributed System in Real Time By Lei ZHAN Measurement Results Department of Information Engineering The Chinese University of Hong Kong
A Framework for Monitoring and Measuring a Large-Scale Distributed System in Real Time By Lei ZHAN Demonstration • Monitoring Platform • operates in static mode • the data of Aug 2nd, 2012 • 4 visualization views Department of Information Engineering The Chinese University of Hong Kong
A Framework for Monitoring and Measuring a Large-Scale Distributed System in Real Time By Lei ZHAN Discussion (I) • Measurement Result • window based statistics • identify End Host by its id • update records upon new feedback message • consider latest state inside the window as current state • time window moving average method for analysis • window size >= feedback message period • data synchronized at Feedback Servers • avoid synchronization problem of feedback messages Department of Information Engineering The Chinese University of Hong Kong
A Framework for Monitoring and Measuring a Large-Scale Distributed System in Real Time By Lei ZHAN Discussion (II) • Real-time Delay • more Feedback Servers • log files at Feedback Servers • generate more frequently • compress before sending to Monitoring Platform • Scalability • multiple Coordinators • more Feedback Servers • sampling on feedback messages Department of Information Engineering The Chinese University of Hong Kong
A Framework for Monitoring and Measuring a Large-Scale Distributed System in Real Time By Lei ZHAN Future Works • Generalize for other Systems • IP Geo-location • Map View & District View • IP -> Physical Address • wired IP Geo-location Department of Information Engineering The Chinese University of Hong Kong
A Framework for Monitoring and Measuring a Large-Scale Distributed System in Real Time By Lei ZHAN Q&A Department of Information Engineering The Chinese University of Hong Kong