240 likes | 258 Views
Explore datacenter computing strategies, storage hierarchy, networking topologies, power management, and application-level software. Learn about resource allocation models, memory management techniques, and network scheduling. Discover the challenges and advancements in datacenter operating systems.
E N D
Datacenter As a Computer Mosharaf Chowdhury EECS 582 – W16
Announcements • Midterm grades are out • There were many interesting approaches. Thanks! • Meeting on April 11 moved earlier to April 8 (Friday) • No reviews for the papers for April 8 • Meeting on April 13 moved later to April 15 (Friday) EECS 582 – W16
Mid-Semester Presentations • Strictly followed 20 minutes per group • 15 + 5 minutes Q/A • Four parts: motivation, approachoverview, current status, and end goal • March 21 • Juncheng and Youngmoon • Andrew • Dong-hyeon and Ofir • Hanyun, Ning, and Xianghan • March 23 • Clyde, Nathan, and Seth • Chao-han, Chi-fan, and Yayun • Chao and Yikai • Kuangyuan, Qi, and Yang EECS 582 – W16
Why is One Machine Not Enough? • Too much data • Too little storage capacity • Not enough I/O bandwidth • Not enough computing capability EECS 582 – W16
Warehouse-Scale Computers • Single organization • Homogeneity (to some extent) • Cost efficiency at scale • Multiplexing across applications and services • Rent it out! • Many concerns • Infrastructure • Networking • Storage • Software • Power/Energy • Failure/Recovery • … EECS 582 – W16
Architectural Overview Aggregation Memory Bus ToR PCIe Ethernet SATA Server EECS 582 – W16
Datacenter Networks • Traditional hierarchical topology • Expensive • Difficult to scale • High oversubscription • Smaller path diversity • … Core Agg. Edge EECS 582 – W16
Datacenter Networks • CLOS topology • Cheaper • Easier to scale • NO/low oversubscription • Higher path diversity • … Core Agg. Edge EECS 582 – W16
Storage Hierarchy • L1 cache • L2 cache • L3 cache • RAM • 3D Xpoint • SSD • HDD • Across machines, racks, and pods (https://www.youtube.com/watch?v=IWsjbqbkqh8) EECS 582 – W16
Power, Energy, Modeling, Building,… • Many challenges • We’ll focus primarily on software infrastructure in this class EECS 582 – W16
Datacenter Needs an Operating System • Datacenter is a collection of • CPU cores • Memory modules • SSDs and HDDs • All connected by an interconnect • A computer is a collection of • CPU cores • Memory modules • SSDs and HDDs • All connected by an interconnect EECS 582 – W16
Some Differences • High-level of parallelism • Diversity of workload • Resource heterogeneity • Failure is the norm • Communication dictates performance EECS 582 – W16
Three Categories of Software • Platform-level • Software firmware that are present in every machine • Cluster-level • Distributed systems to enable everything • Application-level • User-facing applications built on top EECS 582 – W16
Common Techniques EECS 582 – W16
Common Techniques EECS 582 – W16
Datacenter Programming Models • Fault-tolerance, scalable, and easy access to all the distributed datacenter resources • Users submit jobs to these models w/o having to worry about low-level details • MapReduce • Grandfather of big data as we know today • Two-stage, disk-based, network-avoiding • Spark • Common substrate for diverse programming requirements • Many-stage, memory-first EECS 582 – W16
Datacenter “Operating Systems” • Fair and efficient distribution of resources among many competing programming models and jobs • Does the dirty work so that users won’t have to • Mesos • Started with a simple question – how to run different versions of Hadoop? • Fairness-first allocator • Borg • Google’s cluster manager • Utilization-first allocator EECS 582 – W16
Resource Allocation and Scheduling • How do we divide the resources anyway? • DRF • Multi-resource max-min fairness • Two-level; implemented in Mesos and YARN • HUG: DRF + High utilization • Omega • Shared-state resource allocator • Many schedulers interact through transactions EECS 582 – W16
File Systems • Fault-tolerant, efficient access to data • GFS • Data resides with compute resources • Compute goes to data; hence, data locality • The game changer: centralization isn’t too bad! • FDS • Data resides separately from compute • Data comes to compute; hence, requires very fast network EECS 582 – W16
Memory Management • What to store in cache and what to evict? • PACMan • Disk locality is irrelevant for fast-enough network • All-or-nothing property: caching is useless unless all tasks’ inputs are cached • Best eviction algorithm for single machine isn’t so good for parallel computing • Parameter Server • Shared-memory architecture (sort of) • Data and compute are still collocated, but communication is automatically batched to minimize overheads EECS 582 – W16
Network Scheduling • Communication cannot be avoided; how do we minimize its impact? • DCTCP • Application-agnostic; point-to-point • Outperforms TCP through ECN-enabled multi-level congestion notifications • Varys • Application-aware; multipoint-to-multipoint; all-or-nothing in communication • Concurrent open-shop scheduling with coupled resources • Centralized network bandwidth management EECS 582 – W16
Unavailability and Failure • In a 10000-server DC, with 10000-day MTBF machines, one machine will fail everyday on average • Build fault-tolerant software infrastructure and hide failure-handling complexity from application-level software as much as possible • Configuration is one of the largest sources of service disruption • Storage subsystems are the biggest sources of machine crashes • Tolerating/surviving from failures is different from hiding failures EECS 582 – W16
What’s the most critical resource in a datacenter? • Why? EECS 582 – W16
Will we come back to client-centric models? • As opposed to server-centric/datacenter-driven model today • If yes, why and when? • If not, why not? EECS 582 – W16