Berkeley RAD Lab Technical Vision

Berkeley RAD LabTechnical Vision Armando Fox, Randy Katz, Michael Jordan, Dave Patterson, Scott Shenker, Ion Stoica RADS Retreat, June 2005

Outline • Overall Vision • Internet Services Vision (ServRADS) • Network Vision (NetRADS) • Internet Services Network architecture • Principles and Summary

Overarching Mantra Enable a faster pace of network service innovationthrough new distributed system architecturesthat reduce operations cost by 2-3 orders of magnitude The Challenge: Software systems: Too much information => make sense of it through statistical learning & control theory Network systems: Too little information => exploit better observation and monitoring in the network infrastructure to drive management processes

In practice this means … • Single person can write, deploy, operate the next-generation IT business (“the Fortune 1 million”) • Do for Internet apps what Web did for individual publishing • Gray’ s challenge: planetary-scale distributed system operated by a single part-time operator • Goal: programmers focus on functionality; put the *ility in the platform • Could be built on utility computing, giving access to distributed physical resources • Integrated approach to network and server/service management Requires 100x-1000x reduction in TCO from today’s levels

What things are like today • World-scale services created and operated by expert teams • “Google-sized organization” to create a Google • Amazon’s book browsing, designed by programmers, is cumbersome • Browsing for housewares, designed by domain experts on mature infrastructure, more usable • We don’t know what the next “killer app” will be! • NOW project didn’t predict Internet search as a “Killer app” for NOW’s If we succeed, the next killer Internet app will be written, deployed, operated, at Google-like scales, by a single programmer

Focusing on lowering cost of ownership • Standard way to account for “where the money goes” in operating a deployed distributed application • Definition independent of who is operating the app • Operators per byte of storage or per CPU? No, doesn’t scale with technology changes • Operators per end-user served? (This is the figure of merit for e-tailers) • Operators per geographic region served? • Operators per $ spent on capital cost? • Operators per $ of revenue?

Enabling Technologies for Reducing TCO in ServRADS • Past successes • microrebooting: Fast recovery makes false positives tolerable • Pinpoint: using SLT to detect and localize fine-grain failures • visualization+SLT to help operators & earn their trust • Elements of technical vision • SLT and machine learning • Operator-centric visualization • Control theory • “Open source” failures database (sanitized, open failures & forensics repository)

Example scenarios • Helping operators make sense of instrumentation • Using ML techniques to localize failures (P. Bodik, E. Kiciman) • Using automatically-induced statistical models to identify likely causes of performance problems (S. Zhang, I. Cohen et al.) • Combining SLT with visualization for cross-checking problem reports and rapidly spotting potential problems visually • Automating problem identification based on stored signatures (S. Zhang, M. Goldszmidt, I. Cohen et al.) • Facilitating self-tuning/configuration • Using control theory to improve performance of a distributed streaming database (W. Xu) • Service placement in wide-area distributed system (D. Oppenheimer) • Microreboots (G. Candea) and microreplacement (S. Kawamoto) as low-cost prevention/repair strategies If false positive cost can be kept low, automate. Otherwise, help operator do her job.

Services example: combining viz + SLT

Reduce TCO via Planetary-scale Abstractions • Inspiration: narrowly-focused planetary-scale abstractions whose design & implementation... • scale well: understand distributed scheduling, locality, symptoms of wide-area failures • monitorable and controllable (using SLT & linear CT) • retain precisely-quantifiable and “acceptable” semantics under partial-failure conditions • Examples of existing “narrow but powerful” services • MapReduce in Google understands data locality • Can easily imagine a “lossy” MapReduce, like online aggregation • queues/messaging in Yahoo, Amazon, others • User information database in Yahoo • Instrumentation collection & analysis services using Telegraph-CQ

RADS Network Problem • Internet routing has proven to be robust • But … • Poor visibility: hard to determine health of the network • Routing policy interactions defeat propagation of useful diagnostic info: difficult to identify root cause problems • Slow reaction times to connectivity failures; operator intervention (across admin domains) increases cost of ownership • Key observation: network service failures attributed to unexpected traffic patterns • Approach: identify and protect “good” traffic • Mechanism deployed in network edge: • It’s where the servers and clients are located • Greatest need for lowering management costs • Administrative scope and responsibility is well-defined

iBoxes: New network element for Observe, Analyze, Act Enterprise Network Architecture Inspection-and-Action Boxes: Deep multiprotocol packet inspection No routing; observation & marking Policing points: drop, fence, block

Network-Level Observe-Analyze-Act • Observe • Packet, path, protocol, service invocation statistical collection and sampling: frequencies, latencies, completion rates • Construct the collection infrastructure • Analyze • Determine correlations among observations • “Normal” model discovery + anomaly detection • Exploit SLT • Act • Experiment to test correlations • Prioritize and throttle • Mark and annotate • Control theory? Distributed analyses and actions

Application Presentation Session Transport Annotation Network Link Phy Network Layer Mechanism: Annotations • Enhance network visibility: disseminate observations, communicate actions, provide in-band network management actions, iBox-to-iBox communications • iBoxes label packets at annotation layer but do not rewrite packet contents • Annotations stack, must be removed from packets before delivery to A-layer unaware end nodes

Scenario: Traffic Surge Inhibiting Network Services Internet Edge II • DNS Server swamped by excessive request traffic • Observe: DNS time outs, Web access traffic slowed, but also higher than normal mail delivery latency implying busy server edge (correlation between Mail Server and DNS Server utilization?) • Root Cause: High DNS request rates generated by Spam Appliance triggered by mail surge R Primary & Secondary DNS Servers Distribution Tier S S E Mail Server E R R S IA IS E Spam Appliance Server Edge Access Edge E S

Scenario Internet Edge II • How Diagnosed? • I-S detects high link utilization but abnormally high DNS traffic • Stats from I-I: high mail traffic, low outgoing web traffic, in traffic high but link utilization not high • Stats from I-A: lower web traffic, no unusual mail origination • Problem localized to Server edge, but visibility limited: RADS can help R Primary & Secondary DNS Servers Distribution Tier S S E Mail Server E R R S IA IS E Spam Appliance Server Edge Access Edge E S

Scenario Internet Edge II • Possible Action Responses • Experiment: Redirect local DNS requests to Secondary DNS server: if these complete, can infer the server is the problem, not the network • Throttle: Due to MS-DNS correlation, block/slow email traffic at Server Edge: should expect reduced DNS server utilization R Primary & Secondary DNS Servers Distribution Tier S S E Mail Server E R R S IA IS E Spam Appliance Server Edge Access Edge E S

Embodying principles in a prototype • Platform architecture and prototype to enable rapid innovation in network services by non-experts • automatically accommodates scaling, provisioning, failure management • multi-datacenter (geoplexed) • observable networks connecting datacenters • potentially planetary scale • runs with minimal operator oversight • Prototype keeps various research projects focused on common goal and allows ongoing testing • Participation in standards processes to promote “best practices” in platform as open standards

Server Client Distributed Middleware Distributed Middleware Router Router Internet IP Network Reliable Adaptive Distributed Systems Operator User Prototype Applications Programming Abstractions For Roll-back and wide-area distributed computations SLT Services Crash-only services + Observation Infrastructure forSystem SLT Application- Specific Overlay Network Checkable Protocols Fast Detection & Route Recovery ObservationInfrastructure for network SLT iBox iBox Edge Network Edge Network Commodity Internet

Buffers Buffers Buffers Input Ports Output Ports CP CP CP CP CP CP AP CP Interconnection Fabric Action Processor Classification Processor Generic iBox Architecture “Tag” Mem Rules & Programs

Possible architecture of a rack app. server & application, e.g. J2EE Microrecovery actions Datacenter boundary From other datacenters High-leveleffectors SLTalgo. SLTalgo. SLTalgo. To other datacenters Control loops High-level sensor data Externally-inducedfailures, workload changes, etc. T-CQ engine Sanitizeddata Visualization SLTalgo. SLTalgo. SLTalgo. Preprocesseddata Syndrome identification To otherdatacenters

ServRADS: Observations & Summary • SLT algorithms make sense of large amounts of data • Classification, outlier/anomaly detection, clustering, etc. • Viz helps operator use “visual pattern recognition” to quickly spot problems and cross-check SLT models • Enables operator expertise to be quickly brought to bear • Builds operators’ trust in statistical/machine learning models • Challenge • Fundamental challenges associated with applying SLT to problem determination (coming up next session) • Unifying many techniques into a coherent approach - prototype platform as unifying artifact • Idea: capture best practices in TCO-optimized, planetary-scale abstractions

NetRADS: Observations & Summary • COPS: Paradigm for (more) automatically protecting critical resources when network is under stress • Checkable protocols: visible semantics • Observe network behavior: good (easy), bad (hard), suspicious • Protect services: throttle, redirect • Network management major contributor to TCO • NetRADS built on: • iBoxes: pervasive infrastructure for observation and action at the network level • Annotation Layer: for marking, control, inter-iBox communications • Integration with Internet service approach for service/server-level visibility and integrated management

Berkeley RAD Lab Technical Vision

Berkeley RAD Lab Technical Vision

Presentation Transcript

Eric Linder Berkeley Lab UC Berkeley

Cloud Computing and the RAD Lab David Patterson, UC Berkeley Reliable Adaptive Distributed Systems Lab

SSC Pacific Technical Vision

Berkeley RAD Lab: Robust, Adaptive, Distributed Systems

Berkeley Lab

Lawrence Berkeley National Lab

Eric Linder University of California, Berkeley Lawrence Berkeley National Lab

Lawrence Berkeley National Lab

Berkeley Lab Innovation Grants

GeoWeb Vision - Technical Requirements -

Trident Technical College Berkeley Campus Courtyard

Berkeley RAD Lab Technical Overview

Berkeley Lab Overview

Lawrence Berkeley National Lab

Allan DeMello Lawrence Berkeley National Lab

Eric Linder University of California, Berkeley Lawrence Berkeley National Lab

Steve Virostek Lawrence Berkeley National Lab

Steve Virostek Lawrence Berkeley National Lab

The Berkeley Lab Cosmic Ray Detector

Berkeley Lab Innovation Grants

Eric Linder University of California, Berkeley Lawrence Berkeley National Lab

Berkeley RAD Lab Technical Overview