720 likes | 848 Views
Berkeley-Helsinki Summer Course Lecture #12: Introspection and Adaptation. Randy H. Katz Computer Science Division Electrical Engineering and Computer Science Department University of California Berkeley, CA 94720-1776. Outline. Introspection Concept and Methods
E N D
Berkeley-Helsinki Summer CourseLecture #12: Introspection and Adaptation Randy H. Katz Computer Science Division Electrical Engineering and Computer Science Department University of California Berkeley, CA 94720-1776
Outline • Introspection Concept and Methods • SPAND Content Level Adaptation • MIT Congestion Manager/TCP Layer Adaptation • ICAP Cache-Layer Adaptation
Outline • Introspection Concept and Methods • SPAND Content Level Adaptation • MIT Congestion Manager/TCP Layer Adaptation • ICAP Cache-Layer Adaptation
Introspection • From Latin introspicere, “to look within” • Process of observing the operations of one’s own mind with a view to discovering the laws that govern the mind • Within the context of computer systems • Observing how a system is used (observe): usage patterns, network activity, resource availability, denial of service attacks, etc. • Extracting a behavioral model from such use (discover) • Use this model to improve the behavior of the system, by making it more proactive, rather than reactive, to how it is used • Improve performance and fault tolerance, e.g., deciding when to make replicas of objects and where to place them
Introspection in Computer Systems • Locality of Reference • Temporal: objects that are used are likely to be used again in the near future • Geographic: objects near each other are likely to be used together • Exploited in many places • Hardware caches, virtual memory mechanisms, file caches • Object interrelationships • Adaptive name resolution • Mobility patterns • Implications • Prefetching/prestaging • Clustering/grouping • Continuous refinement of behavioral model
Example: Wide-Area Routing and Data Location in OceanStore • Requirements • Find data quickly, wherever it might reside • Locate nearby data without global communication • Permit rapid data migration • Insensitive to faults and denial of service attacks • Provide multiple routes to each piece of data • Route around bad servers and ignore bad data • Repairable infrastructure • Easy to reconstruct routing and location information • Technique: Combined Routing and Data Location • Packets are addressed to GUIDs, not locations • Infrastructure gets the packets to their destinations and verifies that servers are behaving John Kubiatowicz
Two-levels of Routing • Fast, probabilistic search for “routing cache” • Built from attenuated Bloom filters • Approximation to gradient search • Not going to say more about this today • Redundant Plaxton Mesh used for underlying routing infrastructure: • Randomized data structure with locality properties • Redundant, insensitive to faults, and repairable • Amenable to continuous adaptation to adjust for: • Changing network behavior • Faulty servers • Denial of service attacks John Kubiatowicz
3 4 2 NodeID 0x79FE NodeID 0x23FE NodeID 0x993E NodeID 0x43FE NodeID 0x43FE 1 4 NodeID 0x73FE NodeID 0x44FE 3 2 1 3 NodeID 0xF990 4 4 3 2 NodeID 0x035E NodeID 0x04FE 3 NodeID 0x13FE 4 NodeID 0x555E NodeID 0xABFE 2 NodeID 0x9990 3 1 2 1 2 3 NodeID 0x239E NodeID 0x73FF NodeID 0x1290 NodeID 0x423E 1 Basic Plaxton MeshIncremental suffix-based routing John Kubiatowicz
Use of Plaxton MeshRandomization and Locality John Kubiatowicz
Use of the Plaxton Mesh(Tapestry Infrastructure) • As in original Plaxton scheme: • Scheme to directly map GUIDs to root node IDs • Replicas publish toward a document root • Search walks toward root until pointer locatedlocality! • OceanStore enhancements for reliability: • Documents have multiple roots (Salted hash of GUID) • Each node has multiple neighbor links • Searches proceed along multiple paths • Tradeoff between reliability and bandwidth? • Routing-level validation of query results • Dynamic node insertion and deletion algorithms • Continuous repair and incremental optimization of links John Kubiatowicz
OceanStore Domains for Introspection • Network Connectivity, Latency • Location tree optimization, link failure recovery • Neighbor Nodes • Clock synchronization, node failure recovery • File Usage • File migration • Clustering related files • Prefetching, hoarding • Storage Peers • Accounting, archive durability, backlisting • Meta-Introspection • Confidence estimation, stability Dennis Geels, geels@cs.Berkeley.edu
Common Functionality • These targets share some requirements: • High input rates • Watch every file access, heartbeat, packet transmission • Both short- and long-term decisions • Respond to changes immediately • Extract patterns from historical information • Hierarchical, Distributed Analysis • Low levels make decisions based on local information • Higher levels possess broader, approximate knowledge • Nodes must cooperate to solve problem • We can build shared infrastructure Dennis Geels, geels@cs.Berkeley.edu
Architecture for Wide-Area Introspection • Fast Event-Driven Handlers • Filter and aggregate incoming events • Respond immediately if necessary • Local Database, Periodic Analysis • Store historical information for trend-watching • Allow more complicated, off-line algorithms • Location-Independent Routing • Flexible coordination, communication Dennis Geels, geels@cs.Berkeley.edu
Event-Driven Handlers • Treat all incoming data as events: messages, timeouts, etc. • Leads to natural state-machine design • Events cause state transitions, finite processing time • A few common primitives could be powerful: average. count, filter by predicate, etc. • Implemented in “small language” • Counts important primitives for aggregation, database access • Facilitates implementation of introspective algorithms • Allows greater exploration, adaptability • Can verify security, termination guarantees • E.g., EVENT.TYPE=“file access” : increment COUNT in EDGES where SRC==EVENT.SRC and DST==EVENT.SRC Dennis Geels, geels@cs.Berkeley.edu
Local Database, Periodic Analysis • Database Provides Powerful, Flexible Storage • Persistent data allows long-term analysis • Standard interface for event handler scripting language • Leverage existing aggregation functionality • Considerable work from Telegraph Project • Can be lightweight • Sophisticated Algorithms Run On Databases • Too resource-intensive to operate directly on events • Allow use of full programming language • Security, termination still checkable; should use common mechanisms • E.g., expensive clustering algorithm operating over edge graph, using sparse-matrix operations to extract eigenvectors representing related files Dennis Geels, geels@cs.Berkeley.edu
Location-Independent Routing • Not a very good name for a rather simple idea. Interesting introspective problems are inherent-ly distributed. Coodination among nodes is difficult. Needed: • Automatically create/locate parents in aggregation hierarchy • Path redundancy for stability, availability • Scalability • Fault tolerance, responsiveness to fluctuation in workload • OceanStore data location system shares these requirements. This coincidence is not surprising, as each are instances of wide-area distributed problem solving. • Leverage OceanStore Location/Routing System Dennis Geels, geels@cs.Berkeley.edu
Summary: Introspection in OceanStore • Recognize and share a few common mechanisms • Efficient event-driven handlers • More powerful, database-driven algorithms • Distributed, location-independent routing • Leverage common architecture to allow system designers to concentrate on developing & optimizing domain-specific algorithms Dennis Geels, geels@cs.Berkeley.edu
Outline • Introspection Concept and Methods • SPAND Content Level Adaptation • MIT Congestion Manager/TCP Layer Adaptation • ICAP Cache-Layer Adaptation
SPAND Architecture Mark Stemm
SPAND Architecture Mark Stemm
What is Needed • An efficient, accurate, extensible and time-aware system that makes shared, passive measurements of network performance • Applications that use this performance measurement system to enable or improve their functionality Mark Stemm
Issues to Address • Efficiency: What are the bandwidth and response time overheads of the system? • Accuracy: How closely does predicted value match actual client performance? • Extensibility: How difficult is it to add new types of applications to the measurement system? • Time-aware: How well does the system adapt to and take advantage of temporal changes in network characteristics? Mark Stemm
SPAND Approach: Shared Passive Measurements Mark Stemm
Related Work • Previous work to solve this problem • Use active probing of network • Depend on results from a single host (no sharing) • Measure the wrong metrics (latency, hop count) • NetDyn, NetNow, Imeter • Measure latency and packet loss probability • Packet Pair, bprobes • If Fair Queuing, measures “fair share” of bottleneck link b/w) • Without Fair Queuing, unknown (min close to link b/w) • Pathchar • Combines traceroute & packet pair to find hop-by-hop latency & link b/w • Packet Bunch Mode • Extends back-to-back technique to multiple packets for greater accuracy Mark Stemm
Related Work • Probing Algorithms • Cprobes: sends small group of echo packets as a simulated connection (w/o flow or congestion control) • Treno: like above, but with TCP flow/congestion control algorithms • Network Probe Daemon: traces route or makes short connection to other network probe daemons • Network Weather Service: makes periodic transfers to distributed servers to determine b/w and CPU load on each Mark Stemm
Related Work • Server Selection Systems • DNS to map name to many servers • Either round-robin or load balancing • Boston University: uses cprobes, bprobes • Harvest: uses round trip time • Harvard: uses geographic location • Using routing metrics: • IPV6 Anycast • HOPS • Cisco Distributed Director • University of Colorado • IBM WOM: uses ping times • Georgia Tech: uses per-application, per-domain probe clients Mark Stemm
Comparison with Shared Passive Measurement • What is measured? • Others: latency, link b/w, network b/w • SPAND: actual response time, application specific • Where is it implemented? • Others: internal network, at server • SPAND: only in client domain • How much additional traffic is introduced? • Others: tens of Kbytes per probe • SPAND: small performance reports and responses • How realistic are the probes? • Others: artificially generated probes that don’t necessarily match realistic application workloads • SPAND: actual observed performance from applications Mark Stemm
Comparison with Shared Passive Measurement • Does the probing use flow/congestion control? • Others: no • SPAND: whatever the application uses (usually yes) • Do clients share performance information? • Others: no; sometimes probes are made on behalf of clients • SPAND: yes Mark Stemm
Benefits of Sharing and Passive Measurements • Two similarly connected hosts are likely to observe same performance of distant hosts • Sharing measurements implies redundant probes can be eliminated Mark Stemm
Benefits of Passive Measurements Mark Stemm
Design of SPAND Mark Stemm
Design of SPAND • Modified Clients • Make Performance Reports to Performance Servers • Send Performance Requests to Performance Servers • Performance Servers • Receive reports from clients • Aggregate/post process reports • Respond to requests with Performance Responses • Packet Capture Host • Snoops on local traffic • Makes Performance Reports on behalf of unmodified clients Mark Stemm
Design of SPAND • Applications Classes • Way in which an application uses the network • Examples: • Bulk transfer: uses flow control, congestion control, reliable delivery • Telnet: uses reliability • Real-time: uses flow control and reliability • (Addr, Application Class) is target of a Performance Request/Report Mark Stemm
Issues • Accuracy • Is net performance stable enough to make meaningful Performance Reports? • How long does it take before the system can service the bulk of the Performance Requests? • In steady state, what % of Performance Requests does the system service? • How accurate are Performance Responses? • Stability • Performance results must not vary much with time • Implications of Connection Lengths • Short TCP connections dominated by round trip time; long connections by available bandwidth Mark Stemm
Application of SPAND:Content Negotiation Web pages look good on server LAN Mark Stemm
Implications for Distant Access,Overwhelmed Servers Mark Stemm
Content Negotiation Mark Stemm
Client-Side Negotiation Results Mark Stemm
Server-Side Dynamics Mark Stemm
Server-Side Negotiation: Results Mark Stemm
Content Negotiation Results • Network is the bottleneck for clients and servers • Content negotiation can reduce download times of web clients • Content negotiation can increase throughput of web servers • Actual benefit depends on fraction of negotiable documents Mark Stemm
Outline • Introspection Concept and Methods • SPAND Content Level Adaptation • MIT Congestion Manager/TCP Layer Adaptation • ICAP Cache-Layer Adaptation
Internet Congestion Manager(Hari@MIT, Srini@CMU) • The Problem: • Communications flows compete for same limited bandwidth resource (especially on slow start!), implement own congestion response, no shared learning, inefficient, within end node • The Power of Shared Learning and Information Sharing f1 f2 Server f(n) Client
Internet Adapting to Network ? • New applications may not use TCP • Implement new protocol • Often do not adapt to congestion: not “TCP-friendly” f1 Client Server Need system that helps applications learn and adapt to congestion
State of Congestion Control • Increasing number of concurrent flows • Increasing number of non-TCP apps Congestion Manager (CM): An end-system architecture for congestion management
API The Big Picture HTTP Audio Video1 Video2 Per-macroflow statistics (cwnd, rtt, etc.) TCP1 TCP2 UDP Congestion Manager IP All congestion management tasks performed in CM Applications learn and adapt using API
Problems • How does CM control when and whose transmissions occur? • Keep application in control of what to send • How does CM discover network state? • What information is shared? • What is the granularity of sharing? Key issues: API and information sharing
The CM Architecture Applications (TCP, conferencing app, etc) API Congestion Controller Scheduler Congestion Detector CM protocol Responder Prober Sender Receiver
Feedback about Network State • Monitoring successes and losses • Application hints • Probing system • Notification API (application hints)