280 likes | 403 Views
Towards a Transparent Internet. Ehab Al-Shaer School of Computer Science DePaul University. Richard Yang Dept of Computer Science Yale University. Yan Chen EECS Department Northwestern University. Short Bios. Yan Chen Assistant professor of Northwestern Univ. DOE CAREER Award in 2005
E N D
Towards a Transparent Internet Ehab Al-Shaer School of Computer Science DePaul University Richard Yang Dept of Computer Science Yale University Yan Chen EECS Department Northwestern University
Short Bios • Yan Chen • Assistant professor of Northwestern Univ. • DOE CAREER Award in 2005 • Microsoft Trustworthy Computing Award in 2004 & 2005 • TPC co-chair of IWQoS 07, PC for Infocom, Mobicom, etc. • Ehab Al-Shaer: • Associate professor and Director of MNLAB • Very actively involved in the area of network operation and management for more than 10 years. • TPC co-chair of IM’07, the premier network management conference. PC for INFOCOM, ICNP, IM/NOMS, ASIACCS
Motivations • The Internet has evolved to become a un-cooperative ossificated network of networks • Network has to be treated as a blackbox • Performance of even neighboring networks are opaque • Inter-domain routing based on policies but not performance • Have to resort to overlay networks which are suboptimal • Diagnosis and fault location extremely hard • Network config management error-prone & expensive • Reactive configurations: tune after deployment • Vulnerable: manually handled and subject to conflicts • Imperative & fragmented: need to access several specific devices in order to implement a service goal
Proposed Solution: Transparent Internet • Every network shares its measurement and management information with other networks when necessary (glass box) • Performance: delay, loss, available bandwidth, etc. • Can be at link-level • Management info • Configurations: QoS setting, traffic policing, firewalls, etc. • Traffic info: traffic matrices, traffic characteristics • Information sharing through • As part of the inter-domain protocols: Transparent Gateway Protocols (TGP) • Other applications: leverage DHT
Analogy to the Airline Alliance • When airlines compose multi-lag flights, they need more than just route info • Type of aircraft, # of vacancies, probability of punctuation, etc. • Such open model is mutual beneficial • Provide the best flight composition for clients • Similarly, open network model can provide best communications for applications
Objectives I Provides a completely transparent view of the Internet to networks and applications • Diagnosis & trouble shooting becomes extremely easy • No more Internet tomography needed • Flexible inter-domain routing • Not just based on policy or # of AS/hops • Flexible metrics based on bandwidth, latency, etc. • Global traffic engineering • Each AS performs its own local traffic engineering • Provide AS path-level routing guide • Unified framework that applications query (push/pull) info as needed • Streaming media, content distribution • Anomaly/security applications
Objectives II Provides an autonomic, provable and proactive configuration management • Proactive verification: configuration verified and translated to different vendor specific devices • Proactive validation: Test the configuration changes on archived network traffic without interrupting the operation networks • Autonomic configuration: from high-level “management objectives” to configuration parameters • Configurations are auto-tuned dynamically to achieve the “objectives” Auto-tuning & Proactivity Deploying Optimizing Evaluation/ Prediction defining Verifying Validation
Flexible Inter-domain Routing • Multiple routing paths with TGP • Incorporate measurement info into AS paths • Bandwidth-intensive and latency-intensive applications can take different AS paths. • Challenge: inter-domain routing based on bandwidth without making reservation • Solution: Discretize the bandwidth for good tradeoff b/t adaptation and stability • Though stability is a classical problem, not unique to TGP
Global Traffic Engineering (TE) • For the current Internet, TE is executed in each AS -- thus only local optimum is achieved • Allowing the network to handle all traffic patterns possible, within the networks ingress-egress capacity constraints (e.g. two phase routing) • With global information, we can potentially achieve global optimum (or Nash equilibrium) • Each AS is a selfish individual • A center (or each AS) infers the Nash equilibrium • Each AS can try the Nash equilibrium, or attempt to benefit itself based on the inferred Nash equilibrium
2G 2G 1G 2G Example of Benefit of Global TE 1G traffic to AS 1 AS 4 AS 2 1G AS 5 AS 1 1G traffic to AS 1 AS 3
0.5G 0.5G 0.5G 1G 1.5G 2G 1G 2G 2G Example of Benefit of Global TE • Without Global TE 1G traffic to AS 1 AS 4 AS 2 1G AS 5 AS 1 1G traffic to AS 1 AS 3
1G 1G 1G 1G 2G 1G 2G 2G Example of Benefit of Global TE • With Global TE 1G traffic to AS 1 AS 4 AS 2 1G AS 5 AS 1 1G traffic to AS 1 AS 3
Unified Transparency Framework for Various Functionality • Sharing of anomaly/security-related measurement • Various characteristics of traffic: heavy hitter, heavy changes, histogram, etc. • Self-diagnosis to survivability • Adaptations • Routing adaptations at router level or application level
Practical Issues and Solutions • Incentives for information sharing • Mandatory for next-generation Internet ? • Alliance model for incremental growth • Security/cheating: Trust but verify • Trust most of the info shared but periodically verify • Much easier than the current Internet tomography unless many ASes collude • Verification part of the protocol • Some fields in the packet headers designed for that purpose
Summary • Transparent Internet revolutionalizes the black box networks to “glass box” • Enable/improve many functionalities • Diagnosis and trouble shooting over global Internet • Flexible Inter-domain routing • Global traffic engineering • Provable and proactive configuration management to verify, validate and self-tune configuration without interrupting the main operation networks
Summary • Configuring the current Internet is highly complex, improvable and passive • Our approach provide a fully proactive/autonomic configuration architecture to verify, validate and self-tune configuration without interrupting the main operation network • Our architecture uses on high-level goal-oriented policy refinement approach
Objectives (cont.) Provides a provable and proactive configuration for NGI • Automated Configuration Management: from high-level “management objectives” to configuration parameters • Allow “Sami” to access all web servers except the ones in the “accounting” department • QoS configuration is highly complex: FQ, shapers, RED classes • Correct and Seamless network-wide configuration • Creating a unified configuration representation and verifying the mapping • Conflict detection and resolution • Set-and-Test configuration validation framework • Specially important for mission critical network • Very useful for delay-sensitive application • Self-managed Internet • Autonomic Configuration: configurations are auto-tuned dynamically to achieve the “objectives” • Proactive Configuration: configurations are auto-steered dynamically to avoid predicted problems • Searchable MIBs Configuration • using tagged MIB objects of meta-data and semantic web to provide (1) “multi-view” configuration management and (2) MIB information fusion
Configuration Definition and Verification Architecture • Technical Approach: Major Security policy verification components includes (1) Policy modeling: BDD representation for all network security policy (2) Consistency checking of global network polices (2) Goal-oriented verification: verifying certain user-defined service properties (4) Policy aggregation and translation to high-level-language and distribution FW Router IPSec High Level Security Definition Language Global Configuration Query Language what UPR UPR Goal-oriented Security Policy UPR Policy translation validation Policy Tactics Aggregation Global UPR how Goal Oriented Verification Consistency Check Policy Segmentation Policy Translation & Distribution Access Point IDS/ IPS FW IPSec Router
(3) Autonomic Programmable Network Control Policies PSA Model Reasoning AS3 H Symptoms AS4 Reporting/Visualization Tools AS2 Action selection Evaluation Diagnose Actions AS5 AS1 Feedback (Symptoms) PSA=problem-symptom-action
Problem Reasoning Autonomic Programmable Network Control Policies PSA Model Symptoms AS3 AS4 Reporting/Visualization Tools AS2 AS5 AS1 Diagnose Actions Feedback (Symptoms)
Measurement Info to Share • Basic metrics • Delay, loss rate, capacity, available bandwidth • Demand (or traffic volume) and application types • Intra-AS Measurement Info • Link-level info • Queried only when necessary • Aggregated Info • OD flow level info • Path segment b/t entry and exit points in each AS • Inter-AS Measurement Info • General AS relationship • AS-level topology • Inter-AS link metrics
Transparent Internet Architecture Combined w/ routing info and export to neighboring ASes through TGP protocol Provide global retrievable Management Information Base (MIB) with DHT Network link-level monitoring
iterate Algorithm design Realistic simulation Methodology • Network topology • Web workload • Network end-to-end latency measurement Analytical evaluation PlanetLab tests
replica cache always update adaptive coherence client DHT mesh TGP MIB Dissemination Architecture • Leverage Distributed Hash Table - Tapestry for • Distributed, scalable location with guaranteed success • Search with locality data source data plane Dynamic Replication/Update and Replica Management Replica Location Web server SCAN server Overlay Network Monitoring network plane
Adaptive Overlay Streaming Media Stanford UC San Diego UC Berkeley X HP Labs • Implemented with Winamp client and SHOUTcast server • Congestion introduced with a Packet Shaper • Skip-free playback: server buffering and rewinding • Total adaptation time < 4 seconds
Existing CDNs Fail to Address these Challenges No coherence for dynamic content X Unscalable network monitoring - O(M ×N) M: # of client groups, N: # of server farms Non-cooperative replication inefficient
Problem Formulation • Subject to certain total replication cost (e.g., # of URL replicas) • Find a scalable, adaptive replication strategy to reduce avg access cost