290 likes | 398 Views
D istributed Systems, N etwork Protocols & A pplications. Srinivasan Seshan Computer Science Department Carnegie Mellon University. Three Major Projects. Measurement analysis of networks Sensor networks Distributed virtual reality. Measurement/Analysis of Networks. Selfish TCP behavior
E N D
Distributed Systems,Network Protocols &Applications Srinivasan Seshan Computer Science Department Carnegie Mellon University
Three Major Projects • Measurement analysis of networks • Sensor networks • Distributed virtual reality
Measurement/Analysis of Networks • Selfish TCP behavior • Bottleneck discovery • Scaling properties of the Internet • Multihoming • People: • Aditya Akella, Jeff Pang, Bruce Maggs andSrinivasan Seshan • Anees Shaikh (IBM)
Measuring the Internet from Everywhere • What could you learn if you could… • Have a machine in almost every ISP • Collect routing information (E-BGP/I-BGP) from these ISPs • Be part of a significant fraction of all Web transfers • Be queried by almost every DNS server in the world • We have access to such a testbed Akamai
Bottleneck Discovery • Where are bottlenecks in the Internet? • Ignoring access links • What is the capacity of these bottlenecks? • Initial results • There is a lot of available bandwidth in the Internet today • > 45Mbps on 50% of paths! • Quantified relative benefit of using larger Tier-1 ISPs over smaller ISPs • Internal ISP links are bottlenecks more often than expected • Peering between ISPs not as significant a bottleneck as expected Stub Stub More ISPs ISP1 ISP2 Bottlenecks? Stub Stub
Scaling Properties of the Internet • How will these bottlenecks change over time? • Analyzing the combination of Internet topology and routing • Identifying changes that are needed to make the Internet scale with hardware improvements from Moore’s law • Initial results • Congestion scales poorly in Internet-like graphs • Policy-routing does not worsen the congestion • Alleviation possible via simple, straight-forward mechanisms • Uniformly scale all capacities? • Scale some links faster? • Moore’s-law like scaling sufficient? Congested hot-spots
Multihoming The effective use of multiple ISPs (multihoming) by stub networks • How can stub networks like CMU route traffic around bottlenecks? • Using multiple ISPs can… • Improve performance and reliability of Internet connectivity • Make Internet routing robust to failures and attacks • But need… • Techniques for stub domains to choose providers • Monitoring tools to track changes in Internet performance • Dynamic control over chosen routes Destination Internet ISP 1 ISP 2 CMU
Multihoming • In a given metro area… • What maximum performance benefits can multihoming offer? • How can multihomed networks realize these benefits in practice? • Initial results • Multihoming helps, but not much beyond 4 providers • Careful choice necessary • Cannot just pick top individual performers • Performance can be 50% worse for a poor choice of providers • Future work • Reasons for observed performance benefit can we relate route/ISP selection to bottleneck observations? • Impact of ISP cost structure what is the best choices for a given cost? • How will Internet operation be affected by such “smart” routing?
Sensor Networks • IrisNet • People: • Suman Nath, Yan Ke, and Srinivasan Seshan • Phil Gibbons, Babu Pillai, Rahul Sukthankar (Intel Research)
What if Sensors Were Everywhere? Persistent queries/triggered actions Network monitoring Packet sniffers as sensors Show an image when you hear a honk Characterization of human activity Person Locator System Is the cafeteria busy? Where’s Fred?
Sensor Services • Need: infrastructure to simplify creation of sensor-enriched services • Remove deployment overhead • Provide a common shared infrastructure of sensors • Automate common tasks • Sensor reading collection and storage • Efficient query processing over readings • Address privacy concerns of users
Sensor Networks mote hardware TinyOS, TinyDB, etc. campus-scale minimal sensor processing energy is a key concern scalar sensors narrowly focused services ad hoc wireless connectivity IrisNet PCs/PDAs Linux, Java, XML, C++ Internet-scale intensive sensor processing powered nodes multimedia sensors wide variety of services direct Internet connectivity IrisNet: Internet-scale Resource-Intensive Sensor Network Services
Example: Parking Space Finder • A distributed database maintains • Spot availability data • Address of parking spot • Meter description • Historical availability data • Query: Where is the cheapest empty parking spot near school? • Returns driving directions to the best spot
IrisNet Architecture Parking Space Finder Organizing Agents Downtown University Hill District Internet Amy-John Kim-Steve Tom-Zoe Person FinderOrganizing Agents Sensing Agents Sensing Agents
Design Decisions • Sensor feeds processed in application specific way near source • Reduces demand on network • Requires relatively intensive processing on sensor device • Distributed, hierarchical XML database stores readings • Accommodates frequent updates to different readings • XML supports hierarchical and heterogeneous/evolving description of data • Hierarchical organization enables scalability and rich query styles • Challenges in database processing, image processing & distributed systems
Distributed Virtual Reality • Distributed multiplayer games • People: • Ashwin Bharambe, Jeff Pang and Srinivasan Seshan
What do Multiplayer Games Look Like? • Large shared world • Composed of map information, textures, etc • Populated by active entities: user avatars, computer AI’s, etc • Only parts of world relevant to particular user/player Game World Player 1 Player 2
Individual Player’s View • Interactive environment (e.g. door, rooms) • Live ammo • Monsters • Players • Game state
Current Game Architectures • Centralized client-server (e.g., Quake) Every update sent to server who maintains “true” state • Advantages/disadvantages + Reduces overall bandwidth requirements + State management, cheat proofing much easier - Bottleneck for computation and bandwidth current games limited to about 6000 players - Single point of failure - Response time limited by client-server latency • Distributed broadcast-based (e.g., DOOM ) • Every update sent to all participants • Advantages/disadvantages + No central server - Waste of bandwidth - Synchronized game state – difficult for players to join at arbitrary times Do not scale well
x 100 y 200 x ≥ 50 x ≤ 150 y ≥ 150 y ≤ 250 Large-Scale Distributed Games • Need to distribute responsibility for maintaining world state and running computer AIs • Avoid any single point of failure • Efficient use of available bandwidth • Every player only receives “relevant” updates subscribes to updates Events Virtual World (50,250) Solution: model game with Publish-Subscribe (100,200) Player Arena (150,150) Interests
Publishers produce publications • Subscribers register their interests via subscriptions Publications Subscription Publish-Subscribe Overview • Key feature subscription language • Rich database-like subscription languages (e.g. all publications with stock price > 100) • Subject/channel-based subscriptions (e.g. all publications on the IBM stock channel) • State-of-the-art • Centralized designs with rich subscriptions • Scalable distributed designs with channel-based subscriptions • Unscalable designs with rich subscriptions
Publish-Subscribe Critical Components • Subscription language • Subjects vs. attribute/values • Exact matches vs. regular expressions? • Routing mechanism • Where are subscriptions stored in the system? • How are publications routed so that they “meet” subscriptions? • How are publications delivered from this rendezvous point to subscribers?
Related Systems • Scribe, Herald • Scalable, but – • Restricted subscription language • Siena, Gryphon • Flexible subscription language, but – • Poor scalability due to message flooding Delicate balance between expressiveness of language and scalability of routing
MERCURY: Subscription Language • SQL-like but more limited tradeoff to achieve scalability • Example: int x ≤ 200 Enough to support range predicates SQL-like • Need sortable attribute-values • Sufficient for modeling games • Game arenas • Player statistics, etc. • How to support this subscription language scalably? • Use techniques derived from distributed hash tables (DHT) • Existing DHT-based designs only support exact-match lookup • Need range-based lookups • Eliminate the use of cryptographic hashes must explicitly handle load-balancing
MERCURY: Routing Protocol • Each node responsible for range of attribute values • For each attribute, nodes arranged into circle • Each node compares value in message to his range; and routes along the circle [240, 320) [0, 80) Hx [160, 240) [80, 160) Attribute Hub
x 100 y 200 Routing Illustrated • Send subscription to any oneattribute hub • Send publications to allattribute hubs Subscription [240, 320) 50 ≤ x ≤ 150 150 ≤ y ≤ 250 [0, 105) [0, 80) Hx [160, 240) Hy Publication [105, 210) [210, 320) Rendezvous point [80, 160)
Why Not Use DHTs (and Cryptographic Hashing) ? • Hashing is good for exact matches e.g., DHTs • Want to support range queries • Possible approach • Hash each value in the range • Problems • Can only be used for discrete-valued attributes • Too many subscriptions int x 1 int x 10 int x = 1 int x = 9 int x = 10
Future Work • Performance • Cached pointers reduce number of overlay hops • Network aware placement of nodes delay competitive with centralized systems • Robustness need to survive node failures • Workload need system to self-tune to workload • Cheating detecting various forms of cheating • Routing, subscriptions, state ownership
Future Work • Distributed VR has similar challenges as many other distributed applications • Other applications we plan to explore: • Collaborative applications (whiteboard, shared applications, chat servers, etc) • Distributed databases • Distributed simulation (ns-2) • …