1 / 67

Lecture 3: State of the Art

Lecture 3: State of the Art. www.psirp.org. D.Sc. Arto Karila Helsinki Institute for Information Technology (HIIT) arto.karila@hiit.fi. T-110.6120 – Special Course on Data Communications Software: Publish/Subscribe Internetworking. Contents. Introduction Guiding Principles

braith
Download Presentation

Lecture 3: State of the Art

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 3:State of the Art www.psirp.org D.Sc. Arto Karila Helsinki Institute for Information Technology (HIIT) arto.karila@hiit.fi T-110.6120 – Special Course on Data Communications Software: Publish/Subscribe Internetworking

  2. Contents • Introduction • Guiding Principles • Future Internet Architecture • Protocols • Mechanisms • Publish/Subscribe paradigm • Design Considerations • Economics • Security • Trust • Privacy

  3. Introduction • The PSIRP project aims to solve some major issues of the current Internet by applying… • information-centric • publish/ subscribe … paradigm throughout the layers • In fact, many current applications are inherently pub/sub in nature: • Distribution of software and anti-virus updates • IPTV • BitTorrent • RSS feeds and more! • A clean-slate pub/sub architecture could serve such applications very well

  4. Introduction • To succeed, we must know the current state of the art, make use of it, and extend it in many areas of communication • In early 2008 a rather thorough state-of-the-art study was conducted and collected to a report (D2.1) • Development has not stopped there and the wiki used has lived on but D2.1 presents a snap-shot of the situation two years ago • Because of the breadth of the area, we had to focus on promising sub-areas

  5. Contents • Introduction • Guiding Principles • Future Internet Architecture • Protocols • Mechanisms • Publish/Subscribe paradigm • Design Considerations • Economics • Security • Trust • Privacy

  6. Guiding Principles • Our vision is based on these concepts: • Everything is information, which can be organized hierarchically to build complicated structures from simple elements • There are different forms of information reachability on all levels of the design and they can change in real-time • Control is given to the recipient of information, fixing the imbalance of powers inherent in TCP/IP • The state-of-the-art study was focused on issues that appear to serve these ideas

  7. Contents • Introduction • Guiding Principles • Future Internet Architecture • Protocols • Mechanisms • Publish/Subscribe paradigm • Design Considerations • Economics • Security • Trust • Privacy

  8. Scope • The goals were mapped into areas of investigation that seemed to be relevant • Future Internet Architecture • Protocols • Naming • Addressing • Routing • Multicast • Mechanisms • Compensation • Caching • Security • Network Coding

  9. Scope (cont’d) • Publish Subscribe • Design considerations • Economics • Socio-economic aspects • Security must be designed into the architecture • Trust is an important aspect of networking • Privacy is of increasing importance

  10. Methodology • The methodology of the SoA study was dictated by the envisioned scope • The SoA was simply the first step towards understanding the relevant prior work • “A system as complex as the Internet can only be designed effectively if it is based on a core set of design principles, or tenets, that identify points in the architecture where there must be common understanding and agreement” [Cla2003]

  11. Methodology • The original Internet was created by people who share the common goal of interconnecting their computing equipment • Computers were physically large, with extremely limited resources • You kept your data with you and not on the system  Communication was modeled to share resources point-to-point… NOT for many-to-many content sharing and retrieval • As the Internet has grown well out of its envisioned scope, several of its limitations have become apparent • From the socio-economic point of view, solving tussles (conflicts of interest) is one of the key problems facing future Internet • This leads to design for change [Cla2003] and the requirement of evolvability [Rat05] • The importance of trust (E2E => T2T)

  12. Naming • Currently naming usually happens at the service-level: • domain names, • e-mail addresses, • URIs etc. • The Domain Name System (DNS) defines a static, hierarchical namespace organized into a tree, where ICANN manages the top-level domains • The DNS namespace is decoupled from the (also hierarchical) IP address space

  13. Quick Discussion • What is good about DNS? • What is bad about DNS? • Why is DNS is insufficient to support host mobility?

  14. Naming • DONA replaces domain names with self-certifying, two-part, hash-based names, naming data (not hosts or interfaces) • [Ram2004a] proposes a new design for name resolution • [Ram2004b] proposes prefix-matching DHT • In [Cal2007] on channels are named with unique identifiers without hierarchy or centralized control • [Cro2003] introduces contexts – collections of homogeneous network elements • There are lots of different proposals

  15. Addressing • Traditionally IP addresses are divided into classes A, B, and C • In 1993 Classless Inter-Domain Routing (CIDR) was introduced, with variable-length prefixes and aggregation of blocks • [And2007] proposes an address structure where the subnet prefix is replaced with a self-certifying Autonomous Domain identifier (AD) and the suffix with a self-certifying Host Identifier (EID), adresses now being of form: AD:EID • ROFL proposes routing on flat labels, in a totally topology-independent way (this does not scale)

  16. Addressing • In [Cal2007] nodes are anonymous and addressed through their incoming channels • In [Cro2003] specific addresses are bound to different addresses in different contexts • [Han2004] proposes seven steps towards an Internet resistant against DoS attacks – the first two calling for separation of client and server addresses and removal of globally reachable client addresses

  17. Inter-Domain Routing • Border Gateway Protocol (BGP) is suffering from serious scaling problems • Default-free zone • In June 2007, APNIC router in Tokyo had ~225,000 routes! • Any change in a globally visible prefix causes Internet-wide route updates • The number of globally visible prefixes is growing for a number of reasons, such as: • Provider-independent addressing • Multi-homing of sites • Protecting against prefix hijacking

  18. Domain-Level Routing • To tackle BGP’s scaling issues [And2007] proposes to route at the domain level • Removal of path selection from packet-forwarding-level routing has been proposed • Explicit domain-level path construction fits with name-based routing (e.g. TRIAD) • [Lak2006] proposes providing the path selection function as a separate routing service • [Key2006] lets the sending host optimize path selection based on congestion information • NIRA [Yan2007] proposes a separate path discovery protocol for the up-graph, Name-to-Route Lookup Service (NRLS) for the downhill route, and allowing the endpoints to further negotiate end-to-end path selection

  19. Domain-Level Routing • Some of these functionalities are needed by multi-path capable transport protocols, such as the Stream Control Transmission Protocol (SCTP) [Ste2000] • [Fea2004] proposes removing the routing function from routers to allow for better domain-level control of routing policies and allow a more direct domain-level mechanism for inter-domain routing • ROFL uses domain-level source routes as the means to route packets between endpoints – the first packet of a session uses hierarchical DHT routing, but after that the endpoints can use NIRA-like [Yan2007] end-to-end domain-level path control

  20. Compact Routing • Routing table sizes and communication cost of BGP are increasing exponentially with the number of global prefixes [Kri2007] • Routing on AS numbers doesn’t offer a real solution to the growing complexity • Compact Routing aims to decrease the size of routing tables while allowing non-shortest paths to be used • Traditional shortest-path algorithms yield routing tables of size O[n*log(n)] [Gav1996]

  21. Compact Routing • A routing scheme is said to be compact if it produces: • Logarithmic address and header sizes • Sub-linear routing table sizes • Stretch bounded by a constant • A compact routing scheme can be.: • Specialized or universal (works on all graphs) • Name-dependent or name-independent • Two compact routing schemes with small stretch (3) are the non-hierarchical Cowen [Cow1999] and the Thorup-Zwick (TS) [Tho2001] schemes • [Kri2004] focuses on the TZ scheme with Internet-like graphs

  22. Overlay Routing • In overlay routing the topology is formed over an underlying (usually IP) network • DHTs are examples of overlay routing • DHT techniques can be utilized e.g. in implementing non-hierarchical rendezvous • An example of DHT-based solutions is the Content Addressable Network (CAN) • CAN is based on a d-dimensional Cartesian space, each node having a coordinate zone that it is responsible for

  23. CAN • A two-dimensional example

  24. Chord Ring • Greedy forwarding (cmp w/ ROFL)

  25. Pastry DHT • An example with hexadecimal identifiers

  26. Content-Based Pub/Sub Routing • Hosts subscribe to content by specifying filters on the events • The content of the message defines its ultimate destination • Subscribers use interest registration facility which sets up data delivery paths • Pub/sub has been proposed as a replacement for TCP/IP • This would change the economic model too

  27. Content-Based Pub/Sub Routing • Filter-based event routing – pub/sub servers are organized into an acyclic tree • Multicast-based event routing – a multicast tree is build for every interest group • Kyra [Cao2004] combines the approaches using a two-level hierarchy • Within a clique (based on proximity) all nodes know each other • On a higher level minimum spanning trees to the cliques are built for various events

  28. Content-Based Pub/Sub Routing • Siena is a classic example of distributed content-based routing implemented in the application layer, coexisting with TCP/IP [Car2001] • Overlay networks allow more complex functionality to be implemented on top of IP • Good overlay routing configuration follows the placement of network-level routers

  29. Multicast • Multicast is vital for the efficient distribution of media (such as video) • IPv4 has class D addresses for multicast • DVRMP is and early mcast routing protocol • The topological map of OSPF allows MOSPF to operate with little overhead • Protocol Independent Multicast (PIM) works with any routing protocol in two modes: sparse (PIM-SM) and dense (PIM-DM) • In the local network, IGMP is used

  30. Multicast • Multicast is considered valuable but it is not supported in the Internet • The main reasons for this are its security and scalability issues • DVRMP and PIM-DM initially flood the n/wk • Each multicast router requires a lot of state • The sender runs the risk of getting traffic back from a large group of recipients • [PAS1998] provides a summary of approaches emphasizing different goals

  31. Recent Trends in Multicast • There are many proposals for more scalable or more easily deployable multicast • These can be roughly divided into three groups: • Router-based • Host-based • Overlay (DHT) -based

  32. Mechanisms • Compesation • Cacheing • Security • Network Coding

  33. Compensation • To facilitate efficient use of resources by providing the “owner” with some assurance that he will eventually benefit from the use of his resource • Different forms of compensation: • Authorization • Community membership • Resource exchange • Sacrifice or evidence of deliberate waste of the user’s own resources • Payment or promise of future reimbursement

  34. Compensation • Types of transaction-related costs: • Immediate technical costs • Information search costs • Collateral costs associated with the use • Compare w/ Transaction Cost Economics: • Researching potential suppliers • Collecting information on prices • Negotiating contracts • Monitoring the supplier’s output • Legal costs incurred (contract breaches)

  35. Compensation • Weber, Biggard and Delbridge: exchange = voluntary agreement involving the offer of any sort of present, continuing, or future utility in exchange for utilities of any sort offered in return • Four categories of exchange systems: • Price System • Associative System • Moral System • Communal System

  36. Caching • [PIT2008] studies caching performance in nodes of a Delay Tolerant Network (DTN), providing ad-hoc communication services within (sparse) mobile user communities when end-to-end IP service is unavailable • The network acting as a distributed cache • Caching is needed to handle heavy traffic • The price of storage is dropping faster than the price of communication => caching is getting more tempting

  37. 2009 $100/MB $10/MB $1/MB Raw Disk Space $100/GB $10/GB Tier-1 Internet Transit $1/GB $0.1/GB Storage vs. Transit Price

  38. Scope Security • In pub/sub architectures scopes control the spreading of information • [Fie2004] proposes an extension to a large pub/sub system Rebeca to support scopes • In [Far2002] access control is implemented with attribute certificates (ACs) used to identify nodes and their privileges

  39. Packet Layer Authentication • Each packet (or PDU at any layer) can be signed and the public key included • The authenticity of the packet can now be determined by any node on its route • This prevents the attacker from consuming a lot of resources with falsified packets • This area will be covered more on the PSIRP Security Architecture lecture

  40. Transparency and Information Accountability • Social rules ted to more often cause compliance than abuse • This is due to the fact that the consequences of compliance usually are more pleasant than those of violation • If we can build this into the architecture, a large-scale system can be made reliable, robust, secure, trusted, and efficient • [Wei2007] introduces transparency and accountability as the attributes of information systems that could result in compliance and collaboration

  41. Network Coding • Communication through an unreliable and unpredictable channel is difficult • Transmission errors can lead to long delay and large number of retransmissions • Network coding includes Forward Error Correction (FEC) as well as more modern rateless codes – i.e. digital fountain codes

  42. Reed-Solomon Codes • Among the most significant traditional codes is the Reed-Solomon code (N,K), with qm symbols in its alphabet, can be decoded after receiving K out of N symbols sent • The message consists of K original symbols and N-K parity symbols

  43. Fountain Codes • Fountain techniques send randomly all the parts of a message with added redundancy • They are rateless since there is no limit on the number of encoded packets generated from the source message and it can change on the fly • The source can send as many encoded packets as necessary for the destination to decode the data • Among fountain codes are: • Random Linear Fountain Code, • Tornado Codes and • LT Fountain Code

  44. XOR Coding • Intelligent mixing of packets can be used to increase network throughput • An example is the situation where two users of a wifi base station (router) exchange two messages • Without network coding we need four transmissions • With simple XOR coding we can do with only three transmissions

  45. XOR Coding • Message exchange without network coding – four transmissions

  46. XOR Coding • Message exchange with network coding – three transmissions

  47. Linear Network Coding • Linear network coding is rather like XOR coding, except that the XOR operation is replaced with linear combination of data • The recipient can decode the information having received m out of the n messages • Linear coding appears to work well with multicast, which makes it interesting for PSIRP

  48. Publish/Subscribe Paradigm • The starting point of our work is that event-based computing and the pub/sub paradigm are crucial for future services • RSS feeds can be seen as pub/sub • SIP is an example of event-based comp. • Formal modeling of pub/sub systems and correctness of content-based routing protocols are examined in [Müh2002b] • A routing protocol is correct if it satisfies the safety and liveliness requirements

  49. Contents • Introduction • Guiding Principles • Future Internet Architecture • Protocols • Mechanisms • Publish/Subscribe paradigm • Design Considerations • Economics • Security • Trust • Privacy

  50. Design Considerations • Economics • Security • Formal Modeling

More Related