E N D
T-110.6120 Publish/Subscribe InternetworkingHelsinki University of Technology, Spring 2010Lecture 5: Overview of the PSIRP architectureby Arto KarilaSlides based on the PSIRP project review slides presented in Brussels on March 3, 2009 and deliverable D2.3: Architecture Definition, Component Descriptions, and Requirements
Introduction • Project objectives • Consortium • Working method • Development process
Project Objectives The key objectives of the PSIRP project are: • Define and develop a new inter-networking architecture based on the publish/subscribe (pub/sub) paradigm • Develop a reference implementation of the core of the new architecture • Analyse and measure the communications efficiency of new paradigm by quantitative parameters • Evaluate qualitative parameters of the new architecture with focus on security and business incentives and models • Develop migration and deployment strategies Other aspects: • Follow the clean-slate design approach – take nothing for granted • Consider migration, e.g., via overlay solution • Emphasize security and socio-economics • Publish the results, including source code, as much and as widely as possible • Engage with Future Internet community, e.g., cooperate with FIRE (Onelab2) to test on large scale
Consortium • The consortium was formed by people that shared a common vision and wanted to work together towards it • The partners of PSIRP are: • Helsinki University of Technology – Helsinki Institute for Information Technology (TKK-HIIT, FI) • RWTH Aachen University (RWTH Aachen, DE) • British Telecommunications Plc (BT, GB) • Oy L M Ericsson Ab (LMF, FI) • Nokia Siemens Networks Finland Oy (NSNF, FI) • Institute for Parallel Processing of the Bulgarian Academy of Science (IPP-BAS, BG), • Athens University of Economics and Business (AUEB, GR), • Ericsson Magyarorszag Kommunikacios Rendszerek K.F.T. (HU)
Working Method • Iterative working method • Define a pub/sub-based inter-networking architecture • Implement and validate the architecture • Revise the architecture and its implementation • Work packages and tasks are allocated to teams that are formed spontaneously, across organizational boundaries • The purpose is to release creativity in a team consisting of people of different companies and nationalities The approach has proven its strength and it will be continued in the proposed Pursuit project
Requirements Analysis & Design Planning Initial Planning Implementation Deployment Evaluation Testing Development Process
Technical Overview • Are the fundamentals still valid? • Observation: It’s all about information • Hypothesis: Increased information requires information-centric network approaches • Approach: Clean-slate with late binding • Design methodology • Main design principles • Information centrism is key • Information concepts • Grouping information networks • Information structures
Fundamentals of the Internet Collaboration Reflected in forwarding and routing Cooperation Reflected in trust among participants Endpoint-centric services (mail, FTP, even web) Reflected in E2E principle IP, full end-to-end reachability Are the Fundamentals Still Valid? Reality in the Internet Today • Phishing, spam, viruses • There is no trust any more! • Current economics favor senders • Receivers are forced to carry the cost of unwanted traffic • Information-centric services • Do endpoints really matter? • Endpoint-centric services move towards information retrieval through, e.g., CDNs IP with middle boxes & significant decline in trust in the Internet vs.
Observation: It's All About Information • Internet Tomorrow: • Proliferation of dissemination & retrieval services, e.g., • context-aware services & sensors • aggregated news delivery • augmented real life • Personal information tenfold in the next ten years (IBM, 2008) • Increase of personalized video services • e.g., YouTube, BBC iPlayer • Vision recognized by different initiatives & individuals • Internet of Things, Van Jacobson, D. Reed • Lack of interworking of silo solutions will slow innovation and development speed • Internet Today: • In 2006, the amount of digital information created was 1.288 X 10^18 bits • 99% of Internet traffic is information dissemination & retrieval (Van Jacobson) • HTTP proxying, CDNs, video streaming, … • Akamai’s CDN accounts for 15% of traffic • Between 2001 and 2010, information will increase 1million times from 1 petabyte (10^15) to 1 zettabyte (10^21) • Social networking is information-centric • Most solutions exist in silos • overlays over IP map information networksonto endpointnetworks
Hypothesis: Increased Information Requires Information-centric Network Approaches Application developers care about information concepts • Creation of information topologies of various kinds -> Endpoint-centric networking structures are inadequate • Topological network changes too slow in timescale • Topological network boundaries too restrictive • Topological network boundaries often not aligned with information topologies • Overlaying possible but restricted in (developer) scalability -> If it is all about information, why not route on information?
Approach Clean-slate design… • Question ALL fundamentals • Challenge our thinking • Take nothing for granted, including industry structures • Clear vision …with late binding (to reality) • Consider migration and evolvability in separate work items • How to get our design into real deployments, e.g., overlay vs. IP replacement? • Even consider necessary evolution of industry (& regulatory) structures • How do industries need to evolve in certain scenarios?
Combine bottom-up and top-down Implementation matters but rationalization is necessary Adaptive design is crucial Information-centrism is expected to help for areas like metadata & policy Aligns well with cycle-based project approach Creates micro-cycles of question/remove Design Methodology SoA VISION Add/Remove Add/Remove Constraints Goals Derive Principles Observe Design Patterns & Considerations Map Question Remove Components Specify Choice Choice Choice Implement Instance Deploy & evaluate Deployment
Main Design Principles • Information is multi-hierarchically organised • Information semantics are constructed as directed acyclic graphs (DAGs) • Information scoping • Mechanisms are provided that allow for limiting reachability of information to parties • Scoped information neutrality • Within each information scope, data is only delivered based on a given (rendezvous) identifier. • The architecture is receiver-driven • No entity shall be delivered data unless it has agreed to receive those beforehand. Information reachability/ scoping Information Hierarchies Communication Model
Information is everything and everything is information Bootstrap other concepts, e.g., identity, policy, …, Scopes build information networks Policy is metadata So is scope! Producers and consumers need no internetwork-level addressing! Information Centrism is Key Data: Mail Data: Picture Governance policy Governance policy Scope Company A Scope Family Scope Friends Governance policy Spouse Father Friend Colleague
Information Concepts • Information • Smallest something • Information collections • Sets of semantically similar information • Information networks • Sets of information under some common governance • Information producer • Entity publishing information to a particular network • Information consumer • Entity subscribing to information in a particular network
Information Structures SId2 SId2 Private Information networks SId2 Information networks SId1 SId2 alg SId Information collections alg RId RId2 RId2 Information items RId1 RId2
Architecture • High-level architecture • Architecture process • Architectural entities • Identifiers • Algorithmic identifiers • Architectural processes • Architecture overview • Components • Application considerations • Conclusions
Apps pub sub pub Node Architecture Fragmentation pub Service Model Caching Topology Rendezvous Helper ITF ITF RP … RP Rendezvous Network Error Ctrl Forwarding TM TM TM TM FN Forwarding Network Forwarding Network Forwarding Network Forwarding Network RP : Rendezvous point ITF : Inter-domain topology formation TM : Topology management FN : Forwarding node High-Level Architecture Network Architecture
Architecture Process • The aim of the architecture work is to produce coherent design for a publish/subscribe internetwork • Principles, framework, components, internetworking • Networks of information • Using pub/sub in all parts of the protocol suite • The architecture work is iterative • Interaction with implementation and validation workpackages • Encompasses both clean-slate and incremental overlay-based solutions • In practice work is done in a number of architecture teams with participants from different work packages
Architectural Entities • Information items • Any data that can be labelled with an identifier • Items can have different identifiers depending on the level of abstraction • Application identifier, rendezvous identifier, forwarding identifier • Algorithmic identifiers • Information networks (or scopes) • Information items grouped into a network of information under a scope • Scopes can be interpreted at various levels of abstraction • Information subscribers and publishers • Domains • Administrative network areas that can be connected using inter-domain forwarding architecture
Publish / Subscribe Data Metadata (source is implementation-dependent) Includes... Application Identifiers (AId) Includes... Resolved to... Associated with... Scope Identifiers (SId) Rendezvous Identifiers (RId) Resolvedto... Forwarding Identifiers (FId) Define... Network Transit Paths Identifiers
Architectural processes • Rendezvous • The process of resolving higher level identifiers to lower level identifiers within a given scope. • Three simple cases: link-local, intra-domain, inter-domain • Topology management and formation • Management of data delivery topologies and forwarding graphs • Forwarding • Data delivery within a single administrative domain or across multiple domains. • Temporal forwarding identifiers for each publisher and subscriber are derived via the rendezvous and topology management processes • Various routing and forwarding protocols can be used, for example a new protocol replacing IP or IP-based overlays
Architectural processes • Helper functions • Extensions to core functionality of the network architecture, such as management and transport • Network attachment • Discovery of network attachment points and network configuration
AS Rendezvous AS Rendezvous Create delivery path Subscribe Topology Topology Publish Configure Forwarding path Forwarding node Forwarding node Forwarding node Forwarding node Data Forwarding Architecture Overview AS Topology Forwarding edge nodes Subscriber Publisher
Component Wheel • The network architecture is based on a modular and extensible core called the PSIRP Component Wheel • Components may be decoupled in space, time, and context • Layerless protocol suite • Typical components include rendezvous, forwarding, helper functions, and caching. • Applications may insert or request new components to the wheel at runtime. • The components are attached to the local blackboard • Shared components, publications, state • Pub/sub is used to signal changes to blackboard state
Compare w/ Haggle Architecture [Sco2006] J. Scott, P. Hui, J. Crowcroft, C. Diot, Haggle: A Networking Architecture Designed Around Mobile Users, IFIP WONS 2006, Les Menuires, France, 2006
Service Model and API We have considered four different classes of network services: • A low-level page model that exposes network forwarding and rendezvous/topology formation functions • Basic building block on top of raw forwarding API • Pages and no error or congestion control • Listen(FiD), Forward(FiD, meta-data, data) • Mapping of information items to FiD • Pub/sub API towards higher levels • A mid-level memory object model • Memory pages are mapped to publications • A mid-level channel model • Various high-level service models including shared state and document models
Higher-level APIs Channel API Memory Object API Low-level page API APIs
Rendezvous • Many faces of rendezvous • Link-local, inter-domain, information services, communal services, … • The main requirements for this functionality are: • Scalability to Internet-like networks and data space sizes • Efficiency of operation, measured in signalling overhead and overall latency • Deployability • The network is defined in terms of domains and their interconnections • Interconnections between domains include upstream, transit, downstream • Rendezvous networks are units of deployment for the PSIRP rendezvous functionality. The rendezvous networks are formed by rendezvous nodes (RNs), organized as a BGP-like inter-domain hierarchy
RNA RNC RP RNB RNC10 RNB10 RNC100 RNB100 Sub Pub N N RP Interconnection overlay Rendezvous point Rendezvous network Sub Subscriber RNX Rendezvous node Publisher Pub NX End node Rendezvous signaling Rendezvous Networks
Intra- and Inter-Domain Operation • Forwarding is configured by the rendezvous system with the help of the topology management function • Key challenge is scalability • Intra-domain forwarding • Initial work on Bloom-filter based forwarding mechanism indicates that they are useful for sizes up to metropolitan area networks • FiDs identify partial distribution graphs • Minimal state in the routers • Inter-domain forwarding • Solution must take performance and policy requirements into account • Initial work on DHT-based overlay and understanding implications for inter-domain routing • Inter-domain topology function
Security and Packet Level Authentication (PLA) • Direct and indirect cryptographic association using identifiers and rendezvous • Public keys and one-way hash values • Algorithmic identifiers • Packet Level Authentication (PLA) is one candidate protocol for securing PSIRP network functions • We assume that per packet public key cryptography operations are feasible in Internet's scale because of new digital signature algorithms and advances in semiconductor technology • PLA is a novel solution for protecting the network infrastructure against various attacks (e.g., DoS) by providing availability • Each packet has a signature (ECC) • Good analogy for PLA is a paper currency: anyone can verify the authenticity of the bill by using built-in security measures like watermark and hologram, there is no need to contact the bank that has issued the bill • The network should be able to fulfill its basic goal: to deliver valid packets of valid users in reliable and timely manner in all situations
PSIRP Apps Legacy Apps Apps Sockets API Emulator PSIRP Library API Libraries Upper layers (possibly Python) PSIRP Kernel API Lower layers (C/C++) Kernel Packet Level Authentication (PLA) Wi-Fi GMPLS Ethernet IP Prototype Implementation
Application Considerations • Application programming interfaces • Inherently information centric • 1-to-1 message stream • Similar to TCP/IP socket API • 1-to-N bidirectional connection • Similar to UDP over IP multicast • 1-to-N document distribution • Similar to CDN and P2P protocols • Examples • WWW functionality • BitTorrent-like content distribution
Example: Scoping in CS-comms. Using a scope to initiate client-server-like communications
Example: Invitation in CS-comms. Using an invitation to initiate client-server-like communications
Example: Adding a Node to Tree Simplified signaling for adding a node to a concast-like tree
Example: Interactions in the component wheel Interactions in the component wheel
Conclusions We have outlined an information centric network architecture • Publish and subscribe are the basic primitives making multicast the norm • Decoupling in space and time • Receiver driven (subscriber has control) • Rendezvous as the primitive to connect publishers and subscribers across domains on multiple levels • Mapping to forwarding structures • Scoping of data into manageable sets • Architecture work is iterative • Implementation and evaluation are on-going activities