Author: Pere Vilà Supervisor: Josep Lluís Marzo Departament d’Electrònica Informàtica i Automàtica

PhD Thesis:“Dynamic Management and Restoration of Virtual Paths in Broadband Networks based on Distributed Software Agents” Author: Pere Vilà Supervisor: Josep Lluís Marzo Departament d’Electrònica Informàtica i Automàtica Universitat de Girona, May 2004

Acknowledgements • This work has been partially supported by the Ministry of Science and Technology of Spain under contracts: • MCyT TIC2003-05567 • MCyT TIC2002-10150-E • CICyT TEL99-0976 • And by the UdG research support fund: • UdG-DinGruRec2003-GRCT40

Contents Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • Motivation • Background • Objectives • Desired Characteristics • Network Resource Management • Proposed Dynamic Virtual Path Management Architecture based on Software Agents • Analysis and Simulation Results • Conclusions • Future Work • Related Publications

Motivation (1/5) • BCDS group background at the start of this work: • Connection Admission Control in ATM • Routing and Multicast in ATM Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • Network Resource Management (NRM) in ATM networks • Network Management • Deals with the proper utilisation of the network resources. • Objective: Try to dispatch the maximum user traffic using the same network resources without service degradation. • Network technology should have the necessary mechanisms for resource reservations (ATM, MPLS, GMPLS). • NRM at a packet / cell level • Buffer Management • Packet scheduling. • NRM at a connection level • Bandwidth Management • Load Balancing

Logical or Virtual Network Established Logical Paths Physical Network Motivation (2/5) Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • Logical or Virtual Network Concept: • Set of Logical Paths (LP)(allocated resources) that can be visualised as a virtual topology. • Constitutes a higher layer. • Independent of the physical network. • Users establish connections over this Logical Network. • Flexibility: it can be adapted as required. • Advantages: • Allows the separation of services to different LPs • Enables the establishment of Virtual Private Networks • Facilitates several Network Management functions (e.g. fault protection).

Motivation (3/5) Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • Example: Physical Links LP1 User Connections LP2 Node 1 Node 2 Node 3

Example: Motivation (3/5) Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications Physical Links LP1 User Connections LP2 Node 1 Node 2 Node 3

Example: Need of adaptation Nowadays this is performed in a centralised way Manually by the human network managers Periodically as an optimisation problem (network design given a set of constraints and traffic forecasts). For instance: Morning / afternoon / night configurations Every hour / day Motivation (3/5) Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications Physical Links LP1 User Connections LP2 Node 1 Node 2 Node 3

Detected problems: Motivation (4/5) Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • Periodic reconfigurations: • Reconfigurations do not coincide with the congestions • Predictions are difficult Dynamic Reconfigurations • Dynamic Reconfigurations: • Usually performed centralised • Volume of the monitored information bottleneck • Scalability problem Dynamic Distributed Reconfigurations • Dynamic Distributed Reconfigurations: • Lack of Global Network View • Sub-optimal solutions ?

Motivation (5/5) Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • Moreover: • Trend in automating and distributing the network management functions • Dynamic Reconfigurations have also the problem of finding the right balance: • Too fast or too many reconfigurations may cause a management overwhelm. • Integration of the mechanisms that use the same resources (Logical Paths): • Dynamic Bandwidth Management • Fault Protection Mechanisms Complex Distributed Problem Natural area where to use Software Agents

Background Contents Motivation Background Man. Standards Man. Architectures Network Technologies Software Agents MAS Examples Mobile Agents Ex. Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • We have focussed on the study of: • The Network Management Standards • OSI Management Framework • Telecommunications Management Network (TMN) • Simple Network Management Protocol • The characteristics of the Network Management Function Architectures • Centralised • Distributed / Local • Hybrid • Network Technologies with resource reservation mechanisms and the possibility of establish a logical network • ATM • MPLS / GMPLS • Software Agents in Telecoms • Multi Agent Systems • Mobile Agents

OSI Reference Model Managed System Management Information Base (MIB) LM Operations Manager (Client) Operations NM Agent (Server) LM LM LM MO LM Notifications Notifications MO MO LM Managed Objects (MO) LM Layer Managers (LM) MO MO MO Network Management Standards Contents Motivation Background Man. Standards Man. Architectures Network Technologies Software Agents MAS Examples Mobile Agents Ex. Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • OSI Management Framework • Centralised • Use of standard protocols • Telecommunications Management Network (TMN) • Use of an independent network • Hierarchical architecture • Use of OSI standards • Responsibility Model (layered) • Simple Network Management Protocol (SNMP) • The most widely used (Internet) • Only defines protocols and MIBs • Centralised – Hierarchical

Network Management Function Architectures Contents Motivation Background Man. Standards Man. Architectures Network Technologies Software Agents MAS Examples Mobile Agents Ex. Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • Classification considering where the decision-making is placed • Centralised: • Global network view • Enable optimisation / planning • Enable human interface • Low robustness • Distributed: • No central manager • Decision-making equally distributed • No global network view • Fast response (short monitoring loop) • Robustness • Difficult human interaction • Hybrid: • Combines centralised and distributed characteristics • Pure Centralised • Low scalability • Hierarchical • High scalability • Delays • Distributed without management centre • Collaboration • Scalability evaluation difficult • Local • Use of local information only • High scalability • Limited to specific cases • Distributed with management centre • Management by Delegation (MbD) • Mobile Agents • Hierarchically distributed • Distributed - Centralised

Physical Link Cells VPI=8 VCI=4 VPI=9 VCI=5 LSP 1 LSP 1 IP 17 27 L2 IP 19 27 L2 IP 19 25 L2 IP 17 25 L2 LSP 3 IP IP IP IP 17 19 19 17 L2 L2 L2 L2 Flows LSP 1 LSP 1 LSP 3 LSP 2 LSP 2 LSP 2 LSP 2 Network Technologies Contents Motivation Background Man. Standards Man. Architectures Network Technologies Software Agents MAS Examples Mobile Agents Ex. Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • Need of mechanisms to establish Virtual or Logical Paths • Need of resource reservation mechanisms for these LPs • Asynchronous Transfer Mode (ATM) • An integrated services network • Fixed-length small packets (cells) • Two-level hierarchy: • Virtual Circuits (user connections) • Virtual Paths (constitute the Logical Network) • Multi-Protocol Label Switching (MPLS) and Generalised MPLS • Flexible approaches to deploy connection-oriented networks • Group user flows into Forwarding Equivalent Classes • Label Switched Paths (LSP)

Software Agents in Telecoms Contents Motivation Background Man. Standards Man. Architectures Network Technologies Software Agents MAS Examples Mobile Agents Ex. Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • Software agents: computer entities capable of acting autonomously, with certain degree of expertise to deal with the external world and they have the ability to cooperate in some way with other agents. • As the network management is a complex distributed task it is a natural area in which to apply software agents: Multi-Agent Systems • Static MAS are suitable for most of the networks, but usually used in reliable high-capacity core networks. • They usually have several types of agents. Mobile Agents • They can move between nodes and interact with the network element locally. • Suitable for networks with low throughput and/or availability. • Facilitate software upgrades and extensibility – service management. • There are many software agent proposals in network management.

Multi-Agent Systems Examples Contents Motivation Background Man. Standards Man. Architectures Network Technologies Software Agents MAS Examples Mobile Agents Ex. Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • HYBRID [Somers et al. 97] • Architecture based on a geographical hierarchical structure. • Upper layers delegate management to lower layers. • Uses many different complex agents performing all the management functions. • Tele-MACS [Hayzelden 98] • Integration of reactive and planning agents in a layered structure • Its main objective is the connection admission control function • IMPACT [Luo et al. 99] • Presents a multilayered structure of several types of agents • Its objective includes admission control, routing, multiple service provider support, etc and was tested on a real ATM test-bed. • Others: References Description [Eurescom P712 URL] [Corley et al. 2000] Project P712 called “Intelligent and mobile agents and their applicability to service and network management”. Cases study including the investigation into how agents could enable customers, service providers and network providers to negotiate automatically for services and network resources. [Bodanese and Cuthbert, 1999] Resource management of mobile telephony networks. Use of hybrid agents for a distributed channel allocation scheme. [Gibney et al., 1999] Market-based call routing based on self-interested agents representing network resources without assuming a priori co-operation [Willmott and Faltings, 2000] On-line QoS call routing in ATM networks. Hierarchical structure with a controller agent responsible for resource allocation decisions in disjoint local regions of the network.

Mobile Agents Examples Contents Motivation Background Man. Standards Man. Architectures Network Technologies Software Agents MAS Examples Mobile Agents Ex. Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • Swarm Intelligence [Bonabeau et al. 99] • Systems inspired by the biological behaviour of social insects (e.g. ants, bees) • Ant Colonies for network routing problems [Di Caro and Dorigo 97][Schoonderwoerd et al. 96] • JAMES [Silva et al. 99] • Mobile agent platform (infrastructure) for network management • Mobile agents start on the central manager an migrate through network nodes and finally return to the central manager. • Each mobile agent has a specific itinerary and a set of missions • Other: References Description [Halls and Rooney, 1998] ATM switch control, connection admission control, similar to MbD Mobile agents – SNMP integration, SNMP table views, polling and filtering [Gavalas et al., 2000] [White et al., 1998a] [White et al., 1998b] Network modelling, fault management, Mobile Agent framework and architecture. [Caripe et al., 1998] Network-awareness applications, network monitoring [Sahai and Morin, 1999] Mobile Agent framework targeted to mobile user applications where users are intermittently connected to the network through unreliable or expensive connections. [Bohoris et al., 2000] Dynamic service management and reconfiguration, focusing on service performance and fault tolerance management, among others. [Du et al., 2003] Framework for Mobile Agent-based distributed network management with a high integration with SNMP.

Proposal: Objectives / Contribution Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • After the study we could conclude that: • Most of the proposed MAS are too complex and / or have scalability problems. • Other mechanisms focus only on a single NRM function. • Proposal: dynamic network resource management architecture for broadband core networks. • Coordination / integration of the main NRM mechanisms which make use of LPs. • Control the number of reconfigurations. Reconfiguration is only necessary when a problem is detected. • Lightweight monitoring. • Monitoring and decision whether to change an LP or not, can be naturally distributed over the network nodes (~ distributed architecture). • The scenario suggest the use of static agents. Moreover the use of Software Agents can also be seen as a design metaphor.

Proposal: Desired Characteristics Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • Modularity in the sense that such a mechanism can be activated or deactivated without disrupting the normal network operation and also in the sense that this mechanism can be deployed only in several sections of a network. • Robustness in the sense that if part of the system crashes it does not take down the whole system. • Scalability, i.e. when the network grows the architecture must not degrade its operation. • Independence in the sense that the system should not interfere with other network management systems. • Simplicity / Flexibility, enabling easy updating and upgrading, and not representing too much load for the network elements.

days Network Provisioning hours 5 6 Multi-Agent System Network Resource Management Routing min. 3 4 sec. Connection Admission Control ms 1 2 Time Scale User Pool Proposal: Network Resource Management Contents Motivation Background Objectives Desired Characteristics Network Resource Management Dynamic Bandwidth Fault Protection & Spare Capacity Proposed Architecture Specific Mechanisms Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • Focus on the Logical Network Management: • The adaptation of the Virtual or Logical Paths • Short- to mid-time process • Main tasks we consider: • Focus on the established LPs • Dynamic Bandwidth Management • Fault Protection • Spare Capacity Management • Tasks we do not consider: • Initial Logical Network design • The establishment / release of LPs 1) Connection demand 2) Connection accepted or rejected 3) Network performance problems 4) Virtual Network reconfiguration 5) Detected physical network bottlenecks 6) Physical network upgraded

Proposal: Dynamic Bandwidth Management Contents Motivation Background Objectives Desired Characteristics Network Resource Management Dynamic Bandwidth Fault Protection & Spare Capacity Proposed Architecture Specific Mechanisms Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • Maximise the resource utilisation - minimise the Connection Blocking Probability (CBP) • Two actions can be usually performed: • Re-allocation of bandwidth between LPs: • Re-routing of LPs: LP2 LP2 LP1 3 1 3 1 2 LP3 LP3 LP1 5 6 4 5 6 4 Increase LP1 by: • Using available bandwidth • Using underused bandwidth already assigned to other LPs (pre-emption)

Proposal: Fault Protection and Spare Capacity Contents Motivation Background Objectives Desired Characteristics Network Resource Management Dynamic Bandwidth Fault Protection & Spare Capacity Proposed Architecture Specific Mechanisms Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • Pre-planned mechanisms (fast mechanisms): • Establishment of alternative LPs (backup LPs) which become active in case of a failure (resource consuming). • Focus on the end-to-end protection (physically disjoint LPs). • Spare Capacity Management: • Minimisation of the bandwidth reserved for protection purposes. • Set of a desired protection level (e.g. against a simultaneous single link failure). • Set of LP priorities. • Sharing the capacity reserved for protection. Backup LP1 Working LP2 Working LP1 Backup LP2 Link failure

Proposal: Architecture Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Agent Distribution Monitoring Agents Performance Agents Collaboration & Partial Network View Partial Net. View Ex. Specific Mechanisms Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • Two-level agent hierarchy with two types of software agents: • Network Monitoring (M) agents. • Network Performance (P) agents. • They are situated at the network nodes: • One P agent at every node. • Several M agents at every node (at the initial nodes of the LPs). • M agents are subordinated to P agents. P agent < Physical Links > < Node Control > < Partial Network view > M agent < LP4 > < bLP4 > M agent < LP2 > M agent < LP5 > Monitoring LP 4 LP 1 LP 2 Physical Link LP 5 Backup LP 4 LP 3 Logical Path

Proposal: Agent Distribution Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Agent Distribution Monitoring Agents Performance Agents Collaboration & Partial Network View Partial Net. View Ex. Specific Mechanisms Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • Proximity to the managed elements. • Distribution of the processing load. • Software Agents – Node Control System communication: • Direct communication through an Application Programming Interface (API). • Through an SNMP Agent (this allows the placement of the Software Agents outside the network element). • Enables the establishment of an independent control plane. • Sometimes a network element could have a limited processing power. Node Control System P-Agent API M M M

SNMP Agent P-Agent Node Control System M M M MIB Proposal: Agent Distribution Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Agent Distribution Monitoring Agents Performance Agents Collaboration & Partial Network View Partial Net. View Ex. Specific Mechanisms Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • Proximity to the managed elements. • Distribution of the processing load. • Software Agents – Node Control System communication: • Direct communication through an Application Programming Interface (API). • Through an SNMP Agent (this allows the placement of the Software Agents outside the network element). • Enables the establishment of an independent control plane. • Sometimes a network element could have a limited processing power. P-Agent M M P-Agent P-Agent M M M M M M M Control Plane SNMP Agent MIB SNMP Agent SNMP Agent MIB MIB Transport Plane

Proposal: Monitoring (M) Agents Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Agent Distribution Monitoring Agents Performance Agents Collaboration & Partial Network View Partial Net. View Ex. Specific Mechanisms Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • Simple reactive agents with a stimulus/response behaviour. • There is one M agent per unidirectional working LP and they are under the supervision of the P agents. • Control a single LP on its origin (and its backup LP): • Detecting congestion through periodic monitoring. • The decision of considering the LP congested or not can be made using several mechanisms. • Optionally : • They implement the switchover mechanism • Co-ordinate a bidirectional communication • Lightweight monitoring • The monitored data should be few simple parameters. • Lightweight processes (threads) • Usually the M agents’ execution is halted and they are only resumed when the monitoring period expires and when a failure is detected. M agent < LP5 > LP 5

P agent < Physical Links > < Node Control > < Partial Network view > M Physical Link Proposal: Performance (P) Agents Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Agent Distribution Monitoring Agents Performance Agents Collaboration & Partial Network View Partial Net. View Ex. Specific Mechanisms Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • More complex collaborative agents. • There is one P agent per network node. • Their main objective is to maximise the utilisation of the links’ bandwidth they directly control. • Minimise the Connection Blocking Probability (CBP) for all the LPs starting at every particular node. • Check different possibilities to increase a congested LP and decide the best action to take. (by itself or in collaboration with other P agents) • To perform its tasks, a P agent: • Control of the node status and the outgoing physical links. • Keeps track of the transit LPs. • Maintain the status of reserved and available bandwidth on the links. • Maintain a “Partial Network View”. • The P agents’ decisions are based on it.

Proposal: Collaboration and Partial Network View Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Agent Distribution Monitoring Agents Performance Agents Collaboration & Partial Network View Partial Net. View Ex. Specific Mechanisms Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • P agents’ decisions (distributed architecture) can be based on: • In order to find a trade-off between these two options we choose that: • P agent communications are restricted to its physical neighbours: • Use of a signalling-type of communications (hop by hop). • Direct collaboration is also restricted to the neighbours, but indirect collaboration with farther P agents is possible. • The P agents decisions are based on a Partial Network View: • Information which the P agent is sure about (directly controlled by itself). • Information that could be out of date, received from other P agents. • Asynchronously updated only when P agents exchange messages. • Local information only • Limited • High scalability • Whole network view • Powerful decisions • Low scalability

Pa2 Pa1 M Message from Pa1 to Pa2 M Partial Network View of Pa1 Requested Action Response Partial Network View of Pa2 Node 2 Node 1 Message from Pa2 to Pa1 Proposal: Partial Network View Example Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Agent Distribution Monitoring Agents Performance Agents Collaboration & Partial Network View Partial Net. View Ex. Specific Mechanisms Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • Asynchronous “Partial Network View” actualisation: • “Partial Network View” examples: Pa1 M Node 1 Node 2 Node 1 Node 2 LP 1 Node 3 LP 2 LP 6 Node 3 LP 3 LP 5 LP 7 LP 4 LP 4 LP 8 Pa5 Node 5 Node 4 Node 5 Node 4 M

Proposal: Specific Mechanisms Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Monitoring & Congestion Detection Bandwidth Reallocation LP Rerouting Restoration & Spare Capacity Man. Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • In order to test the proposed architecture the required resource management mechanisms have also to be implemented. • We propose several mechanisms with the aim of simplicity and scalability. • Monitoring and congestion detection (Triggering Functions): • Rejected • CBP30 • Load • Bandwidth reallocation algorithms: • Free Bandwidth Only (FBO) • First Node Only (FNO) • Any Logical Path (ALP) • Rerouting algorithms. • Protection and Spare Capacity management.

t OC(t) RC(t) Counter(t) Rejected(t,limit=5) LP 0 0 0 0 LP OK 1 15 2 2 LP OK 2 22 3 3 LP OK 3 34 3 0 LP OK 4 48 6 3 LP OK 5 62 9 6 LP Congested Proposal: Monitoring and Congestion Detection Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Monitoring & Congestion Detection Bandwidth Reallocation LP Rerouting Restoration & Spare Capacity Man. Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • This is the main task of the M agents. The important factors are: • The monitoring period (not too fast, not too slow). • How to decide whether an LP is congested. • The size in which increase the bandwidth of a congested LP (the “step size”). • The monitored parameters for each LP are : • The number of Offered Connections (OC). • The number of Rejected Connections (RC). • The LP Load (L) as a percentage of the bandwidth allocated to connections. • Triggering Function (TF) is the mechanism to decide whether an LP is congested or not. We propose 3 simple TFs: • Rejected • CBP30 • Load 1 5 10 15 20 25 30 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3/30  10%  LP Congested 90% allocated  LP Congested

Proposal: Bandwidth Reallocation Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Monitoring & Congestion Detection Bandwidth Reallocation LP Rerouting Restoration & Spare Capacity Man. Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • When an LP is detected congested, the P agent has to try to increase its initial bandwidth avoiding re-routing. • There are messages sent form P agents to their physical neighbours. • We propose three mechanisms: • Free Bandwidth Only (FBO) • The only possibility is try to increase the congested LP using available bandwidth from the physical links • First Node Only (FNO) • This case is similar to the previous one, but only in the LP origin node, it is also possible to decrease the bandwidth of another LP starting at the same node and use it. An LP bandwidth can only be decreased if it is unused and allowed (pre-emption) • Any LP (ALP) • It is possible to use unused bandwidth of any other LP when it partially coincides in the same route. LP 4 LP 3 LP 2 Node 1 Node 2 Node 3 Node 4 LP 1

Pa1 M Proposal: LP Rerouting (1/2) Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Monitoring & Congestion Detection Bandwidth Reallocation LP Rerouting Restoration & Spare Capacity Man. Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • Not possible to use the Bandwidth Reallocation algorithms? Try rerouting the congested LP. • No need for a fast re-routing. The LP already exists. • It is possible to use any existing routing algorithm, e.g. a constraint based routing algorithm. • The selected routing algorithm is applied on every node in a hop-by-hop manner to select just the next hop. • This routing algorithm use the Partial Network View • The calculated route and the Partial Network View is forwarded to the next hop LP1 Congested Node 3 Node 1 Node 2 Node 5 Node 4 New route for LP1 calculated by Pa1

Pa5 M Proposal: LP Rerouting (1/2) Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Monitoring & Congestion Detection Bandwidth Reallocation LP Rerouting Restoration & Spare Capacity Man. Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • Not possible to use the Bandwidth Reallocation algorithms? Try rerouting the congested LP. • No need for a fast re-routing. The LP already exists. • It is possible to use any existing routing algorithm, e.g. a constraint based routing algorithm. • The selected routing algorithm is applied on every node in a hop-by-hop manner to select just the next hop. • This routing algorithm use the Partial Network View • The calculated route and the Partial Network View is forwarded to the next hop LP1 Congested Node 3 Node 1 Node 2 Node 5 Node 4 New route for LP1 recalculated by Pa5

Pa2 M Proposal: LP Rerouting (1/2) Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Monitoring & Congestion Detection Bandwidth Reallocation LP Rerouting Restoration & Spare Capacity Man. Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • Not possible to use the Bandwidth Reallocation algorithms? Try rerouting the congested LP. • No need for a fast re-routing. The LP already exists. • It is possible to use any existing routing algorithm, e.g. a constraint based routing algorithm. • The selected routing algorithm is applied on every node in a hop-by-hop manner to select just the next hop. • This routing algorithm use the Partial Network View • The calculated route and the Partial Network View is forwarded to the next hop LP1 Congested Node 2 Node 3 Node 1 Node 5 Node 4 It is not possible to follow any path form Node 2

Pa4 M Proposal: LP Rerouting (1/2) Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Monitoring & Congestion Detection Bandwidth Reallocation LP Rerouting Restoration & Spare Capacity Man. Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • Not possible to use the Bandwidth Reallocation algorithms? Try rerouting the congested LP. • No need for a fast re-routing. The LP already exists. • It is possible to use any existing routing algorithm, e.g. a constraint based routing algorithm. • The selected routing algorithm is applied on every node in a hop-by-hop manner to select just the next hop. • This routing algorithm use the Partial Network View • The calculated route and the Partial Network View is forwarded to the next hop LP1 Congested Node 3 Node 1 Node 2 Node 5 Node 4 New route for LP1 calculated by Pa4

Pa1 M Final New route for LP1 Proposal: LP Rerouting (1/2) Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Monitoring & Congestion Detection Bandwidth Reallocation LP Rerouting Restoration & Spare Capacity Man. Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • Not possible to use the Bandwidth Reallocation algorithms? Try rerouting the congested LP. • No need for a fast re-routing. The LP already exists. • It is possible to use any existing routing algorithm, e.g. a constraint based routing algorithm. • The selected routing algorithm is applied on every node in a hop-by-hop manner to select just the next hop. • This routing algorithm use the Partial Network View • The calculated route and the Partial Network View is forwarded to the next hop LP1 Congested Node 3 Node 1 Node 2 Node 5 Node 4

Route Hops (H) Ra 3 Rb 4 Rc 3 Proposal: LP Rerouting (2/2) Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Monitoring & Congestion Detection Bandwidth Reallocation LP Rerouting Restoration & Spare Capacity Man. Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • However, we propose a simple routing algorithm which takes into account that the partial network view could be incomplete and/or out of date. • First, all the possible routes with enough resources for a enlarged LP are listed. • Second, this list is ordered giving a weight to each route. The best is selected. • The routes’ weight is a weighted mean which divide the importance of the route length and the available bandwidth in the route. • The farther the link the greater the possibility to have incorrect information on the Partial network view. The influence decreases with the distance. 7 units LP1 Congested 3 units 1 2 3 2 units 5 units 5 4 4 units Physical links bandwidth = 10 units Route weight (1=2=1) Alternative Routes for LP1 (5 units): 10 8 10 Best Ra = 1 2 4 3 Larger weight 10 10 6 10 Second Rb = 1 2 5 4 3 5 6 10 Third Rc = 1 5 4 3

Link Failure LinkFailed LP1 2 1 3 LinkFailed LP2 5 4 Backup LP1 Backup LP2 Proposal: Preplaned Restoration & Spare Capacity Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Monitoring & Congestion Detection Bandwidth Reallocation LP Rerouting Restoration & Spare Capacity Man. Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • Preplaned Restoration: • Spare Capacity Management: Cannot share: backup LP1 and backup LP2 Can share: backup LP1 with backup LP3 backup LP2 with backup LP3 LP-3 LP-1 LP-2

Proposal: Putting Everything Together Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Putting Everything Together Analysis & Simulations Conclusions Future Work Related Publications • Dynamic bandwidth management clearly needs to be coordinated with protection/restoration mechanisms. • It is proposed that the P agents follow several coordination rules: • Bandwidth management: • Decreasing the bandwidth assigned to an LP is always possible and straight. However when decreasing a backup LP it must be checked whether the spare bandwidth is shared or not. • In a LP increase the backup LP has to be accordingly increased. We select to start the increasing procedure of both simultaneously and if one of them cannot be increased by no means, then abort the procedure. • Due to the difficulties of a distributed scenario we select to not allow rerouting of the backup LPs. Rerouting option is only allowed for the working LPs, always through link disjoint paths. • Link failure: • When a backup is used, it is considered a special situation and all the affected LPs and backup LPs cannot be increased/decreased nor rerouted. • This also helps to save coordination messages between P agents and thus they can put all their attention into the failure situation.

Analysis and Simulations Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Putting Everything Together Analysis & Simulations Sim. Development Mon & Congestion Det. BW Reallocation LP Rerouting Everything Together Scalability Study Conclusions Future Work Related Publications • Difficult comparison with other proposals based on Software Agents: • Lack of time to implement them. • Few details in the literature. • Focus on the test of the proposed mechanisms and the coordination rules • The offered connections are designed to cause congestions. • Main parameters evaluated. • Connection Blocking Probability (CBP) • Number of P agents messages. • We used several network topologies depending on the test.

P-Agent (Client) P-Agent (Client) Multi-Agent System Processes (Java) M M M M M M P-Agent (Client) P-Agent (Client) Java RMI M M M M M M Client/Server socket Traffic Event Generator (Client) Node Emulator (Server) Node Emulator (Server) Traffic Event Generator (Client) Simulated Network Traffic Event Generator (Client) Node Emulator (Server) Node Emulator (Server) Traffic Event Generator (Client) Simulator Processes (C++) Simulation Development Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Putting Everything Together Analysis & Simulations Sim. Development Mon & Congestion Det. BW Reallocation LP Rerouting Everything Together Scalability Study Conclusions Future Work Related Publications • Simulation platform. • Distributed connection level simulation (client/server, C++). • Software Agents implementation. • Distributed agents (Java).

LP1 1 2 200 Mbps Parameter Values Monitoring Period (s) 2 5 10 20 Step Size (Kbps) 500 1000 2000 4000 Rejected Limit (#connections) 1 3 5 7 CBP Limit (%) 10 30 50 70 Load Limit (%) 85 90 95 99 Exp. 1: Monitoring and Congestion Detection (1/2) Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Putting Everything Together Analysis & Simulations Sim. Development Mon & Congestion Det. BW Reallocation LP Rerouting Everything Together Scalability Study Conclusions Future Work Related Publications • Test of the Triggering Functions behaviour: • Detection of the congestion situations (function thresholds). • Evaluation for different monitoring periods. • Evaluation for different ‘step sizes’. • The simulated network is very simple: • The number offered connections increase during the simulation time. • The LP bandwidth increase is always possible. • Homogeneous and heterogeneous connections. • Combination of many parameters:

Homogeneous case Homogeneous case Heterogeneous case Heterogeneous case Exp. 1: Monitoring and Congestion Detection (2/2) Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Putting Everything Together Analysis & Simulations Sim. Development Mon & Congestion Det. BW Reallocation LP Rerouting Everything Together Scalability Study Conclusions Future Work Related Publications • Results grouped by monitoring period and by step size:

5 2 12 11 6 3 1 13 14 4 7 10 8 9 15 # messages Algorithm Free Bandwidth Only 5478 13339 First Node Only Any Logical Path 21402 Experiment 2: Bandwidth Reallocation Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Putting Everything Together Analysis & Simulations Sim. Development Mon & Congestion Det. BW Reallocation LP Rerouting Everything Together Scalability Study Conclusions Future Work Related Publications • We detected, as expected, that at stationary situations there is few bandwidth reallocations although congestion is detected and the P agents keep trying bandwidth reallocations. • This test was about how well the bandwidth reallocation mechanisms adapt the logical network when the level of offered connections changes over time. • 7 edge nodes and 8 core nodes. 24 unidirectional LPs. 1 hour simulation.

5 2 12 11 6 3 1 13 14 4 7 10 8 9 15 Experiment 3: Logical Path Rerouting Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Putting Everything Together Analysis & Simulations Sim. Development Mon & Congestion Det. BW Reallocation LP Rerouting Everything Together Scalability Study Conclusions Future Work Related Publications • Comparative of a bandwidth reallocation algorithm (ALP) with and without rerouting. • After the bandwidth reallocation mechanism initially adapts the logical network, the level of offered connections changes over time and there are several congested LPs that are rerouted. • 7 edge nodes and 8 core nodes. 34 unidirectional LPs. 1 hour simulation. • Less offered connections than in other experiments. • The results are grouped by origin node. Situation # messages ALP + Rerouting 1858 ALP Only 3814

5 2 12 11 6 3 1 13 14 4 7 10 8 9 15 Experiment 4: Putting Everything Together Contents Motivation Background Objectives Desired Characteristics Network Resource Management Proposed Architecture Specific Mechanisms Putting Everything Together Analysis & Simulations Sim. Development Mon & Congestion Det. BW Reallocation LP Rerouting Everything Together Scalability Study Conclusions Future Work Related Publications • Finally, we perform several simulations where the bandwidth reallocation and the rerouting mechanisms were tested together with the restoration and the spare capacity management ones. • The focus of these simulations was to test the coordination rules for the P agents. • 7 edge nodes and 8 core nodes. 34 unidirectional LPs + 34 backup LPs. 1 hour simulation. • Simulation of a link failure. • Coordination: • The bandwidth mechanisms accordingly increases / decreases the working and backup LPs. • The rerouting is only allowed for the working LPs. • When a failure occurs and the backup is active, bandwidth changes of the affected LPs are not allowed

Author: Pere Vilà Supervisor: Josep Lluís Marzo Departament d’Electrònica Informàtica i Automàtica

Author: Pere Vilà Supervisor: Josep Lluís Marzo Departament d’Electrònica Informàtica i Automàtica

Presentation Transcript