200 likes | 332 Views
Dynamic routing – QoS routing. Other approaches to QoS routing Traffic Engineering Practical Traffic Engineering. Other approaches. Reduce the load from updates: Use more efficient distribution methods Trees instead of flooding But setting up the tree has its own complexity and issues
E N D
Dynamic routing – QoS routing • Other approaches to QoS routing • Traffic Engineering • Practical Traffic Engineering
Other approaches • Reduce the load from updates: Use more efficient distribution methods • Trees instead of flooding • But setting up the tree has its own complexity and issues • Handle inaccurate state information: Crankback • If the path I try is not good (because of stale information) step back and try a different one • Path setup may take long time now • Reduce the load from updates: • Introduce hierarchy (similar to IGP areas) • Avoid updates altogether: Probing [Chen, Nahrsted 1998] • Send probes over multiple paths towards the destination • The probes will collect information about the network conditions • Have extra probe traffic now
PNNI • The QoS routing component of ATM • Only standardized QoS routing protocol • Gone now that ATM is gone • Link state with Strict Hierarchy • Recursively create multiple peer-groups • Flooding only inside a peer-group • Parent floods information to its descendants too • Abstract nodes • Summarize QoS characteristics of a whole peer-group • Appears as a single node in the parent peer-group • A route is signaled using source routing • Source route is expanded as we enter a peer-group • Crankback • If signaling fails backup to the entry point of the peer-group and try another path
PNNI Routing Algorithm • What is advertised • Administrative weight • Available bandwidth • Loss rates • Delay • Delay variation • The routing algorithm was not specified in the protocol specification • Many proposals
Why is QoS routing “dead” • Partly because per-flow QoS is hard to achieve • Int-serv lost to diff-serv • Partly because it is very hard to have QoS in the inter-domain • All that we talked about are intra-domain • “Applications did not need it” argument • Applications never had the chance to use it, it never worked • It became traffic engineering • Different timescales • Offline algorithms • Traffic matrices • Potentially different optimization objectives • Still intra-domain though!
Traffic Engineering • Given • A traffic matrix • Demands between any two endpoints in my network • In practice demands between POPs • The network • Topology • Link sizes • Find • How to arrange the offered traffic into the network so as to optimize network performance
Problems • What should I optimize? • Do I really have the traffic matrix? • How easy is to do this optimization algorithmically?
What should I optimize? • In QoS routing I was trying to maximize the traffic I could fit in the network • One request I the time • In TE I known all the traffic • I can optimize some global routing metrics • Minimize the overall cost of routing • Depends on how I define the cost of a link as a function of its load • A common function is one that makes the cost exponentially higher as the link approaches saturation • Minimizing this, minimizes the load on each link
How to get the traffic matrix • Traffic matrix: • Volume of data between all pairs of ingress/egress points of my network • Could be PoPs, customers etc.. • Hard to get the traffic matrix data • Packet counting is expensive • Sometimes count only packets and not packets/destination • Even when I can count packets/destination I have to map destinations to egress points • Routing dependent • Changes in routing can dramatically change the traffic matrix • BGP hot-potato routing • Need to estimate the traffic matrix • Y is the table of link loads, A is the routing, X is the traffic matrix • Y = A * X • This is very under-determined, too many possible solutions
Traffic matrix estimation • Active research area • Probabilistic approaches • Start from an estimate of the traffic matrix • Assume some statistics for the traffic • That may not necessarily be true, real traffic does not follow much these models • Refine the estimates • Choice models • Model as each POP making a choice where to send its traffic • Gravity models • Traffic between POP a and POP b is • Proportional to the volume of traffic leaving a • Proportional to the volume of traffic entering b • Inversely proportional to their distance
General Routing problem • Network with N nodes and E edges • Traffic matrix T for each pair in N x N • Cost function C(e) • Dependent on the load of edge e • Find how to split traffic into flows to minimize • Cost = Sum of C(e) for all e in E • Can solve in linear time • If I can split flows arbitrarily
Unfortunately • In IP networks • Routing depends only indirectly on the link costs • My algorithm should find link costs • Can not split flows arbitrarily • May not have enough paths • If I have ECMPs flow is split equally among the multiple options • Destination based routing • Traffic to the same destination will follow the same path • With these constraints • Problem of finding IGP link costs so as to minimize the cost of routing is NP-complete
Enter MPLS • MPLS can approximate the flow splitting properties • No destination routing anymore • Can control exactly what traffic goes into an LSP • And how this traffic is delivered to its destination • This connection oriented nature is what makes MPLS (and ATM before) good for traffic engineering • Of course there is some cost • Full mesh of LSPs • Higher administrative complexity
TE in practice • I have a 3 level network • Customer, aggregation and wan routers • Three approaches for TE • IGP only • IGP+MPLS • Mostly IGP with the occasional LSP • For unequal cost forwarding • For temporarily repairing hot-spots • MPLS • Full mesh of LSPs • Compute paths • On-line • Off-line
Pros and Cons • IGP+MPLS • Mostly manual process • Error prone • It is not too easy to patch up network problem with a few LSPs • And may cause other problems • MPLS • Scaling of the full mesh • Can work at each of the 3 levels , Wan level full mesh scales ok, Customer router full mesh could be a problem • With 100 customer routers will have 10,000 LSPs • Can be more if I have separate LSPs for each Diff-Serv class • Signaling overhead • May hit the limit of LSPs in the transit routers
Off-line MPLS TE • Compute best LSP paths for the whole network • Signal them using RSVP • When something changes • Re-compute all the LSPs again • Off-line allows for better control • Compute best LSP paths for the whole network • No oscillations • Global view can optimize resource usage • But can not respond to sudden traffic changes • Attacks • Flash crowds
IP TE is not impossible • Recent research has shown that it is possible to achieve solutions that are very close to the optimal using just IP • I do this by picking the right IGP weights for each link • But as we said this problem is NP-complete • Need to do a state space search in the link weight state space • With some tricks this is feasible • For real networks this can get within few percent of the optimal flow based routing !
IGP link weight space search • Typical state search problem • Can use variety of methods • Steepest descent • Tabu search • Local vs. global minima • Start for a set of link weights state1 • Compute the cost of routing for this set • This is expensive • Need to route all the traffic and measure how much load each link has in order to compute its cost • Modify one or more weights -> state2 • Compute new routing cost • Keep if new routing cost is better • Continue until … ?
Some tricks to speed up search • Avoid cycles • Remember states that were visited before and do not evaluate them again • Need to do this efficiently • Faster routing cost evaluation • Consider the effects of only large flows • Incremental SPFs • Find out which links are the ones that have a large impact on the cost and optimize for them only • Adaptation • Consider a dynamically sized “neighborhood” and explore it first before moving on • Neighborhood becomes smaller when I improve on the solution • Larger when I do not improve on the solution • Avoid local minima • Essentially repeat search starting from random places in the state search space
When links fail? • All this TE is good when links do not fail • What happens in failures? • MPLS TE • Failures are handled by fast reroute • Some minimal optimization when determining backup LSPs • Global re-optimization if the failure lasts for too long • IGP weight optimization • It is possible to optimize weights to take into account single link failures • Other approaches: • Multi-topology traffic engineering • Optimize the weights in the topologies that are used for protecting from failures