0940_03F8_c1 NW97_US_106

Multicast Issues for Gigapop Operators David MeyerGigapop Operators II Workshop 26 June 1998 0940_03F8_c1 NW97_US_106

Agenda • Introduction • First Some Basic Technology • Basic Host Model • Basic Router Model • Data Distribution Concepts • What Are the Deployment Obstacles • What Are the Non-technical Issues • What Are the Technical Scaling Issues

Agenda (Cont.) • Potential Solutions (Cisco Specific) • Multi-level RP, Anycast Clusters, MSDP • Using Directory Services • Industry Solutions • BGMP and MASC • Possible Deployment Scenarios • References

Introduction—Level Set • This presentation focuses on large-scale multicast routing for Gigapops and their customers • Note that the problem is essentially the same as the inter-domain multicast routing problem • The problems/solutions presented are related to inter-enterprise or Gigapop deployment of IP multicast • The current set of deployed technology is sufficient for enterprise environments

Introduction—Why Would You Want to Deploy IP Multicast? • You don’t want the same data traversing your links many times— bandwidth saver • You want to join and leave groups dynamically without notifying all data sources—pay-per-view

Introduction—Why Would You Want to Deploy IP Multicast? • You want to discover a resource but don’t know who is providing it or if you did, don’t want to configure it— expanding ring search • Reduce startup latency for subscribers

Introduction—Why Would a Gigapop Want to Deploy IP Multicast? • All of the previous, plus revenue potential for deploying IP multicast • Initial applications • Radio station transmissions • Real-time stock quote service • Future applications • Distance learning • Entertainment

Basic Host Model • Strive to make the host model simple • When sourcing data, just send the data • Map network layer address to link layer address • Routers will figure out where receivers are and are not • When receiving data, need to perform two actions • Tell routers what group you’re interested in (via IGMP) • Tell your LAN controller to receive for link-layer mapped address

Basic Host Model • Hosts can be receivers and not send to the group • Hosts can send but not be receivers of the group • Or they can be both

Basic Host Model • There are some protocol and architectural issues • Multiple IP group addresses map into a single link-layer address • You need IP-level filtering • Hosts join groups, which means they receive traffic from all sources sending to the group • Wouldn’t it be better if hosts could say what sources they were willing to receive from

Basic Host Model • There are some protocol and architectural issues (continued) • You can access control sources but you can’t access control receivers in a scalable way

Basic Router Model • Since hosts can send any time to any group, routers must be prepared to receive on all link-layer group addresses • And know when to forward or drop packets

Basic Router Model • What does a router keep track of? • interfaces leading to receivers • sources when utilizing source distribution trees • prune state depending on the multicast routing protocol (e.g. Dense Mode)

Data Distribution Concepts • Routers maintain state to deliver data down a distribution tree • Source trees • Router keeps (S,G) state so packets can flow from the source to all receivers • Trades off low delay from source against router state

Data Distribution Concepts • Shared trees • Router keeps (*,G) state so packets flow from the root of the tree to all receivers • Trades off higher delay from source against less router state

Data Distribution Concepts • How is the tree built? • On demand, in response to data arrival • Dense-mode protocols (PIM-DM and DVMRP) • MOSPF • Explicit control • Sparse-mode protocols (PIM-SM and CBT)

Data Distribution Concepts • Building distribution trees requires knowledge of where members are • flood data to find out where members are not(Dense-mode protocols) • flood group membership information (MOSPF), and build tree as data arrives • send explicit joins and keep join state (Sparse-mode protocols)

Data Distribution Concepts • Construction of source trees requires knowledge of source locations • In dense-mode protocols you learn them when data arrives (at each depth of the tree) • Same with MOSPF • In sparse-mode protocols you learn them when data arrives on the shared tree (in leaf routers only) • Ignore since routing based on direction from RP • Pay attention if moving to source tree

Data Distribution Concepts • To build a shared tree you need to know where the core (RP) is • Can be learned dynamically in the routing protocol (Auto-RP, PIMv2) • Can be configured in the routers • Could use a directory service

Data Distribution Concepts • Source trees make sense for • Broadcast radio transmissions • Expanding ring search • Generic few-sources-to-many-receiver applications • High-rate, low-delay application requirements • Per source policy from a service provider’s point of view • Per source access control

Data Distribution Concepts • Shared trees make sense for • Many low-rate sources • Applications that don’t require low delay • Consistent policy and access control across most participants in a group • When most of the source trees overlap topologically with the shared tree

Deployment Obstacles—Non-Technical Issues • How to bill for the service • Is the service what runs on top of multicast? • Or is it the transport itself? • Do you bill based on sender or receiver, or both? • How to control access • Should sources be rate-controlled (unlike unicast routing) • Should receivers be able to receive at a specific rate only?

Deployment Obstacles—Non-Technical Issues • How to make your peers fan-out instead of you (reduce the replication factor in you own network) • Closest exit versus latest entrance—all a wash • How to avoid multicast from opening a lot of security holes • Network-wide denial of service attacks • Eaves-dropping simpler since receivers are unknown

Deployment Obstacles—Technical Issues • Source tree state will become a problem as IP multicast gains popularity • When policy and access control per source will be the rule rather than the exception • Group state will become a problem as IP multicast gains popularity • 10,000 three member groups across the Internet

Deployment Obstacles— Technical Issues • Hopefully we can upper bound the state in routers based on their switching capacity

Deployment Obstacles— Technical Issues • Gigapop customers are telling us they don’t want to depend on another customer’s (or gigapop) RP • Do we connect shared trees together? • Do we have a single shared tree across domains? • Do we use source trees only for inter-domain groups?

Deployment Obstacles— Technical Issues • Customers are telling us that the unicast and multicast topologies won’t be congruent across domains • Due to physical/topological constraints • Due to policy constraints • We need a inter-domain routing protocol that distinguishes unicast versus multicast policy

How to Control Multicast Routing Table State in the Network? • Fundamental problem of learning group membership Flood and Prune DVMRP PIM-DM Broadcast Membership MOSPF DWRs Rendezvous Mechanism PIM-SM BGMP

Rendezvous Mechanism • Why not use sparse-mode PIM? • Where to put the root of the shared tree (the RP) • Third-party RP problem • If you did use sparse-mode PIM • Group-to-RP mappings would have to be distributed throughout the Internet

Rendezvous Mechanism • Lets try using sparse-mode PIM for inter-domain multicast • Look at four possibilities • Multi-level RP • Anycast clusters • MSDP • Use directory services

Connect Shared Trees Together—Multi-Level RP • Idea is to have a hierarchy of shared trees • Level-0 RPs are inside of domains • They propagate joins from routers to a Level-1 RP that may be in another domain • All level-0 shared trees get connected together via a Level-1 RP • If multiple Level-1 RPs, iterate up to Level-2 RPs

Connect Shared Trees Together—Multi-Level RP • Problems • Requires PIM protocol changes • If you don’t locate the Level-0 RP at the border, intermediate PIM routers think there may be two RPs for the group • Still has the third-party problem, there is ultimately one node at the root of the hierarchy • Data has to flow all the way to the highest-level RP

Connect Shared Trees Together—Anycast Clusters • Share the burden of being an RP among service providers • Each RP in each domain is a border router • Build RP clusters at interconnect points (or dense-mode clouds) • Group allocation is per cluster and not per-user or per-domain

Connect Shared Trees Together—Anycast Clusters • Closest border router in cluster is used as the RP • Routers in a domain will use the domain’s RP • Provided you have an RP for that group range at an interconnect point • If not, you use the closest RP at the interconnect point (could be RP in another domain)

Connect Domains Together—MSDP • If you can’t connect shared trees together easily, then don’t • Multicast Source Discovery Protocol • Different paradigm • Rather than getting trees connected, get sources known to all trees • Sounds non-scalable, but the trick is in the implementation

Connect Domains Together—MSDP • An RP in a domain has a MSDP peering session with an RP in another domain • Runs over TCP • Source-Active (SA) messages are sent to describe active sending sources in a domain • Logical topology is built for the sole purpose to distribute SA messages

Connect Domains Together—MSDP • How it works • Source goes active in PIM-SM domain • It’s packets get PIM registered to domain’s RP • RP sends SA message to it’s MSDP peers • Other MSDP peers forward to their peers away from the originating RP • If a peer in another domain has receivers for the group the source is sending to, it joins the source (Flood-and-Join model)

Connect Domains Together—MSDP • There is no shared tree across domains • Therefore each domain can depend solely on it’s own RP (no third-party problem) • SA state is not stored at each MSDP peer • You could encapsulate data in SA messages for low-rate bursty sources • You could have SA caching peers to speed up join latency

Use Directory Services • You can use directory services to: • Enable a single shared tree across domains • Enable use of source tree only and avoid using a single shared tree across domains

Use Directory Services • How it works with a single shared tree across domains • Put RP in client’s domain • Optimal placement of the RP if the domain had a multicast source or receiver active • Policy for RP is consistent with policy for domain’s unicast prefixes • Use directory to find RP address for a given group

Use Directory Services • For example • Receiver host sends IGMP report for 224.1.1.1 • First-hop router performs DNS name resolution on • 1.1.1.224.pim.mcast.net • An A record is returned with the IP address of RP • First-hop router sends PIM join message towards RP

Use Directory Services • All routers get consistent RP addresses via DNS • When dynamic DNS is widely deployed it will be easier to change A records • In the mean time, use loopback addresses on routers and move them around in your domain

Use Directory Services • When domain group allocation exists, a domain can be authoritative for a DNS zone • 1.224.pim.mcast.net • 128/17.1.224.pim.mcast.net

Use Directory Services • Another approach—avoid using shared trees all together • Build PIM-SM source trees across domains • Put multiple A records in DNS to describe sources for the group 1.0.2.224.sources.pim.mcast.net IN CNAME dmm-home IN CNAME dino-home dmm-home IN A 171.69.58.81 dino-home IN A 171.69.127.178

Standards Solutions • Ultimate scalability of both routing and group allocation can be achieved using BGMP/MASC • Use BGP4+ (MBGP) to deal with non-congruency issues

Border Gateway Multicast Protocol (BGMP) • Use a PIM-like protocol that runs between domains (BGP equivalent for multicast) • The protocol builds a shared tree of domains for a group • So we can use a rendezvous mechanism at the domain level • Shared tree is bi-directional • Root of shared tree of domains is at root domain

Border Gateway Multicast Protocol (BGMP) • Runs in routers that border a multicast routing domain • Runs over TCP like BGP • Joins and prunes travel at domain hops • Can build unidirectional source trees • MIGP tells the borders about group membership

Multicast Address Set Claim (MASC) • How does one determine the root domain for a given group? • Group prefixes are temporarily leased to domains • They are allocated out of a service provider’s allocation which in turn get from upstream provider

Multicast Address Set Claim (MASC) • Claims for group allocation resolve collisions • Group allocations are advertised across domains • Lots of machinery for aggregating group allocations

Multicast Address Set Claim (MASC) • Tradeoff between aggregation and anticipated demand for group addresses • Group prefix allocations are not assigned to domains—they are leased • Application must be written to know that group addresses may go away • Work in progress

0940_03F8_c1 NW97_US_106

0940_03F8_c1 NW97_US_106

Presentation Transcript

0940_03F8_c1 NW97_US_106