1.44k likes | 1.45k Views
Learn multicast functionality, terminology, protocols, and troubleshooting with practical methodology and essential tools in a comprehensive tutorial session from the Joint Techs Meeting in 2005.
E N D
Multicast Troubleshooting TutorialCaren Litvanyilitvanyi@grnoc.iu.eduJoint Techs MeetingSalt Lake City, UtahFebruary 2005
Tutorial Outline • Review IP multicast terminology and basic functionality. • Review how the most common multicast protocols in use today work. • Discuss some design issues. • Troubleshooting multicast methodology, particularly interdomain multicast. • Mention some tools and resources.
Unicast vs. Multicast Unicast Multicast
Multicast Building Blocks • The SENDERS send without worrying about receivers. • Packets are sent to a multicast address. • (224.0.0.0 - 239.255.255.255) • The RECEIVERS inform their local routers what they want to receive. • The routers build a tree backwards (reverse-path) towards the source, thus making sure the STREAMS make it to the correct receiving networks.
233.12.24.11 128.138.10.2 e.g., video server 128.138.10.2 Essential Multicast Terminology IP source = IP unicast addr Ethernet source = MAC addr IP destination = IP multicast addr Ethernet dest = MAC addr source sender receivers listeners group members Multicast stream Distribution tree A few things to note here: The IP source address is the IP address of the server BUT – the destination address in the packet is NOT an IP address of a receiver. It is a multicast IP address. 224.0.0.0 - 239.255.255.255 tree= the path taken by multicast data. Routing loops are not allowed, so there is always a unique series of branches between the root of the tree and the receivers.
(S,G) notation • For every multicast stream there must be two pieces of information: the source IP address, S, and the group address, G. • These correspond to the sender and receiver addresses in unicast. • This is generally expressed as (S,G). • Also commonly used is (*,G) - every source for a particular group.
Multicast Addressing • RFC 3171 244.0.0.0 – 239.255.255.255 • Examples of Reserved & Link-local Addresses • 224.0.0.0 - 224.0.0.255 reserved & not forwarded • 224.0.0.1 - All local hosts • 224.0.0.2 - All local routers • 224.0.0.4 - DVMRP • 224.0.0.5 - OSPF • 224.0.0.6 - Designated Router OSPF • 224.0.0.9 - RIP2 • 224.0.0.13 - PIM • 224.0.0.18 - VRRP • 224.0.0.22 - All IGMP routers • 239.0.0.0 - 239.255.255.255 Administrative Scoping • 232.0.0.0 – available for SSM use • “Ordinary” multicasts don’t have to request a multicast address from IANA. Use GLOP space – RFC 2770.
Essential Multicast Protocols • Membership reports • Receivers • Group Management Protocol - enables hosts to dynamically join/leave multicast groups. Receivers send group membership reports to the nearest router. • Multicast Routing Protocol - enables routers to build a delivery tree backwards from the receivers to the sender of a multicast stream. • Reverse path tree • Data flow • Senders Group Management Protocol (IGMPv2 or v3) Multicast Routing Protocol (PIM-SM)
Multicast Protocol Summary • Essential Protocols • IGMP - Internet Group Management Protocol is used by hosts and routers to tell each other about group membership. (Usually version 2) • PIM-SM - Protocol Independent Multicast - Sparse Mode is used to propagate forwarding state between routers. • Other Protocols (for interdomain) • MBGP - Multiprotocol Border Gateway Protocol is used to exchange routing information for inter-domain reverse-path forwarding (RPF) checking. • MSDP - Multicast Source Discovery Protocol is used to exchange active-source information.
IGMP Protocol Flow - Join a Group I want to JOIN! 230.0.0.1 Router adds group I want 230.0.0.1 230.0.0.1 230.0.0.1 Forwards stream • Router triggers group membership request to PIM. • Hosts can send unsolicited Join membership messages – called reports in the RFC (usually more than 1) • Or hosts can join by responding to periodic query from router
IGMPv2 • Router: • sends Membership Query messages to All Hosts (224.0.0.1) • default query-interval = 125 seconds • router with lowest IP address is Querier (rest non-queriers) • If lower-IP address query heard, back off to non-querier state • Other Querier Present Interval default: (robust-count x query-interval) + (0.5 x query-response-interval) = 255 seconds • listens for reports (whether querier or not) and adds group to membership list for that interface • default query-response-interval = 10 seconds • timeout (Group member interval) default: • (robust-count x query-interval) + (1 x query-response-interval) = 260 seconds • robust-count - provides fine-tuning to allow for expected packet loss on a subnet. Default = 2 (tunable from 2-10) • Triggers group membership request to PIM.
IGMPv2 • Host: • responds to router query with Membership Report messages to groups it is a member of (e.g.224.10.8.5) • waits 0-10 sec (default; specified in Query) • Hosts listen to other host reports • Only 1 host responds. Others become “idle-members.” • sends unsolicited Membership Reports (i.e., Join Messages) to group address (e.g. 224.10.8.5) • sends Leave messages to All Routers (224.0.0.2) • reports group membership ONLY – no sources. • Only the existence of local group members is known, not the actual members themselves (due to idle-member state).
IGMP Protocol Flow - Querier Yes, me! Still interested? (general query) 224.0.0.1 0-10 sec 230.0.0.1 I want 230.0.0.1 230.0.0.1 group 230.0.0.1 • Hosts respond to query to indicate (new or continued) interest in group(s) • only one host should respond per group • Hosts fall into idle-member state when same-group report heard. • After 260 sec with no response, router times out group. 125 sec 224.0.0.1
IGMP Protocol Flow - Leave a Group I want to leave! Anyone still want this group? 224.0.0.2 <230.0.0.1> 230.0.0.1 <230.0.0.1> I don’t want 230.0.0.1 anymore 1 sec (re-transmit timer) 230.0.0.1 <230.0.0.1> 230.0.0.1 group • Hosts that support IGMPv2 send Leave messages to all-routers group indicating group they’re leaving. • Router follows up with 2 group-specific query messages. • IGMPv1 hosts leave by not responding to queries (260 sec timeout).
Switches and Snooping • IGMP host reports (Joins) tell the router to start sending multicast traffic to the LAN, since one or more hosts on the LAN are members of the group. • In a conventional shared broadcast LAN using switches that have no multicast smarts, the traffic is flooded to all hosts. • With multiple high bandwidth multicast sources (e.g. video at 5 Mbps), this does not scale. • There are a few techniques used to deal with this...
IGMP Snooping • Implemented by several vendors. Support for IGMPv2 is common; support for IGMPv3 is becoming more common. • What happens at the MAC layer: • IGMP snoopers add a bridge table entry for each multicast group destination address(GDA) to each switch port that has the interested member's unicast source address(USA) already on it. • Remember that there are likely to be hubs or switches downstream of a given switch port, so more than one USA can be on a single port. • When an IGMP Leave is received, the GDA entries are pruned.
Why IGMP snooping isharder than it looks • The IGMP membership reports have to be captured from each host and suppressed to other hosts to prevent the others from going into idle-member state. Every interested host has to be spoofed into thinking it is the only member of the group, so that it actively sends membership reports. • The IGMP snooper then forwards one of these membership reports up to the router or makes up a fake membership report coming from one of: • the host • the switch’s management IP address, or • 0.0.0.0
Why IGMP snooping is harder than it looks, continued • Since multiple USAs can be on a port (via downstream switch), the switch has to actually do the IGMP membership query/timeout before pruning a port. • Since membership reports are sent to the same GDA as the (possibly high-bandwidth) multicast traffic, there is a potential for heavy loading of the switch CPU, unless you use more expensive ASICs that can separate the IGMP protocol messages from general traffic and route only the IGMP messages to the CPU. • The switch has to know which is the multicast router port. It does this by snooping for IGMP queries.
Join without IGMP snooping Switch 230.0.0.1 230.0.0.1 I want 230.0.0.1 230.0.0.1 230.0.0.1 230.0.0.1 230.0.0.1 I want 230.0.0.1 1. Host A sends membership report. 2. Switch floods it to all ports. 230.0.0.1 3. Router sends traffic (floods). 230.0.0.1 4. Host B wants to join. No IGMP message needed (idle-member).
Join with IGMP snooping Switch 230.0.0.1 230.0.0.1 I want 230.0.0.1 230.0.0.1 230.0.0.1 230.0.0.1 230.0.0.1 I want 230.0.0.1 1. Host A sends membership report. 2. Switch forwards it to router. 3. Router sends traffic. 4. Host B sends membership report. Switch suppresses it and adds port to bridge table.
Maintaining state w/IGMP snooping 224.0.0.1 ? Switch 230.0.0.1 230.0.0.1 224.0.0.1 General Query 224.0.0.1 ? 230.0.0.1 230.0.0.1 230.0.0.1 230.0.0.1 1. Router sends general query. 2. A&B both respond w/membership report (no idle member). 224.0.0.1 ? 3. Switch sends one to router and suppresses one.
Leave with IGMP snooping 224.0.0.22 <230.0.0.1> Switch 230.0.0.1 ? done 230.0.0.1 230.0.0.1 230.0.0.1 1. Host A sends Leave. 2. Switch spoofs G-specific query. 3. No reply, switch prunes port. (Nothing sent to router.)
Leave with IGMP snooping, cont’d Switch 224.0.0.22 <230.0.0.1> 224.0.0.22 <230.0.0.1> 230.0.0.1 ? 230.0.0.1 ? 230.0.0.1 ? done 230.0.0.1 230.0.0.1 1. Host B sends Leave. 2. Switch spoofs G-specific query. 3. No reply; switch prunes port. 4. Switch sends Leave to router. 5. Router sends 2 G-specific queries, gets no response, and prunes the group. (Queries may [not] be suppressed)
Video Server Sourcing Multicast: conventional switch Switch 230.0.0.1 230.0.0.1 230.0.0.1 Multicast is just like broadcast: Flooded out all ports. 230.0.0.1
Video Server Sourcing with multicast-aware switch Switch 230.0.0.1 230.0.0.1 Multicast traffic is forwarded only to mrouter ports (learned by snooping for IGMP queriers). Exception: flood 224.0.0.0/24
Design Consequences for Networks • Be careful selecting/purchasing switches if you plan to support multicast. Try to do a test/eval before buying. Many vendors say they support IGMP, but how well varies widely. Also varies widely within same vendor. • Consider your physical topology design. Is it possible to put multicast-heavy subnets closer to the core, or on higher-class switches? Can you avoid switches and connect direct to a router? • Keep subnets small. Less churn in joins/leaves. • Check defaults. What is turned on and what is not?
Consequences for Troubleshooting • In general, multicast on the LAN is not as well understood as multicast on the WAN. • Bugs are common. • The horsepower of your switch(es) might matter. When snooping is enabled and CPU load is high, they may drop packets that shouldn’t be dropped. • Even without snooping, sometimes they step outside their bailiwick, trying to do non-Layer-2 tasks. • Management visibility into the switch may be limited. • Often testing to a host directly connected to a router can expose these problems.
PIM-SM Protocol Independent Multicast - Sparse Mode • The core multicast protocol: builds and tears down multicast trees. • “Protocol Independent” means independent of the protocol used to build the reachability table, not independent of IP. (More on reachability in a moment.) • “Sparse Mode” refers to the explicit join approach taken by PIM-SM — the protocol assumes that not everyone wants the data. • PIM also has a Dense Mode, which starts with the assumption that everyone does want the data. This is also known as a flood-and-prune approach. Not recommended!
Multicast “Routing” • Multicast routing can be thought of as the reverse of unicast forwarding. • Unicast forwarding is concerned with where the packet is going. • Multicast routing is concerned with where the packet will be coming from. • Multicast paths to receivers form a “tree”. The tree is built (or torn down) from the receiver back toward the source. This is easy to forget, but very important to remember.
Multicast “Routing” • Multicast forwarding topology is stored in outgoing interface lists (OILs). • On each router, PIM-SM maintains an OIL for each group for which it has downstream listeners. • Once the multicast distribution tree is built, multicast forwarding works similarly to unicast forwarding — but instead of using unicast forwarding tables to send packets out single interfaces, routers use OILs to send packets out multiple interfaces. • Multicast packets received from a given source on an incoming interface for a given group are sent out only on the interfaces specified in the appropriate outgoing interface list (OIL).
ASM: the original multicast service model • Packet transmission is based on UDP, so packet delivery is “best-effort”, with no loss detection or retransmission • A source can send multicast packets at any time, with no need to register or schedule transmissions. • Sources do not know the group membership. A group may have many sources and many members. • Group members may come and go at will, with no need to coordinate with a central authority. • And, critically, group members know only the group. They don’t need to know anything about sources — not even whether or not any sources exist. • This is the ASM paradigm. It requires sender registration and tree-switching.
Multicast Distribution Trees • In the original multicast service model, a connection between a source and a receiver is first set up by building an RPT from the receiver back to a Rendezvous Point (RP), then an SPT (source tree) from the RP back to the source. • Then, once data starts flowing to the receiver, an SPT is built directly from the receiver back to the source. • This is called “tree-switching”. • A special router adjacent to the receiver is responsible for this – the PIM Designated Router (DR). • Each multicast-enabled routed segment on your network has a PIM DR.
Designated Router (DR) • DR sends • “Join/Prune” messages toward the RP from receiver network • “Register” messages toward the RP from source network • Selecting the DR: • Neighboring PIM-SM routers multicast periodic “Hello” messages to each other (default is every 30 seconds; the hello-interval is tunable for faster convergence). • On receipt of a Hello message, a router stores the IP address and priority for that neighbor. • The router with highest IP address is selected as the DR, if the priorities match. • When DR goes down, a new one is selected by scanning all neighbors on the interface and choosing the one with the highest IP address.
IGMPv2 host report (*, G) Join RP Tree ASM RP Tree Join RP Receiver announces desire to join group G with IGMPv2 host report – (*,G). DR sends PIM (*,G) Join toward the RP; subsequent routers do likewise. Receiver
Source (S, G) Register (unicast) Traffic Flow RP Tree Shortest Path Tree (S, G) Join ASM Sender Registration RP Active source triggers DR to send (S,G) Register message to RP. RP sends (S,G) Join to source. Receiver
Source (S, G) Register (unicast) Traffic Flow RP Tree Shortest Path Tree (S, G) Register-Stop (unicast) ASM Sender Registration RP (S, G) traffic begins arriving at the RP via the SPT. RP sends a Register-Stop back to the first-hop router to stop the Register process. Receiver
Source Traffic Flow ASM Sender Registration RP Source traffic flows nativelyalong SPT to RP. From RP, traffic flows downthe RPT to the receiver. RP Tree Shortest Path Tree Receiver
Source Traffic Flow (S, G) Join ASM SPT Cutover RP Last-hop router joins the SPT. RP Tree Shortest Path Tree Receiver
Source Traffic Flow (S, G) RP-bit Prune ASM SPT Cutover RP Traffic begins flowing down the new branch of the SPT. RP Tree Shortest Path Tree Additional (S, G) state is created along the RPT to prune off (S, G) traffic. Receiver
Source Traffic Flow ASM SPT Cutover RP (S,G) traffic flow is now pruned off of this branch of the RPT and is flowing to the receiver via the SPT. Traffic for other sources may still be flowing down the RPT. RP Tree Shortest Path Tree Receiver
Source ASM SPT Cutover RP (S, G) traffic flow is no longer needed by the RP, so it prunes the flow of (S, G) traffic. Traffic Flow RP Tree Shortest Path Tree (S, G) Prune Receiver
Source ASM SPT Cutover RP (S, G) Traffic flow is now only flowing to the receiver via a single branch of the SPT. Traffic Flow RP Tree As long as the source remains active, its first-hop router sends Null-Register messages to the RP, enabling the RP to maintain a list of all active sources. Shortest Path Tree Receiver
RP Options • Remember, the RP is used to “hook up” receivers with senders. Receivers only know group address. • Static RP • Recommended • Easy transition to Anycast-RP • Allows for a hierarchy of RPs • Auto-RP (Cisco proprietary) • Fixed convergence timers (slow) • Must flood RP mapping traffic • bootstrap router • Fixed convergence timers (slow) • Allows for a hierarchy of RPs
RP Options • In most cases, static RP is the best option: • simple: just tell every router the RP address (once!) • flexible: use a /32 on a loopback interface so it can be moved • scalable: add more instances of same RP address for redundancy, load splitting, topological localization, etc. • survivable: fail-over from one RP to another is as fast as IGP convergence • blessed: RFC 3446 (just 8 pages!) • Only use more complicated options if you really need to: • different RP(s) for different groups • see later Anycast-RP slides for details
Inter-domain ASM and MSDP • A PIM domain is a network in which all routers use the same RP for any given multicast group. • Inter-domain ASM requires another protocol: Multicast Source Discovery Protocol (MSDP). • Why? Because the receiver is restricted to sending only (*,G) joins to its RP. And its RP doesn’t know where the source is, because the source is registered to a different RP. MSDP is needed for the receiver's RP to find the (S,G). • Officially, MSDP is a temporary solution. We shall see.
MSDP Peers (inter-domain case) • MSDP establishes a neighbor relationship between MSDP peers • Peers connect using TCP port 639 • Peers send keepalives every 60 secs (fixed) • Peer connection reset after 75 seconds if no MSDP packets or keepalives are received • MSDP peers must have knowledge of multicast topology. • Required for peer-RPF checking of the RP address in the SA to prevent SA looping. Note that this is not the same thing as the multicast routing RPF check.
MSDP Operation — Flooding • Initial SA message sent when source DR first registers • May optionally encapsulate first data packet • Originating RP sends subsequent SA messages every 60 seconds, for as long as source remains active • Flooding • SA (source active) packets periodically sent to MSDP peers indicating: • source IP address of active streams • group multicast IP address of active streams • IP address of RP originating the SA • RPs only originate SAs for your sources within your domain!
RP RP RP RP RP r SA Join (*, 224.2.2.2) SA SA SA SA SA SA Message192.1.1.1, 224.2.2.2 s SA Message192.1.1.1, 224.2.2.2 Register 192.1.1.1, 224.2.2.2 MSDP Overview Domain E MSDP Peers SA Source ActiveMessages Domain C Domain B Domain D Domain A
RP RP RP RP RP Multicast Traffic r Join (S, 224.2.2.2) Join (S, 224.2.2.2) Join (S, 224.2.2.2) s MSDP Overview Domain E MSDP Peers Domain C Domain B Domain D Domain A