500 likes | 514 Views
Learn about IPv4 multicast basics, group and source addressing, reachability, examples, lessons from MBONE, and applying these lessons to modern multicast deployment using Multicast Border Gateway Protocol (MBGP) and Protocol Independent Multicast (PIM) Sparse Mode.
E N D
Best Current Practices for IPv4 Multicast Deployment Bill Nicklessnickless@mcs.anl.govhttp://www.mcs.anl.gov/home/nickless
What is Multicast? • A multicast sender simply sends its data, and intervening routers "conspire" to get the data to all interested listeners. (S. Deering) • Destination of IP multicast packets is a “Group” address, within 224.0.0.0/4.
Notation • Specific source address(es): S • Specific group address(es): G • Specific source traffic for a group: (S,G) • All sources traffic for a group: (*,G) • Rendezvous Point RP
Any Source Multicast • Senders send multicast group-addressed packets. • Receivers register their interest in groups by way of IGMPv2 (*,G) Joins • Network keeps track of all senders for each group, and delivers packets from all senders to each interested Receiver.
Source Specific Multicast • Senders send multicast group-addressed packets. • Receivers register their interest in specific sources sending to specific groups by way of IGMPv3 (S,G) Joins (well, group membership reports….) • Receivers are responsible for specifying which Senders’ traffic they want to receive.
Reachability NOT DEFINED BY INTERNET STANDARDS
Reachability (Where To?) • NOT DEFINED BY INTERNET STANDARDS • Unicast reachability is interpreted by implementation and practice as: Send me IP packets with destination addresses that match this advertisement. • Think ‘show ip route’
Reachability (Whence?) • NOT DEFINED BY INTERNET STANDARDS • Multicast reachability is interpreted by implementation and practice as: Here’s where to get IP packets from sources that match this advertisement. • Think ‘show ip rpf’
Reachability Examples terra% netstat –rn Kernel IP routing table Destination Gateway Genmask Flags Iface 140.221.11.103 0.0.0.0 255.255.255.255 UH eth0 140.221.8.0 0.0.0.0 255.255.252.0 U eth0 127.0.0.0 0.0.0.0 255.0.0.0 U lo 224.0.0.0 0.0.0.0 240.0.0.0 U eth0 0.0.0.0 140.221.11.253 0.0.0.0 UG eth0
Reachability Examples Kiwi#show ip route 140.221.11.103 Routing entry for 140.221.8.0/22 Known via "ospf 683", distance 110, metric 1117, type intra area Last update from 140.221.20.124 on GigabitEthernet5/0, 03:35:56 ago Routing Descriptor Blocks: * 140.221.20.124, from 140.221.47.6, 03:35:56 ago, via GigabitEthernet5/0 Route metric is 1117, traffic share count is 1
Reachability Examples Kiwi#show ip rpf 140.221.11.103 RPF information for terra.mcs.anl.gov (140.221.11.103) RPF interface: GigabitEthernet5/0 RPF neighbor: stardust-msfc-20.mcs.anl.gov (140.221.20.124) RPF route/mask: 140.221.8.0/22 RPF type: unicast (ospf 683) RPF recursion count: 0 Doing distance-preferred lookups across tables
The Old MBONE • Excellent first approximation. • Used tunnels to encapsulate multicast traffic over unicast paths. • Routing done by user-space daemons running on general purpose Unix boxes. • Internet Group Management Protocol (IGMP)(Think Multicast ARP) • Pre-dates the World Wide Web (hence SDR)
Lessons Learned from MBONE • Distance Vector Metric Routing Protocol (DVMRP) does not scale • Easy to create IP Multicast “amplifiers”. • Separate tunneled routing infrastructure not aligned with modern BGP Internetworking. • Flood & Prune does not scale • Examples: PIM-Dense Mode, DVMRP. • Not sensitive to available bandwidth. • Requires downstream routers that are smart and powerful enough to send prune messages.
Applying Those Lessons • Multicast Border Gateway Protocol. • Provides reachability and policy control for multicast routing, just as BGP does for unicast. • Protocol Independent Multicast (Sparse Mode) • Listeners receive traffic only when requested. • Forms multicast distribution trees. • Multicast Source Discovery Protocol • Finding active sources in other PIM Sparse Mode domains (usually other ASes).
Setting Reachability Policy: Multicast Border Gateway Protocol • RFC 2283 adds the MP_REACH_NLRI attribute to BGP-4. • Identifies a BGP route as unicast, multicast, or both • When implemented in a router, all the standard BGP machinery is available for prefix filtering, preference setting, MEDs, AS length comparisons, etc. • M-BGP routes can be independent of BGP, allowing for different inter-AS unicast/multicast reachability.
Cisco M-BGP Configuration router bgp 683 network 130.202.0.0 nlri unicast multicast network 140.221.0.0 nlri unicast multicast neighbor 192.5.170.130 remote-as 145 nlri unicast multicast neighbor 192.5.170.130 description vBNS neighbor 192.5.170.130 soft-reconfiguration inbound neighbor 192.5.170.130 route-map from-vbns-lp-400 in neighbor 192.5.170.130 route-map to-vbns-med-10 out
Cisco M-BGP Configuration route-map from-vbns-lp-400 permit 10 match nlri unicast set local-preference 400!route-map from-vbns-lp-400 permit 15 match as-path 145 match nlri multicast set local-preference 400!route-map to-vbns-med-10 permit 10 match ip address 50 set metric 10
Cisco M-BGP Configuration access-list 50 permit 140.221.0.0access-list 50 permit 130.202.0.0!ip as-path access-list 145 deny _24_ip as-path access-list 145 deny _293_ip as-path access-list 145 deny _11537_ip as-path access-list 145 permit .*
Juniper M-BGP Configuration routing-options { rib inet.2 { static { route 141.142.0.0/16 reject; route 141.142.109.0/25 next-hop 141.142.11.74; route 141.142.109.128/25 next-hop 141.142.11.74; route 141.142.104.0/24 next-hop 141.142.11.74; route 141.142.105.0/24 next-hop 141.142.11.74; route 141.142.108.0/24 next-hop 141.142.11.74; } } }
Juniper M-BGP Configuration routing-options { rib-groups { ifrg { import-rib [ inet.0 inet.2 ]; } mcrg { export-rib inet.2; import-rib inet.2; } igp-rg { export-rib inet.0; import-rib [ inet.0 inet.2 ]; } }}
Juniper M-BGP Configuration protocols { bgp { group anl { import [ bgp-anl-accept reject-all ]; family inet { any; } export [ bgp-announce-ncsa reject-all ]; peer-as 683; neighbor 206.220.243.21; } }
Monitoring M-BGP (Cisco) Kiwi#show ip mbgp sum BGP router identifier 192.5.170.2, local AS number 683 MBGP table version is 324285 4121 network entries and 12621 paths using 862335 bytes of memory Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ 192.5.170.130 4 145 53420 20497 324285 0 0 Up/Down State/PfxRcd 5d14h 346
Kiwi#show ip mbgp 128.163.3.214 MBGP routing table entry for 128.163.0.0/16, version 323761 Paths: (3 available, best #2) 24 145 10490 10437, (aggregated by 10437 128.163.55.253), (received-only) 192.12.123.10 from 192.12.123.10 (198.10.80.66) Origin IGP, localpref 100, valid, external, atomic-aggregate 145 10490 10437, (aggregated by 10437 128.163.55.253) 192.5.170.130 from 192.5.170.130 (204.147.135.241) Origin IGP, localpref 400, valid, external, atomic-aggregate, best 145 10490 10437, (aggregated by 10437 128.163.55.253), (received-only) 192.5.170.130 from 192.5.170.130 (204.147.135.241) Origin IGP, localpref 100, valid, external, atomic-aggregate
Monitoring M-BGP (Juniper) nickless@charlie> show bgp neighbor 206.220.243.21 Peer: 206.220.243.21+179 AS 683 Local: 206.220.243.160+1969 AS 1224 [. . .] NLRI advertised by peer: inet-unicast inet-multicast NLRI for this session: inet-unicast inet-multicast Peer supports Refresh capability (2) Table inet.0 Bit: 10006 Active Prefixes: 13 Received Prefixes: 13 Suppressed due to damping: 0 Table inet.2 Bit: 20006 Active Prefixes: 9 Received Prefixes: 9 Suppressed due to damping: 0
nickless@charlie> show route table inet.2 140.221.34.1 inet.2: 5046 destinations, 5046 routes (5045 active, 0 holddown, 1 hidden) + = Active Route, - = Last Active, * = Both 140.221.0.0/16 *[BGP/170] 2w5d 19:24:04, MED 0, localpref 1000 AS path: 683 I > to 206.220.243.21 via at-1/0/0.683 [BGP/170] 3d 04:38:22, MED 0, localpref 60 AS path: 11537 683 I > to 141.142.11.246 via so-2/2/0.0 [BGP/170] 1w0d 11:18:35, localpref 60 AS path: 145 683 I > to 141.142.11.1 via at-1/0/0.145 [BGP/170] 2w5d 19:23:42, localpref 60 AS path: 38 683 I > to 192.17.8.32 via at-1/0/0.38 [BGP/170] 4d 05:55:21, MED 5, localpref 20 AS path: 2914 683 I > to 192.17.8.34 via at-1/0/0.2914
PIM Sparse Mode • RFC 2362 defines PIM Sparse Mode. • No PIM-SM activity until: • A host starts transmitting traffic (or) • A host subscribes to a group. • A Rendezvous Point (RP) is the root of the shared distribution tree for multicast traffic within a PIM Domain. • Given enough traffic, a source-based distribution tree is created. (Enough is typically anything greater than zero). • Inter-PIM Domain distribution trees are all source-based.
Multicast Session Discovery Protocol (MSDP) • Not yet an RFC (in Last Call stage). See http://www.ietf.org/html.charters/msdp-charter.htmlandftp://ftp.ietf.org/internet-drafts/ draft-ietf-msdp-spec-09.txt • Currently only covers IPv4. • PIM-SM RPs communicate through MSDP to find active multicast sources. • If “interested”, the RP initiates a PIM-SM Join towards each active source.
Reachability Redux • A BGP NLRI=Multicast route is a statement of reachability. • Inter-domain PIM-Sparse Mode Joins follow the BGP reachability topology. • MSDP forwarding between RPs follows the BGP reachability topology. • Not doing MSDP where you do M-BGP means that you’ve formed an MSDP “black hole”.
Cisco PIM-SM w/ MSDP Configuration • interface ATM3/0.145 point-to-point description vBNS MBGP+PIM-SM+MSDP ip address 192.5.170.129 255.255.255.252 ip pim border ip pim sparse-mode ip multicast ttl-threshold 32 ip multicast boundary 10ip msdp peer 204.147.128.141ip msdp description 204.147.128.141 vBNSip msdp sa-filter in 204.147.128.141 list 111ip msdp sa-filter out 204.147.128.141 list 111ip msdp sa-request 204.147.128.141ip msdp ttl-threshold 204.147.128.141 32ip msdp cache-sa-state
access-list 10 deny 224.0.1.39 ! CISCO-RP-ANNOUNCE.MCAST.NETaccess-list 10 deny 224.0.1.40 ! CISCO-RP-DISCOVERY.MCAST.NETaccess-list 10 deny 239.0.0.0 0.255.255.255access-list 10 permit 224.0.0.0 15.255.255.255 • access-list 111 deny ip any host 224.0.2.2 ! SUN-RPC.MCAST.NETaccess-list 111 deny ip any host 224.0.1.3 ! RWHOD.MCAST.NETaccess-list 111 deny ip any host 224.0.1.24 ! MICROSOFT-DS.MCAST.NETaccess-list 111 deny ip any host 224.0.1.22 ! SVRLOC.MCAST.NETaccess-list 111 deny ip any host 224.0.1.2 ! SGI-DOG.MCAST.NETaccess-list 111 deny ip any host 224.0.1.35 ! SVRLOC-DA.MCAST.NETaccess-list 111 deny ip any host 224.0.1.60 ! HP-DEVICE-DISC.MCAST.NETaccess-list 111 deny ip any host 224.0.1.39 ! CISCO-RP-ANNOUNCE.MCAST.NETaccess-list 111 deny ip any host 224.0.1.40 ! CISCO-RP-DISCOVERY.MCAST.NETaccess-list 111 deny ip any 239.0.0.0 0.255.255.255access-list 111 deny ip 10.0.0.0 0.255.255.255 anyaccess-list 111 deny ip 127.0.0.0 0.255.255.255 anyaccess-list 111 deny ip 172.16.0.0 0.15.255.255 anyaccess-list 111 deny ip 192.168.0.0 0.0.255.255 anyaccess-list 111 permit ip any
Juniper PIM-SM w/ MSDP Config protocols {pim { rib-group mcrg; rp { local { address 141.142.12.1; } } interface all { mode sparse; version 2; }} }
Juniper PIM-SM w/ MSDP Config protocols { msdp { rib-group mcrg; group anl { /* kiwi-loop.anchor.anl.gov */ peer 192.5.170.2 { local-address 141.142.12.1; } } } }
Monitoring MSDP and PIM-Sparse • Verify that MSDP session has come up with your peer:Kiwi#show ip msdp sum MSDP Peer Status SummaryPeer Address AS State Uptime/ Reset Peer Name Downtime Count204.147.128.141 145 Up 1d12h 11 cs.dng.vbns.net nickless@charlie> show msdp peer 192.5.170.2 Peer address Local address State Last up/down Peer-Group192.5.170.2 141.142.12.1 Established 2w5d18h anl
Monitoring MSDP and PIM-Sparse • Verify that active sources are being discovered: Kiwi#show ip msdp sa-cache 224.2.177.155 MSDP Source-Active Cache - 4020 entries (128.197.160.27, 224.2.177.155), RP 204.147.128.141, MBGP/AS 145, 03:40:18/00:05:03 […etc] nickless@charlie> show msdp source-active group 233.2.171.1 Group address Source address Peer address Originator Flags 233.2.171.1 140.221.34.1 141.142.11.246 192.5.170.2 Accept 192.5.170.2 192.5.170.2 Accept 192.17.8.32 192.5.170.2 Accept 204.147.128.141 192.5.170.2 Accept
Monitoring MSDP and PIM-Sparse • Verify that you are receiving traffic from those active sources, and are forwarding:Kiwi#show ip mroute count 224.2.177.155 128.163.3.214 Forwarding Counts: Pkt Count/Pkts per second/ Avg Pkt Size/Kilobits per secondOther counts: Total/RPF failed/ Other drops(OIF-null, rate-limit etc)Group: 224.2.177.155, Source count: 26, Group pkt count: 31060731 RP-tree: Forwarding: 159/0/429/0, Other: 72/0/0 Source: 128.163.3.214/32, Forwarding: 7089/0/480/0, Other: 6/0/0
Kiwi#show ip mroute 224.2.177.155 128.163.3.214 IP Multicast Routing Table Flags: D - Dense, S - Sparse, C - Connected, L - Local, P - Pruned R - RP-bit set, F - Register flag, T - SPT-bit set, J - Join SPT, M - MSDP created entry, X - Proxy Join Timer Running Timers: Uptime/Expires Interface state: Interface, Next-Hop or VCD, State/Mode (128.163.3.214, 224.2.177.155), 03:55:28/00:03:22, flags: MT Incoming interface: ATM3/0.145, RPF nbr 192.5.170.130, Mbgp Outgoing interface list: ATM0/0.216, Forward/Sparse, 03:55:28/00:03:08 ATM0/0.200, Forward/Sparse, 03:55:28/00:02:04
nickless@charlie> show multicast route group 233.2.171.1 \ source-prefix 140.221.34.1 extensive Group Source prefix Act Pru NHid Packets IfMismatch T/O 233.2.171.1 140.221.34.1 /32 A F 68 1829657 0 355 Upstream interface: at-1/0/0.683 Session name: Static Allocations nickless@charlie> show multicast route group 233.2.171.1 \ source-prefix 140.221.34.1 extensive Group Source prefix Act Pru NHid Packets IfMismatch T/O 233.2.171.1 140.221.34.1 /32 A F 68 1830512 0 355 Upstream interface: at-1/0/0.683 Session name: Static Allocations
nickless@charlie> show pim join 233.2.171.1 extensive Group Source RP Flags [. . .] 233.2.171.1 140.221.34.1 sparse,spt-pending Upstream interface: at-1/0/0.683Upstream State: Local RP, Join to Source Downstream Neighbors: Interface: ge-1/1/0.103 141.142.0.14 State: Join Flags: S Timeout: 182 Interface: gr-1/2/0.0 141.142.11.74 State: Join Flags: S Timeout: 208
Other Tips • ATM peerings are best done with point-to-point subinterfaces. (What’s a Designated Router in the context of an ATM exchange point, anyway?) • MSDP Source Actives are made from PIM Register messages. If you’re not sending MSDP SA messages for a source, you may have a problem with the Designated Router for that source.
More Tips • MSDP encapsulates data in its Source Active messages (just like they were encapsulated in the PIM Sparse Mode Register messages). This was done primarily to support SDR. • It is possible for MSDP to work while PIM-SM is not working, so you can’t always count on SDR to verify multicast routing.
Debugging Multicast • You must have: • at least one constantly active source • at least one constantly active receiver • Start near the receiver • Identify the PIM-SM Designated Router • Verify IGMP state in the Designated Router • Look for (S,G) state in the Designated Router
Debugging Multicast • Follow the Reverse Path Forwarding (RPF) from the Designated Router back towards the source • Verify PIM-SM has been configured on each interface along the RPF, because that determines the forwarding tree topology. • Check (S,G) state in each router. • Check (S,G) counters in each router.
Debugging Multicast • If the source is external to your PIM Domain: • Verify that you have an MSDP SA for that source. • Verify that the M-BGP Next Hop is: • A PIM Sparse Mode neighbor • An MSDP peer • Verify that you’re actually choosing the NLRI=Multicast route as your preferred RPF path. (hello BGP distance)
Debugging Multicast • What if nobody can hear your source? • Verify that the (S,G) shows up at your RP. • Verify that your RP is MSDP announcing the source, and that it shows up in your peer’s MSDP SA cache. • Verify your PIM-SM adjacency with your peer. • Verify that you have your peer’s interface in the outgoing list for the (S,G). • Verify that packet counters show traffic going out.
The Beacon: Test Signal • Testing Multicast requires active sessions • http://dast.nlanr.net/projects/beacon • In Java, so runsanywhere
The Beacon: Issues • Shows current state only. • Archive state over time? • How to visualize evolving state? Inherently a 3-dimensional problem, since state is 2D already. • Server scaling problems with O(40) beacons. • Currently seeing O(70) beacons at any time. • Assumes Any Source Multicast model.
Core Multicast Building Blocks • M-BGP: RFC 2283 is implemented by Juniper and Cisco in all major releases. AG community has used Juniper/Cisco the most. • MSDP: Implemented by Juniper, Cisco, Foundry... • PIM-Sparse Mode: RFC 2362 is implemented by a whole raft of vendors, including Cisco, Juniper, Foundry, Extreme, Marconi, etc.
Edge Multicast Building Blocks • IGMPv2 is widely available in Layer 2 and Layer 3 devices, and in most host operating systems. • IGMPv3 is coming soon to support SSM: • Available in Layer 3 devices from Cisco and Juniper. • IGMPv3 will be available in Windows XP (Whistler). • Ugly hack workarounds exist (URD et al).
North American IP Multicast Status • ESNet, Abilene, vBNS+, and NREN all running M-BGP, MSDP, and PIM-SM amongst themselves and with their customers/peers. • Regional and Institutional networks are currently the most common stumbling blocks for multicast apps. • STARTAP in Chicago is an international IP multicast meeting point. • International / commercial networks are coming online.