260 likes | 490 Views
Inter-Peer NOC Communication. Mike Hughes mike@linx.net. Scene Setting: Straw Poll. Who here in this room does peering?. Scene Setting: Straw Poll. Who here in this room does peering? Have you ever had issues resolving problems with your peerings?
E N D
Inter-Peer NOC Communication Mike Hughes mike@linx.net
Scene Setting: Straw Poll • Who here in this room does peering?
Scene Setting: Straw Poll • Who here in this room does peering? • Have you ever had issues resolving problems with your peerings? • Difficulties contacting peers, finding the right contact, communication problems?
Scene Setting: Straw Poll • Who here in this room does peering? • Have you ever had issues resolving problems with your peerings? • Do you maintain a local db of contacts? • Why? Issues with freshness of data?
Scene Setting: Straw Poll • Who here in this room does peering? • Have you ever had issues resolving problems with your peerings? • Do you maintain a local db of contacts? • When a peer needs to talk to you, where does their call/email arrive? • Main NOC contact? Dedicated peering contact? “Customer Care”?
Scene Setting: Straw Poll • Who here in this room does peering? • Have you ever had issues resolving problems with your peerings? • Do you maintain a local db of contacts? • When a peer needs to talk to you, where does their call/email arrive? • Some names have been changed to protect the innocent… and guilty…
Why do you go peering? • Long term money savings • Less Transit • Lower latency, better performance • Traffic Control • Diversity, Reliability • Presence …and so on…
Where’s the problem? • Poor inter-peer communication seems to be common • Friendly IX operator called in to “mediate” • Communication hitting the wrong place • Customer NOCs • IX Operator • IP address maintainer (e.g. whois contact)
Identifying the right contact • Sources of information: • Whois queries to databases • IXP-maintained NOC and Peering contact db • Internal databases • Third-party voluntary databases • http://puck.nether.net/netops list • peeringdb.com • All above are vulnerable to information “rot”
How to drive RIPEdb/RA, etc • Some really subtle differences in the implementations • RIPE expects “AS” before an AS number! • Which contacts are useful • Which objects to look up • Like the Peer ASN, not the Peer IP address! • Why can’t ASN be logged in adjacency changes on routers? • This seems to drive IP-based lookups
Drive the Data Sources Properly! • Example: using WHOIS queries • “Oh, I have an outage on WAIX, I’ll look up the IP address” $ whois -h whois.arin.net 198.32.212.11|less … OrgName: Exchange Point Blocks … RTechHandle: WM110-ARIN RTechName: Manning, Bill RTechPhone: +1-310-322-8102 RTechEmail: bmanning@karoshi.com
Bad Data Enters the System • “Okay, I’ll phone Bill Manning” • But all Bill did was give WAIX some v4 space • Bill doesn’t run WAIX, and isn’t an operational contact for WAIX • So, Bill either ignores your voicemail, or tells you to call someone else • Whatever – it’s added delay, increased frustration – it’s how not to do it
Driving Whois Properly • Always lookup the PEER ASN • Not the IP address! • It’s a BGP problem, we use ASNs in BGP $ whois -h whois.ra.net AS3856|less aut-num: AS3856 as-name: UNSPECIFIED descr: Packet Clearing House www.pch.net admin-c: Bill Woodcock tech-c: Bill Woodcock remarks: peering@pch.net, +1 866 BGP PEER
$ whois -h whois.ra.net AS3856|less aut-num: AS3856 as-name: UNSPECIFIED descr: Packet Clearing House www.pch.net admin-c: Bill Woodcock tech-c: Bill Woodcock remarks: peering@pch.net, +1 866 BGP PEER Driving Whois Properly • Always lookup the PEER ASN • Not the IP address! • It’s a BGP problem, we use ASNs in BGP
So you’ve found the contact • How do they respond to you? • Confusing recursive call trees? • Recalcitrant ticketing systems? • First-line NOC – “Is it switched on?” • “You’re not a customer, go away” • Once negotiated, peering is an engineering relationship • So backbone ops, not “customer care”
Expectations of Peer Contacts • Choose your points of contact carefully • Big problems with • What’s peering/BGP/WAIX? • Are you a customer? • What’s your circuit ID? • Go away, you aren’t a customer • All serious no-no’s – be nice to your peers!
PCH INOC-DBA Phones • PCH operate a “dial by ASN” NOC hotline system • They run the SIP registry/proxy • “Bring your own” SIP compliant phone • The idea is that it should get through to someone clueful • No call-trees, no music-on-hold • http://www.pch.net/inoc-dba/
Suggested Role Contacts • Peering@ • For setting up new peerings, changing existing ones, no 24x7 expectation • Shouldn’t go to exclusively to sales@ ;-) • NOC@ • Reaches your 24x7 NOC, which is either BGP friendly and has enable, or knows when, how and where to escalate • Support@ • Is generally your “customer-care”/call center
Getting the message across • Okay, so you’ve made contact • Now, make your point • Provide the peer with useful information • Start with the subject line • Be informative, who, when, what • Messages like “Help” and “Peering down” aren’t helpful
How not to do it… -----Original Message----- From: Joe Schmoe <schmoe@noc.foo.com> Sent: Wednesday, January 25, 2006 5:41 PM Subject: Maintenance Notification Dear Peers, … • Where? How does it affect me? • All detail buried in wordy message body • When? No TZ stamp! • Help me handle my huge NOC inbox!
Example: Useful Subject Headers AS7132’s preferred subject line format: <IX location> - <peer writing to/ASN> - <peer writing from/ASN> - <what is the issue> - <date of initial correspondence> - <time of initial message> Example subject line: Equinix-Ashburn - RCN/6079 - SBC/7132 - new session turn-up - 29- Mar-06 - 9:45 am EST Thanks to Ren Provo
Look clueful Subject: Traffic Drop Dear Peer, We suddenly noticed a 300Mb drop in traffic on our connection to the PIE-IX. Can you investigate, and help us find where the traffic has gone? Regards, … • What does this say about your peer? • Don’t you think they look silly? • Run tools to help you answer these questions yourself • Netflow, MAC accounting, etc.
How to escalate • Check your equipment first • Ask your peer - “What’s up?” • Often you can resolve a problem bi-laterally • Go to the IX only if you need to • Not all IX operators can provide a 24x7 contact • When to escalate a customer fault • Don’t stonewall customer reports • Don’t point them to the IX operator • Co-ordinate directly with your peers
How the IXP Op can help • Provide an up-to-date list of IX participants and their NOC/Peering contact information • Usually password protected • Help break comms deadlock • Help fix “dead ends” • Otherwise, they can only help with “physical” problems • “link down”, packet loss, broken cables, packet corruption to all destinations connected to the IXP
In Summary • Keep your own information up to date • Whois db objects, third party dbs • Make sure your peering and NOC contacts are appropriate • No-one likes call-trees and holding • Find the right contacts at your peers • Be nice to your peers!
Thanks • mike@linx.net