120 likes | 129 Views
Guidelines for maintaining effective peer-to-peer communications in the realm of network operations, involving standardization, notification protocols, and subject line formats for efficient issue tracking.
E N D
Peering Operations:NOC-to-Peer Communications StandardsRen Provo, ren.provo@att.comAT&T Internet Services
Common Peering Expectations • Peer must have a 24x7 NOC with internal trouble ticket capability, email & phone coverage. • Commit to joint capacity planning and ongoing review. • Security concerns should be worked out jointly unless under court order or denial of service (DOS) attack. • Neither party should discuss network of peer with clients or media • Can be a customer of a service if not Internet access • Post your guidelines! http://www.att.com/peering/
Top Reasons Not to Discuss Existing Peering Relationships • Peering relationships evolve daily • Legal or regulatory efforts may be underway • Alliances, merger & acquisition activity supercedes most current and/or posted peering policies. • Most peering relationships in the US are bound by NDAs. • Some peers consider this press release activity and insist on joint review, especially if their logo is included. • Listing some but not all peers raises eyebrows about ‘missing’ peers so lists should not be supplied
Peering Contacts Should Be Standardized • Peering@ - Policy/business hours Alias should reach a person who can advise as to new peering process, augments to existing peerings and support communication and contact updates with peering partners. • Network Operations Center – 24x7 Alias should reach a person(s) who can work with internal peering team to resolve maintenance and outage concerns. Should BGP routing/peering be outside the scope of the NOC role, the NOC should be able to escalate peering issues promptly to BGP/peering friendly engineers.
Reasons to Communicate • Maintenance notifications are encouraged but not required. As a positive relationship maintainer, and to keep current contacts, make the effort if a session will be affected for +15 minutes. • If you know you will trigger alarms, interrupt monitoring, change a MAC address that will halt RRD, MRTG, etc. supply a ‘heads-up’ rather than waste cycles with your peer network. • Discovery of network issue impacting a few or all peers.
Subject Line Format – Why So Picky? • Blank subject lines are just plain sloppy! • ‘bgp session down’ or ‘emergency reload’ is not helpful when sent to 1,000s of people on mailing lists and/or NOC distributions. • IX location in human readable vs. router-name format.Why force your peers to research your naming schema? • Peer writing to/ASN & peer writing from/ASN – why?- Mergers, acquisitions, marketing whims… include the ASN to reduce the guessing game.- Chronic tracking of ongoing threads with searchable subject lines.- Some networks have multiple ASNs and identification of the sub-network will expedite resolution.
Use Subject Lines Wisely • Purpose: Help your peers track & rapidly resolve outages. • AS7132’s preferred subject line format: <IX location in human readable vs. router-name format:><peer writing to/ASN>&<peer writing from/ASN><what is the general issue> - <date of initial outreach> - <time of initial message> • Example subject line: Equinix-Dallas: RCN/6079 & AT&T/7132 – new session - 29-Mar-06 - 9:45 am EST
State the General Issue • In large organizations the message will likely be forwarded for assistance based on the issue at hand. Help prevent emails that rope in half an organization. • Suggested ‘general issues’ include:- session down – fine after listing where/who- zero prefix – this does trigger alarms in some systems- inconsistent prefix/origin- move to new connection – more common for publics- upgrading capacity to facility- emergency maintenance- scheduled maintenance
List the Date & Time that the Issue Was Discovered • Assist data correlation with tools, reports, timelines, etc. • Remember that peers are global and email headers usually only contain local time to the user, not the location of the incident. • Email threads get forwarded and sometimes get stuck in a human queue for periods of time. • Chronic outage issues are easier to track.
Subject Line Format Examples • NOTA-Miami: Tel. Italia/6762 & AT&T/7132 – emergency maintenance – 15-Jan-06 – 2:30 pm Central • Reason: Card failed where private peer session was in use. • PAIX-Palo Alto: TNZI/4648 & AT&T/7132 – migrate session – 27-Feb-06 – 1:15 pm Central • Reason: we discovered they were closer to our gear in the PAIX Suite than at the SIX. • Equinix-Dallas: ELI/5650 & AT&T/7132 – session down – 5-Dec-05 – 12:30 am Central • Reason: This was unscheduled and helped mark the start of an outage for ELI to troubleshoot.
Create a Peering Page • Peering ‘Guidelines’ or ‘Requirements’ are not ‘RULES’ • Peers will have business drivers to balance with network needs. • Keep it simple, easy to maintain and straightforward. • State the facts in an easy to follow format. • Keep in mind many peers use translation tools in locations where English is not their first language. Overly complex ‘rules’ will only add to your overhead. • Attempt to convey URL in your .signature file as a FAQ to avoid rehash of common details. It can’t hurt!
Peeringdb.com – Use it. • Peeringdb.com is a valuable asset. • Keep your info current. • Push Richard Steenbergen for new & improved features. • Cheers!