1.06k likes | 1.07k Views
This panel discussion provides insights on building a Network Operations Center (NOC), covering topics such as customer expectations, staffing requirements, organizational structure, tools, and funding. Case study from Indiana University and Global NOC.
E N D
How to Build a NOC Quilt NOC Workshop Panel Discussion: Indiana University/Global NOC WiscNetPacific Northwest GigaPopOctober 3, 2007 How to Build a NOC
Customers and Expectations • Who are your customers and what are their expectations/SLA’s? • Campus, University System, StateNet, GigaPoP, RON, National Backbone, International Connections • 24x7, Business Hours, Best Effort • Problem Resolution, Triage, Problem Identification, Service Desk How to Build a NOC
Supported Services • In addition to networking, what other services does your organization support? • Computer Operations • Support Center • Security Response • Grid Computing • Consulting How to Build a NOC
Monitoring and Troubleshooting • How large and complex is your network? • Large and Simple, or Small and Complex • Types of Networking Supported • Optical DWDM • Layer 2 – Switch Network • Layer 3 – Routed Network • How many hats being worn? Incestuous operational and support relationships? How to Build a NOC
Monitoring and Troubleshooting • What level troubleshooting and/or monitoring will your NOC do? • Troubleshooting is based on delegation of responsibility within a NOC organization and/or other related NOC’s • Monitoring based on SLA’s and health of the network & services provided How to Build a NOC
Monitoring and Troubleshooting • How will you communicate outages and planned work to customers? • Phone • Email / Listserv • Web page announcements • RSS How to Build a NOC
Staffing • Staffing requirements due to SLA's or after hour service response policies • 24x7, Business Hours, Best Effort How to Build a NOC
Staffing • Service Hours - Hours of coverage? Not all NOCs need to be 7x24x365, but what about holidays? Weekends? On-call? How to Build a NOC
Staffing • What level of staff needs to be present, and when? • Tier One: Service Desk (Call Center), Customer Service, Problem Assessment, Network Knowledgeable • Tier Two: Engineering, Problem Resolution, Perform Maintenance • Tier Three: Advanced Engineering, Complex Problem Resolution, Escalation Point, Network Planning How to Build a NOC
Staffing • Means of responding to issues when NOC is not staffed 24x7? • Other group within organization answering phone, email, and watches monitoring, contacting the on-call • Monitoring sends message directly to the on-call pager • Out-Source after-hours How to Build a NOC
Organizational Structure • What staffing tiers/hierarchy will you have for support? Techs? Leads? NEs? How to Build a NOC
Organizational Structure • Escalation practices and policies • When to move a ticket to an escalation group or person within an organization • When to inform key personnel within organization or network supported about outages/problems How to Build a NOC
Organizational Structure • Writing/updating procedures, training manuals, etc. • Who is charged with this? When is it accomplished? • NOC personnel in conjunction with their other responsibilities • Dedicated resources How to Build a NOC
NOC Location • What is your facility like? • Does your facility have any unique or particular advantages? • How do you want to arrange your staff? Separate offices? "War room"? How to Build a NOC
NOC Funding • How is your organization funded • University funds • State appropriations • GigaPoP / RON revenue • Contracts, grants How to Build a NOC
Tools • How will you track trouble tickets? • Enterprise wide systems shared used on university or state wide level • Proprietary system supported by the NOC and/or other related support groups • Commercial application or Homegrown How to Build a NOC
Tools • How will you track customer information? (Database needs, CRM?) • Ticketing system • Database • Web or Wiki information repository How to Build a NOC
Tools • How will you monitor and troubleshoot? Tools, specifically. • Network monitoring system like Nagios, WhatsUp Gold, HP OpenView • Weather Maps • MRTG How to Build a NOC
Tools • Are you writing any of your own tools? • Who will maintain your applications? How to Build a NOC
Reporting • What are the key metrics for a NOC? • How will you measure these? • Uptime availability • Nodes monitored • Trouble tickets • Phone calls • Emails How to Build a NOC
NOC Evolution • What factors have determined operational changes for your organization - new services, expanded hours, increased number of customers, new equipment types, deeper skill level How to Build a NOC
Building a NOC Indiana University/Global NOC Case Study Steve Peck October 3, 2007 How to Build a NOC
Customers and Expectations • Who are your customers and what are their expectations/SLA’s • Indiana University • Indiana GigaPoP & IP Grid • I-Light (state of Indiana Higher Ed) • Internet2 • National LambdaRail • CIC OmniPoP IU/Global NOC How to Build a NOC
Customers and Expectations • Who are your customers and what are their expectations/SLA’s • TransPAC2 • AMPATH • MAN LAN • HOPI • Connecticut Education Network (pending) • OneNet (consulting) IU/Global NOC How to Build a NOC
Supported Services • In addition to networking, what other services does your organization support? • REN-ISAC (Security service) • Open Science Grid (Grid monitoring service) IU/Global NOC How to Build a NOC
Monitoring and Troubleshooting • How large and complex is your network? YES!!! • Types of Networking Supported • Optical DWDM • Layer 2 – Switch Network • Layer 3 – Routed Network • How many hats being worn? Yes! • Incestuous operational and support relationships? Yes! IU/Global NOC How to Build a NOC
Monitoring and Troubleshooting • What level troubleshooting and/or monitoring will your NOC do? • Within the division of work between our Service Desk and Network Engineering groups, our NOC is able to perform all levels of troubleshooting and monitoring. Ranges from simple layer 3 connections to complex DWDM systems. IU/Global NOC How to Build a NOC
Monitoring and Troubleshooting • How will you communicate outages and planned work to customers? • Email / Listserv • Web page announcements • RSS Feeds • Web and iCalendar based Maintenance/Outage Calendars • Phone (in limited circumstances) IU/Global NOC How to Build a NOC
Staffing • Service Hours - Hours of coverage? Not all NOCs need to be 7x24x365, but what about holidays? Weekends? On-call? • Service Desk: 24x7x365 • Engineering: Business Hours & On-Call • Systems Engineering: Business Hours & On-Call IU/Global NOC How to Build a NOC
Staffing • Staffing requirements due to SLA's or after hour service response policies • Service Desk: 24x7x365 • Engineering: Business Hours & On-Call • Systems Engineering: Business Hours & On-Call IU/Global NOC How to Build a NOC
Staffing • What level of staff needs to be present, and when? • Service Desk: 24x7x365 • Engineering: Business Hours & On-Call • Systems Engineering: Business Hours & On-Call IU/Global NOC How to Build a NOC
Staffing • Means of responding to issues when NOC is not staffed 24x7? • We have on-call rotation for Engineering and System Engineering groups, as well as Service Desk Supervisors. IU/Global NOC How to Build a NOC
Organizational Structure • What staffing tiers/hierarchy will you have for support? Techs? Leads? NEs? Service Desk • 2 Shift Supervisors (Day & Night shifts) • 5 Senior Technicians (at least one on every shift) • 13 Technicians (including hourlys) • 5 Off Front-Line support personnel IU/Global NOC How to Build a NOC
Organizational Structure • What staffing tiers/hierarchy will you have for support? Techs? Leads? NEs? Network Engineering • 17 Network Engineers • Network Engineering Team • Network Planning Team Systems Engineering • 7 Systems Engineers (+1 open position) • Application Developers • System Administrators IU/Global NOC How to Build a NOC
Organizational Structure • Escalation practices and policies • Service Desk has 15 minutes to assess problem or outage before escalating to Engineering. • Standard “escalation” processes for outages and problems (immediate, 1 hour, 4 hours, 12 hours, etc.) IU/Global NOC How to Build a NOC
Organizational Structure • Writing/updating procedures, training manuals, etc. • NOC personnel in conjunction with their other responsibilities (Service Desk & Engineering) • Recently have hired dedicated resources to focus on internal documentation environment IU/Global NOC How to Build a NOC
NOC Location • What is your facility like? • State of the art • Does your facility have any unique or particular advantages? • Showpiece for tours, on the edge of downtown Indianapolis, close to State Capitol • How do you want to arrange your staff? Separate offices? "War room"? • War room (for most part), plus offices for appropriate staff IU/Global NOC How to Build a NOC
NOC Funding • How is your organization funded • Contracts, grants • University funds • State appropriations • GigaPoP revenue IU/Global NOC How to Build a NOC
Tools • How will you track trouble tickets? • Footprints ticketing system (manufactured by Numara) for all GRNOC networks & projects • Peregrine for all IU campus related tickets IU/Global NOC How to Build a NOC
Tools • How will you track customer information? (Database needs, CRM?) • Ticketing system • Great Database (developed in-house) • Web or Wiki information repository IU/Global NOC How to Build a NOC
Tools • How will you monitor and troubleshoot? Tools, specifically. • Monitoring: AlertMon homegrown web based alert interface links Nagios to Footprints ticketing system • Visualization: Weather Maps, Utilization Graphs (SNAPP) • Management: GRNOC Database and linked systems (RADIUS, DNS, etc.) • Other special-purpose tools (examples: Spanning Tree state map, Juniper Firewall Filter Grapher, Syslog Analysis Scripts, prefix list diff checker, etc.) IU/Global NOC How to Build a NOC
Tools • Are you writing any of your own tools? • Yes. Large deployment of custom developed and open source software with a sprinkle of commercial software. • Who will maintain your applications? • Systems Engineering Team: Software Developers and System Administrators. IU/Global NOC How to Build a NOC
Tools: Monitoring • AlertMon: “Big Board” front-end to monitoring system • Nagios: open source network monitoring system with custom-developed Plug-ins and monitoring agents. Monitor variety of services: BGP session status, Interface up/down, IS-IS Adjacency, Router/Switch CPU load, MSDP status, etc. IU/Global NOC How to Build a NOC
Tools: Monitoring Nagios Monitoring System IU/Global NOC How to Build a NOC
Tools: Weathermaps “Mini Maps” embedded in NOC web sites IU/Global NOC How to Build a NOC
Tools: Weathermaps IU/Global NOC How to Build a NOC
Tools: Management GRNOC Database • Network Management System including: • Contact Management • Device Management • Inventory • RADIUS-based authentication • DNS record generation • Configuration archiving • Tied to utilization measurement system • Circuit Management (Layer 0, Layer 1, Layer 2) • IP Address Management • Services Management IU/Global NOC How to Build a NOC
Tools: GRNOC DB IU/Global NOC How to Build a NOC
Tools: GRNOC DB IU/Global NOC How to Build a NOC
Reporting • What are the key metrics for a NOC? • How will you measure these? • Uptime availability • Services monitored • Trouble tickets • Phone calls • Emails IU/Global NOC How to Build a NOC