330 likes | 347 Views
HEAnet Ltd provides free broadband connectivity to Irish schools, ensuring internet and educational network access. Services include network management, security, email, and monitoring tools like SmokePing, Nagios, Rancid, Cacti, and Netflow. The provisioned system handles services, router configurations, DNS, web hosting, and more.
E N D
NOC Tools Donal O’Cearbhaill HEAnet Ltd.
Ireland’s National Education and Research Network • Provides Internet services to Irish Universities • 2005 - Broadband for Schools
Broadband for Schools • Free ‘always on’ broadband connectivity to Schools • 3 Year Agreement • Dept of Education/Dept of Communication/TIF • 3,925+ Schools • 7 Access Providers • HEAnet backbone network • Onward connectivity to Internet & Educational Networks • HEAnet Managed Services: Network; Security; E-Mail
Challenges • 4,000 schools • Highly contended links • A lot of satellite connections • SLA/Contract enforcement
Monitoring/ISP Infrastructure • 28 Debian/Ubuntu servers • 4 Fibrenetix disk arrays • Disk based backup • rsync & application level dumps • Syslog nodes • PostgreSQL database • Aggregation Routers • 7301 • PPPoE • GRE • Border/Services Routers • 6500, 3750
Tools • SmokePing • Nagios • Rancid • Cacti • Netflow
SmokePing • Latency measurement tool • Runs probes in parallel • >3,800 hosts • RRD backend • Reporting • Historical view • Acceptance testing • Tuning • FPing timeouts decreased • Total number of probes reduced • Satellite frequency reduced
Nagios • 4,131 services on 3,905 hosts • Top 5 number of hosts on nagios.org • Populated by SmokePing and memcache • Nagios runs checks serially • >1 hour vs. 15 mins • Nagios populates • sidebar alarms • Schools Up Graph
Rancid • Really Awesome New Cisco confIg Differ • 3,296 Router configs • Maintains history of changes • Mails changes
Cacti • 3,900 hosts • Data gathering • SNMP • External Perl scripts • Graph templating • Database driven • Cricket: 27 mins • Perl • Cacti: <5 mins • Cactid • Custom multithreaded C application
Netflow • NfSen is a graphical web based front end for the nfdump netflow tools • Query abuse reports • Usage reporting
Reporting Gigabytes downloaded by schools on 22/03/07: 332 Gigabytes uploaded by schools on 22/03/07 : 48 Total MegaBytes downloaded for Digiweb Satellite: 12834 Total MegaBytes uploaded for Digiweb Satellite: 1202 Total MegaBytes downloaded for Digiweb Wireless: 77578 Total MegaBytes uploaded for Digiweb Wireless: 10217 Total MegaBytes downloaded for ESATBT ADSL: 54352 Total MegaBytes uploaded for ESATBT ADSL: 6632 Total MegaBytes downloaded for HSData Wireless: 3047 Total MegaBytes uploaded for HSData Wireless: 575 ….. • Daily Reports • DNS log reporting • Report infected PCs • Top MX lookups • Misconfigurations • Active Directory • Netflow • IPs • Schools usage
Logging • Syslog server per PoP • Servers • Routers • Logcheck • Logfile scanner • IP to school identifier • Mapping IP to school
Server Monitoring • SSH keys • Sharing keys/fingerprints • High overhead • SNMP • Less configurable • Memcache • Local Perl script • Easy to rollout • Load • Disk Space • Monitor Processes
Memcache • Distributed memory caching system • Low overhead • Speed up dynamic database-driven websites by caching data and objects in memory • Developed for LiveJournal • Slashdot • Wikipedia • SourceForge • Schools • Nagios • Maps • Server status
Subversion • Modern replacement for CVS • Provisioning System • Configs • ViewCVS • Checkins get mailed • Schools-noc • Scripts stored on every server • Automatically updated • cron.d
Sidebar • Nagios polled every minute • Populated into memcache • Sidebar alarms • Pubcookie single sign-on
Provisioning System • Services provisioned • CPE router config • Nagios • RADIUS • Cacti • Cisco ACS (TACACS+) • SmokePing • Fortigate (Content filtering) • Maps • DNS • Webhosting
Provisioning System • Text::Template templating system • Data stored in authoritative database • PostgreSQL’s INET type is brilliant! • Perl scripts generate configlets • Added to Subversion • Perl/Shell provisioning agents handle service restarts etc. • Ability to stop all provisioning
Random things we’ve encountered • Predictable traffic levels • Smokeping, Nagios and Cricket/Cacti take a lot of tuning to monitor our network • Difficult to achieve high bandwidth and high level of reliability in transparent content filter