120 likes | 226 Views
Building Technology for Storage Systems Monitoring Intermountain HealthCare. Thomas Gwyn Dunbar. Thomas.Dunbar@imail.org. References & Introduction. * http :// content.healthaffairs.org/content/30/6/1185.full.html * nagios.org, etc
E N D
Building Technology for Storage Systems Monitoring Intermountain HealthCare Thomas Gwyn Dunbar Thomas.Dunbar@imail.org
References & Introduction • * http://content.healthaffairs.org/content/30/6/1185.full.html • * nagios.org, etc • * Nagios: Building Enterprise-Grade Monitoring Infrastructure for Systems and Monitoring, 2nd ed., David Jacobsen • * Unix Programming Environment, Kernighan & Pike • * After Virtue, 3rded, Alasdair MacIntyre • * Purgatorio, Dante - since Nagiosain’tgonna insist on sainthood
IHC and IT • Intermountain Healthcare is an internationally recognized, nonprofit system of 22 hospitals, a Medical Group with more than 185 physician clinics, and an affiliated health insurance company, SelectHealth. Our 33,000 employees serve patients and plan members in Utah and southeastern Idaho. IHC has an annual budget of around 5 billion dollars. • Datacenters in Plano, TX and Salt Lake City, UT and Ogden, UT providing high availability systems with over 5 petabytes of storage (over 12000 spindles) using IBM DS8000 for tier 1 and Netapp for other storage. In-house developed applications run on top of multiple Oracle databases over 15TB in size. • CA Service Desk/CA Spectrum/Xmatters; Nagios
Storage’s Nagios Servers • while SA team moving away from Nagios, Storage is moving to it: • Using 3.5, with check_mk and pnp4nagios • DNX, if need be • Our own servers for business reasons • Integration with CA Spectrum/Service Desk, etc
Storage Hardware • Brocade switches, IBM DS SAN, SVC & Netapp
This Talk’s Perspective • Comprehensive monitoring is a major, site specific application. • Major applications become very difficult to replace (e.g. air traffic control, IHC systems) • Hence, let’s consider fundamentals
Worldviews • * What we look through, not what we look at • * Tempts us to think it is the only way to see • * Scientific: what can we know, and how • * Technological: what can we build, and how • * Context
Strategies • * Building and Growth • * Inputs and Feedback • * Planning • * Personality
Building Technology • Coherence • Clarity • Continuity
Spectrum of Traps • EventMessage: Thu 05 Sep, 2013 - 14:47:23 - Device ********** of type NetAppONTAPDev is no longer responding to primary management requests (e.g. SNMP) • CA Spectrum and Nagios
Graphing: Time Series • Down the road…correlation