1 / 37

Advanced Management Technologies For Exchange 5.5

Advanced Management Technologies For Exchange 5.5 Greg Todd Program Manager NT Solutions Group BMC Software, Inc. Agenda. Current issues with problem diagnosis Application availability timeline Theory of root cause analysis (RCA) Primer on RCA How RCA can help you today

tanith
Download Presentation

Advanced Management Technologies For Exchange 5.5

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Advanced Management Technologies For Exchange 5.5 Greg ToddProgram ManagerNT Solutions GroupBMC Software, Inc.

  2. Agenda • Current issues with problem diagnosis • Application availability timeline • Theory of root cause analysis (RCA) • Primer on RCA • How RCA can help you today • Demos of RCA on Exchange 5.5 • Systems management vision • Management maturity curve • The future of Exchange management

  3. The Business Problem • Event automation #1 priority of IT executives • Problem diagnosis is a critical aspect that requires attention • Wasted Time80% of down time spent diagnosing20% of time spent fixing • Wasted ResourcesDiagnosis often a finger-pointing exercise • Frustrated UsersUsers have no idea what to expect Gartner, 1998

  4. Point of Recovery Point of Notification Point of Diagnosis Point of Failure Point of Postmortem PoF PoN PoD PoR PoP Monitoring Analysis Recovery Evolution Application Availability Timeline

  5. Application Violating Service Level Application Availability Timeline time PoF PoN PoD PoR PoP Root Cause Analysis Monitoring Recovery Evolution

  6. Application Availability Timeline Application Violating Service LevelSignificant Decrease FasterServiceRestoration time PoF PoN PoD PoR PoP Root Cause Analysis Diagnosis Time Reduced Monitoring Recovery Evolution

  7. Benefits Of RCA • Based on well-established theories • Quicker problem resolution • Problem isolation saves resources to address the real problem • Symptom filtering allows administrator to ignore sympathetic events • Performs tests to find the root cause • Far superior to rules-based approach • Key enabler to make systems self-sufficient • Provides impact analysis capability

  8. RCAKey concepts • Symptoms are problems tobe investigated • Faults are the root causes ofthese symptoms • Tests are active tasks whichgather information RCA is a problem analysis methodology geared towards finding the real cause of a problem and preventing it from happening again.

  9. Rules-Based Symptom received Possible causes looked up in afixed table of rules Set of possible causes presented to user Only suggestedactions can be provided to user Root Cause Analysis Symptom received Possible causes determined from a generic fault model Each cause is tested against suspects Actual root cause is presented to user after suspects are eliminated Specificactions can be provided to user Rules-Based Approach Vs. RCA

  10. Root Cause AnalysisFor Exchange Server Three components that work synergistically Exchange Server Windows NT IP Network

  11. High Level RCA Architecture EnterpriseConsole Mid-LevelManager ManagedNode ManagedNode ManagedNode

  12. Managed Node Mid-Level Manager Bridge Managed Node ProtocolLayer ARB ARB Agent RequestBroker Javalink Bridge RTEP RTEP RealtimeEvent Proxy Managed Node Mid-level agent Custom ARB Other KM KM EnterpriseConsole Diagnostic KM Monitor KM KM RCA Engine RCA Architecture BMC PATROL Exchange Server and OS KMs

  13. Root Cause AnalysisSample problem Remote Office Exchange Server Inbound Server T1 Link to Remote Office Exchange Server D Inbound Messages To Internet BridgeheadServer BridgeheadServer Firewall Exchange Server A Outbound Messages Exchange Server B Outbound Server Exchange Server C Legend Internal Mail Internal & Internet Mail Internet Mail

  14. Queue Growth on Server A Queue Growth on Server B Queue Growth on Server C Queue Growth on Server D CPU Usage High Memory Bottlenecks MTA down on target machine Network Problem PATROL RCASample problem • Symptom received by model • Queue Growth Alarms from multiple Exchange Servers • Suspected root causes found in model

  15. ? ? ? ? CPU Usage High Memory Bottlenecks MTA down on target machine Network Problem CPU Usage High Memory Bottlenecks MTA down on target machine Network Problem PATROL RCASample problem • Suspected root causes tested • Root cause isolated • CPU usage high on bridgehead

  16. Demo Simple RCA Scenario

  17. Sample Generic Fault Model

  18. Sample Specific Fault Model

  19. Sample Specific Fault ModelClose-up

  20. Demo RCA Engine Causal Directed Graphs

  21. Demo Root Cause Analysis Exchange, NT, IP Network

  22. Demo Impact Analysis Exchange, NT, IP Network

  23. Benefits Of RCA • Based on well-researched theories • Quicker problem resolution • Problem isolation saves resources to address the real problem • Symptom filtering allows administrator to ignore sympathetic events • Performs tests to find the root cause • Far superior to rules-based approach • Key enabler to make systems self-sufficient • Provides impact analysis capability

  24. Systems Management Vision Where’s all this stuff going?

  25. VIRTUALIZE STABILIZE CONTROL MANAGE MONITOR Phases Of Management Maturity Based on commonly known process control theory Applies directly to management of complex software systems

  26. Maturity Phases MONITOR • Monitoring is plumbing • Included with Windows 2000 and Exchange 2000 • Server-centric data and event collection • Monitors component and system data • No awareness of other systems or apps • Basic alerting, scripting, and actions • WMI, PerfMon, HealthMon,Exchange 2000 monitoring

  27. Maturity Phases MANAGE • Application-specific and server-centric • View and take action on components • Availability and performance monitoring • Rich reporting • Application SLA definition • ASAP resolution when out of compliance • Most correlation done in your head • Some tools have reached this level • Key enabler to Control phase

  28. Maturity Phases CONTROL • Places system automation in control • Provides holistic view of systems • Enables high level of SLA compliance • Quick problem diagnosis • Action <--> Reaction • Proactive correction before users feel impact • Management automation maturing

  29. Maturity Phases STABILIZE • Provides utility-level service • Reliable as electric, telephone, water • Assures continuous application service • Clusters • Built-in fault tolerance, re-routing, workload management • Failure does not impact service • Prediction / impact analysis • Awareness of impact on SLAs caused by planned changes

  30. Maturity Phases VIRTUALIZE • The system learns how to intelligently deal with various issues • Automatic everything • Actions and responses for the IT group • Alerts and communications • Acquires and stores knowledge for future reference • Uses policy engines to control actions • Systems become truly self-sufficient • User becomes self-serviced

  31. Virtualization ExampleProblem Research Assistant • Correlates problem root cause diagnoses with: • Previous resolutions - presents the user with previous remedies based on exact matches or best guess • On-line technical documentation - integrates with vendor-supplied support documentation (e.g. Microsoft Knowledge Base articles) • Technical Support Request Generator - formats required user information and diagnosed fault into a support request, according to vendor- specific templates

  32. Virtualization ExampleProblem Research Assistant SupportRequests DiagnosedFaults Problem ResearchAssistant Correlation Backend Bridge Previous Resolutions Help RCA Server Domain Model Domain Model OnlineTechnicalArticles ProblemResponseHistoryRepository Domain Model IP Reachability Analyzer

  33. RCA Takes Management To The Next Level VIRTUALIZE STABILIZE Many Players Many Choices CONTROL MANAGE RootCauseAnalysis MONITOR

  34. Summary • GOAL: No interruptions in service • RCA is key to Exchange availability • Accelerates the diagnosis process • Can assess impact of failures before-hand • Not unreasonable to achieve “five 9’s” • RCA paves the way to virtualization • Managed systems that learn and adapt • You never have to intervene • Free to invest more time in pro-activity • RCA is in beta now!!

  35. Call To Action • Demand sophistication and simplicity in Exchange management solutions • Solutions that learn • Solutions that are easy to use • Start thinking of Exchange availability in terms of utility-level service • Consider where to implement RCA in your current environment • Bring along those whom you service • Take care of your users • Communicate with them as you progress

More Related