1 / 14

Managing Complex Systems in BT: Challenges and Solutions

Explore BT's approach to managing end-to-end systems, application events, business processes, component monitoring, and more. Learn about BT's Matrix Architecture challenges, solutions for service design and operation, and how they align SLAs with business requirements.

ssale
Download Presentation

Managing Complex Systems in BT: Challenges and Solutions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BT – Managing Complex Systems Ian Johnston & John Palmer BCS Kingston & Croydon Branch presentation 26/02/08

  2. Presentation Objectives • Approach to managing e2e systems • A standard for application events • Business process and component transaction monitoring • Order tracking and jeopardy • Leveraging the value of monitoring, eg. ASGs, Service and Capacity etc. • Managing COTS products eg BEA, Siebel

  3. The BT experience • BT architecture – SOA – linked reusable capabilities • Our position has been driven from experience in monitoring of complex distributed architecture. • The concept of configuring toolsets to monitor e2e is unachievable for large enterprises – maintenance expensive/ impossible. • This has led us along the Design route which now parallels ITIL‘s Service Design concepts.

  4. BT’s Matrix Architecture

  5. BT Matrix Architecture Challenges - Service Design • Service Level Management • SLAs aligned to business requirements • BT’s outsourcing strategy • Availability • Understanding CE requirements • Response times • Capacity Management • Accurate measurement of transaction volumes • Response times broken down by capability • IT Service Continuity management • Dynamic deployment in virtualised environments • Physical and geographic resilience • SLM • Defining measurements & targets, eg volumes, response times • Aligning SLAs with UCs • Capacity Management • Procedures to ensure customer targets are met • Business Continuity management • Deployment designs to ensure resilience • Availability management • Measure e2e availability broken down to capabilities

  6. BT Matrix Architecture Challenges - Service Operation • Operational management • How to assess the impact and prioritise application events by business process and IT Service ? • Application management • Routing of PRs to the appropriate support groups? • Analysing high volumes of events in log files? • Technical management • Pinpointing root-cause across multiple shared capability • Metrics • Stepped changes in volumes, errors and response times? • Impact of changes eg trend in error rates • Measuring operational efficiency eg txns vs. failures

  7. BT Matrix Architecture Challenges – E2E Design End Customer End Customer NB : incorporates Flow Stream / Manage / Monitor / Director From Create ServiceID “ SF – Provide – Progress - pt 1 ” ( Place Order ) Build Port Network Capacity Shortfall Into Error Get Tie Cable Mapping queue for manual processing Place Order Pending Pending Assigned ` ` ` Acknowledged Acknowledged ( SMPF ID ) ` ` ` Committed Committed Committed Build VC RADIUS , B - RAS , VCI , etc ` Installed Completed Completed Update ( SMPF ID , Installation DN etc ) SMPF ID Status = “Completed” ` Complete Activation email Status = “Completed” To To “ Close Order” “ Close Order” sub - process sub - process

  8. BT Approach – Application event standard Business transaction Business Process Event type Time Application Standard Host Business keys server e2e correlation key Component capability

  9. BT Matrix Architecture Solution - Service Design SLM • agile design workshop to build in measures to support SLAs Availability • Agile capability workshops to build in measures for monitoring of capacity implemented by apis • Standardised events for common error conditions such as interface failures IT Service Continuity • Dynamic reports of services and deployment profile (host/server distribution)

  10. BT Matrix Architecture Solution- Service Operation Operational management • Event correlation (by service and transaction identifiers) • Impact (problem scenario and guided action) • Performance bottlenecks • Support group checklists (quick wins) Application management • Improved routing of PRs to the appropriate support groups provided by e2e view • We can we analyse high volumes of events by restricting the types of events and provision of summarisation Technical management • Diagnosis – root cause ( e2e location and standard error) Metrics • Summarisation and granularity inherent in standard

  11. BT Application Monitoring Standard

  12. Outsourcing Supplier Contracts 1.Monthly views to identify any stepped changes in • Volumes, Response times, Error rates 2. Weekly views of top 5-10 transactions showing • Distribution of volumes, variance in response times, peaks and spikes • Any worsening trends in errors and thresholds 3. Monthly analysis of error messages showing • Volumes errors, eg aborts, application, business, etc. • Breakdown by business process, IT service and component transaction • Corresponding traps and CR/DRs using AlarmMis 4. Ad-hoc Investigations to review • Loadings and relative performance across servers • Real-time transaction analysis • Drill down diagnostics • COTS, platform and network root cause analysis   5. Service management process to review • Capacity • Supplier’s (eg Siebel, WLS) and applications development group’s CRs and DRs • PRs against remedial activities

  13. What is the BT experience? Key messages • Define Standard for Application Events • Instrumentation by design built into matrix capabilities • Implementation by using agile design workshops • Exploitation of toolset supported by supplier contracts • Application monitoring standard promotes the effective problem management by integration with the enterprises diagnostic toolsets

  14. Events Performance Hunter Integration Console System & Application Trap Definitions Management Frameworks COTS Monitoring definitions, e.g., Seibel, BEA, Oracle Remote Operation Business Process & Application txn Monitoring • Flexible & agile • Uses COTS out-of-the-box • Rapid development & deployment • Any management frameworks • Low maintenance

More Related