1 / 99

Service Operation

Service Operation. Agenda/Learning Objectives. Main goals, objectives & business value of Service Operation Generic concepts & definitions Event Alert Incident Impact, Urgency & Priority Service Request Problem Workaround Known Error Known Error Database. Agenda/Learning Objectives.

elma
Download Presentation

Service Operation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Service Operation

  2. Agenda/Learning Objectives • Main goals, objectives & business value of Service Operation • Generic concepts & definitions • Event • Alert • Incident • Impact, Urgency & Priority • Service Request • Problem • Workaround • Known Error • Known Error Database

  3. Agenda/Learning Objectives • Key Principles & Models • Conflicting balance in Service Operation • IT Services (external) vs. Technology component (internal) • Stability vs. Responsiveness • Quality of service vs. Cost of Service • Reactive vs. Proactive • Processes • Incident Management • Event Management • Request Fulfillment • Problem Management • Access Management

  4. Agenda/Learning Objectives • Functions • Service Desk • Technical Management • IT Operations Management • IT Operations Control • Facilities Management • Application Management

  5. Goal • The goal of Service Operation is to co-ordinate and carry out the activities and processes required to deliver and manage services at agreed levels to business users and customers • Service Operations is also responsible for the ongoing management of the technology that is used to deliver and support services

  6. Primary Goals, Objectives and Benefits Service Operation

  7. Primary Goals& Objectives • Manage and deliver services at agreed levels • Manageand maintain the technology that is used to deliver and support service • Enable Continual Service Improvement through monitoring performance, asses metrics and gather data • Coordinate and execute the processes and activities required to deliver the agreed levels of service to the business

  8. Scope • Service value is modelled in Service Strategy • The cost of service is designed, predicted and validated in Service Design and Service Transition • Measures for optimisation are identified in Continual Service Improvement

  9. Key Principles & Models Service Operation

  10. Achieving Balance • Conflict arises because constant, agreed levels of service need to be delivered in a continually evolving technical and business environment • Getting the balance wrong can mean service too expensive, unable to meet business requirements or unable to respond in good time Focus 1 Focus 2

  11. Internal View vs. External View • Technology Components(Internal) – that underpin the ability to deliver the services • Different teams or departments manage technology thus each should focus on achieving good performance and availability of ‘its’ systems • IT Services(External) – How customers/users experience services • Customer/users don’t worry about the details of what technology is used to manage services. • Only concern is that the services delivered as required and agreed Extreme focus on Internal Extreme focus on External

  12. Stability vs. Responsiveness • Balance between no change (stability) – may ignore changing business requirements and too frequent change (responsive) – may not be able to provide stable services to meet business needs • For example, a Business Unit requires additional IT Services, more capacity and faster response times • To respond to this type of change without impacting other services is a significant challenge. • Many IT organizations are unable to achieve this balance and tend to focus on either the stability of the IT Infrastructure or the ability to respond to changes quickly Extreme focus on Stability Extreme focus on Responsiveness

  13. Quality vs. Cost • Too much focus on quality – deliver more than necessary at higher cost • Too much focus on cost – deliver on or under budget, risk due to sub-standard services • Service Level Requirements (and good SLAs) can be used to deliver service at appropriate cost and avoid “over sizing” • Achieving a balance will ensure delivery of the level of service necessary to meet Business requirements at an optimal cost Extreme focus on Cost Extreme focus on Quality

  14. Reactive vs. Proactive • Reactive (fire-fighting?) – does not act unless prompted by external driver • Proactive – always looking for ways to improve current situation • Continually scan, looking for potentially impacting changes • Seen as positive behavior but can be expensive • Achieve balance between reactive and proactive, requires: • Formal, Integrated problem and Incident Management processes • Ability to prioritize technical faults and demands • Ongoing involvement from Service Level Management in Service Operations Extremely Reactive Extremely Proactive

  15. Service Operation Procesesses Service Operation

  16. Service Operation Process

  17. Impact, Urgency & Priority

  18. Useful Definition • Event • Any detectable or discernable occurrence that has significance for the management of a CI or IT Service • Types of Events include: • Information events – e.g. batch job has finished successfully • Warning events – e.g. a disk drive 90% full • Exception events – e.g. a server is not responding to a poll • Alert • A warning or notice that a threshold has been reached, something has changed, of a failure has occurred • Alerts are often created and controlled by System Management tools • Can be an event which Event Management has interpreted as requiring action, e.g. a threshold on CPU usage has been exceeded

  19. Useful Definition • Service Request • A request from a user for information, or advice, or for a standard change or for access to an IT Service • e.g. reset password, provide standard IT Services for a new user • Service requests are usually handled by the service desk, and do not require an RFC to be submitted • Incident • Unexpected interruption or reduction in quality of an IT service • Failure of a CI that has not yet impacted service is also an Incident

  20. Useful Definition • Problem • A cause of one or more Incidents. • The cause is not usually known at the time the problem record is created • Workaround • A temporary way of overcoming a difficulty and restoring full of limited service (to reduce the impact) • For example by restarting a failed configuration item • Workarounds for problems are documented in known error records. • Workarounds for incidents that do not have associated problem records are documented in the incident record

  21. Useful Definition • Known Error • A problem that has a documented root caused & a workaround • Known errors are created and managed throughout their lifecycle by problem management. • Known Error Database • A database containing all known error records • This database is created and maintained by problem management, and used by both incident and problem management • Part of an organization’sSKMS

  22. Event Management Service Operation

  23. Definition • Event – any detectable or discernable occurrence for the management of a CI or IT Service • An Alert can be an Event which Event Management has interpreted as requiring action, e.g. a threshold CPU usage has been exceeded • Event Management vs. Monitoring • Two areas are very closely related, but slightly different in nature • Event Management works with occurrences that are specifically generated to be monitored • Monitoring is broader, and tracks these occurrences, but it will also actively seek out conditions that do not generate Events

  24. Objectives & Purpose • The ability to detect events, make sense of them, and initiate the appropriate control action is provided by event management. • Event Management provides mechanism for early detection of incidents • In many cases it is possible for the incident to be detected and assigned to the appropriate group for action before any actual service outage occurs • Event management provides a basis for automated operations, thus increasing efficiencies and allowing expensive human resources to be used • Basis for operational monitoring and control and entry point for many service operation activities

  25. Types of Events

  26. Roles • Unnecessary to appoint specific Event Manager • Event Management activities are delegated to the service desk or IT operation management • Technical and application management must ensure that the staff are adequately trained and that they have access to the appropriate tools to enable them to perform these tasks

  27. Roles

  28. Incident Management Service Operation

  29. Goal • The primary goal of the Incident Management process is to restore normal service operation as quickly as possible and minimize the adverse impact on business operations, thus ensuring that the best possible levels of service quality and availability are maintained • ‘Normal service operation’ is defined here as service operation within Service Level Agreement (SLA) limits

  30. Objectives & Purpose • To restore normal service operation as quickly as possible • Minimize the impact on business operation • Maintain optimal levels of service quality & availability • To manage the lifecycle of incidents

  31. Scope • Incident Management covers anything (any event or occurrence) that disrupts, or could disrupts a service • Incidents can be generated by : • User notification • Tools (e.g. HP Open view) • Event Notification (imp note, not all events will become incidents as many classes of events are not related to disruptions at all, but are indicators of normal operation, or are simply informational) • Raised by IT Technical staff • Incidents are reported to and managed by the Service Desk

  32. Definitions

  33. Incident Model • Many incidents are not new (they involve dealing with something that has happened before and may well happen again) • Many organizations will find it helpful to pre-define ‘standard’ incident models – and apply them to appropriate incidents when they occur • An incident model is a way of predefining the steps that should be taken to handle a process in an agreed way • The incident model should include: • The steps to be taken to handle the incident • Responsibilities; who should do what • Timescales and thresholds for completion of the actions • Escalation procedures; who should be contacted and when

  34. Timescales • Timescales must be agreed for all Incident handling stages (these will differ depending upon the priority level of incident) • These will be based on the overall incident response and resolution targets as stated within SLAs • These will themselves be captured as targets within Operational Level Agreements and Contracts • Tools should be used to automate timescales and escalate • Support groups must be informed of defined Timescales

  35. The Process

  36. The Process

  37. The Process

  38. Major Incidents • A separate procedure, with shorter timescales and greater urgency, must be used for ‘major’ incidents • A definition of what constitutes a major incident must be agreed and ideally mapped on to the overall incident prioritization system • Special Major Incident teams may be convened directly under or reporting to the Incident Manager • May run in parallel with Problem Management but service restoration must remain the priority

  39. Escalation

  40. The Process

  41. Metrics • Total numbers of incidents (as a control measure) • Size of current incident backlog • Breakdown of incidents at each stages (e.g. logged, WIP, closed, etc) • Number and percentage of major incidents • Mean elapsed time to achieve incident resolution or circumvention, broken down by impact code • Percentage of incidents handled within agreed response time (incident response time) • Targets may be specified in SLAs, for example, by impact and urgency codes • Average cost per incident • Number of incidents reopened and as a percentage of the total • Number and percentage of incidents incorrectly assigned • Number and percentage of incidents incorrectly categorized • Percentage of incidents closed by the service desk without reference to other levels of support • Number and percentage of incidents processed per service desk agent • Number and percentage of incidents resolved remotely, without the need for a visit • Number of incidents handled by each incident model

  42. Challenges • Having the ability to detect incidents as early as possible • Ensuring all incidents are logged (convincing both users and technical staff) • Availability of information: Problem & Known Errors • Integration into: • Configuration Management : use CMS to determine relationships between CI’s & find history of CIs • SLM: to correctly assess impact and priority • SLM: use defined escalation procedures

  43. Critical Success Factors • A good Service Desk is key to successful Incident Management • Clearly defined targets to work to – as defined in SLAs • Adequate customer-oriented and technically trained support staff with the correct skill levels, at all stages of the process • OLAs and UCs that are capable of influencing and shaping the correct behaviour of all support staff • Effective Problem Management process (reduce the volume of incidents)

  44. Value to the Business • The ability to detect and resolve incidents results in higher availability of the service, which in turn means less downtime to the business • The ability to align IT activity to business priorities • This is because Incident Management includes the capability to identify business priorities and allocate resources as necessary • Incident Management is highly visible to the business, and it is therefore easier to demonstrate its value than most areas in Service Operation • For this reason, Incident Management is often one of the first processes to be implemented in service management projects

  45. Roles • Incident Manager • Drive efficiency & effectiveness • Produce management information • Manage work of incident support staff (1st & 2nd line) • Monitor effectiveness of process & recommend improvement • Develop & maintain Incident Management systems • Manage major incidents • Develop & maintain the process & procedures • First-line Support • Carried out by the service desk function

  46. Roles • Second-line Support • Normally a group with greater, but still general, technical skills than the Service Desk • Handles many of the less complicated incidents • Of benefit to be co-located with the Service Desk as communications and access improved • Third-line Support • Specialist internal and external technical groups • Concentrate on more difficult incidents

  47. Request Fulfillment Service Operation

  48. Purpose, Goal, Objectives • Request fulfillment is the processes of dealing with service request from the users. The objectives of Request Fulfillment process include : • To Provide a channel for users to request and received standard pre-defined, pre-authorised standard services • To provide information to users and customers about the availability of services and the procedure for obtaining them • To source and deliver the components of requested standard services (e.g. licenses and software media) • To assist with general information, complaints or comments

  49. Basic Concepts • Many service requests will be frequently recurring, so a predefined process-flow (Request Model) can be devised to aid consistency and control and safety • This is similar in concept to Incident Models but applied to service requests. • Service requests will usually be satisfied by implementing a standard change • The value of request fulfilment • Provide quick and effective access to standard services which business staff can use to improve their productivity • Request fulfilment effectively reduces the bureaucracy involved in requesting and receiving access to existing or new services

More Related