770 likes | 792 Views
Explore the critical role Service Operation plays in delivering value to customers in the ITIL Service Lifecycle. Learn how effective communication, event management, and processes like incident and problem management contribute to operational success.
E N D
ITIL SERVICE LIFECYCLE SERVICE OPERATION
ITIL LIFECYCLE – STAGES – SERVICE OPERATION Service Operation is where the value of the services being provided is first realized by the customer. During Service Operation, the day-to-day operation of the processes that manage the services takes place. It is also where performance metrics for the services are gathered and reported. The Service Operation phase of the Service Lifecycle is concerned with ensuring that services operate within agreed parameters. When service interruptions do occur, Service Operation is charged with restoring service as quickly as possible and with minimizing the impact to the business. Service Operation is the only lifecycle phase in which value is actually realized by customers. Whereas all other phases of the Service Lifecycle contribute to and enable value, it is only experienced during Service Operation.
ITIL LIFECYCLE – STAGES – SERVICE OPERATION Service Operation also adds business value by: . Ensuring that services are operated within expected performance parameters . Restoring services quickly in the event of service interruption . Minimizing impact to the business in the event of service interruption . Providing a focal point for communication between users and the Service Provider organization enable value, it is only experienced during Service Operation.
ITIL LIFECYCLE – STAGES – SERVICE OPERATION During Service Operation, the importance and criticality of communication is especially acute. ITIL stresses the importance of communication: . Between users and the IT Service Provider . Between customers and the IT Service Provider . Between different processes, functions, teams, etc. within the IT Service Provider . Between the IT Service Provider and its suppliers
ITIL LIFECYCLE – STAGES – SERVICE OPERATION – PROCESSES Service Operation stage processes are: Event management Incident Management Problem Management Request Fulfillment Management Access Management Application Lifecycle Management
ITIL LIFECYCLE – STAGES – SERVICE OPERATION – PROCESSES Service Operation stage functions are Service Desk Technical Management IT Operations Management Applications Management
ITIL SERVICE LIFECYCLE SERVICE OPERATION – EVENT MANAGEMENT
EVENT MANAGEMENT • An event can be defined as any detectable or discernible occurrence that has significance for the management of the IT Infrastructure or the delivery of IT service and evaluation of the impact a deviation might cause to the services. • Events are typically notifications created by an IT service, Configuration Item (CI) or monitoring tool. • Effective Service Operation is dependent on knowing the status of the infrastructure and detecting any deviation from normal or expected operation. This is provided by good monitoring and control systems, which are based on two types of tools: • active monitoring tools that poll key CIs to determine their status and availability. Any exceptions will generate an alert that needs to be communicated to the appropriate tool or team for action • passive monitoring tools that detect and correlate operational alerts or communications generated by CIs.
EVENT MANAGEMENT Event Management is the process of managing trigger events, which ITIL defines as alerts or notifications created by an IT service, configuration item, or monitoring tool. During this process, trigger events occur and are then detected and filtered. If the trigger events are determined to be significant, they generate an incident, problem, or change request (work events). If the trigger event is determined to be just an alert, it is assigned to responsible personnel, reviewed, and then closed.
EVENT MANAGEMENT • The ability to detect events, make sense of them and determine the appropriate control action is provided by Event Management. Event Management is therefore the basis for Operational Monitoring and Control. • In addition, if these events are programmed to communicate operational information as well as warnings and exceptions, they can be used as a basis for automating many routine Operations Management activities, for example executing scripts on remote devices, or submitting jobs for processing, or even dynamically balancing the demand for a service across multiple devices to enhance performance. • Event Management therefore provides the entry point for the execution of many Service Operation processes and activities. • In addition, it provides a way of comparing actual performance and behavior against design standards and SLAs. As such, Event Management also provides a basis for Service Assurance and Reporting; and Service Improvement. This is covered in detail in the Continual Service Improvement publication.
EVENT MANAGEMENT Theobjectives • Detect Events, make sense of them, and determine the appropriate control action • Event Management is the basis for Operational Monitoring and Control
EVENT MANAGEMENT – BASIC CONCEPTS • Event An alert or notification created by any IT Service, Configuration Item or monitoring tool. For example a batch job has completed. Events typically require IT Operations personnel to take actions, and often lead to Incidents being logged. • Event Management The Process responsible for managing Events throughout their Lifecycle.
EVENT MANAGEMENT – BASIC CONCEPTS There are many different types of events: • Events that signify regular operation • Notification that a scheduled workload has completed • An e-mail has reached its intended recipient • Events that signify an exception • A user attempts to log on to an application with the incorrect password • A device’s CPU is above the acceptable utilization rate • Events that signify unusual, but not exceptional, operation. • A server’s memory utilization reaches within 5% of its highest acceptable performance level
EVENT MANAGEMENT – BASIC CONCEPTS Examples of eventcategories Informational: This refers to an event that does not require any action and does not represent an exception. They are typically stored in the system or service log files and kept for a predetermined period. Informational events are typically used to check on the status of a device or service, or to confirm the successful completion of an activity. Examples of informational events include: • A user logs onto an application • A job in the batch queue completes successfully A device has come online • A transaction is completed successfully.
EVENT MANAGEMENT – BASIC CONCEPTS Examples of eventcategories • Warning: A warning is an event that is generated when a service or device is approaching a threshold. Warnings are intended to notify the appropriate person, process or tool so that the situation can be checked and the appropriate action taken to prevent an exception. Warnings are not typically raised for a device failure. Examples of warnings are: • Memory utilization on a server is currently at 65% and increasing. If it reaches 75%, response times will be unacceptably long and the OLA for that department will be breached. • The collision rate on a network has increased by 15% over the past hour.
EVENT MANAGEMENT – BASIC CONCEPTS Examples of eventcategories • Exception: An exception means that a service or device is currently operating abnormally (however that has been defined). Typically, this means that an OLA and SLA have been breached and the business is being impacted. Exceptions could represent a total failure, impaired functionality or degraded performance. Examples of exceptions include: • A server is down • Response time of a standard transaction across the network has slowed to more than 15 seconds • A segment of the network is not responding to routine requests.
Exception Filter Warning Event Information Event Management - Logging and Filtering
Incident Management Incident Incident/ Problem / Change? Problem Management Problem Change Management RFC Event Management - Managing Exceptions Exception
Incident Incident/ Problem / Change? Problem RFC Human Intervention Alert Warning Auto Response Log Event Management - Information & Warnings Information
Event Management - Roles • Event management roles are filled by people in the following functions • Service Desk • Technical Management • Application Management • IT Operations Management
ITIL SERVICE LIFECYCLE SERVICE OPERATION – INCIDENT MANAGEMENT
INCIDENT MANAGEMENT Incident : An incident is any occurrence which causes or may cause interruption or degradation to an IT Service. The usual priority when an incident occurs must be to restore normal service as quickly as possible, with minimum disruption to the users. An incident defined as an unplanned, unexpected or unexplained disruption in service. This is any event which is not part of the standard operation of a service and which causes or may cause an interruption to or a reduction in the quality of the service that is provided. E.g. mail server not responding to incoming or outgoing messages.
INCIDENT MANAGEMENT Incident Management is concerned with the rapid restoration of services and with minimization of impact to the business. In most but not all cases the Incident Management process is owned and executed by the Service Desk.
INCIDENT MANAGEMENT Within ITIL Incident Management consists of a number of basic activities or steps: . Detection – The incident becomes known by any mechanism, e.g. user call, system alert, etc. . Logging – Details of the incident are recorded in the incident management system. All incidents must be fully logged and date/time stamped, regardless of whether they are raised through a Service Desk telephone call or whether automatically detected via an event alert. . Classification – The incident is categorized according to predefined criteria for the purpose of facilitating diagnosis and prioritizing its handling relative to other incidents. . Prioritization – The impact and urgency of the incident are determined and factored together to determine its relative priority among other incidents.
INCIDENT MANAGEMENT Within ITIL Incident Management consists of a number of basic activities or steps: . Initial Diagnosis – If the incident has been routed via the Service Desk, the Service Desk analyst must carry out initial diagnosis, using diagnostic scripts and known error information to try to discover the full symptoms of the incident and to determine exactly what has gone wrong. The Service Desk representative will utilize the collected information on the symptoms and use that information to initiate a search of the Knowledge Base to find an appropriate solution. If possible, the Service Desk Analyst will resolve the incident and close the incident if the resolution is successful. . Escalation – If necessary, the incident may be forwarded to the appropriatehandling group
INCIDENT MANAGEMENT Within ITIL Incident Management consists of a number of basic activities or steps: .Investigation and Initial Diagnosis – Additional details regarding the incident are gathered and used along with tools such as the Known Error Database to attempt resolution. . Resolution and Recovery – Service is restored and users are provided assistance to allow them to resume work. . Closure – Successful resolution of the incident is verified with the user, the incident resolutiondetails are recorded, and the incident is flagged as being closed in the incident management system.
ITIL SERVICE LIFECYCLE SERVICE OPERATION – REQUEST FULLFILLMENT
REQUEST FULLFILLMENT MANAGEMENT A service request is a request from a user for information or advice, or for a standard change, or for access to an IT service. The purpose of Request Fulfillment is to enable users to request and receive standard services; to source and deliver these services; to provide information to users and customers about services and procedures for obtaining them; and to assist with general information, complaints and comments. Request Fulfillment Management is the process that manages service requests received from the users. It is important to distinguish between incidents and service requests. Incidents are unplanned and require Change Management approval prior to resolution. A service request, on the other hand, is a request that has a standard procedure for response and is pre-approved by Change Management. All requests should be logged and tracked. The process should include appropriate approval before fulfilling the request.
REQUEST FULLFILLMENT MANAGEMENT Request Fulfillment Management is the process that manages service requests received from the users. It is important to distinguish between incidents and service requests. Incidents are unplanned and require Change Management approval prior to resolution. A service request, on the other hand, is a request that has a standard procedure for response and is pre-approved by Change Management. All requests should be logged and tracked. The process should include appropriate approval before fulfilling the request.
REQUEST FULLFILLMENT MANAGEMENT The term ‘Service Request’ is used as a generic description for many varying types of demands that are placed upon the IT Department by the users. Many of these are actually small changes – low risk, frequently occurring, low cost, etc. a request to change a password a request to install an additional software application onto a particular workstation a request to relocate some items of desktop equipment a question requesting information Their scale and frequent, low-risk nature means that they are better handled by a separate process, rather than being allowed to congest and obstruct the normal Incident and Change Management processes.
REQUEST FULLFILLMENT MANAGEMENT The objectives of the Request Fulfillment process include: To provide a channel for users to request and receive standard services for which a pre-defined approval and qualification process exists To provide information to users and customers about the availability of services and the procedure for obtaining them To source and deliver the components of requested standard services (e.g. licences and software media) To assist with general information, complaints or comments
REQUEST FULLFILLMENT MANAGEMENT The roles : Not usually dedicated staff Service Desk staff Incident Management staff Service Operations teams
ITIL SERVICE LIFECYCLE SERVICE OPERATION – PROBLEM MANAGEMENT
PROBLEM MANAEGEMENT Problem : A problem is the unknown underlying cause of one or more incidents. A problem is NOT just a particularly serious incident. E.g. mail server not responding to incoming or outgoing messages, and the root cause is identified as power has been lost because the server was accidentally unplugged due to other servers being un-plugged and relocated to another part of the building. ITIL recommends a clear demarcation between incident control and problem management. If help desk cannot resolve an incident, it is progressed to problem management.
PROBLEM MANAGEMENT Error : An error is the known underlying cause of one or more incidents. Known Error : A known error is the known cause of an incident for which a workaround also exists.
PROBLEM MANAGEMENT A problem is a cause of one or more incidents. The cause is not usually known at the time a problem record is created, and the problem management process is responsible for further investigation. The key objectives of Problem Management are to prevent problems and resulting incidents from happening, to eliminate recurring incidents and to minimize the impact of incidents that cannot be prevented. Problem Management includes diagnosing causes of incidents, determining the resolution, and ensuring that the resolution is implemented. Problem Management also maintains information about problems and the appropriate workarounds and resolutions.
PROBLEM MANAGEMENT Problems are categorized in a similar way to incidents, but the goal is to understand causes, document workarounds and request changes to permanently resolve the problems. Workarounds are documented in a Known Error Database, which improves the efficiency and effectiveness of Incident Management. Although Incident and Problem Management are separate processes, they are closely related and will typically use the same tools, and may use similar categorization, impact and priority coding systems. This will ensure effective communication when dealing with related incidents and problems.
PROBLEM MANAGEMENT Problem Management is broadly divided into two major sub-processes: . Reactive Problem Management, which is charged with responding to problems as they arise in the environment, usually driven by the Incident Management process. . Proactive Problem Management, which is charged with proactively seeking out improvements to services and infrastructure before incidents occur.
PROBLEM MANAGEMENT Technical problems can exist without impacting the user. However, if they are not spotted and dealt with before an incident occurs they can have a big impact on the availability of IT Services. Problems experienced by users .The printer won’t form feed paper through the printer. The user has to advance the paper by using the form feed button. .Each time a new user logs onto a computer, they have to reinstall the printer driver. .Windows applications crash intermittently without an error message. The computer will restart and work properly afterwards. .
PROBLEM MANAGEMENT Technical problems .Disk space usage is erratic. Sometimes a considerable amount of disk space is available, but at other times little is available. There is no obvious reason and no impact to the users – yet. .A network card is creating lots of unnecessary traffic on the network, which could eventually reduce the bandwidth available, leading to a slow response from network requests.
PROBLEM MANAGEMENT Roles: Problem Manager Supported by technical groups Technical Management ITOperations Applications Management Third-party suppliers
PROBLEM MANAGEMENT The benefits of taking a formal approach to problem management include the following: .Improved quality of the IT service. A high quality, reliable service is good for the business/organization. .Incident volume reduction. Problem management is instrumental in reducing the number of incidents that interrupt the business/organization every day. .Permanent solutions. There will be a gradual reduction in the number and impact of problems and known errors as those that are resolved stay resolved. .Improved organizational learning. The problem management process is based on the concept of learning from past experience. The process provides the historical data to identify trends, and the means of preventing failures and of reducing the impact of failures, resulting in improved productivity. .A better first time fix rate at the Service Desk. Problem management enables the Service Desk to know how to deal with problems and incidents that have previously been resolved and documented.
ITIL SERVICE LIFECYCLE SERVICE OPERATION – ACCESS MANAGEMENT
ACCESS MANAGEMENT The purpose of the Access Management process is to provide the rights for users to be able to access a service or group of services, while preventing access to non-authorized users. Access Management helps to manage confidentiality, availability and integrity of data and intellectual property. Access Management is concerned with identity (unique information that distinguishes an individual) and rights (settings that provide access to data and services). The process includes verifying identity and entitlement, granting access to services, logging and tracking access, and removing or modifying rights when status or roles change. It has also been referred to as Rights Management or Identity Management in different organizations.
ACCESS MANAGEMENT Access Management is effectively the execution of both Availability and Information Security Management, in that it enables the organization to manage the confidentiality, availability and integrity of the organization’s data and intellectual property. Access Management ensures that users are given the right to use a service, but it does not ensure that this access is available at all agreed times – this is provided by Availability Management. Access Management is a process that is executed by all Technical and Application Management functions and is usually not a separate function. However, there is likely to be a single control point of coordination, usually in IT Operations Management or on the Service Desk. Access Management can be initiated by a Service Request through the Service Desk.
ACCESS MANAGEMENT – BASIC CONCEPTS Access refers to the level and extent of a service’s functionality or data that a user is entitled to use. Identity refers to the information about them that distinguishes them as an individual and which verifies their status within the organization. By definition, the Identity of a user is unique to that user. Rights (also called privileges) refer to the actual settings whereby a user is provided access to a service or group of services. Typical rights, or levels of access, include read, write, execute, change, delete. Services or service groups. Most users do not use only one service, and users performing a similar set of activities will use a similar set of services. Instead of providing access to each service for each user separately, it is more efficient to be able to grant each user – or group of users – access to the whole set of services that they are entitled to use at the same time. Directory Services refers to a specific type of tool that is used to manage access and rights.