460 likes | 576 Views
What a DBA should know about ITIL. Mike Sniezek. Agenda. Introduction and Disclaimer Why - What - When – How – Where - ITIL Where you fit in Nothing Like an Example Questions. Introduction and Disclaimer.
E N D
What a DBA should know about ITIL Mike Sniezek
Agenda • Introduction and Disclaimer • Why - What - When – How – Where - ITIL • Where you fit in • Nothing Like an Example • Questions Mike Sniezek – BMC Software
Introduction and Disclaimer • My perspective and information really comes from spending time with many companies IT organizations both large and small but honestly mostly enormous organizations. • They all are looking or moving to a set of standards ITIL being the most popular. • You need to justify a standard to solve problems. Everyone organization has problems. Based on my experience the biggest value of implementing a standard is finding all the problems you didn’t know about. DBAs find allot of problems. Mike Sniezek – BMC Software
Why Change • Leaps of faith • This should work or even I hope this works! • Proactive monitoring why – the users always let us know • When this happens again I’ll remember • It must be the developer everything looks fine • I don’t need any advise from that guy • We just need more stuff – hardware – software……\ Mike Sniezek – BMC Software
Standards – WHY? • Quality Systems • ISO 900x,TQM,EFQM,Six Sigma, Malcolm Baldrige, Theory of Constraints, Statistical Process, Control, Deming, etc.. • Process Frameworks • IT Infrastructure Library, Meta Models, IBM Processes, EDS Digital Workflow, Microsoft MOF, Telecom Ops Map Etc.. • What is not defined cannot be controlled • What is not controlled cannot be measured • What is not measured cannot be improved Mike Sniezek – BMC Software
Motivation For ITIL - WHY • The mystery between business and information management. • When upper management does not understand why IT is doing something it’s a mystery. When IT does not understand why upper management is doing something it’s a mistake. • Money or the concept of making money is often lost between internal organizations because of lack of communication and especially lack of process. Mike Sniezek – BMC Software
I. T. Culture • Technology focused – isolation even between IT groups. • Service focused – look at an application end to end but few applications live in isolation. • Customer focused – a strategy focused on the customer, internal department just like an outsourced group of services. • Business focused – viewed as a partnership Mike Sniezek – BMC Software
WHY do I care? • You meet our SLA agreements and no one is really happy. • You can’t really discuss that project because Jack’s away. • You spend forever fixing the wrong problem. (what do I mean) • I hate surprises • Complexity of applications and interaction with multiplatform and data types • Everything requires 24 x 7 • Data is growing now exponentially • 4000 servers need an upgrade by Wednesday • Nothing is getting smaller or fewer. • More data – greater complexity - less time – fewer resources Mike Sniezek – BMC Software
Objective • Three Key Objectives of ITIL • Align IT services to meet the needs of business and customers • Improve quality of IT service delivery • Reduce the long-term cost of services • IT is complex and a common methodology is needed to manage for today’s level of complexity and to just cope with the increase in complexity for the future. Promotes a common language of communication within a business. Mike Sniezek – BMC Software
ITIL – What is IT? • It’s a bunch of books. • It’s a library of a defined set of best practices or processes that you can decide to implement some or all elements. • ITIL is recognized as the de facto standard for IT Service Management • ITIL is a best practices framework. Kind of.. • ITIL has a strong relationship with the ISO9000 quality framework Mike Sniezek – BMC Software
Language - Defined • Incident. • Any event that is not part of the standard operation of a service and causes, or may cause, an interruption to, or a reduction in, the quality of service. • Problem. • The undiagnosed root cause of one or more incidents. • Known error. • An incident or problem for which the root cause is known and a temporary workaround or a permanent alternative has been identified. If a business case exists, an RFC will be raised, but—in any event—it remains a known error unless it is permanently fixed by a change. • Major incident. • An incident with a high impact, or potentially high impact, which requires a response that is above and beyond that given to normal incidents. Typically, these incidents require cross-company coordination, management escalation, the mobilization of additional resources, and increased communications. Mike Sniezek – BMC Software
Houston we have a Problem No You have an incident No….Houston we have a problem! A common language avoids confusion Mike Sniezek – BMC Software
ITIL – The Big Picture Mike Sniezek – BMC Software
Support and Service • ITIL is organized into a series of sets, which themselves are divided into two main areas: service support and service delivery. • Service Delivery • What services must the data center provide to the business to adequately support it. • Service Support • How does the data center ensure that the customer has access to the appropriate services Mike Sniezek – BMC Software
Service Support • Service Support comprised those disciplines that enable IT Services to be provided effectively. These are broadly concerned with delivering and supporting IT services that are appropriate to the business requirements of the organization. • Service Support is divided into: • Change Management • Release Management • Problem Management • Incident Management • Configuration Management • Service Desk Mike Sniezek – BMC Software
Service Delivery • Service Delivery is the management of the IT services themselves, and involves a number of management practices to ensure that IT services are provided as agreed between the Service Provider and the Customer. Essentially, service providers need to offer business users adequate support: Service Delivery covers those issues which must be taken into consideration to ensure this. • Service Delivery is divided into: • IT Financial Management • IT Continuity Management • Capacity Management • Availability Management • Service Level Management Mike Sniezek – BMC Software
Service with a - Consistent Definitions • Service request. • Requests for new or altered service. The types of service requests vary between organizations, but common ones include requests for information (RFI), procurement requests, and service extensions. Requests for change (RFC) may also be included as part of service request. • Service. • A business function deliverable by one or more IT service components (hardware, software, and facility) for business use. • Service catalog. • A comprehensive list of services, including priorities of the business and corresponding SLAs. • Service level agreement. • A written agreement documenting the required levels of service. The SLA is agreed on by the IT service provider and the business, or the IT service provider and a third-party provider. • Service level management. • The process of defining and managing through monitoring, reporting, and reviewing the required and expected level of service for the business in a cost-effective manner. • Service level objectives. • Objectives within an SLA detailing specific key expectations for that service. Mike Sniezek – BMC Software
Service Support Service Desk Change Mgmt Approve and manage Single point of contact Incident Mgmt Release Mgmt Hey – this happens allot Make sure it Gets done Problem Mgmt Is this an error or a problem Configuration Mgmt Keeping track of CIs CMDB Mike Sniezek – BMC Software
Responsibilities – No change • Database Security • Database Change Management • Database Performance • SQL • State of Tables • Parameters and Sizing • Database Recovery • Application Failure • Disaster Recovery • Database or software failure • Interaction with Development Mike Sniezek – BMC Software
DBA and Service – Example • Problem management has found response time is consistently bad on a customer information system around the middle of every month and then suddenly the problem goes away. • How do they know – the number of help desk incidents registered on the customer information system. • What has it got to do with you. The database you manage is listed as an asset of the CI in the CMDB. • What are you supposed to do? • Believe it or not they are trying to to turn this problem into an error even a known error. • You investigate and you note that a great number of updates hit that database and the reorganization is scheduled the third week of every month. Mike Sniezek – BMC Software
Example Continued - Now we have a Known Error • What can make this error go away? • Based on your knowledge you go with a REORG on some primary tables every week instead of every other week. • Request For Change (RFC) would go to Change Management • Implementation Management would ensure the change was implemented • Post Implementation review would be scheduled • You would still do all the same task you do today without ITIL. • With ITIL all this is tracked. • Also • Mean-Time-To-Repair (MTTR) • Mean-Time-Between-Failure (MTBF) Mike Sniezek – BMC Software
“Restore normal service operation as quickly as possible and minimize the adverse impact on business operations, thus ensuring that the best possible levels of service quality and availability are maintained” A report problem or complaint. Help desk – Service desk etc. The DBA – might get a call Reactive, Break-fix Database down, Database slow, Job failure, Schema Changes, Add users Service Desk, Call Center, Ticketing System, P1, SEV-1 24x7, Remote Access, VPN Incident Management Mike Sniezek – BMC Software
“Minimize the adverse impact of Incidents and Problems on the business that are caused by errors within the IT infrastructure, and to prevent recurrence of Incidents related to these errors” Remember my example they find the problems and turn them into errors. Some DBA may be directly involved with this group. DBA – May interact with consistent problems Proactive, Root Cause Analysis, Post-Mortem, Trend Analysis This is separate from Incident Management Problem Management Mike Sniezek – BMC Software
“Provide accurate information on configurations and their documentation to support all the other Service Management processes” They own everything and when a change goes in they are responsible for making sure the CMDB reflects the change as well they will have a PIR Post Implementation Review and someone from the DB group should be involved The DBA will know how and when the DB is accessed and should get this information to this group How does Server A differ from Server B? Who has access to Server A? Environmental….UNIX / Mainframe / Windows……. Which patches have been applied to this CRM environment? When does the Support Contract expire? Who is the Business Owner? Configuration Management Mike Sniezek – BMC Software
“Ensure that standardized methods and procedures are used for efficient and prompt handling of all Changes, in order to minimize the impact of Change related incidents upon service quality, and consequently to improve the day-to-day operations of the organization” A change would have been defined an RFC and if it involved a database you should have someone involved in the review of the change. Change Management is not project management, however it is responsible for this change being done and being done on time. DBA will be involved if not on the CAB team Risk Analysis, ROI Analysis Pre-Test Plan Pre-Communication Plan Pre-Signoffs Backup Plan Execution Plan Backout Plan Post-Test Plan Post-Communication Plan Post-Signoffs Documentation Updates Contingency Plan Updates Change Management Mike Sniezek – BMC Software
“Design and implement efficient procedures for the distribution and installation of Changes to I.T. Systems” Release management makes sure all the players responsible for the change are involved and coordinates with the various departments on release schedules testing etc. Again this is not project management.. DBA will definitely work with this group Installs Upgrades Patches Database Change Management Impact Rollback etc Release Management Mike Sniezek – BMC Software
“Maintain and improve I.T. Service quality, through a constant cycle of agreeing, monitoring and reporting upon I.T. Service achievements and instigation of actions to eradicate poor service – in line with business or cost justification” Good old SLA and Operation Level Agreement Should maintain a service catalog to understand if agreements are being met Should always be running a Service Improvement Program How to do things better or cheaper Today the DBA will be under some SLA agreement but should have a better understanding of priority Service Level Agreements Operational Level Agreements Satisfaction Surveys Service Level Management Mike Sniezek – BMC Software
“Provide cost-effective stewardship of the I.T. assets and resources used in providing I.T. Services” Budget / Accounting - what it costs to do business Everybody spends money these guys track it Hardware, Software, Personnel, Facilities, Service Contracts TCO, ROI, Budgeting, Accounting, Charging Server Consolidation, Standard Edition, Colocation, Linux, Open Source Financial Management Mike Sniezek – BMC Software
“Ensure that cost-justifiable I.T. capacity always exists and that it is matched to the current and future needs of the business” What they need now to run the business and what they will need in the future Measure of workload – how we doing now Keep track of all resources including employees and IT assets Performance management lands here. Monitor and tuning. Should have a Capacity Management Database for analysis The DBA will be involved in all of theses task and provide data and advice Monitoring Tuning! Capacity Planning Demand Management Capacity Management Mike Sniezek – BMC Software
“Support the overall Business Continuity Management process by ensuring that the required I.T. technical and service facilities (including computer systems, networks, applications, technical support and Service Desk) can be recovered within required, and agreed, business timescales” Responsible for understanding the impact of the applications and components of your IT department. You will be involved in the Business Impact Analysis of your databases and there use by applications Need to know all the assets, the threats and possible vulnerabilities. DBA will provide plans and criteria Disaster Recovery Contingency Planning Application Error or fallback Fire, earthquake, flood, power failure Continuity Management Mike Sniezek – BMC Software
“Understand the Availability requirements of the business and plan, measure, monitor and continuously improve the Availability of the I.T. Infrastructure, services and supporting organization to ensure that these requirements are met consistently” Design for availability – measure MTTR and MTBF and my favorite MTBSI DBA - what do you need to get where the business wants to go Availability is Job #1! High Availability, Hardware? Redundancy, RAC SAN, NAS, Active-Passive Configuration Backups! Backups! Backups! Test! Test! Test! Availability Management Mike Sniezek – BMC Software
“The process of Security Managementis required to establish the necessary logical and physical security measures to ensure (Confidentiality, Integrity and Availability) C.I.A. of IT Systems and information.” Determine confidentiality, integrity and availability of data Physical security, technical security and procedural security DBA – Day to day and compliance standards Database Access and authorities Application access and database and change management Audit access Rules, Regulations and Compliance Security Management Mike Sniezek – BMC Software
What are the ITIL deliverables / goals of the DBA function? • Operational management tools • Management reports and information • Exception reviews and reports • Review and audit reports • Operational Document Library • A stable, secure and resilient infrastructure • A log or database or all operational events, alerts and alarms • Fail-over and disaster recovery testing schedule • Operational work schedules Mike Sniezek – BMC Software
War Story – Small Company (IT <100) • Pre ITIL • No standards for logging issues • No standard tool • Poor prioritization or escalation of issues • Lost tickets • No loop back on changes • Changes were implemented with no measurement • Phase 1 Incident Management (first 9 months) • Single point of contact • Standard tool across IT • Restricted access to create tickets • Assigned an Incident Process owner and coordinator Mike Sniezek – BMC Software
War Story - Small Company • Phase 2 Implemented Change Management • Uniquely identify change • Use RFC request for change • Phase 3 Implement Change Management (2 years later) • Refine RFC • Create a CAB change advisory board • Assign a process owner and coordinator • Prioritize and categorize • Urgent – Standard – Minor – Medium – Major • Approval process Mike Sniezek – BMC Software
War Story - Small Company • Two years later • Got real formal • Acknowledgement – you can always do better • Proof • 95% of all incidents through one single point of contact • 97% reduction of outstanding incidents in one year • Year one – 23% reduction in incidents • Year two – 35% reduction in incidents • Year three – 40% reduction in incidents Mike Sniezek – BMC Software
Small Company – Savings? • Cost Per Incident in dollars • Level 1 – 25 • Level 2 – 200 • Level 3 – 500 • Reduced number of calls 2500 • Handled without escalation 1400 • Savings approx. 250,000.00 per year • What’s wrong with this view of savings? Where are the trucks? Mike Sniezek – BMC Software
Implement What • Adopting ITIL requires • adopting some new terminology • completely reinventing the organization • focus on a single ITIL process • include all ten. • Staffing • It might be staffed exclusively with internal resources or might rely heavily on expert assistance. • The approach taken to ITIL adoption will depend on the level of nature of that adoption. Mike Sniezek – BMC Software
Prioritize your time? • I – Activities that are Important and Urgent e.g. Incident Management • II – Activities that are Important but not Urgent e.g. Configuration Management • III – Activities that are not Important but Urgent • IV – Activities that are not Important and not urgent Mike Sniezek – BMC Software
DBA ITIL Procedures • Rate each ITSM focus area • Rate the quality of each deliverable • Decide what level you want to reach • Determine how much work is involved • Determine how much time you need Mike Sniezek – BMC Software
Summary ITIL is coming to your organization or has already arrived. It’s better to be a head of the curve….get certified. Your organization is dependent on IT. The better IT delivers services the better your business will do. Your job function will not really change with ITIL best practices, but it will improve the quality of your organization. The implementation of ITIL is not cheap or easy. However the IT world we work in is not getting less complicated nor is it shrinking. Mike Sniezek – BMC Software
Q & A • Did you learn anything today? • QUESTIONS “It is not necessary to change. Survival is not mandatory.” - Deming • If you think of one ….my email mike_sniezek@bmc.com Mike Sniezek – BMC Software