270 likes | 396 Views
Change Management Practitioners Forum Thursday Nov 29, 2012 Jon Dowell Shawn McKenzie. Agenda. Housekeeping Introductions Assessing change risk “Is that really the worst that can happen?” Responsiveness versus control " How quick is too quick ? " Mitigating change risk
E N D
Change ManagementPractitioners ForumThursday Nov 29, 2012Jon DowellShawn McKenzie
Agenda Housekeeping Introductions Assessing change risk “Is that really the worst that can happen?” Responsiveness versus control "How quick is too quick?" Mitigating change risk "Just say NO?" Next Steps
HousekeepingWelcome to Gibsons EnergyFire Alarms & WashroomsKatharina Stephens
Facilitators Jon Dowell • Senior Consultant with KSLD Consulting. • 15 years of experience solving I.T. mysteries. • Facilitation and critical thinking during: • Major Incidents • Problem investigations • Project quality assessments prior to go-live • Project warranty periods • Training and mentoring • Critical thinking • Root cause analysis • Impact assessments • Potential risks associated with requests for change • KSLD Consulting specializes in I.T. Problem Management and problem solving for today’s busy world.
Facilitators Shawn McKenzie • Founding partner of SignalFire Inc, an independent ITSM and Organizational Change Management consulting practice • Over 12 years experience implementing IT, OCMand Process Optimization solutions for clients in: • Oil & Gas • Telecommunications • Government • Mining • Utilities • Certified ITIL ver.2 Master and ver.3 Expert • Prosci Change Management certified- Sorrento, Italy • Cobit Practitioner certified • Focus on process before tools and technology • ITIL Foundations instructor for over 250 students with 99% pass rate • itSMF board member for the last 5 years
Objectives of the Practitioner Forum • To facilitate sharing of information and experiences between like minded practitioners • To provide an opportunity for networking • To grow the level of knowledge of participants
How we operate • We try to meet quarterly as a group • The group drives future agenda based on interest levels • We respect the difference in our level of experiences • We participate and share freely.
Assessing change risk“Is that really the worst that can happen?” Jon Dowell
Typical Answers… “Is that really the worst that can happen?” • Server may not reboot clean • Require hard boot • Software installation may not work on first try • Reboot & retry reinstall • Users may not log off at start of change window • Force disconnect users
Typical Answers… “Is that really the worst that can happen?” • Entire network could go down in a cascade failure • Millions lost!!! • Server suffer major hardware failure during reboot and never restart • Millions lost!!! • Ice storm takes out all our power & data centres for months • Millinos lost!!!
Real Answers… “Is that really the worst that can happen?” • Human error (fat fingered change) • Second set of eyes (buddy system) • Pre-build script in copy/paste • All related assets not identified • Minimal number of services impacted • Have teams on stand-by • Rollback required (restore to pre-change stage) • Time to restore
True Risk Understanding… “Is that really the worst that can happen?” “What could happen?” Key Steps • Identify Potential Problems • Identify Likely Causes • Take Preventative Action • Plan Contingent Actions KepnerTregoe – Potential Problem Analysis
What could happen… State Action… (Change Short Description) • KepnerTregoe – Potential Problem Analysis
What could happen… State Action… Network Core Switch Replacement Users may remain logged in during replacement HIGH / LOW Entire network cascade failure V. LOW / HIGH New switch does not work MED / HIGH Applications do not work properly after change MED / MED • KepnerTregoe – Potential Problem Analysis
What could happen… State Action… Network Core Switch Replacement No power bar outlets preventing start up Check rack prior to start of change Roll back to old switch by re-installing into rack, re-cabling, & turning on. (1 hr) Miss-configured OS causing switch to not communicate Test in lab. Copy config to text file and past into production Network loop causing infinite loop / broadcast storm Second set of eyes during cabling Second set of eyes. Test in lab. Copy/paste config Use DRP application service to complete critical work (8 hrs) Port miss match Routing incorrect Second set of eyes. Test in lab. Copy/paste config Response too quick for application to process Test in lab • KepnerTregoe – Potential Problem Analysis
Responsiveness versus control“How quick is too quick?”Shawn McKenzie
Change Management is Schizophrenic Balance of control and responsive To efficiently and promptly handle changes . . . To manage and control changes to the live environment using standard methods and process . . . while minimizing the risk and impact of change related issues *
Building consensus on The Balance How can we communicate the importance of risk control without creating a fear of approval 'Red Tape"? Responsiveness to Business need is important, but how would a business-hours Change-related outage affect Operations? Raising the Spectre of a Black Monday
Building consensus on The Balance Would having Change Implementations 6 days a week during business hours create too much uncertainty? How would going to planned weekly deployment windows change the culture of IT Operations? How about every other week? Do Change Risk assessments imply Tiers of Risk/Reward?
Next Steps • Future Sessions - Change • January 2013 • April 2013 • Oct 2013 (Align with conference?) • Future Sessions - Problem • February 2013 • May 2013 • Sept 2013 • November 2013 (Align with conference?) • Future Sessions - ??? • March 2013 • Future Change Practitioner Topics? • Process Review and challenges along the way… • Profile of a Change Manager – what makes a good change manager • Supporting Tools? (ITSM Suite, CMDB) • Other???
itSMF Upcoming Events! • Problem Management Practitioner Event • Jorge Wong & Harry Contos • Thursday, Dec 6 • Gibsons Energy
Real Problems • Lack of co-ordination between Change and Configuration management • Inadequate risk and impact assessment undertaken • Inaccurate / missing configuration information in CMDB so risk of wrong decisions • Incomplete scope of assessment (security, business impact, availability, capacity, continuity) • Staff complacency – manipulate the system • Urgent changes not be appropriately tested. • Process seen as bureaucratic and burdensome • Instituting process control over contractor support personnel or specific segments / applications • Lack of tools to efficiently track changes • Scope too wide for resources available to handle