150 likes | 293 Views
Cloudpocalypse We put “fail” in failover. Vlad Mazek, MCSE CEO, Own Web Now Corp vlad@ownwebnow.com facebook.com/ vladmmd @ vladmazek Cell: (407) 536-VLAD. Agenda. Summary of events What to tell your clients about the outage Our current network design What failed?
E N D
CloudpocalypseWe put “fail” in failover Vlad Mazek, MCSE CEO, Own Web Now Corp vlad@ownwebnow.com facebook.com/vladmmd @vladmazek Cell: (407) 536-VLAD
Agenda • Summary of events • What to tell your clients about the outage • Our current network design • What failed? • What we are doing to address it
So what failed? ATS Automatic Transfer Switch Electrical switch that reconnects electric power source from it’s primary source to a standby source.
Summary of Events • 12:04 Power failure • 1:34 ATS replacement advised by DC • 2:00 Partial power restored • 4:10 First ETA issued, 6:30 PM • 4:30 Emergency systems start coming online • 4:46 DC offers additional details on the problem • 5:10 Restored Exchange 2010 clusters • 7:10 DC restores power
Impact • This is the first major issue with the Dallas DC in over a decade • We moved our critical systems to Dallas from California and Florida due to the weather and power issues • This has adjusted our roadmap for service delivery
Agenda • Extend LiveArchive to a second DC • Extend Exchange 2010 hosting to additional data centers • Improve our communications across partner networks • Facebook: ExchangeDefender • Twitter: @xdnoc @ExchangDefender
What can I tell my clients? • Power issues happen. • There will be a partial refund. • There is no additional support cost. • The company is going to improve the solution. • The uptime record thus far has been impressive. • Complex systems lead to complex problems and aren’t you glad you don’t have to worry about it?
What next? • Look for an email from me in the morning. • Advise customers about LiveArchive. • Stay tuned for network enhancements. • Keep the issue in perspective: This isn’t Microsoft’s fault or general negligence/incompetence, it’s a massive failure.
Something funny… You know why I don’t trust the cloud? It’s still powered by guys who’s butt cracks show when they squat to fix an electrical issue.