90 likes | 201 Views
Reducing Business Risk with Ken. Resilience in One Slide. Failures will happen Acceptable: Looks like slow performance Unacceptable: Manual intervention, lost work Ken R educes set of unacceptable failures Reduces scope of affected parties Ken is infrastructure
E N D
Resilience in One Slide • Failures will happen • Acceptable: Looks like slow performance • Unacceptable: Manual intervention, lost work • Ken • Reduces set of unacceptable failures • Reduces scope of affected parties • Ken is infrastructure • Don’t need to code resilience into each app • Ken resilience is composable • Independent services written to Ken • Composition is resilient when they interact
More Detail • But not too much
Problem • SOA is distributed by its nature • Failures will occur • Processing: OS panic, CPU failure • Network: Partition, message replay • Loss of entire data center • Failure types • Tolerated: Looks like slow performance at worst • Untolerated: Manual intervention, lost work • Want to minimize untolerated failures
Ken Reduces Cost and Risk • Implemented in the infrastructure • Reduces development costs and resilience errors • Restartable on different hardware • Handles permanent hardware failures • Loss of entire data centers • Composable Resilience for free • When two Ken services start interacting, • Composition has same resilience with no work.
Issues • Only for event driven programs • Avoids risk of transient deadlocks in production • Interactions with non-Ken components • Handled with drivers • Need libraries that obey Ken rules • Most do, but must review to be sure
Ken Status • Parts of Ken for C/C++ in production use • Indigo • Google’s V8 • Waterken for Java • Passed security review run by David Wagner (UCB) • Easier to write resilient, secure programs in Java • Replaces the need for most of the programmers to learn how to write reliable code for distributed systems with the need to enhance and support the waterken open source code.
Next Steps • Identify resilience today • Categorize tolerated vs. untolerated failures • Evaluate impact of untolerated failures • Compare with Ken resilience • Many untolerated failures affect subset of users • Evaluate relative business risk • Evaluate issues with both approaches • Decide to use Ken or not