240 likes | 576 Views
Towards A Configuration Specification Language for Internet Systems. Archana Ganapathi (archanag@cs.berkeley.edu). Motivation – Internet Services. Failures impact availability End user satisfaction Economic repercussions Predominant causes Human operator Software.
E N D
Towards A Configuration Specification Language for Internet Systems Archana Ganapathi (archanag@cs.berkeley.edu)
Motivation – Internet Services • Failures impact availability • End user satisfaction • Economic repercussions • Predominant causes • Human operator • Software [Oppenheimer et al. Architecture, operation, and dependability of large-scale Internet services: three case studies.IEEE Internet Computing special issue on Global Deployment of Data Centers, September/October 2002.]
Recap: Service Failure Cause Online Content Total: 61 failures in 12 months Total: 56 failures in 3 months [Failure Analysis of Two Internet Services - Winter 2003 ROC Research Group Retreat, Granlibakken, CA, January 2003.]
Case Study of Mis-configurations ~25 problems from Online & Content • Errors in component-specific configuration • Multi-component configuration inconsistency • Non-configuration failure solvable by reconfiguration?
Configuration Scenarios • Never intended • Unacceptable behavior • Anticipated and tested • Problems with solutions (e.g. recovery code) • Anticipated but not tested • Rare occurrence, high cost of testing • Never anticipated • New/evolving environments/interactions
Configuration Tools Psgconf Quattor Radmind REMBO Rdist RPM Rsync SmartFrog SUE System Imager SysTracker Tivoli Unison Xhier Zenworks Apple Netinstall BCFG BCONFIG BigFix Cfengine EDG Fabric Management Grid Weaver HP Utility DataCentre ISconf Jumpstart/Kickstart LCFG Microsoft SMS Netcool Novadigm Radia NPACI Rocks
Configuration Languages: Windows Registry: [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office\10.0\Word\InstallRoot] "Path"="C:\\Program Files\\Microsoft Office\\Office10\\“ Shell Script: if (! $?YPDOMAIN && -r $LOGHOME/.domainname) then setenv YPDOMAIN `cat $LOGHOME/.domainname` if ("$YPDOMAIN" == "") unsetenv YPDOMAIN endif XML: <server> <server-name>oski</server-name> <num-connections>3</num-connections> </server>
Configuration Needs • Account for Human Component • Dynamic Monitoring of System Functionality • Authenticate Privacy and Integrity • Programmatic Manipulation of Configuration Data • Domain Independence
Configuration Needs contd. • User Intent rather than Low Level Assembly Language • Intra-Configuration Constraints (Consistency) • Inter-Configuration Constraints (Conformity) • Formalization and Automatic Derivation
Desired Language Features • Descriptive: • Capture inter- and intra- component interactions • User intent and assertions for proper behavior • Expressions for failure models & recovery code • temporal event relationships • Prescriptive: • recovery mechanisms for anticipated events • “Software TDR”
Service Requests Response System spec Services spec Event logs Internet System Configuration Generator Error Models Configuration Files/Software Operator modifications “Learning” Model
LISA Framework • Formal models for configurations in IS • Recovery handlers • Assertions & consistency checking • Coverage/utilization • Uncover pitfalls in configuration APIs • Dependence analysis • Conformity checks • Use LISA verification modules to authenticate changes
LISA Statement Structure pre_condition ==> rule_body Pre_conditions = temporal sequences. Rule_body = action handlers invoked upon matching pattern Example: pre-condition: “A->B: ping” is not followed by “B->A:‘I’m alive’” within 5 sec rule body: A should time out and try C instead.
Language Features • IS events and transactions • specify event order and transactions • temporal sequences with references to past and future • logic connectives (and, or, not operators) • repetition, concatenation and overlap of sequences • sequence vs con-sequence
LISA syntax • LISA_Statement ::= Assertion Action • Action ::= ==>{<ok message>, <recovery code>} | ε • Assertion ::= assert Property @ ISA_clk ; • Property ::= Sequential_Expression | Logical_Expression | Temporal_Operation
LISA Operators Logical: and(&), or(|), not(~) Sequential: concatenation(;), overlap(:) Implication: -> -- logical if or sequential implication <-> -- logical iff implication => -- temporal ‘next’ implication Extended Regular Expressions * -- 0 or more repetition + -- 1 or more repetition ? -- optional [] -- count qualifier
LISA Semantics Semantics defined by model represented by triple <A,F,S>. • A is a non-empty set of atomic propositions. • S is a finite set of states. • F is a function that maps each state from S to the alphabet 2A, with a set of valid atomic propositions. F:S → 2A f |═ b Boolean expression b holds under truth assignment represented by f f |═ b <═> b ε f f |═ ¬b <═> f |≠ b f |═ b1 & b2 <═> f |═ b1 and f |═ b2 f |═ b1 | b2 <═> f |═ b1 or f |═ b2
Examples • If a is True intermittently or continuously for 3 ISA_cycles then after that b must be True within 4 ISA_cycles, unless c happened in the meantime. assert always (a[1..3]) => b[1..4] | c) @ISA_clk • Byzantine fault tolerance, checking if n > 3f always holds [Castro & Liskov] assert always (up_nodes > 3*const_f)
Examples contd. • Network property to guarantee “free of routing loops”: at most one entry in table, count less than number of nodes in network. assert always {(seqa < seqb) - (seqa = seqb ^ hop_a > hop_b)} • Perfect failure detector protocol for completely synchronous systems [Fetzer]; to verify the status of a system component c, a configuration process asserts function ISA_f(c) == “up”. function ISA_f (component c) { send ping to c; wait on receive pong from c return “up”; after 2*τ return “crashed”; } always (on receive ping from sender send pong to sender);
LISA to Verilog IS-dictation: Within 1 to 3 ISA_cycles after ISA_event ping occurs, ISA_event pong must occur assert always {~ping; ping} -> {~pong[1..3]; pong} @(ISA_clk) Verilog program (hand-written; non state-machine model) always @(ping) begin repeat (1) @(ISA_clk); fork: P begin @(pong); $display($time,,"Computer up"); disable P; end begin repeat (2) @(ISA_clk); $display($time,,"Computer crashed"); disable P; end join end
Deployment Run-time Consider ISA_clock = 2*τ τ ping = 0 pong = 0 3*τ ping = 1 pong = 0 5*τ ping = 0 pong = 1 7*τ ping = 1 pong = 1 *** assertion failure 5*τ ► 7*τ 9*τ ping = 0 pong = 0 11*τ ping = 1 pong = 0 13*τ ping = 0 pong = 1 15*τ ping = 1 pong = 0 17*τ ping = 0 pong = 0 19*τ ping = 0 pong = 0 21*τ ping = 0 pong = 0 23*τ ping = 0 pong = 0 *** assertion failure 13*τ ► 21*τ
LISA Future Work • Implement LISA to Verilog compiler • Implement Internet Service event monitor with simulated events (anticipatory event sequences) • Incorporate dynamic “learning” phase • Deploy at actual Internet Service sites.
Need Data….Please Help • What configuration tasks are regularly performed and why • Good/bad “event sequences” • Types and impact of configuration failures • Desired language features for system configuration