260 likes | 389 Views
The Time-Triggered Architecture. Krishnakumar B kitty@dre.vanderbilt.edu Institute for Software Integrated Systems Vanderbilt University, Nashville, TN. Outline of Talk. Overview of TTA Architecture Model Design Principles Communication Fault Tolerance Design Methodology Questions ?.
E N D
The Time-Triggered Architecture Krishnakumar B kitty@dre.vanderbilt.edu Institute for Software Integrated Systems Vanderbilt University, Nashville, TN
Outline of Talk • Overview of TTA • Architecture Model • Design Principles • Communication • Fault Tolerance • Design Methodology • Questions ?
Time-Triggered Architecture • Treatment of physical time as a first-order quantity • Provides fault-tolerant global time base • Decomposes a large application into: • Clusters • Nodes • Combination of both • Use global time to specify interfaces between nodes • Communication and agreement protocols
Model of Time • Time progresses along a dense timeline • Duration – Interval delimited by two instants • Event occurs at an instant • E.g. Observation of state • Time-stamping • Assign state of node-local global time to event • How to synchronize clocks ?
Sparse Time Base • Continuum of time is partitioned • Infinite sequence of alternating durations of activity & silence • Duration of the activity interval > precision of clock synchronization • All events that occur within an interval of activity considered simultaneous • External representation of time
RT Entities and RT Images • TTA system • Node, Communication Network Interface, Host • Time domain and value domain
RT Entities and RT Images (Contd…) • Real-Time Entities • State variables used to model dynamics of system • Change their state as time progresses • Mix of both static and dynamic attributes • E.g Flow of a liquid in a pipe, Temperature of valve • Observation • State of RT Entity at a particular instant tobs • Observation = <Name, Value, tobs> • Real-Time Image • Temporally accurate picture of RT entity at instant t • Duration b/w time of observation and instant t < dacc • Observation valid forever, not true of validity of image
State-Information vs Event-Information • State attribute • Property of a RT entity at a particular instant • State Information • (state variable, value, time of observation) • Idempotent, atleast-once semantics • Sender-side – Not consumed • Receiver-side – Update-in-place, non-consuming read • Event • Sudden change of state of an RT Entity at an instant • Event Information • (state variable, value difference, time of event) • Exactly-once semantics • Sender-side – Consumed on sending • Receiver-side – Queued and consumed on reading
Structure of TTA • Node • Self-contained unit • Communication system • Replicated channels • Autonomous • Executes periodically • a priori TDMA schedule • Fetch Instant • Reads state message from CNI • Delivery instant • Delivers it to CNI of all other nodes of cluster • Overwriting previous version of state message • Fetch, delivery instants in message scheduling table
Interconnection topology • TTA-bus • Replicated passive buses • Each node has 3 subsystems • Node, 2 guardians • Spatial proximity faults • Fail-safe vs fail-operational • TTA-star • Independent guardians • n+2 packages vs 3n • Reshape physical signals & resilient to Slightly-off-specification (SOS) faults • Additional monitoring, better EMI characteristics
Design Principles of TTA • Consistent Distributed Computing Base • Unification of Interfaces – Temporal Firewalls • Composability • Scalability • Transparent Fault Tolerance • Openness
Consistent Distributed Computing Base • Distributed algorithms dependent on consistent data • TTA exploits short error detection latency of protocol • Error-detection at protocol level • Distributed agreement (membership) algorithm • Checking membership of all nodes to ascertain correct operation • Detect faulty outgoing link • Violation of fault-hypothesis • Distributed agreement protocol unable to reach conclusion • Result: Clique avoidance algorithm is activated
Unification of Interfaces – Temporal Firewalls • Uni-directional data-flow interfaces • Elementary – Uni-directional control flow • Composite – Bi-directional control flow • TTA CNI is an elementary interface • Control-error propagation prevented by design • Interface called temporal firewall
Different Interfaces of a Node • Real-Time Service (RS) Interface • Provides timely real-time services to node environment • Must satisfy temporal specification under all conditions • Affects temporal composability • Diagnostic & Maintenance (DM) Interface • Opens channel to internals of a node • Useful in configuring node parameters • Retrieve node parameters for fault diagnosis • Doesn’t affect temporal composability • Configuration Planning (CP) Interface • Connect node to other nodes of a system • Used during integration phase to generate “glue” • Not time critical
Composability • Independent development of nodes • Differentiate between node and architecture design • Precise specification of all node services => independent design of nodes • Stability of Prior services • Validated service of a node should be unaffected by integration of node into a system • Constructive Integration • n nodes already integrated => addition of n+1 doesn’t affect previous n nodes • Replica determinism • All members have same externally visibile state • Produce same output messages atmost d time units apart
Scalability • Complexity of system should not increase with growth of system • In TTA, CNIs provides abstraction • Encapsulate properties of environment • Only essential properties available to nodes • Example - Gateway nodes
Transparent Fault-Tolerance • Active redundancy by replication and voting • Active replication is complex • Shouldn’t be done at application level • TTA provides dedicated Fault-Tolerance layer • Fault-tolerant CNI (FTU-CNI)
Openness • Standardize interfaces • TTA interfaces submitted for standardization by OMG • Inter-operation with CORBA clients • RS, DM and CP interfaces available at the ORB level
Communication • Deliver information between CNIs • Within interval delimited by fetch and delivery instants • TTP/C Protocol • Autonomous, fault-tolerant, TDMA based transport • Fault-tolerant clock synchronization • Membership service • Inform every node about “health” of every other node • Doubles as multicast acknowledgment • Used in implementing fault-tolerant clock synchronization • Clique avoidance to detect and eliminate the formation of cliques when fault-hypothesis is violated
Communication (contd…) • TTP/A protocol • Time-triggered field-bus protocol of TTA • Connects low-cost smart transducers to a node of TTA • Two types of rounds – Master/Slave (MS) & Multi-partner (MP) • MS – Read/write records from IFS to implement DM and CP • MP – Periodic, implements the RS service
Event Message Channels & Performance • Event message channels • Created by allocating portion of TT communication • Push-pull model for events • Filter service & Garbage collection service • Performance of TTA • Time distribution needs inter-frame gap of 5 μs • 80% bandwidth utilization => 20 μs for send-phase • 40,000 messages / second • 10 clients => 250 μs sampling period => 4kHz loop • Amount of data • 5 Mbps => 12 bytes / 20 μs • 1 Gbps => 2400 bytes / 20 μs
Fault Tolerance • Fault Hypothesis • States types and number of faults that the system should tolerate • TTA-star cluster • Can tolerate an arbitrary failure of a single node • Single faulty unit detected by membership protocol • Isolated within two rounds (for single fault) • Fault-tolerant Units – Triple Modular redundancy
Fault Tolerance (contd…) • Till now assumed that environment complies with fault-hypothesis • If environment violates fault hypothesis • TTA activates never-give-up strategy • Initiated by TTP/C protocol in combination with application • Only when necessary resources are unavailable to provide minimum required service • Redundant transducers • Requires two independent TTP/A field buses
Design Methodology • Architecture Design • Decompose into clusters and nodes • Can use top-down or bottom-up • Specify CNIs of nodes in both the temporal & value domains • Node design • Delivery and fetch instants • Used as pre-condition and post-condition by applications • Validation • Formal methods for consistent distributed computing base algorithms • Reproducable, observed without probe effect, DM interface
Concluding Remarks • Autonomous clusters and nodes • Global time used to specify interfaces among nodes • Two-phased design • Architecture and Component (Node) design • Take advantage of global time • Currently occupies a niche position • Time considered a nuisance in mainstream computing • Real-Time is an integral part of real-world • Cannot be abstracted away