920 likes | 969 Views
TU Wien. The Time-Triggered Architecture for Real-Time Systems H. Kopetz TU Wien. http://stf.rgai.hu. Outline. Introduction System Architecture Time-Triggered Protocols Composability--Temporal Firewalls Fault Tolerance Conclusion. Our Goal.
E N D
TU Wien • The Time-Triggered Architecture • for Real-Time Systems • H. Kopetz • TU Wien
Outline • Introduction • System Architecture • Time-Triggered Protocols • Composability--Temporal Firewalls • Fault Tolerance • Conclusion
Our Goal • Our goal is to facilitate the systematic design of large dependable control systems out of components. The interactions of the components is realized by the exchange of messages across interfaces to a real-time communication system. • The driving forces for the composition of a large System of Systems (SOS) out of a set of components (component systems) are: • Cognitive complexity reduction in order to reduce the design and development effort • Reuse of components: The components may be newly designed according to a given architectural style or may be already existing systems (legacy systems). • Simplified diagnostics and repair.
Report on US Air Traffic Control • In February 1997, the United States General Accounting Office (GAO) published a report to the Secretary of Transportation, Mr.F. Pena, about the design and implementation of the new air traffic control system in the US. • The author of the report was Dr. R. B. Stillman, Chief Scientist for Computers and Telecommunications.
ATC Plagued by Problems • “To illustrate, the long-time centerpiece of this modernization program--the Advanced Automation System (AAS)--was restructured in 1994 after estimated costs tripled from $2.5 billion to 7.6 billion and delays in putting significantly less-than-promised system capabilities into operation were expected to run 8 years or more.” • “For example, the per-unit cost estimate for the Voice Switching and Control System increased 522 percent, and the first site implementation was delayed 6 years from the original estimate.” • Source: GAO Report to the Secretary of Transportation, February 3, 1997, p.24
Principal Findings of GAO Report • An architecture is the centerpiece of sound system development and maintenance. • FAA is developing a logical architectural component for ATC modernization and evolution. • FAA lacks a technical architectural component to guide and constrain ATC modernization and evolution. • Without a technical ATC architecture, costly system incompatibilities have resulted and will continue. • FAA lacks an effective management structure for developing and enforcing an ATC systems architecture.
What is a Technical System Architecture? • A technical system architecture is a framework for the construction of a system that constrains an implementation in such a way that the ensuing system is understandable, maintainable, extensible, and can be built cost-effectively.
Technical System Architecture (II) • Architectural style: An architecture must provide rules and guidelines for the partitioning of a system into subsystems and for the design of the interactions among the subsystems. • Composability: An architecture must provide a framework for the systematic construction of a system out of subsystems (components). • Property Match: Components must comply with the architectural style to avoid a property mismatch at the component interfaces. • Elegance: An architecture must constrain an implementation in such a way that the ensuing system is understandable, maintainable, extensible, and can be built cost-effectively--in other words, it is elegant. • Architecture Design is Interface Design
Property Mismatches at Interfaces • Property Example • Physical, Electrical Line interface, plugs,Communication protocol CAN versus J1850 • Syntactic Endianness of data • Flow control Implicit or explicit, Information push or pull • Incoherence in naming Same name for different entities • Data representation Different styles for data representation Different formats for date • Temporal Different time bases Inconsistent time-outs • Dependability Different failure mode assumptions • Semantics Differences in the meaning of the data
Size versus Mental Effort to Understand Mental Effort (Complexity) If the mental effort required to understand a particular system function grows with the system size, there is an inherent limitation to the size of the systems we can build. Human Mental Capability Size
Complexity and Size • Large systems can only be built if the effort required to understand the system operation, i.e, the complexity of the system, remains under control as the system grows. • The effort to understand any particular system function should remain constant, and should be independent of the system size. • A large system contains many more different functions than a small system. • The effort needed to understand all functions of a large system grows with the system size. • The design effort must be guided by technical system architecture.
Summary: A Good Distributed Architecture • provides a framework and guidelines for the composition of a system out of nearly autonomous components (subsystems) without the occurrence of property mismatches. • defines an architectural style. • specifies the type of interactions among the components across well-defined and small interfaces. It thus builds structure by weak inter-component coupling and strong intra-component coupling. • provides interfaces that are flexible enough to support the intended functions, but rigid enough to act as error containment boundaries. • is based on already familiar orthogonal concepts that are used recursively. • is scalable without limits.
Technology Trend to Distributed Systems • System on a Chip (SOC) is the components: A complete computer system, including, CPU, Memory, I/O, Communication Controller, Operating Systems, and Application Software can be implemented on a single silicon die: e.g., Motorola “Golden Oak” • Smart Sensors: Sensing Element, signal processing, calibration, diagnosis, communication control on a single die. • On-Chip Oscillators for low-cost nodes: cheap, but imprecise • COTS: Commercial off the shelf components comprising hardware and software • Integrated Fault Tolerance: to mask faults, e.g. SEU (single event upsets)--New failure modes of SOCs
Economics of Silicon • Silicon real-estate requirements (today, i.e. in the year 2002): • ARMcore 32 bit CPU: 1 mm2 • Infineon 256 Mbit DRAM: < 100 mm2 : 320 kbyte of DRAM: 1 mm2 • Marginal Production Costs of 1 mm2 of silicon is in the order of 10 US cent (Cost at silicon foundry TSMC) • Cost of packaging, testing, pins, power-supply significant and often dominant. • Marginal production costs of 100 mm2 silicon chip order of 10 US $. • One men minute of work buys how many megabytes of RAM?
Time-Triggered Architecture (TTA) • Safety without compromises • No single point of failure • Formal analysis of critical functions • Composability: • Building systems out of prevalidated components--Component reuse • Fully specified operational interfaces in the temporal domain and value domain • Two level design methodology • Flexibility • Flexible reuse of existing components .
TTA Overview H Host TR Transducer Data Sharing Interface H H H RT Communication System Digital on a sparse time-base TR TR Analog or Digital dense time-base Controlled Object
Design Principles of the TTA • Establishment of a Consistent Distributed Computing Base • Global Time at every Node • Temporal Accuracy of of Real-time Data • Distinction between State and Event Observations • Interfaces specified in the domains of time and value • Transparent Fault Tolerance
Validity of Real-Time Data How long is the observation: “The traffic light is green” temporally accurate ? The validity of real-time data is time dependent.
Definition: Temporal Accuracy • The temporal accuracy of a RT image is defined by referring to the recent history of observations of the related RT entity. A recent history RHi at time ti is an ordered set of time points <ti,ti-1,ti-2,. . . . ti-k>, where the length of the recent history • dacc = ti - ti-k • is called the temporal accuracy. Assume that the RT entity has been observed at every time point of the recent history. A RT image is temporally accurate at the present time ti • if
State and Event Observation • An observation is a state observation, if the value of the observation contains the full or partial state of the RT-entity. The time of a state observation denotes the point in time when the RT-entity was sampled. • An observation is an event observation, if the value of the observation contains the difference between the “old state” (the last observed state) and the “new state”. The time of the event information denotes the point in time of the L-event of the “new state”.
Example of State and Event Observation • State observation (blue): • <Name of RT entity, Time of observation, full value> • The flow is at 5 l/sec a 10:45 a.m. • Event Observation (red): • <Name of Event, Time of event occurrence, state difference> • The flow changed by 1 l/sec at 10:45 a.m. RTImage RT Entity
Message • A message is an atomic data structure that is formed for the purpose of inter-component communication. The endpoints of the communication are the component interfaces. • In the temporal domain, a message can be characterized by • The message send instant, i.e. the instant when the first bit of the message leaves the sender. • The message receive instant, i.e., the instant when the last bit of the message arrives at the receiver.
Interface • The interface between two subsystems (cluster, component, etc.) is characterized by • Its dataproperties, i.e., the structure and semantics of the data items crossing the interface • Its temporalproperties, i.e., the temporal conditions that have to be satisfied by the interface: control and temporal data validity. • The functional intent, i.e., the assumptions about the functions of the interfacing partner • In a non-real-time computer system, there is little concern about the temporal properties.
Distributed System Interfaces Communication System Component B Component A Inter-face View Inter-face View Messages
Elementary vs. Composite Interface • Consider a unidirectional data flow between two subsystems (e.g., data flow from sensor node to processing node). • We distinguish between: Control Example: state message in a DPRAM A Elementary Interface: B Data Control A B Composite Interface: Queue of event messages Data Elementary interfaces are inherently simpler than composite interfaces
Information Push vs. Information Pull • Information Push Interface: Information producer pushes information on information consumer (e.g., telephone, interrupt) • Information Pull Interfaces: Information consumer requests information when required (e.g, email). • What is better in real-time systems?--For whom?
State Message versus Event Message • State Message: A periodic message that contains state observations (synchronous).Message handling: update in place and non-consuming read.Periodic state messages can be implemented as an elementary interface (nodependence of sender on receivers) with error detection at the receiver. • Event Message: A message that contains event observations (asynchronous).Message handling: exactly-once semantics, realized by message queues. Requires a composite interface (dependence of sender on receivers) for error detection at the sender. • (Compare “sampled message” and “queued message” in ARINC)
Time Triggered (TT) vs. Event Triggered (ET) • A Real-Time system is Time Triggered (TT) if the control signals, such as • sending and receiving of messages • recognition of an external state change • are derived from the progression of a (global) time. • A Real-Time system is Event Triggered (ET) if the control signals are derived from the occurrence of events, e.g., • termination of a task • reception of a message • an external interrupt
Basic Elements of the TTA • Assumes existence of a sparse global time and contains the following four basic elements: • Interface: a data-sharing boundary between two communicating subsystems that contains temporally accurate state observations. • Communication subsystem: transports real-time data in the from of state messages from an output interface to an input interface within a given time. • Host computer: Reads input data from an input interface (information pull), performs a data transformation and writes output data into an output interface (information push) within a given a priori known duration. • Transducer: Transforms output data from an interface into a form required by the system environment and transforms data from the environment into the form required by an input interface.
A Time-Triggered Architecture (TTA) Node Control signals and data items to and from the controlled object Interface to Transducerss Communication Network Interface (CNI) Host computer includingapplication software Host Computer Communication Network Interface (CNI) Interface to Other Nodes Messages to and from the real-time communication system
TTP - Principle of Operation • TTP generates a global time-base • Media access is controlled by TDMA, based on this time • Acknowledgement implicit by membership • Error detection is at the receiver, based on the a priori known receive time of messages • State agreement between sender and receiver is enforced by extended CRC calculation • Every message header contains 3 mode change bits that allow the specification of up to seven successor modes
Sparse Time Base • If the occurrence of events is restricted to some active intervals with duration with an interval of silence of duration between any two active intervals, then we call the timebase /-sparse, or sparse for short.
Uniform Time Format--OMG Standard Time horizon Time granularitydetermined byprecision of GPS Elapsed seconds since January 6, 1980 at 00:00(GPS base). 1 sec 2-24 sec 240 seconds external time format (8 bytes) Start of epoch: January 6, 1980 at 0:00:00 UTC Granularity about 60 nanosecond
Time and State • In abstract system theory (Mesarovic, p.45), the notion of state is introduced in order to separate the past from the future: • “The state enables the determination of a future output solely on the basis of the future input and the state the system is in. In other word, the state enables a “decoupling” of the past from the present and future. The state embodies all past history of a system. Knowing the state “supplants” knowledge of the past. Apparently, for this role to be meaningful, the notion of past and future must be relevant for the system considered.” • A precise concept of time is a prerequisite for a precise concept of state.
Global Interactions versus Local Processing HostComputer HostComputer HostComputer C NI C NI C NI In the TTA, the locus of temporal control is in the communic- ation system. CC+MEDL CC+MEDL CC+MEDL CC+MEDL CC+MEDL C NI C NI In ET systems, the locus of temporal control is inhost computers. HostComputer HostComputer I/O I/O
TTP-Controller Host CPU TTP-Time Interrupt CNI in DPRAM TTP Controller Protocol Engine TTP ControlData in MEDL Replicated TTP Bus
Use of Apriori Knowledge • The a priori knowledge about the behavior is used to improve the Error Detection: It is known a priori when a node has to send a message (Life sign for membership). • Message Identification: The point in time of message transmission identifies a message (Reduction of message size) • Flow control: It is known a priori how many messages will arrive in a peak-load scenario (Resource planning). • For event-triggered asynchronous architectures, there exists an impossibility result: ‘It is impossible to distinguish a slow node from a failed node!’ This makes the solution to the membership problem very difficult.
Continuous State Agreement • The internal state of a TTP controller (C-state) is formed by the • Time • Operational Mode, and • Membership • The Protocol will only work properly, if sender and receiver contain the same state. • Therefore TTP contains mechanisms to guarantee continuous state agreement (extended CRC checksum) and to avoid clique formation (counts of positive and negative CRC checks).
TTP-A Objectives • Composability and Testability • Latency Guarantee for State Estimation • Good Error Detection for fail safe operations • Use of Standard UARTS (8 data bits with parity) • High Data Efficiency (>50 %) and small latency • Single wire (10 kbits) or twisted pair operation • Clock Synchronization better than 1 msec
Fault-Tolerant Sensor Connection Fault Tolerant Unit FTU TTP/A Bus TTP/A TTP/C A A A Host Sensors Controlled Object TTP/C TTP/A Host A A A TTP/A TTP/A master controller TTP/C TTP/C controller A TTP/A slave node interfacing to sensors and actuators TTP/C Bus
TTA and the CORBA Architecture Time-TriggeredArchitecture Corba Facilities:Time Internationalization Domain Specific, e.g, Banking Health Care TTA CNI Object A Object B ORB at A ORB at B Object Request Broker (ORB)--GIOP communication Corba Services: Naming Transaction Security Persistent State Event Notification, and more
Integration of TT and ET Services--the Options • (i) Parallel: Time Axes is divided into two parallel windows, where one window is used for TT, the other for ET, Two media access protocols needed, one TT, the other ET • TTETTTET Time • (ii) Layered: ET service is implemented on top of a TT protocol Single time triggered access media access protocol. • Time Loss ofTemporalComposability Loss of GlobalBandwidthSharing What are the consequences for global time and state?
Architecture Design is Interface Design • A good interface within a distributed real-time system • is precisely specified in the value domain and in the temporal domain, • provides the relevant abstractions of the interfacing subsystems and hides the irrelevant details, • leads to minimal coupling between the interfacing subsystems, • limits error propagation across the interface, • Conforms to the established architectural style • and thus introduces structure into a system.
Composability • Compose: “to make or form by combining things, parts, or elements” • Composition: “the act of combining parts or elements to form a whole”Webster Encyclopedic Dictionary, 1989, p. 302 • Composability: “The ease of forming a whole by combining parts” • Parts: The component systems or thecomponents • Whole: A system of systems (SOS). • A composition brings into existence new emerging services of the SOS that are more than the sum of the prior services of the components. • These emerging services are the result of the integration of the component systems.
What is a “Component”? • In our context, a component is complete computer system that is time aware. It consists of • The hardware • The system and application software • The internal state • The component interacts with its environment by the exchange of messages via interfaces.
Closed Component vs. Open Component • Closed Component: Contains no local interface to the real world, but can contain local interfaces to other closed components. Semi-closed if it is time-aware. • Open Component: Contains an interface to the real world. Semi-open if no control signals are accepted from the real-world (e.g., a sampling system). • The real world has an unbounded number of properties.
Interfaces of a Component Diagnostic and Management Interface (Boundary Scan in Hardware Design) Application Software Linking Interface (LIF) Relevant for Composability Local Interfaces Configuration Planning Interface