650 likes | 922 Views
TU Wien. Component-Based Design of Industrial Control System H. Kopetz TU Wien September 2006. The Message. Today, in 2006, computer technology is in the middle of a major paradigm shift, similar to the transition from the Mainframe to the Personal Computer twenty-five years ago. .
E N D
TU Wien • Component-Based Design of • Industrial Control System • H. Kopetz • TU Wien • September 2006
The Message • Today, in 2006, computer technology is in the middle of a major paradigm shift, similar to the transition from the Mainframe to the Personal Computertwenty-five years ago.
Outline • Introduction • Technology Developments • What is a Component? • Composition Framework • Example: Time-Triggered Architecture • Future Developments • Conclusion
Technology Developments • Hardware • Communication • Software • User Expectations
Estimated Parameters of an SoC around 2015 • 2004 2007 2015 • Feature Size (nm) 90 65 20 • DRAM Mbits/mm2 10 40 100 • SRAM Mbits/mm2 0.2 .8 8 • Million transistors/mm2 1 4 40 • Chip size mm2 200 200 200 • Frequency in GHz 2 8 50 • Cost/ mm2 (in cents) 10 10 10 • Cost per transistor (cents) 10 2.5 0.25 • Number of CPUs/mm2 5 20 200 • Cost (c) per CPU ARM 7 (200k) 2 0.5 0.05 • MTTF/chip permanent (years) 1000 1000 100 • MTTF/chip transient (years) 1 .8 <0.01
The Technology Landscape--Hardware • The limits of Moore’s law are becoming visible: power dissipation, physical feature size, reliability. • The performance increase of a single processor from one generation to the next is proportional only to the square root of the increase in silicon area (Pollack’s rule). • The transient failure rate of sub-micron devices is increasing [both single-event upset (SEU) and single-event transient (SET)]. • Multi-computer chips (SoC) area appearing. The development cost of such a chip can pass the 100 Mio $ wall--mass markets are needed to justify this level of investment. • It is getting more and more difficult to mobilize and finance the engineering resources required to design a billion transistor SoC.
From the International Roadmap of Semiconductors • Architectural means to mitigate the consequences of component failures might become a necessity when using the upcoming submicron devices, as stipulated in the latest 2005International Roadmap of Semiconductors p.6: • Relaxing the requirement of 100% correctness for devices and interconnects may dramatically reduce the costs of manufacturing, verification and test. Such a paradigm shift is likely forced in any case by technology scaling, which leads to more transient and permanent failures of signals, logic values, devices and interconnects.
The Technology Landscape--Communication • The widespread availability of a wireless communication infrastructure enables the ad-hoc detection and integration of services without any physical action--the coming of situation aware systems--wirelesssensors and actuators. • Seamless integration of Radio Frequency Identification (RFID) technology with embedded devices (e.g., cell phones). • Flexible transmission technologies (e.g., spread spectrum, ultra wide band, frequency hopping) and waveform-agile transmission methodologies (cognitive radio) allow multiple users to share a given frequency band with minimal interference and reduce the power required per bit transmitted. • It must be possible to integrate heterogeneous communication subsystems with minimal effect on the overall architecture of a large control system.
The Technology Landscape--Embedded Software • Uncontrolled system complexity: The costs of design, verification, integration and maintenance of large systems are getting prohibitive. • The investment in software is more long-lasting than the investment in hardware. • Security becomes a key issue-- as embedded devices are integrated into the Internet, particular in wireless systems. • The clear distinction between software and hardware is disappearing, e.g., power-dissipation is becoming also a software issue, not only in battery-operated devices. • Component-based design elevates the design process to a higher level of abstraction--but many key issues are still open, e.g., the precise specification of component interfaces, the identification of fault-containment units.
New Software Implementation Choices FPGA (Software) CPU Software • Algorithm Complexity • Ease of Programming • Greater DataVolumes • Higher Execution Speed (speedup 100X) From: R. Chamberlain, Embedding Applications within a Storage Aplliance Proc. HPEC 2005, p. 2 • Example: Look for keywords in a set of documents: • (A and B) or (C and D) • Search for the occurrence of A,B,C,D in FPGA, connect the results in software
User Expectations • In a large embedded control system that consists of a vast assembly of networked components that must operate 24 hours per day for 365 days per year the occurrence of transient and permanent failures of components and interconnects must be considered the norm, not the exception. Future system must thus include strategies and mechanisms that assure that the reliability of the user-perceived system services remains at an acceptable level despite the occurrence of these failures. • In an ambient intelligence scenario, where a multitude of diverse embedded devices is fielded in a home, it cannot be expected that the end-user is willing to spend her/his time and effort to troubleshoot a misbehaving distributed embedded system. A system must thus be capable to diagnose its own faults and guide an untrained user to repair the system with minimal effort.
Integrity-Level of Application Domains Control Systems
The Dilemma • The consumer electronics (CE) domain has the size to support the large development costs needed to build powerful SoCs. • Since in the near future there is no need to harden CE chips to mitigate the consequences of ambient cosmic radiation, the CE industry will not pay extra for hardening their chips. • Architectural mitigation strategies have to be developed such that replicated mass-market chips can be used to build high-integrity control systems.
Summary: Need for a Higher-Level Design Process • The difference between software design and hardware design is disappearing--we need a design process that captures both domains. • The hardware-base is becoming less reliable, but the system’s services must be more reliable. Fault-tolerance will become the norm, not the exception. It must be supported at the architecture level anad by the design process. • Physical parameters of the execution environment (e.g., the generated heat) must be considered in algorithm design. • The design process must be elevated to a higher level of abstraction to substantially increase designer productivity and enhance the design choices of the implementations
Component-Based Design: What is a Component? • Agreement at an abstract level: • A component is a building block in the construction of large systems. • Is this building block • A software unit of independent deployment (Szyperski--software component) or • a hardware-software unit that has behavior and state (system component)? • We need a clear concept of a component and a composition framework that supports the interactions among components.
System Component: Software plus Container Application Software Module API Only the Software plus Container exhibits temporal properties
System Component Characteristics • A (system) component--(sometimes called host, processing element(PE) or IP core or a tile)-- has the following characteristics: • It performs a computation controlled by software or a hardware state machine. • It is aware of the progression of real-time. • It supports one or more message-basedinterfaces and contains state. • Every interface must contain one and can contain more ports for sending and receiving messages. • At any instant, only one message can be sent/received at a port.
Different Types of System Components Local Interfaces--Open Components • The Communication Network Interfaces (CNI) of all three different types of system components should have the same syntax, timing and semantics. For a user, it should not be discernible which type of system component is behind the CNI.
Model Driven Design: From the PIM to the PSM Domain Specific Application Model (e.g. expressed in UML) Platform Independent Model Platform Independent Model (PIM) expressed in a High-Level Language Platform Specific Model
Model Driven Design: From the PIM to the PSM Domain Specific Application Model (e.g. expressed in UML) Platform Independent Model Platform Independent Model (PIM) expressed in a High-Level Language Platform Specific Model
The Key Issue: Interfaces of a System Component Diagnostic and Management Interface (Boundary Scan in Hardware Design) Diagnostic and Management Interface (Boundary Scan in Hardware Design) Local Interfaces toEnvironment Component Linking Interface (LIF) Offers the services of a component. Relevant for Composability Linking Interface (LIF) Offers the services of a component. Relevant for Composability Configuration Planning Interface
The LIF Specification Hides the Implementation Component Hardware Operating System Middleware Programming Language WCET Scheduling Memory Management Etc. Linking Interface Specification (In Messages, Out Messages, Temporal, Coordiation, Interface State, Meaning-- InterfaceModel)
Linking Interface (LIF) Specification • Operational Specification of the Messages: • Operational Input Interface Specification • Syntactic Specification (e.g. by IDL) • Temporal Specification (receive instant) • Input Assertions • Operational Output Interface Specification • Syntactic Specification (e.g. by IDL) • Temporal Specification (send instant) • Output Assertions • Coordination Patterns • Semantic Specification: • Meaning of the data elements: Means-and-ends interface model that explains the data transformation of the component • Interface State of the Interface Model
Semantic Specification: Interface Model • Specifying the concepts that are behind the names of the data structures • Interface model specifies the relationship between the input messages, the output messages, the interface state and time. • Interface model must be expressed with concepts that are familiar to the conceptual world of the intended users. • Interface model of an open component must include the context of use, i.e. a (constrained) model of the environment. • The brittleness of natural language cannot be avoided in open components. • Meta-level specification remain often informal -- Formalization increases the precision, but at the same time increases the distance to reality (Chargaff) • Beware of pseudo-formalism.
Services of the Composition Framework • The composition framework that links the LIFs of the components is realized by a real-time communication network which must have the following properties: • Timely transport of messages from any output port to any input port • High performance with minimal power consumption • Determinism to enable fault-masking by TMR • Protection of the network and the correct nodes from a failing node (hardware or software error) • Integrated diagnostics to detect component misbehavior
The Key Decision concerning the Network • Competition vs. Cooperation
Conflict Resolution in the Network • Only a single message can be received at a receive port at an instant: • Competition--Arbitration or Message Store Needed: • Network access control protocol ensures that only one sender may send at an instant to a receiver and excercises back pressure on the sender (based on priority (e.g., CAN) or on random process (e.g., Ethernet bus) in case of conflict • Conflicting messages are stored in the network (e.g., switched Ethernet) and sent at a later time, when the line to the sender is free • Cooperation--no Arbitration and no Message Store Needed: • Senders cooperate among each other to establish transmission slots that are free of conflict at the receiver (e.g., TTP).
An EpistemologicalIssue • Accesses to common resources by multiple clients requires arbitration or a priori coordination. Coordination can be achieved by reference to a global time-base. • Establishing temporal order is much more expensive than maintaining temporal order. • Once a proper global time has been established, it can be used to to order events consistently in the temporal domain. • In a system with global time, the difficult ordering problem has to be solved only once, at system startup, to achieve the initial synchronization of the clocks. Maintaining the clock synchronization is relatively easy. • In a system without global time, the difficult problem has to be solved every time temporal order is required (arbitration).
Performance and Power • According to Richardson et. al (2006): • Arbitration consumes time. Example AMBA Bus from ARM: Since AMBA arbitrations consume three cycles from request to address transmission and four from request to data transmission, the induced latency overhead may become significant. • There is no need for arbitration, back-pressure or intermediate storage of data packets in a synchronous network. The power consumption is thus reduced over comparable asynchronous networks. • Richardson, R.D. et. al. A Hybrid SoC Interconnect with Dynamic TDMA-Based Transaction Less Buses and On-Chip Networks, Proc. Of the 19th International Conference on VLSI Design, 2006, IEEE Press
Example for Determinism: Airplane on Takeoff • Consider an airplane with a three channel flight control system taking off from a runway: Channel 1 Take off Accelerate Engine Channel 2 Abort Stop Engine
Example for Determinism: Airplane on Takeoff • Consider an airplane with a three channel flight control system taking off from a runway: Channel 1 Take off Accelerate Engine Channel 2 Abort Stop Engine Channel 3 Take offStop Engine (Fault) Majority Take off Stop Engine
Determinism of a Communication Channel • A communication channel is called deterministic if (as seen from an omniscient external observer): • The receive order of the messages is the same as the send order. The send order among all messages is established by the temporal order of the send instants of the messages as observed by an omniscient observer. • If the send instants of n(n>1) messages are the same, then an order of the n messages will be established in an a priori known manner. • Two correctly operating independent deterministic communication channels will deliver messages always in the same order.
Determinism--Temporal Order is Obvious A A B B Real Time Red Channel Blue Channel
Determinism: Simultaneity--Who Wins? A B A B Determinism: If A wins on the blue channel then A must also win on thered channel Real Time Red Channel Blue Channel
Simultaneity: A Fundamental Problem • The ordering of simultaneous events is a fundamental problem of computer science: • Hardware level: metastability • Node level: semaphor operation • Distributed system: ordering of messages • There are two solutions within a distributed system to solve the simultaneity problem: • Distributed consensus--takes real-time and requires bandwidth (atomic broadcast) • Sparse time
The Time-Triggered Architecture • The Time-triggered Architecture (TTA) provides an execution environment for real-time applications. It is • a distributed architecture that provides afault-tolerant sparseglobal time-base of high precision at every node. • a deterministicarchitecture that supports fault tolerance by replication, where a node can be a single-chip computer (SoC). • an integrated architecture, where different application subsystems (DAS) up to the highest criticality class can be integrated into a single framework. • a generic architecture, which can be deployed in different application domains (e.g., automotive, aerospace, train signaling, process control, multimedia). • Kopetz, H, Bauer, G. , The Time-Triggered Architecture, Proc. of the IEEE, Jan 2003, Vol 91 p. 112-126
Fault Tolerant Sparse Time Base in the TTA • If the occurrence of events is restricted to some active intervals with duration with an interval of silence of duration between any two active intervals, then we call the timebase /-sparse, or sparse for short. • In a sparse time base, instants can be represented by integers.
TTA Eight-Byte Time Format: OMG Standard 5 Bytes Horizon about 30 000 years Epoch starts on January 8, 1980 (Origin of GPS Time) Precision about 60 nanosecond
Triple-Modular Redundancy (TMR) in the TTA • Triple Modular Redundancy (TMR) is the generally accepted technique for the mitigation of component failures at the system level: VO T E R A/1 VO T E R B/1 A B VO T E R A/2 VO T E R B/2 VO T E R A/3 VO T E R B/3
Triple Modular Redundancy is Supported by the TTA • The following architectural services that are needed to implement Triple Modular Redundancy (TMR) are supported by the TTA: • Provision of an Independent Fault-Containment Region for each one of the replicas • Timely and Deterministic Operation • Synchronization Infrastructure • Multicast communication • Replicated Communication Channels • Support for Voting
Systems of Systems do not have a Single Top • The services of a large RT control system (e.g., the computer system onboard a car or an airplane) can be partitioned into a set of nearly autonomous subsystems, we call them Distributed Application Subsystems (DAS). DASes communnicate via gateways. • Examples of a DAS onboard a car are • The body electronics DAS (doors, lights, clima control etc.) • The power train control DAS • The multi-media DAS .
Example of DASes onboard a Car DAS-Distributed Application Subsystem
Integrated Architecture • A number of technical and economic advantages could be realized if the different DASes were integrated into a single architecture • Cost savings by the reduction of the number of ECUs, sensors and wiring points (results also in an increase in hardware reliability). • Better integration of functions--more flexibility • Implementation of fault tolerance simplified But • Independence of individual DAS compromised--increased potential of error propagation from one DAS to another DAS • Integration increases complexity and diagnostics • Allocation of responsibility more difficult.
The TTA is an IntegratedPlatform Architecture Distributed Application Systems (DAS) DAS A DAS B DAS C DAS D • DECOSPlatform Interface Layer: • Encapsulation Services • Event-Triggered Communication • Virtual Channels • Hidden Gateways • Provision of Legacy Interfaces • Application Diagnosis Support Platform Interface Layer (PIL) Core Services (done for TTP) • Timely and Deterministic Transmisson • Fault-Tolerant Clock Synchronization • Fault Isolation • Determinism to support TMR • FCR-Diagnosis (Membership) Core Services (Done) Different Implementation Choices e.g., TTP, TT Ethernet Every DAS has has itsown encapsulated execution environment (may be its own processor and memory and I/O). Technology invariant interface
Different Time-Triggered Communication Protocols • The TTA is not bound to a single communication protocol, provided the core services are provided: • The Time-Triggered Protocol (TTP) is the most mature protocol that has been formally analyzed and is used in the aerospace domain. • The automotive industry has defined its own version of a time-triggered protocol FlexRay. • For high-bandwidth application a standard compatible extension to Ethernet, TT Ethernet has been developed.
Why TT Ethernet? • At present, the communication system of the Time-Triggered Architecture (TTA) is controlled by the TTP/C protocol that provides the core services. • The scope of the TTA would be substantially widened, if Ethernet could be deployed as the communication system within the TTA. • Is it possible to augment Ethernet in such a way that COTS Ethernet users (we call them ET Ethernet users) and TT Ethernet users can operate in parallel?
Purpose of TT Ethernet • The purpose of TT Ethernet is to provide a uniform communication system for all types of distributed non-real-time and real-time applications, from very simple uncritical data acquisition tasks, to multimedia systems and up to safety-critical control applications, such as fly-by-wire or drive-by wire. • It should be possible to upgrade an application from standard TT- Ethernet to a safety-critical configuration with minimal changes to the application software.
Legacy Integration • TT-Ethernet is required to be fully compatible with existing Ethernet systems in hardware and software: • Message format in full conformance with Ethernet standard • Standard Ethernet traffic must be supported in all configurations • Existing Ethernet controller hardware can coexist with TT Ethernet controllers within the same cluster • Special TT controllers are required if the additional services (e.g., clock sync, membership) of TT Ethernet are used.
Principles of Operation of TT Ethernet • Distinction between two Messsage Categories: Event Messages (Standard (ET) Ethernet Messages) originating in the open world and State Messages (Time-Triggered (TT) Ethernet Messages) originating in a closed world. • TT messages are scheduled and the schedules are assumed to be free of conflicts • TT messages are sent at a predetermined instant on a sparse timebase and arrive within an a priori known delay with minimal jitter . • Conflict resolution: Preemption of ET Messages by the Switch in case it is in the way of a TT message. • Automatic Retransmission of a preempted ET messages by the Switch