1.43k likes | 1.46k Views
Real Time Systems. Introduction. Goal: Analyze the features and constraints imposed by the real time part of such systems Method: Talk about the following subjects Speed of response Failures Scheduling and interrupts Semaphores and Synchronization Software Dynamic Redundancy.
E N D
Introduction • Goal: Analyze the features and constraints imposed by the real time part of such systems • Method: Talk about the following subjects • Speed of response • Failures • Scheduling and interrupts • Semaphores and Synchronization • Software Dynamic Redundancy
Real Time System • Definition (Oxford Dictionary of Computing) • Any system which the time at which the output is produced is significant. This is usually because the input corresponds to some movement in the physical world and output has to relate to that same movement. The lag between input time and output time must be sufficiently small for acceptable timeliness • The clause of “physical-world” limits the system type • The “acceptable timeliness” restricts even more
RT Characteristics • Timeliness: system has to perform “on-time” • Several types of “timeliness” • Short Time (nanosec-msec-msec) e.g.Cruise Control System, Broad Band Telecommunication, Games, etc. • Medium Time (msec-sec)Teller Systems, Intensive Care Units, Gas and Petrol Pumps etc. • Long Time (sec-hours) Chemical Reaction controllers, Oven Temperature Controllers, etc.
RT Characteristics • Dynamic Internal Structure • Ability of system to cope with changing external conditions • Telecom Line severed rerouting • Disk malfunction use mirror disk • Temperature too high disconnect line, activate emergency program • Net result dynamic creation and deletion of executables (to cope with memory, cpu speed etc.)
RT Characteristics • Reactiveness • RT systems are usually “event driven”. They react to events which may come in any order and at various times • The reactiveness is a measure of their ability to do so prior to their total failure. • Disk parity, unknown file, disruption of communication at different stages, Telephone switches, insertion and removal of credit cards, various timeouts, etc. • Ability of RT systems to react to various events in any order. It is a set of allowed sequences which together with timing yields the desired reaction of the system
RT Characteristics • Architecture - The way the system is built • A single activity system • Usually one task with predefined sequences which can work at a time • Concurrent systems • A system which may perform a few tasks at a time • A quasi-concurrent system uses a single CPU shared by many tasks which “seem” to work in parallel • A concurrent system uses many CPU on each one another task is running. This type usually uses the same memory for the different CPUs. This is a true concurrent system • A distributed concurrent system uses many independent computers (may reside in different sites linked by fast communication (A fly by wire plane is a good example)
Distributed systems • Distributed architecture is suitable for complex systems • Redundancy (CPU memory and mass storage) • Parallel calculations • Depend heavily on communications • System start may depend on the order different CPUs come on line • Latency may depend on channels traffic • Complex protocols may be counter productive
RT Characteristic • Real world systems are a superposition of the above • Design of such systems is therefore complex and error prone • Requires knowledge, expertise and ingenuity to achieve a good solution. • Needs methods and some theory to ensure success
Timeliness Issues • Reaction Time is the total time it takes to get a reaction from the system • Reaction time = Latency+Service time • Latency time it takes to start reacting • Time to start interrupt routine • Time to position disk reading head (why?) • May well be a function of system state! • Service Time is the time it takes to complete the response calculation • Mathematic calculation • Data transfer
Real Time Systems • Systems vary with the deadline requirements • Hard RT Systems must meet deadline regardless of state • Soft RT systems may have some leeway • Not required to meet deadline within a specified time but about it • May be probabilistic (90% within x etc.) • Consistent delays may cause failures (queue filling up!) • Part hard & part soft =Firm
Real Time Systems • Remember: Regardless of type a late answer is a wrong answer • Better late than never is not acceptable! • Additional categories of RT system exist
Examples • Travel Agent versus Air line central CPU • RT system but as long as delay in answers is less then 30-60 sec. Everybody is happy • Soft RT system T T Mul tip lex or Central computer modem modem Front End CPU T T T
Travel Agent Example Analysis • Input : completed forms (queries or abort) • Output: forms, display of DB data, DB updates • Latency: varies, depends on rate of requests • Service Time variable (depends on rate and length of output) • Reaction time variable (within reasonable delay ok) Soft RT system
Nuclear Detonation Probe • Important data available within 2.5 msec. • Measurements made at explosion site and transmitted within this 2.5 mili • After that device melts • HARD RT system. • System is reactive due to sensors sensing the explosion.
Car Cruise Control • Reactive? • Hard? • What are sensible times for sampling and control?
Oven Temperature Control • Reactive? • Hard? • Time Constants? • What else can be done?
Getting the External Stimuli • In order for the computer to react it has to “read” the stimulus • Two methods exist: Interrupts and Polling • Input to a computer usually comes in through an I/O port • Interrupt is a hardware mechanism attached to an I/O port external to the cpu which senses changes in the port status and interrupts the cpu current work
Getting the External Stimuli • Handling the interrupt consists of reading the I/O port • Interrupt handlers do this • Interrupts allow better utilization of the CPU as no cpu cycles are used to probe the I/O port status and deciding what to do • Polling consists of continuously probing the port status and deciding what to do
Getting the External Stimuli • I/O ports usually have one data word (byte) and one status word (byte) • Interrupts are coupled to the status word • Polling continuously reads the status word • Status word is device dependent but has at least one ready bit and one error bit. • For output: ready means ready for next data • For input: one new data is ready in the data word to be read in
Example: Controlling Sale of Gas • Need to control • Pumps flow (mainly stop) • Credit card reading • Gas quantity • Sales receipts • Back- office integration- update the db of the gas station. • Gas restocking quantities-Issue computerized orders for gas buying (example) • Analyze the needed types of sub-systems, their type and method of input reading • Asses time constants
Last Example Analysis • This is a mixed system part hard part soft • Needs many tasks to perform functionality • Tasks to work concurrently • Needs a Real Time Operating System (RTOS)
RTOS Qualifications • Fast Response Capability • Stability under transient overload (stress) • Inter-task communication facility • Available high level programming language • Low level programming language • Requirement analysis tool • Hardware and Software debugging tools
Multi-Tasking RTOS • Why buy an RTOS • Available facilities tested • Behavior under stress known • Amount of testing reduced • Additional memory and overhead (time!) needed (disadvantage) • In-house development • No available facilities • No ready tools for development; prepare all by yourself • Huge testing amount • Less memory and overhead (advantage)
Patient Monitoring Example Local Station Patient monitoring 0 Central Station 1 2 Collect patient details Central monitoring and alarm 0- poll local station 1- patient vital signs 2- bed ID 3- personal details 4- vital signs limits 5- patient data 6- time data 7- time information 8- present patient information 8 3 7 4 Update log Date time & polling interval producer 5 6 Produce patient report
Example (continued) • This is a quasi concurrent system • The number of local stations is of paramount (big) importance • An RTOS will switch between the beds within a specified time (this is the hard RT part) • All other tasks will use queues to transfer information • All tasks will have to co-operate in order to get timely correct data
Example (continued) • By using an RTOS all the timing interrupts etc are taken care of by the RTOS • Same is true for task scheduling • Information transfer is achieved by using the system supplied queues • Developers have to concentrate on writing the application not on testing the interrupts and the stress conditions!
Example (continued) • We concentrate now on the “Local Station” • Vital signs are: Blood Pressure; Body Temperature; Heart Beat Rate; Respiration Rate; Sweating. A transducer needed to enable reading the information into the computer. • Minimal time constants are ~300 mili-sec. • Reading ports takes only a few mili system can accommodate many concurrent patiens
A Look at RTOS Operation • RTOS must • Allocate system resources taking into account the deadlines which produce higher priorities • Schedule different tasks according to priorities • Provide for task synchronization and exclusion (in our example we want data of a single patient at a time, not garbled by other patients data) • Provide for data sharing between tasks (report is made of different bits of information coming from shared data regions) • Must provide for external events (interrupts)
Outline of a Multi-Task RTOS User Level System tasks Application tasks OS Level Resource allocation & mgmt Command processor I/O Subsystem File MGMT Task Scheduling & Dispatch Interrupt Service routines Real time clock
Another RTOS Diagram Programs Real Executive Kernel world Application Interface
Another Look (cont.) • Kernel- Detailed functions to supply executive with its functionality depend on computer • Executive- control system functions which • allocate system resources (memory, scheduling, file mgmt,mutual exclusion semaphores, synchronization etc.) • Applications call executive functions to perform tasks, so become only system dependent • Real World Interfaces, functions that deal with special IO of the system (written according to executive requirements)
Pros/Cons • Pros • Standard software drives any system • Scaling of system becomes easier • Testing simpler • RTOS provides: programmable timers, configurable I/O services, queues , semaphores • RTOS contains standard sub routines for: • Serial Communications,ATD and DTA converters, keyboard controllers • Cons • Uses more memory and may be slower
Embedded Computer systems • Definition: • Any RT system in which a computer is used as a component within that monitors and controls its operation is an embedded system • An embedded system has a computer which may or may not use an RTOS • It may have horrible constraints on size, power, memory, etc. • Usually has a very rudimentary (simple) user interface
Embedded Systems - Closer look • External constraints are extremely important in embedded systems • Three main areas influence the decisions • Environment • Physical, Operational, Electrical • Performance • Speed of response, Failure mode recovery • Interface (the need to communicate with external devices)
Embedded Systems - Closer look • Physical Environment Considerations • Really anything may affect the system • Temperature, Humidity, Shock and Vibration,Space constraints (size and shape), Weight limits • Temperature • Some ranges are given: • Industrial range 0-500C • Military range –55 - 1250C • Oil exploration 0-2000C • Space Exploration –55 - 2000C • The different ranges call for special insulation and or heat removal, affecting electrical power needed. The smaller the device easier to cope with its requirements • Display units, Hard disks, User Terminals do not function well at extreme temperatures Need for protected environment
Environment (continued) • Humidity • Humidity and salt cause corrosion and short circuits protection needed and number of chioces available to designer very reduced • Vibration, Size, Weight and Shape • These constrains are found in hand held units, space probes, special aircrafts, submarines, armored vehicles, etc. • Once all factors are combined the number of choices is very small and invariably reduce the possibility to use ready made systems, libraries etc.
Environment (continued) • Electrical Environment • Many automatic systems depend on internal power supply power should be saved! • Limits on memory (and/or special ram) • Small displays (difficult to read, and primitive) • Requirement for “stand by “ state • Cosmological probes may actually hibernate! • Operational Environment • On turn on system is expected to start up correctly • On turn off system is expected to turn itself off in a safe manner BIT for start up, special care in turning off memory, ensure disk heads do not crash on disk plates etc.
Environment (continued) • General Requirements • Start and forget (once started it has to complete its mission safely and/or continue for ever! • Very long operational life • Software may have to be upgraded without changing the system • Bugs may have to be fixed. To reduce corrections a strategy of easy testing and maintainability will affect the primary design • However, the last requirement is “forgotten” and finally costs more than the whole system development
System Failures • Any system will fail (including well designed ones) • Problems may be categorized as: • H/W fault (short circuit, severed lines etc.) • S/W fault (an error escaping testing) • A transient fault (overload causes information to be lost) • A permanent fault (due to some unforeseen condition system cannot recover)
System Failures • It may be impossible to recover from a H/W faults, however one has to try Design for it! • S/W faults have to be fixed, however, system has to recover reload sick task and restart (hope that conditions have changed) • System has to recover from transient faults
System Failures 1 1- Excess stress domain 2- Stress domain from which recovery is possible 3- Operational domain, part of which is the service region and the unused part is additional unused domain for later use 2 3
Region Definitions • Operational Domain • Totality of states space which the system may visit in the course of its normal operation • A space in which it functions according to requirements • Tolerable Stress Domain • Totality of states space in which the system must survive without damage and from which it must recover and return to the operational domain
Region Definitions • Excess Stress Domain • The state space outside the normal and tolerable stress spaces. A domain where system’s behavior and safety cannot be guaranteed. • The system is either protected from this possibility or have a graceful degradation • Graceful degradation should give time for external help to arrive.
Coping with system failure • When no recovery possible (past the graceful degradation) system has to be put in a fail safe state. • Close files, turn off heat,tell user, stop communications so they can be restarted, try to restart (a preset no. of times) • When possible a graceful degradation is called for. • Remove part of functionality, read only most important messages, increase reaction time
Coping with system failure • Full Fault Tolerant Systems • Are not affected by failures up to a point • Are very expensive as the system is duplicated • May need an arbitrator computer to manage the double (triple or more) redundancy • Such systems have a very large MTBF (but it is finite!) • The best bet is a graceful Service Degradation, needs a lot of planning!
External Device Interfacing • External devices and their interfaces are one of the main sources of failures in RT systems • Usually the size of the hardware and its cost is dominated by the interface electronics • When the system is mainly a computer (one or two standard devices) a watch dog & master slave computers is a good and inexpensive solution
External Device Interfacing • For I/O dominated systems (aircrafts, missiles, smart bombs, life sustaining systems etc.) • Doubling the number of computers requires doubling all external devices • Very expensive • Different external devices, especially when DTA and ATD are used give rise to different values need and arbitrator • Done only when money is of secondary importance
S/W Treatment of Faults • C++ and other languages have mechanisms to cope with software exceptions (TRY-THROW-CATCH) • The start of exception starts with a test • How do we know H/W mal-functioned? • How to detect: sensor failure, illegal action by operator, garbled message display etc. • Assuming we caught an error the more difficult part is: what to do