1 / 41

Ardea: Dynamic Reconfigurable Control Architecture

Explore the run-time behavior of the Ardea framework, a distributed embedded control system emphasizing fault tolerance and real-time deadlines. Learn about the Ardea hardware architecture, software module dependency graphs, fault tolerance mechanisms, and more.

Download Presentation

Ardea: Dynamic Reconfigurable Control Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Run-Time Behavior of Ardea: A Dynamically Reconfigurable Distributed Embedded Control Architecture Osamah A. Rawashdeh and James E. Lumpp, Jr. Department of Electrical and Computer Engineering University of Kentucky Lexington, KY

  2. Outline • Motivation/Background • Objective and Contributions • Ardea Framework Overview • Ardea Hardware Architecture • Software Module Dependency Graphs • Ardea Fault Tolerance • Runtime Behavior • Summary and Conclusion

  3. Embedded Control • Distributed embedded control system • Mission critical tasks • Non-mission critical tasks • Array of sensors and actuators • Set of computing resources • Interconnection network

  4. Distributed State • Distributed Systems as HPC • Long Running Computations • Nodes and Links Fail • Rollback/Roll-forward • Save process state • Save in-transit messages • Limit interaction with the outside world • Redo the part that had the problem • Real-Time deadlines

  5. Motivation • No building block can be error free so we have to tolerate faults in embedded/real-time HW/SW • Traditional Techniques: • TMR • N-version redundancy • (ad-hoc) failover approaches • Disadvantages: • Wasted resources (cost, size, weight, power) • Software complexity • Quantifying failures

  6. 17,000 ft 58,000 ft 63,000 ft 86,000 ft 89,000 ft Wing Deployment Aircraft Separation Vehicle Launch Parachute Landing BIG BLUE • BIG BLUE: Baseline Inflatable-wing Glider, Balloon- Launched Unmanned Experiment. • Ongoing project at UK to developing a test bed for Mars airplane technology. • ~ 40 undergraduate students involved per year.

  7. UAV Research • BIG BLUE is funded by NASA Workforce Development Program. • Dependable UAVs for Homeland Security • BIG BLUE III, with inflatable only wings, and a UAV for entry in the AUVSI 3rd Annual Student UAV Competition. • “READY” UAV

  8. BIG BLUE II Architecture • Mission Controller • Auto-Sequencing • Data Acquisition • Ground Communication • Flight Controller • Control Glider • Chute Control • Monitor System Status • Deploy Recovery Chute • Camera Driver • Capture Images • Store to NVRAM • Shared Memory Space • Mailbox-Based Messaging

  9. Dependable Systems • Dependability: trustworthiness of a system allowing reliance to be justifiably placed on it’s services • Failures: Deviation of service provided from compliance with specifications • Faults: the cause of failures • Failure semantics: omission, timing, response, and crash • Hardware versus software faults • Fault Tolerance: ability to continue operation despite failures Figure 1 - Page 6

  10. Traditional Fault Tolerance • Fault tolerance entails fault detection and subsequent handling • Fault tolerance requires redundancy: • Static redundancy (spatial redundancy) • Modular redundancy • Design Diversity • Dynamic redundancy (temporal redundancy) • Recovery blocks • Failover programming

  11. Reconfiguration Based FT • Run-time reconfiguration FT feasible in distribute embedded systems: • Cost, size, power constraints • Availability of non-critical resources • Graceful degradation: a loss of or reduction in the quality of services a system provides in response to faults • Graceful degradation for distributed embedded systems is a new research area

  12. Approach • Graceful Degradation • Hardware/Software faults degrade performance instead of causing system failure. • Resources dedicated to non-critical functions serve as backup resources for critical functions. • No need to consider every failure combination at design time. Objective: To develop a framework for specifying gracefully degrading distributed embedded systems.

  13. The Challenges • How to specify a dynamically reconfiguring system that included static and dynamic redundancies as well as graceful degradation abilities • How to manage the redundancies • What infrastructure is needed to run these dynamic applications

  14. The Challenges • Software mode location independence • Moving object code • Routing module I/O data • Fault Recognition • User/application code reported faults • Ardea built-in fault detection • Tracking status/availability of HW and SW resources • Configuration Management • Tracking resource availability (HW and SW) • Finding new configurations (mapping of modules to PEs) • Deploying new configurations (starting, stopping, and restarting modules) • Managing state data variables • Reconfiguration time and critical deadlines. (multiple system reconfiguration policies to support reconfiguration before deadlines are missed. If a deadline is missed, then system fail-stop)

  15. The Ardea Framework • Ardea – Automatically Reconfigurable Distributed Embedded Architectures • Ardea herodias – The Great Blue Heron, a wading bird of the heron family Ardeidae, common all over North and Central America. This is the largest North American heron.

  16. HW Architecture Overview • Processing Elements (PEs) - Homogeneous set of processors - Real-time OS. - Local management tasks (scheduler, network interface, loader) • I/O Devices - Sensors and actuators - Hosted by PEs • Communication Network - Broadcast Network - Bandwidth and Latency • System Manager - Fault tolerant by other means - Tracks status of resources - Finds and deploys configurations

  17. Micro C/OS-II • Portable, ROMable, scalable, preemptive, real-time, multi-tasking, priority-based kernel. • Source available, ANSI C and free for academic use. • Ported to 40+ architectures (8 to 64 bit) since 1992. • Meets RTCA DO-178B Level 1 • Uses 4% CPU and 3 KB - 30 KB RAM

  18. CAN Aerospace • Stock Flight Systems • NASA Langley AGATE/SATS • NASA Ames SOFIA

  19. Silicon Labs C8051F040X • 25 MIPS pipelined 8051 Integrated CAN 2.0 B controller • 64 kB of Flash, 4 kB of SRAM, external memory interface • Mixed Signal • Dual UARTs, SMBus and SPI serial interfaces • MicroC/OS and Ardea CAN layer were ported to this MCU

  20. Ardea Overview • Software is developed in a modular fashion • Mobilesoftware modules can have several implementations with different resource requirements and output qualities • Dependencies among modules are graphically captured in software module dependency graphs(DGs) specifying application operating modes and execution parameters • A set of networked processors for running application software

  21. Ardea Overview – cont. • A global system manager tracks status of hardware and software resources • System manager computes new system configurations (a mapping of software modules onto processing elements) • Local management tasks are responsible for OS scheduling and data routing • Target applications: real-time distributed embedded control/periodic applications

  22. Application Software Specification • Dependency graphs show the periodic flow of information from sensors to actuators (i.e., data pipelines) • Graph nodes: software modules, data variables, I/O devices, and dependency gates • Software modules: • Executable code schedulable on a processing element • Suspended while input(s) unavailable • Produce and consume data variables • Attributes: worst case execution time and rate factor

  23. Data Exchange Data variables: • Application data between software modules • State data variables arelocal to a software module • Management data variables contain data consumed by system manager. • Attributes: • Size • Quality value or function Figure 5 - Page 19

  24. Specifying Dependencies • Dependency gates: • “k-out-of-n OR” gates: n > 0, 0 ≤ k ≤ n • “AND”: all input required • “XOR”: only one input required • “DEMUX”: for fanning out • OR gates can be specified to distribute inputs

  25. DG with Node Attributes ID = yaw_cntrl1 Exec_T = 900 cycl. Rate_factor = 1:5 ID = out1, out2 Criticality: critical Priority = 1, Rate = 10 Hz State = Enabled ID = rud_Angle1 Size = 2 bytes Quality = 1 ID = mag_drv1 , mag_drv2 Exec_T = 300 cycl. Rate_factor = n/a ID = yaw_history Size = 8 bytes Quality = n/a ID = servo1_drv, servo2_drv Exec_T = 200 cycles Rate_factor = 1:1 ID = yaw1, yaw2 Size = 2 bytes Quality = 1, 2 ID = yaw_cntrl2 Exec_T = 400 cycl. Rate_factor = 1:2 ID = rud_Angle2 Size = 2 bytes Quality = 2

  26. Ardea Fault Tolerance • Specifying static redundancy • Modular redundancy (E.g.,TMR) • N-version programming • Specifying dynamic fault redundancy • Rollback • Roll forward • Check pointing • Specifying graceful degradation • Multi-version software modules • Shedding non-critical services • Reducing update/output rate of services

  27. Ardea Fault Detection & Handling • Failure detection of sensors, actuators and software modules is the responsibility of application software • Ardea built-in fault detection: • PE crash failures by heartbeat messages • Network link failures detected and handled as PE failures • Software module crashes detected locally by a module execution monitors • Critical output modules detect missed deadlines • Fault Handling: masking, reconfiguration, or fail-stop

  28. Sensor Fault Detection

  29. Actuator Fault Detection

  30. Software Fault Detection

  31. Triple Modular Redundancy

  32. Ardea Runtime Behavior • Supporting mobile software modules (moving object code, scheduling/unscheduling, and data re-routing) • Tracking resource availability • Finding Configurations • Deploying Resources • Manage state data variables

  33. The System Manager

  34. Memory Loader: copies code into program memory Scheduler: starts and stops execution of modules Network Interface: handlespublic data variables (data routing) Processing Elements (PEs)

  35. Mobility and Data Routing • Module I/O data passed through mailboxes • Data routing transparent to modules • Starting, stopping of modules Figure 26 - Page 61

  36. Starting, stopping, and restarting modules Restarting requires: State Preservation Unprocessed data preservation Scheduling and Unscheduling

  37. Two configuration finding algorithms: High-fidelity is (NP-hard) to find high-utility configurations Low-fidelity (fast) to insure running of critical services Response based on criticality of detected/reported fault Deploying configurations starting from sensor side of a DG Reconfiguration Policies

  38. Reconfiguration Policies

  39. Ardea Benefits • More flexible fault tolerance at reduced cost • Ability to analyze reconfigurable architectures using DGs • Simplified debugging and maintenance • Runtime system testing • Graceful upgrade and repair • Reduction of design errors • Software reusability

  40. Current Work • Applying techniques to a UAV for AUVSI student UAV competition. • Avionics system for BIG BLUE IV. • “READY” UAV Project • Expand bus via wireless link to the ground: • Rapid prototyping • Minimize risk to hardware • Flexible Reconfiguration

  41. Conclusion • Graceful degradation in distributed embedded system is a new research area currently focusing on either abstract modeling or on non-real-time/non-critical systems • Ardea provides a structured framework for the design and implementation of real-time systems • Dependency graphs for application software specification • A software layer supporting relocatable software modules, fault recognition, and handling

More Related