280 likes | 445 Views
A Testing Framework for Reproducible Execution and Race Condition Detection in Real-time Embedded Systems. Ken Chen, JSC Eric Wong, UT at Dallas Yann-Hang Lee, ASU. Motivation. Real-time embedded systems are widely deployed in NASA missions Manned or un-manned space vehicles
E N D
A Testing Framework for Reproducible Execution and Race Condition Detection in Real-time Embedded Systems Ken Chen, JSCEric Wong, UT at DallasYann-Hang Lee, ASU
Motivation • Real-time embedded systems are widely deployed in NASA missions • Manned or un-manned space vehicles • Often exhibit temporal-dependent non-deterministic behavior, and thus extremely difficult to test • Threads may interact in an unpredictable manner due to scheduling and synchronization • Interaction with physical environment can be unpredictable, such as interrupts, timer, and changes of sensor values. How to verify the temporal behavior of real-time embedded systems in the presence of non-determinism? or Will the software behave similarly when the interval between the arrivals of two interrupt events is 1 sec, 2 sec, 3 sec….?
Challenges • How to characterize a non-deterministic execution caused by temporal dependency? • How to control an otherwise non-deterministic execution such that an execution can be reproducible (for debugging and test analysis)? • How to derive the possible deviations of a non-deterministic execution?
A Testing Framework • A platform-independent approach to deterministic execution • Can trace/replay an execution or force a specified test sequence to be exercised • Follows the same synchronization and IO event sequences • Time-stamped events to recoup timing information • Works at a higher level of abstraction • A systematic approach to derive possible deviations of a non-deterministic execution • Based on static/dynamic code analysis • Does not require any formal specification of the system behavior
Conduct a test run The Testing Process Synchronization and I/O event trace Reproducible execution Dynamic and race analyses Race variants
Instrumentation Dynamic analysis (execution flow, timing, synchronization, and I/O operations) Run test cases in target environment Static analysis (control flow and data dependence) Model of events and program execution Model deduction from multiple test runs Create new event occurrences from uncovered intervals Timing and race condition verification Analysis Overview of Tool Environment
Record/Replay Framework • Need to have execution trace for race condition analysis (Event type and event sequence plus timing) • Record event sequence between threads and with environment • Replay to reproduce the identical sequence or (relative) timing • Dynamic analysis such as coverage and slicing, Test case generation, Debugging • Related work • Software Instruction Counter (Count backward branches etc.) subroutine calls) • Deterministic Java Replay Utility (KVM – an interpreter) • Complete System Simulation (Instruction set simulation) • Time Machine (Register context)
App 1 App 2 App 3 System Task App 1 App 2 App 3 System Task System Call Recorder System Call Recorder System Call Generator VxWorks VxWorks IO Driver Board Support Packages IO Driver Board Support Packages Exact Execution (Interrupt Replay) System Architectures • Relative versus Exact Replay Execution Relative Execution (OS_level Replay)
1 2 3 4 5 6 7 8 Replaying OS_level Record/Replay • Record event start and end marks • Replay execution defers until next event • All results returned from buffer in Replay • Corporative execution based on event order Event Log 1 2 3 4 5 6 7 8 Event Log Recording
Framework Target System (IXP1200) Record/ Replay Task App 1 App 2 App 3 System Task Serial Link Workstation VxWorks Ethernet Router IO Driver Board Support Packages Server
Design Considerations for AERCam • Scheduling • Preemptive Priority Scheduling • Ready, Pending, Delay, Suspend • Execution Context • Priority 0 to 255, System Tasks use 0 to 100 • Task Name • IPC • Application versus Interrupt Context • Timeout • Signals • Synchronous versus Asynchronous • Generation & Delivery • Current implementation supports semTake, semGive, msgQSend, msgQReceive, signal, kill, read, and write
Reserved Memory IPC Recorder IPC 1 Interrupt Recorder VxWorks _intEnt Device ISR Interrupt Recording IRQ 2 IPC Recorder IPC 2 Scheduler Vector Table Interrupt Event IPC 1 Event IPC 2 Event Event Log IPC 1 VxWorks Exception Handler Task Manager Interrupt Replaying Breakpoint IPC 2 Program Code Interrupt Record/Replay • Record interrupt, system calls, context switches • Replay exact execution sequence • Re-execute all systems calls
RESTA • A tool suite for Real-time Embedded Software Testing and Analysis • Challenges • A massive amount of execution trace is collected that is not only complex but also difficult to interpret • An urgent need to provide a methodology, supported by a tool, to automatically analyze the data and present them from different views for better understanding • Twofold objectives • Visually re-create the program execution to gain insight into its dynamic behavior • Present data retrieved from the execution trace in different perspectives to aid in quality assurance, performance improvement, etc.
Our Approach • Take the advantage of the concept of “Data Visualization” • Experience has shown that graphical visualization can help us significantly better understand complex phenomena and large amounts of complex data • Through different visualization of the execution trace, various graphs are generated to help us deduce what really happened during the program execution • Features in RESTA • Message Graph • Race Condition Graph • Semaphore Graph • Task Active Graph • Program Execution Graph (Coverage Summary and Source Code Display Graphs)
Design and Implementation Philosophy (1) • Portability • RESTA is implemented in Javawhich can run on many different platforms such as Windows, UNIX, and Linux. • Scalability • Many of the existing tools are not very scalable. Their limitations are readily apparent when large, complex data are analyzed. • Data collected in our studies can be large and complex. • Good Visualization • All the visual displays be intuitively meaningful. • Easy of Understanding • Information presented by each graph should be self-explanatory and obvious to its users • Even if a user has to spend little effort to understand a graph for the first time, the same user should easily recall what he or she learned when seeing this graph again
Design and Implementation Philosophy (2) • Easy of Use • The use of a tool should reduce, not increase, either the stress or the boredom of its users • Provide an interactive, mouse-click-, or menu-oriented interfaces for invoking different features with customized • Diversity • No single graph can offer a full view of the behavior and the data associated with the program execution • Provide different graphs to represent views from different perspectives • Extensibility • New features for different views will continuously be included • RESTA should adopt a flexible architecture/design to make such extension easy and feasible
Message Graph (1) • Displaying message-passing between different tasks • Assume a program with seven tasks running simultaneously on a single CPU and the following message-passing between different tasks • (1) Task 1 sends a message to Task 2 • (2) Task 3 sends a message to Task 2 • (3) Task 2 receives the message sent by Task 3 at Step (2) • (4) Task 2 receives the message sent by Task 1 at Step (1) • (5) Task 1 sends a message to Task 3 • (6) Task 3 receives the message sent by Task 1 at Step (5) • (7) Task T6 sends a message to T3 • (8) Task T7 sends a message to T3 • (9) Task T3 receives the message from Task T7 • (10) Task T3 receives the message from Task T6
Race Condition Graph (1) • Displaying possible race conditions due to message-passing between different tasks • Due to the synchronization among different senders and receivers • Two receivers with possible race conditions are highlighted in red
Race Condition Graph (2) • Clicking on the red arrow of the first receiver, we have race conditions displayed as follows • The receiver is surrounded by a red square. • The two competing senders are circled in green. • The corresponding source code of this receiver and the two senders is displayed in red and green, respectively, in another pop-up window. Race conditions with respect to the first receiver highlighted in red
Race Condition Graph (3) • Clicking on the red arrow of the second receiver, we have race conditions displayed as follows The receiver (Task 3) may receive a message from Tasks 1, 6 or 7
Semaphore Graph • Displaying how tasks take and give semaphores • Example: mutual-exclusion semaphores Task 2 takes SEM1 at ta and gives at tg Task 3 waits for SEM1 from tb to tg before it can take the semaphore at tg
Task Active Graph • Displaying when each task is active
Program Execution Graph: Coverage Summary • Providing a visualization of how the program is executed by each task • Coverage Summary Graph • Report code coverage (basic block and decision) for each task • Other criteria such as all-synchronizable-sender-receiver-pairs, all-concurrency-paths, etc. • Coverage with respect to the entire program • Coverage with respect to the modules executed by each task
Program Execution Graph: Source Code Display • Code in white has already been executed by the task • Code in color is prioritized in terms of increasing the coverage • Compute the priorities by using a dominator/ superblock analysis • Priorities are displayed as numbers in the color spectrum at the top
Summary • Our overall objective is to provide a reproducible execution and graphical displays/summaries with respect to various analytical analyses to study the program behavior and to improve its quality, dependability, safety, performance, etc. • Current work • A layered testing environment • Finalize the “prefix test sequence plus non-deterministic run” method to identify various execution paths caused by timing variants (in compliance with an input event constraint model) • Enhance the current version of RESTA by including additional analysis • Conduct a case study on AERCam (Autonomous Extravehicular Robotic Camera) • Work with NASA JSC quality assurance engineering team to apply our research results to the AERCam real-time simulation/testing environment, thereby realizing autonomous operation capabilities with a high level of assurance. • Publish and present our research results