Alternative Majority-Voting Methods for Real-Time Computing Systems

Alternative Majority-Voting Methods for Real-Time Computing Systems Paper By: Kang G. Shin and James W. Dolter Presented By: Chad Meador Chad Meador

Introduction • The need for reliable computers in support of real-time control applications is growing quickly. • Errors are traditionally masked in these systems by: • Replicating the application & system tasks. • Distributing tasks on independent hardware. • Combining the replicated results. Chad Meador

Introduction • Two methods to combine the replicated results are: • A synchronous vote applied to the replicated output. • Adv. – Simpler implementation. • Disadv. – Requires synchronization overhead. • A combination of the output in an asynchronous fashion. • Adv. – Takes full advantage of the asynchronous nature of distributed environments. • Disadv. – Lacks application independence. Chad Meador

Implemented Synchronous Vote Architectures • Fault Tolerant Multi-Processor (FTMP) • Fault Tolerant Processor (FTP) • Software Implemented Fault Tolerance Computer (SIFT) • Difference – where to place and how often to perform the synchronous vote Chad Meador

Vote on information entering the processors from replicated buses. Allows the vote to be performed in hardware transparent to the application software. Requires processors to be in tight synchronization, thus, cannot be extended to a distributed environment. Time-frame structure is used. Messages are exchanged and a vote is performed. Time frame places restrictions on the structure of the application software and hampers application independence. FTMP And FTP SIFT Chad Meador

Asynchronous Voting • Run replicated tasks asynchronously and vote on the results. • Suffers from the problem of specifying and implementing suitable algorithms to derive a single output request from the multiple output requests produced by the replicated tasks. Chad Meador

Asynchronous Voting • Requests can differ since processors run asynchronously with respect to one another and sample the input independently. • Once algorithm is developed, resulting systems require great care for even minor modifications. • Thus, they are highly application dependent. Chad Meador

Synchronous & Asynchronous Compromise • There are two techniques incorporate advantages from both synchronous and asynchronous voting. • Quorum-Majority Voting (QMV) • Comparative-Majority Voting (CMV) • Provide a compromise between a tightly synchronized system w/ high overhead and an asynchronous system which lacks suitable algorithms for combining data. Chad Meador

Problem Statement • Primary goal – provide an environment in which real-time applications interfaced to life critical functions can be executed correctly in the presence of malfunctioning hardware. • Correct operation involves masking up to a given number of faults while satisfying real-time constraints of the applications. • Secondary goal – provide application independence allowing a broader class of problems to be handled. Chad Meador

Questions To Be Addressed • If the tight synchronization of a system is relaxed, which is desirable to lower synchronization overhead, then: • “When and how does the system sample the input sensors to provide a single input datum to each of the replicated tasks?” • “What action should be taken if a task is not yet ready when the input is sampled?” • “When and how does the system resolve the replicated output data and issue a single output action honoring the individual requests?” Chad Meador

Assumed Target Architecture In QMV And CMV Development Chad Meador

Processor Pool • Each processor has its own clock, but the system is neither completely asynchronous nor tightly synchronized. • Achieved through infrequent clock synchronization in software. • Clocks in any two processors may differ by at most some maximum skew. • Must also be a fault-tolerant software clock synchronization algorithm allowing any necessary system resynchronization. Chad Meador

Semi-intelligent I/O Controllers • Most important addition. • Must know which tasks will request service for devices under their control. • Must maintain lists of requests and resolve the requests into a single action. • This addition serves to insulate the I/O devices from the actions of a single processor. Chad Meador

Secure Communication • Must be available for use in processor-I/O control messages. • Must have the capability of detecting the origin and any modifications of the messages. • Achieved using a suitable encryption scheme. • Secure communication allows the I/O controllers to detect multiple messages from faulty processors. Chad Meador

Replicated Sensors • Sensors obtaining replicated data must be placed in different I/O controllers. • This ensures that failure of an I/O controller does not affect the ability of an application task in obtaining data from sensors. • Data from replicated sensors is handled in the applications and not by the I/O controllers, preserving application independence and allowing flexibility in managing redundant sensors. Chad Meador

Real-Time Task Structure • Primary interest is in real-time applications. • Scope of the applications is restricted to an assumed structure present in real-time tasks hinged on 2 assumptions: • Real-time applications can be naturally decomposed into tasks with three phases. • Real-time applications have a single-source/single-link I/O requirement. Chad Meador

Task Structure Chad Meador

The Three Phases • Input Phase – Input operations collect the data from a single (but replicated) input source. • Computation Phase – Once data is available this phase will begin with further assumption that the phase is non-preemptive. • Output Phase – Task is concluded with an output to a single device. Chad Meador

Request Tag Structure Chad Meador

QMV and CMV Naming Requests • The OS and underlying architecture must support several naming conventions used in the implementation of QMV and CMV. • Tasks are uniquely identified w/ task ID’s. • Replicated tasks have replication ID’s. • Request ID’s are a combination of the Address Offset from the virtual address space and the number of I/O requests from the from the particular image of the task. Chad Meador

Quorum Majority Voting (QMV) • Ensures that I/O operations are triggered only on the behavior of a quorum of proper participants. • For each request of service, a quorum of the replicated images must issue the request before an action takes place. • Once quorum is established, the nature of the operation is decided: • For input, the sample is taken and sent. • For output, the received data values are voted upon and the resulting action taken. Chad Meador

Quorum Majority Voting (QMV) • To tolerate k faulty task images, the non-faulty images must be able to control establishment of a quorum. • Once quorum is formed, the non-faulty images must be able to dominate the majority vote for the selection of the operation. Chad Meador

Quorum Majority Voting (QMV) • 3k+1 replicated images are required to tolerate the behavior of k faulty images. • 2k+1 non-faulty images are needed in the case that all k faulty images choose to abstain from making a request. Chad Meador

Comparison Majority Voting (CMV) • In contrast to QMV, CMV requires only 2k+1 replicated images, but handles requests for I/O operations differently. • IN CMV, I/O controllers wait for k+1 requests for I/O from the same data. • Input Request – Data is sampled and sent to all 2k+1 images. • Output Operation – Data is sent to the specified device. Chad Meador

Comparison Majority Voting (CMV) • Disadvantages of CMV • I/O controllers have to partition incoming requests into equivalence classes as defined by the data of the requests. • These operations are not trivial and could increase the hardware complexity as compared to requirements for QMV. Chad Meador

Timing Bounds Conclusions • Time bounds suggest that it is possible to reduce the overhead for QMV to below that for synchronous voting. • All of the bounds assume that the computation phase is non-preemptive and that there is no multi-tasking. Chad Meador

Conclusions • Two techniques provide a compromise between a tightly synchronized and an asynchronous system. • QMV and CMV are most applicable to distributed real time systems with single-source/single-sink tasks. • All real-time systems eventually have to resolve their inputs into a single action at some stage. Chad Meador

? Questions ? Chad Meador

Alternative Majority-Voting Methods for Real-Time Computing Systems

Alternative Majority-Voting Methods for Real-Time Computing Systems

Presentation Transcript

REAL-TIME SYSTEMS

Voting Methods

Real-Time Systems

Real-Time Systems

Majority voting

Real Time Systems

Voting Methods

Voting Methods

Voting Methods

Alternative Systems Development Methods

Real Time Systems

Real-Time Operating Systems for Embedded Computing

Voting Methods

Real-Time Systems

Voting Methods

Real Time Systems

Real-Time Systems

Impossibility and Other Alternative Voting Methods

Real-Time Systems