280 likes | 424 Views
Alternative Majority-Voting Methods for Real-Time Computing Systems. Paper By: Kang G. Shin and James W. Dolter Presented By: Chad Meador. Introduction. The need for reliable computers in support of real-time control applications is growing quickly.
E N D
Alternative Majority-Voting Methods for Real-Time Computing Systems Paper By: Kang G. Shin and James W. Dolter Presented By: Chad Meador Chad Meador
Introduction • The need for reliable computers in support of real-time control applications is growing quickly. • Errors are traditionally masked in these systems by: • Replicating the application & system tasks. • Distributing tasks on independent hardware. • Combining the replicated results. Chad Meador
Introduction • Two methods to combine the replicated results are: • A synchronous vote applied to the replicated output. • Adv. – Simpler implementation. • Disadv. – Requires synchronization overhead. • A combination of the output in an asynchronous fashion. • Adv. – Takes full advantage of the asynchronous nature of distributed environments. • Disadv. – Lacks application independence. Chad Meador
Implemented Synchronous Vote Architectures • Fault Tolerant Multi-Processor (FTMP) • Fault Tolerant Processor (FTP) • Software Implemented Fault Tolerance Computer (SIFT) • Difference – where to place and how often to perform the synchronous vote Chad Meador
Vote on information entering the processors from replicated buses. Allows the vote to be performed in hardware transparent to the application software. Requires processors to be in tight synchronization, thus, cannot be extended to a distributed environment. Time-frame structure is used. Messages are exchanged and a vote is performed. Time frame places restrictions on the structure of the application software and hampers application independence. FTMP And FTP SIFT Chad Meador
Asynchronous Voting • Run replicated tasks asynchronously and vote on the results. • Suffers from the problem of specifying and implementing suitable algorithms to derive a single output request from the multiple output requests produced by the replicated tasks. Chad Meador
Asynchronous Voting • Requests can differ since processors run asynchronously with respect to one another and sample the input independently. • Once algorithm is developed, resulting systems require great care for even minor modifications. • Thus, they are highly application dependent. Chad Meador
Synchronous & Asynchronous Compromise • There are two techniques incorporate advantages from both synchronous and asynchronous voting. • Quorum-Majority Voting (QMV) • Comparative-Majority Voting (CMV) • Provide a compromise between a tightly synchronized system w/ high overhead and an asynchronous system which lacks suitable algorithms for combining data. Chad Meador
Problem Statement • Primary goal – provide an environment in which real-time applications interfaced to life critical functions can be executed correctly in the presence of malfunctioning hardware. • Correct operation involves masking up to a given number of faults while satisfying real-time constraints of the applications. • Secondary goal – provide application independence allowing a broader class of problems to be handled. Chad Meador
Questions To Be Addressed • If the tight synchronization of a system is relaxed, which is desirable to lower synchronization overhead, then: • “When and how does the system sample the input sensors to provide a single input datum to each of the replicated tasks?” • “What action should be taken if a task is not yet ready when the input is sampled?” • “When and how does the system resolve the replicated output data and issue a single output action honoring the individual requests?” Chad Meador
Assumed Target Architecture In QMV And CMV Development Chad Meador
Processor Pool • Each processor has its own clock, but the system is neither completely asynchronous nor tightly synchronized. • Achieved through infrequent clock synchronization in software. • Clocks in any two processors may differ by at most some maximum skew. • Must also be a fault-tolerant software clock synchronization algorithm allowing any necessary system resynchronization. Chad Meador
Semi-intelligent I/O Controllers • Most important addition. • Must know which tasks will request service for devices under their control. • Must maintain lists of requests and resolve the requests into a single action. • This addition serves to insulate the I/O devices from the actions of a single processor. Chad Meador
Secure Communication • Must be available for use in processor-I/O control messages. • Must have the capability of detecting the origin and any modifications of the messages. • Achieved using a suitable encryption scheme. • Secure communication allows the I/O controllers to detect multiple messages from faulty processors. Chad Meador
Replicated Sensors • Sensors obtaining replicated data must be placed in different I/O controllers. • This ensures that failure of an I/O controller does not affect the ability of an application task in obtaining data from sensors. • Data from replicated sensors is handled in the applications and not by the I/O controllers, preserving application independence and allowing flexibility in managing redundant sensors. Chad Meador
Real-Time Task Structure • Primary interest is in real-time applications. • Scope of the applications is restricted to an assumed structure present in real-time tasks hinged on 2 assumptions: • Real-time applications can be naturally decomposed into tasks with three phases. • Real-time applications have a single-source/single-link I/O requirement. Chad Meador
Task Structure Chad Meador
The Three Phases • Input Phase – Input operations collect the data from a single (but replicated) input source. • Computation Phase – Once data is available this phase will begin with further assumption that the phase is non-preemptive. • Output Phase – Task is concluded with an output to a single device. Chad Meador
Request Tag Structure Chad Meador
QMV and CMV Naming Requests • The OS and underlying architecture must support several naming conventions used in the implementation of QMV and CMV. • Tasks are uniquely identified w/ task ID’s. • Replicated tasks have replication ID’s. • Request ID’s are a combination of the Address Offset from the virtual address space and the number of I/O requests from the from the particular image of the task. Chad Meador
Quorum Majority Voting (QMV) • Ensures that I/O operations are triggered only on the behavior of a quorum of proper participants. • For each request of service, a quorum of the replicated images must issue the request before an action takes place. • Once quorum is established, the nature of the operation is decided: • For input, the sample is taken and sent. • For output, the received data values are voted upon and the resulting action taken. Chad Meador
Quorum Majority Voting (QMV) • To tolerate k faulty task images, the non-faulty images must be able to control establishment of a quorum. • Once quorum is formed, the non-faulty images must be able to dominate the majority vote for the selection of the operation. Chad Meador
Quorum Majority Voting (QMV) • 3k+1 replicated images are required to tolerate the behavior of k faulty images. • 2k+1 non-faulty images are needed in the case that all k faulty images choose to abstain from making a request. Chad Meador
Comparison Majority Voting (CMV) • In contrast to QMV, CMV requires only 2k+1 replicated images, but handles requests for I/O operations differently. • IN CMV, I/O controllers wait for k+1 requests for I/O from the same data. • Input Request – Data is sampled and sent to all 2k+1 images. • Output Operation – Data is sent to the specified device. Chad Meador
Comparison Majority Voting (CMV) • Disadvantages of CMV • I/O controllers have to partition incoming requests into equivalence classes as defined by the data of the requests. • These operations are not trivial and could increase the hardware complexity as compared to requirements for QMV. Chad Meador
Timing Bounds Conclusions • Time bounds suggest that it is possible to reduce the overhead for QMV to below that for synchronous voting. • All of the bounds assume that the computation phase is non-preemptive and that there is no multi-tasking. Chad Meador
Conclusions • Two techniques provide a compromise between a tightly synchronized and an asynchronous system. • QMV and CMV are most applicable to distributed real time systems with single-source/single-sink tasks. • All real-time systems eventually have to resolve their inputs into a single action at some stage. Chad Meador
? Questions ? Chad Meador