650 likes | 810 Views
Checking Computation of Numerical Functions by the Use of Functional Equations. REC 2006 NSF Workshop on Reliable Engineering Computing F. Vainstein and C. Jones. Presentation Summary. Background Fault tolerance Computing Numerical Functions Theory Finding checking polynomials
E N D
Checking Computation of Numerical Functions by the Use of Functional Equations REC 2006 NSF Workshop on Reliable Engineering Computing F. Vainstein and C. Jones
Presentation Summary • Background • Fault tolerance • Computing • Numerical Functions • Theory • Finding checking polynomials • The general method • A program developed by this research • Some examples • Considerations for deployment • Future directions
Fault Tolerance Grace in response to the unexpected • Withstands failures • Exhibits desirable behavior • Does not endanger life (military, transportation, medical) • Preserves scientific investment (space, supercomputing) • Meets consumer expectations
Fault Tolerance Can Be Critical Military: Global Hawk Science: Gravity Probe B Exploration: Mars Opportunity Rover Civilian: Airbus A380
Methods for Fault Tolerance • Modular redundancy • Back up systems in the event primary unit fails • Replication with voting • Duplicate function blocks and compare for “majority” wins • Error-correcting codes • Reed-Solomon, parity checks, … • Algorithm-based fault tolerance (ABFT) - Encodes data and augments algorithm to detect errors
A Complex System: The Space Shuttle Total number of parts > 600,000 Total Weight = 4,500,000 pounds Cost to move one pound of cargo = $20,000 Budget = $3.3 billion / year
Modular Redundancy: Space Shuttle Space shuttle avionics from Redundancy Management Techniques for Space Shuttle Computers, Sklaroff, IBM Research Development, 1976.
Complex System: The Microprocessor Intel Pentium 4 Prescott Core Number of transistors > 125 million Transistor size = 90nm Pipeline = 31 stages Development Budget = $4.2 billion/year “Never in the history of mankind has it been possible to produce so many wrong answers so quickly.” Carl-Erik Froeberg
What Does a Microprocessor Do? • ALU: Arithmetic logic unit performs • math and logic functions. • Math coprocessors were big business • for Intel and others in the 1980s. • Today, most processors incorporate • a math coprocessor or emulator for • numerical calculations. • Move data from one memory location • to another • Make decisions and jump to new • set of instructions
IBM FPU Core Scientific codes typically spend much of their time in common numerical subroutines - about 70% of a phase retrieval application, for example, is spent in the Fast Fourier Transform alone. M. Turmon, Annual Report for FY 2001 Final Report Algorithm-Based Fault Tolerance, Nasa-JPL, Remote Exploration and Experimentation Project. Image Legend:Dark Blue: Interface, Decode and IssuePink: Pipe Management and Data ForwardingYellow: Arithmetic PipeAqua: Load/Store Pipe
Numerical Functions Numbers from numbers • Degrees to radians • Cosine • Hyperbolic Sine • ArcSine • SINC function • Next positive power of 2 • Linear interpolation • Root finding • Gaussian • Mod • Greatest Common Divisor • Absolute value • Minimum • Maximum • Round to next integer • Return the fractional part of a value • Clip in a saturation fashion • Wrapping for integers • Log • Fast Fourier Transform (FFT) • Numerical Differentiation • Kalman Filtering
Numerical Functions in Action: 1 • IMAGE PROCESSING • The FIDO Mars Exploration Rover (MER) • relies on detailed panoramic views in its • operation for near real-time tasks: • Determination of exact location • Navigation • Science target identification • Mapping • WEATHER MODELING • Roe, K., et al., High Resolution Weather Monitoring • for Improved Fire Management, 2001, Maui HPCC • Real-time analysis of environmental information • for prediction of fire behavior
Numerical Functions in Action: 2 NON-LINEAR CONTROL SYSTEMS Brennan, S., Integrated Chassis Control for Vehicles, 2000 SCIENTIFIC SUPERCOMPUTING U. Landman, et al., Large-scale classical molecular dynamics, 2001, Georgia Tech
Background Summary: • Computing is at the heart of most modern systems • Fault tolerance is a concern – especially for mission and safety critical systems • The computation of numerical functions is a critical area of computing
Notable Work in Numerical Result Checking • M. Blum, R. Rubinfeld • - Self-Testing/Correcting with Application to Numerical • Problems, 1990 • M. Blum, H. Wasserman • - Reflections on the Pentium Division Bug, 1995, • - Software Reliability Via Runtime Result Checking, 1997 • Promoted numerical checking • A motivation for result checking Used functional equations but no general method existed.
An Algebraic Method for Fault Tolerance 1991 – Feodor Vainstein, Georgia Tech Error Detection and Correction in Numerical Computations by Algebraic Methods Developed a general theory for generating functional equations. Showed that many numerical functions have functional equations and that computations of such numerical functions could be verified by checking polynomials – a novel technique based upon algebraic concepts such as the transcendental degree of field extensions.
Contribution of This Work:A Method for Practical Numerical Checking • Developed software method for finding checking • polynomials. • Treated the case of functions that are not polynomially • checkable. • User-friendly program for hardware/software engineering • Design considerations
Algebra*: Fields *S. Lang, Algebra, Addison-Wesley, 1965
Theory: Other Cases We also considered • Functions over various fields • PC and LC functions of several variables • Partially polynomially checkable functions The focus of the present work is on finding a practical method for determining approximate checking polynomials for PC and non-PC functions for real-valued functions of a single variable.
Least Squares Estimation The least squares estimation technique is used to compute estimations of parameters and to fit data. Since some functions are not PC we can generalize to approximate for non-PC functions. • There are other methods but this was chosen to • Add robustness • Develop a practical process • Treat all polynomially checkable functions
Application of Least Squares Estimation: 1 The problem of finding a checking polynomial can be reduced to the following optimization problem. Let
Software Implementation of Least Squares Estimation: 1 Solve the matrix equation:
Software Implementation of Least Squares Estimation: 2 The coefficients of the checking polynomial are then in vector X: Those values can be used to find the value of the delta function: Deviation shows how good is our approximation
The Matlab Function: • Solves the least squares estimation problem • Finds the delta function value for a range of k • Returns the checking polynomial coefficients • for the best (smallest error) delta function • Plots the error over the function domain for • the best delta • Plots deviation for a range of k • Simulink, DSP Builder generates VHDL and • deploys to Altera FPGA (Xilinx similar)
Example: SINE Function Plots The sine function is linearly checkable (LC)
Why Matlab Matlab (MATrix LABoratory) • Matrix-oriented programming environment • Code can compile to C/C++ • Built-in routines for data analysis and visualization • GUI/Web publishing support • A popular environment for technical computing http://www.gtrep.gatech.edu/undergradlabs/labman/CheckingPolynomial