200 likes | 319 Views
Preprocessing Input Data to Augment Fault Tolerance in Space Applications. Jayakrishnan K. Nair Zahava Koren Israel Koren C. Mani Krishna. Architecture and Real-Time Systems Lab – University of Massachusetts, Amherst. Motivation. Applications in harsh environments
E N D
Preprocessing Input Data to Augment Fault Tolerance in Space Applications Jayakrishnan K. Nair Zahava Koren Israel Koren C. Mani Krishna Architecture and Real-Time Systems Lab – University of Massachusetts, Amherst
Motivation • Applications in harsh environments • Onboard processing of huge amounts of sensor data in real time • Vital to anticipate and counter faults preemptively • Example: Space systems vulnerable to many faults • Bombardment by charged particles in space • Alpha Particles • Cosmic Rays • Power Glitches and Stray Capacitance effects • Crosstalk at CCD sensors in the detector array of imaging systems
Data Faults • Advancedreal-time applications in hostile environments • High likelihood of input data faults • Data faults occur at source, transit from source or while in memory • We focus on input data errors • Re-running the process or a secondary is useless as the input remains thesame • Current schemes can handle process faults well, but not input data faults • Input precision and reliability is vital to good performance • Corruption at input translates to unreliable, imprecise output
Proposed Solution – Input Preprocessing • Input data can be preprocessed to detect and dynamically recover from input errors • Use inherent redundancy in natural data and application semantics • Spatial, Spectral and Temporal Correlation • Dynamic Preprocessing algorithms • Application-specific, use domain knowledge on input datasets • Statistically analyze input data to find potential outliers • Use locality modeling of data in space, spectrum and/or time • Use absolute theoretical bounds on natural data • Automatically adjust to changing turbulence in data • Better results with more cohesive datasets • Reduce false alarms (pseudo-corrections)
Next Generation Space Telescope • A deep spacetelescope spacecraft to replace Hubble • Detectors sample once every 1000s, exposed to heavy radiation • Limited downlink bandwidth (6 GB/day) -> onboard processing • COTS processors based system -> increased vulnerability • Cosmic rays can corrupt pixel data : these must be cleaned • Multiple readouts during each baseline (N= 64) • Uses this redundancy to identify and recover from transient effects. * Ref: NASA
Input Analytical Model • Gaussian Correlation Model (GCM): The difference between consecutive pixel intensities follow a Gaussian distribution (i+1) = (i) + i where (i) are the pristine pixels in a datasets, i is a Gaussian RV with zero mean and standard deviation representative of simulated NGST datasets
Fault Models • Uncorrelated model: Bitflips occur independently with a fixed probability, 0 • Correlated model: Block faults affecting contiguous memory regions show a correlated pattern • Correlation in vertical and horizontal directions are considered • Probability corr ()increases with length R of run of bitflips at corr () = (ini) R j j=1 where ini is the probability for initializing a fresh run, and R is the length of the longer run among both directions.
Algo_NGST for Dynamic Preprocessing • Application-specific for NGST, uses temporal correlation • Dynamic Statistical Analysis to obtain a voter matrix • Pixels are paired with immediate neighbors at front and back in a pixel-window of width for least mean distance • Indices of the turbulence across data are obtained • Filter out voters based on sensitivity parameter [1,100] • For trading-off effectiveness with computational overhead • Identify three Bit Windows using dynamic bitmasks • Window A is the most stable bit-window, has MSBs • Window C has LSBs that change with every pixel, hence ignored • Window B in middle has a temporal model for bitwise consistency
Image Smoothing Algorithms • Optimal Median Smoothing • Each pixel is replaced by the median of a sliding window • More robust than mean smoothing • Bitwise Majority Voting • Each bit in pixel is replaced by a majority vote in the corresponding bit position in a sliding window • Preserves bit-wise information at the uncorrupted bits
Precision Improvement for GCM datasets Relative Error in Dataset (%) Probability of a bitflip in data A promising reduction factor in input average relative error, in the range ~50 to ~1000, is obtained for a practical range 0<10%
Computational Overhead Sensitivity can be adjusted to scale the algorithm to the achieve apposite balance between correction and computational overhead
Results for correlated input faults Relative Error in Dataset (%) Probability of a bitflip in data The two smoothing algorithms perform very similarly, but Algo_NGST yields better performance across all probabilities by reducing false alarms
Orbital Thermal Imaging Spectroscope • OTIS • Reads radiation reflected by earth’s surface for various wavelengths • Computes emissivity and temperature for each coordinate • Input and Output are represented as three-dimensional floating-point arrays • Unlike NGST, there is no temporal redundancy • Spectral Correlation – unreliable as it falls sharply outside a band • Spatial Correlation with Locality bounds – usable for preprocessing
OTIS Datasets * Ref: E. Ciocca • Three distinctive datasets from OTIS • Blob: Broad areas of unchanging temperature, high correlation • Stripe: Prominent vertical turbulence, other regions benign • Spots: Plethora of spots, turbulencedistributed over entire region • Assumptions for Preprocessing • Exceptions occur as trends, never as single outliers • Single-bit anomalies are faults • Any theoretically out-of-bound value is a fault
Performance Comparison for “Blob” 80 60 40 20 Relative Error in Dataset (%) 0 0.02 0.04 0.06 0.08 0.1 Probability of a bitflip in data A very high gain in precision is obtained when bitflips are present in highly correlated data.
Performance Comparison for “Stripe” 80 60 40 20 Relative Error in Dataset (%) 0 0.02 0.04 0.06 0.08 0.1 Probability of a bitflip in data
Performance Comparison for “Spots” 80 60 40 20 Relative Error in Dataset (%) 0 0.02 0.04 0.06 0.08 0.1 Probability of a bitflip in data
OTIS Results for correlated input faults 80 60 40 20 Relative Error in Dataset (%) 0 0.05 0.01 0.15 0.2 0.25 Probability of a bitflip in data Beyond a certain limit, the preprocessing is profligate in generating false positives
Conclusions • Input Faults in Space systems • Process-fault tolerance schemes cannot handle input faults • Input Preprocessing • Inherent Redundancy at input for proactive error correction • Natural Correlation in Temporal, Spectral or Spatial locality • Application-specific preprocessing algorithms for dynamic recovery • Use application semantics and domain knowledge of input data • Results • Works well for uncorrelated and correlated faults • Significant improvements in input precision for varying fault probabilities and statistically diverse datasets
Thank You URL: http://www.ecs.umass.edu/ece/realtime Architecture and Real-Time Systems Lab – University of Massachusetts, Amherst