270 likes | 311 Views
NPP Science Data Segment (SDS) Data Depository and Distribution Element (SD3E) Delta Design Review for CERES August 26, 2008 NASA GSFC B8 Auditorium. Evelyn Ho Ryan Gerard Evelyn.L.Ho@nasa.gov Ryan.Gerard@nasa.gov. SD3E Overview (1 of 2). CERES Science Users. NESDIS NSIPS.
E N D
NPP Science Data Segment (SDS)Data Depository and Distribution Element (SD3E)Delta Design Review for CERESAugust 26, 2008NASA GSFCB8 Auditorium Evelyn Ho Ryan Gerard Evelyn.L.Ho@nasa.govRyan.Gerard@nasa.gov
SD3E Overview (1 of 2) CERES Science Users NESDIS NSIPS Intermediate Products CERES DataProduction Ocean PEATE • Generate Climate Quality Data Products to Characterize Global Climate change SD3E Atmosphere PEATE (U of Wisconsin) NOAA ADS (CLASS) • Assess and Validate VIIRS • Radiometric and Geometric • Calibration Pre- and Post-Launch • Provide Calibration update • Recommendations • Coordinate PEATEs Cal • Related Analysis and Anomaly • Characterization NICSE SDR, EDR, IP, TDR, LUTs, Algorithms Quality Control, Cal. LUT and S/W updates • Support EDR Assessment • Support NICSE CV • Assessment Pre-/Post-Launch • Provide Data Analysis • Tools to ST • Develop and Demonstrate • Algorithms enhancements Land PEATE NESDIS IDPS RDR CERES RDRs, VIIRS Subset SDRs NPP Science Team Pre-launch data CasaNOSA IPO CLASS Pre-launch data • A Mini- Version of the IDPS • Demonstration of Algorithm • Enhancement • Generation of Intermediate • Product, if necessary • IDPS Production Algorithms / • Software I&TSE Algorithm Enhancements IP Gen. Request IDPS S/W Packages Pre-launch data Status reports, Management direction • Support EDR Assessment • Provide Data Analysis tools to ST • Develop and Demonstrate • Algorithms enhancements • Assess and Validate CrIS & • ATMS Calibration Pre- and • Post-Launch (TBD) Sounder PEATE (JPL) TDR, RDR, SDR, EDR, IP Ozone PEATE EDR Assessment Algorithm Enhancements • Support EDR Assessment • Provide Data Analysis tools to ST • Develop and Demonstrate • Algorithms enhancements • Assess and Validate OMPS • Radiometric and Geometric • Calibration Pre- and Post-Launch • Perform OMPS Limb Cal/Val • Support OMPS Limb Ops. • Generate OMPS Limb SDR EDRs • Transition OMPS Limb Algorithm to NOAA for Operations NESDIS C3S Calibration Tasking • Overall management direction • and science guidance to • the PEATEs • Coordinates EDR Assessment, • Algorithm, and Calibration • recommendations for • submission PSOE Status reports, Management direction Ozone Community OMPS Limb SDR EDR New due to Level 1 Requirements Changes NPOESS Algorithm CCB Status reports, Management direction Algorithm Enhancements IP Gen. Request EDR Assessment, Algorithm Enhancements STA Report Not all data flows shown here. RDR, SDR EDR, IP • 32 Day Data Buffer for RDR, SDR, EDR, • and 5 days of Retained Intermediate Products CERES RDRs RDR, SDR EDR, IP RDR, SDR, EDR, IP
SD3E Overview (2 of 2) • SD3E – SDS Data Delivery & Depository Element • Performs data request and acquisition from major data providers • Acquires data from NOAA/CLASS, NSIPS, and NESDIS/IDPS • Now will handle CERES RDRs like other RDRs • Provides ~32 days “rolling storage” for pick-up by PEATEs and the NICSE • No data processing of NPP products • No reformatting • No aggregation • No subsetting
SD3E Level 3 Requirements • Acquire products [Req. 3.1.1.1 – 3.1.1.18] • RDRs from IDPS • xDRs, selected IPs, operational algorithms and source code, software and documentation, calibration products, selected IPs, and official ancillary/auxiliary data from ADS • Retained IPs from NSIPS • Verify data integrity [Req. 3.1.2.1] • Manage data requests from PEATEs/NICSE [Req. 3.1.3.1 – 3.1.3.10] • Provide data access to the PEATEs/NICSE [Req. 3.1.4.4] • Manage 32-day (TBD) data store [Req. 3.1.4.1 – 3.1.4.3] • Respond to PSOE management direction and provide status reports [Req. 3.1.4.5 – 3.1.4.6] • General Requirements [Req. 3.6.1 - 3.6.2, 3.6.7, 3.7.1 - 3.8.3.1]
Context Data Flow Diagram Data Product Requests IDPS Product Requests and Status User (PEATES, IDPS NICSE) xDRs NPP Products File Errors Data Delivery Report Request Status Status Retained IPs NSIPS P1 NSIPS Data Product Requests SD3E and Status Management Direction Reports Digital Signature PSOE Operational Algorithms Official Ancillary Data xDRs, Selected IPs Calibration Products ADS Calibration Coefficients ADS Product Requests and Status Software and Documentation NSIPS PSOE
Level 1 Data Flow Diagram All RDRs NESDIS IDPS Selected SDR ,IPs EDRs, and TDR Telemetry RDR, Science RDR, Diagnostic RDR VIIRS SDRs Channels, 5,7,9, 10-16 Aerosol EDR RIPS SD3E NOAA CLASS NSIPS Data Subscription LAND PEATE SDS ASDC Telemetry RDR (HDF-5) Science RDR, (HDF-5) Diagnostic RDR (HDF-5) VIIRS SDRs Channels, 5,7,9, 10-16 (Subsetted HDF4) Aerosol EDR (HDF4)
Gap Analysis • Add CERES product types to SD3E product definition table • CERES science RDRs • CERES diagnostic RDRs • CERES telemetry RDRs • Additional Data Volume/Disk Space • Ingest additional 2500.48 KiB/Day (2083.74 KiB/Day + 20% HDF overhead) • ~78.14 MiB storage for 32 days
Results of RMA and Throughput Study • SDS Level 3 Requirement • 3.7.3 Availability • The NPP SDS shall have an Operational Availability, Ao., of 0.95. • Background: The SDS and associated elements have only been “Research Grade”. • Risk: Data latency when the SD3E system is down. • Will the current requirement be sufficient in meeting CERES requirements?
Results of RMA and Throughput Study • Assumptions • Five year mission: 43800 hours • Operating time: 10000 hours • Time to repair disks: 288 hours (2 weeks) • Average time for maintenance/resolve problems: 168 hours (1 week) • Average time to bring system back up (Nominal scenario): 50 minutes • Time for operator to notice problem: 5 minutes • Time to take down the system/clean up: 25 minutes • Time to bring DB back up: 10 minutes • Time to bring SD3E back up: 10 minutes
Single Disk Metrics P (single disk fails during mission) = # of hours of mission / MTBF of disk = 43800 hr / 1000000 hr = .0438 = 4.4 % P(single disk is down at a given moment in time during Mission) = P(single disk fails during mission) x P(disk has not yet been replaced) = .0438 x (288hr / 43800) = .000288 = .028% assumes 2 weeks to repair drive
Tier Metrics (1 of 2) • Assumptions: • disks are placed in tiers of size 10 • 2 or more disks must fail in the same tier for downtime to occur • P( no disks are down in a given tier) = P(disk 1 never goes down) * P(disk 2 never goes down) * … * P(#10 never does down) = (.9997)10 = .997
Tier Metrics (2 of 2) • P(1 disk is down in a tier) = P(disk 1 is Down) x P(disks 2…10 are Up) x # of variations of which disk is down = (.000288)(.9997)10(10) = .0029 • P(2 or more disks are down in a tier) = 1 – P(no disks are down) – P(1 disk is down) = 1 - .997 - .0029 = .0001
Whole System Disk Metrics • P(2 or more disks are down in ANY tier) = # of tiers x P(2 or more disks fail in a single tier) = (50)(.0001) = .005 Availability = 1 – P(2 or more disks are down in ANY tier) Ao of entire disk array = 0.995
Results of RMA and Throughput Study DB1 S1 C1 D1 S3 DB2 S2 C2
Results of RMA and Throughput Study System Reliability = [(DB1+DB2-(DB1)(DB2)] * [S1+S2-(S1)(S2)] * S3 * [C1+C2-(C1)(C2)] * D1 = 0.00702 = 1 - 0.00702 = 0.99298 Operational Availability = Where: Configuration MTBF = 11447.8 hours Mean Time Between Maintenance (MTBM) = 10000 hours Mean Down Time (MDT) = 168 hours Average down-time events over 10000 hours = 0.8735 Average down time = 146.75 hours Ao = 10000/(10000+146.75) Ao = 0.9855
Throughput Estimates • Demonstrated Performance: • Ingest Rate: 4.05-4.4 TB/day • Tests ranged between 102660 – 146064 files • Files Ranged in Size Between 500K-300MB • Tested while 2 simultaneous FTP pulls (simulates the PEATEs) • Hardware • Single Disk Controller (S2A8000) • 2 Dell Servers (2650/2850)
Throughput Estimates • Throughput Estimate: 8-12 TB per day • Rationale: Our memory and CPU averaged at approximately 95% during our throughput tests. Also, our Disk Usage was near 90%. Our next hardware addresses these with: • 4 high performance servers(each with quad cores) • Dedicated database servers • 2 Disk Controllers (with 3 times the performance of our current disk controller)
Nominal Latency SD3E Land PEATE ASDC IDPS In-Bound Out-Bound RDRs Push Estimated every 12 hours, Land ingests data from SD3E Ingest Every 5” … … … Directory Poll Ingest PEATE Data Pull Ingest Ingest Ingest RDRs Push Ingest … … … Ingest Ingest Estimate every 12 hours, Land Ingests data from SD3E Ingest Directory Poll Ingest RDRs Push PEATE Data Pull Ingest … … … Ingest
Build 3 Plans • Add CERES RDRs products to SD3E product definition table • Obtain sample CERES RDR products to test transfer, ingest, verify, and store • Test interfaces between IDPS/ADS to SD3E to Land to CERES
Build 4 Plans • Continue testing interfaces between NESDIS IDPS to SD3E to Land PEATE to CERES • Continue performance and throughput testing from SD3E to Land PEATE to CERES
SD3E Capabilities by NCT3 • Receive subscription and ad-hoc requests via Web Interface and machine-to-machine interface • Coalesce subscriptions from PEATEs/NICSE • Determine missing products • Determine products for reorder • Submit product request using DDS GUI to mini-IDPS • Ingest, verify, store xDR, ancillary/auxiliary, IPs, and calibration products • Ingest, verify, store CEREs RDRs • Notify operator of products for manual reorder • Operator GUIs to monitor system status and resources • Report generation capability • Database replication • Ingest of RDR Packaged Filenaming (Primary Products ID first) • Work-around to ingest RDR Packaged Filenaming (sorted product IDs)
Schedule • Build 3 delivery 11/05/08 • Handle CERES RDRs • Handle RDR packaging file naming (including work around) • Build 3 I&T 11/06/08 - 11/21/08 • Hardware Augmentation Part 1 08/01/08 • Regression Testing 12/22/08 - 01/09/09 • Performance Testing 11/04/08 - 11/18/08 • Build 4 11/06/08 - 12/01/09 • NCT3 06/09/09 - 06/15/09
Issues/Concerns • No Issues/Concerns