1 / 26

Mission Support Team Storage Architectures

Mission Support Team Storage Architectures. Presented By Ken Gacke, SAIC* U.S. Geological Survey, EROS Data Center Sioux Falls, SD July, 2004 * Work performed under U.S. Geological Survey contract 03CRCN0001. Storage Architecture Agenda. Online/Nearline Storage Architecture

salaam
Download Presentation

Mission Support Team Storage Architectures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mission Support TeamStorage Architectures Presented By Ken Gacke, SAIC* U.S. Geological Survey, EROS Data Center Sioux Falls, SD July, 2004 * Work performed under U.S. Geological Survey contract 03CRCN0001

  2. Storage Architecture Agenda • Online/Nearline Storage Architecture • System Backup Architecture • Onsite Short-term system recovery • Offsite Disaster Recovery • Archive Storage • Long term data preservation

  3. Storage Architecture Agenda • Online/Nearline Storage Architecture • System Backup Architecture • Onsite Short-term system recovery • Offsite Disaster Recovery • Archive Storage • Long term data preservation

  4. Storage Architecture • Online • Direct Attached Storage (DAS) • Just a Bunch Of Disk (JBOD): • Intermediate processing. • Redundant Array Independent Disk (RAID): • Database, Web/ftp, and product generation. • Network Attached Storage (NAS): • Office automation. • Storage Area Network (SAN): • Clustered File System for High Performance Processing • Nearline: • Online disk cache with high performance tape backend

  5. Storage Architecture • EDC’s historical nearline experience: • EPOCH, AMASS – (1987 – 1993) • Optical • AMASS – (1992 – 2000) • Quantum DLT 2000 • UniTree – (1992 – 2001) • StorageTek 3480/3490/D-3/9840 • DMF, AMASS, LAM (2000 – Present) • StorageTek 9840/9940B

  6. Storage Architecture • Multi Tiered Storage Vision • Online • Supported Configurations • DAS – Local processing such as image processing • NAS – Data sharing such as office automation • SAN – Production processing • Data accessed frequently • Nearline • Integrated within SAN • Scalable for large datasets & infrequently accessed data • Multiple Copies and/or Offsite Storage

  7. Storage Architecture Decisions • Optimized by individual program and program manager, not the enterprise. • Requirements Factors • Reliability – Data Preservation • Performance – Data Access • Cost – $/GB, Engineering Support, O&M • Scalability – Data Growth, Multi-mission, etc. • Compatibility with current Architecture • Evaluated and recommended through engineering white papers and weighted decision matrices.

  8. High Performance RAID Weighted Matrix

  9. Bulk RAID Weighted Matrix

  10. CR1 Storage in Terabytes – May 2004

  11. 2Gb Fibre 1Gb Fibre Disk Cache /dmf/edc 68GB /dmf/doqq 547GB /dmf/guo 50GB /dmf/pds 223GB /dmf/pdsc 1100GB Tape Drives 8x9840 2x9940B CR1 SAN/Nearline Architecture Ethernet DMF Server Product Distribution

  12. DMF FTP (lxs37) PDS Tape Library 8x9840 3x9940B Future Seamless/Silo Architecture Ethernet CIFS Mount Data Servers TP9300S 3TB TP9400 Web/Extract

  13. Storage Architecture Agenda • Online/Nearline Storage Architecture • System Backup Architecture • Onsite Short-term system recovery • Offsite Disaster Recovery • Archive Storage • Long term data preservation

  14. System Backup Architecture • ITS is responsible for generating system backups to maintain system integrity. • Promote centralized data backup solution to the Projects • Legato is used for automated system backups for the Unix (SGI, SUN, Linux) platforms. • ArcServe is used for automated system backups for the Windows based platform. • Fully automated backup solution • Tapes located within tape library • Retention period is three months

  15. System Backup Architecture • Unix Server • Weekly Full backups with daily incremental: • System partitions • Local and third party software packages • Databases • DORRAN, Earth Explorer, Inventory, Seamless • Legato Oracle Module for Very Large Databases • Quarterly Full backups with daily incremental • RAID Datasets (DRG, Browse, Anonymous FTP) • Backups with exclusion of image files and large files • User data file systems

  16. System Backup Architecture • Windows Servers • Typically full backups with daily incremental (no exclusions) • Workstations and PCs • Generally, no system backups • Production workstations within CR1 are backed up (International, WBV)

  17. System Backup Resources Dell 2550 (2 CPU) ArcServe -- Windows Sun E450 (4CPU) Legato -- Unix Overland Storage NEO 4100, Three LTO-2 Drives StorageTek L700, Six SDLT 220 Drives

  18. Legato Monthly Data Backups

  19. Offsite Backup Architecture • ITS is responsible for generating offsite backups for disaster recovery • Mission essential data written to media and stored offsite • LTO-2 tape generated once per week • Data written in an open format (tar) • Retention period is three months • Projects currently using offsite storage • DORRAN • Inventory • EarthExplorer • Digital Archive • LAS Source Code • Web Servers

  20. Storage Architecture Agenda • Online/Nearline Storage Architecture • System Backup Architecture • Onsite Short-term system recovery • Offsite Disaster Recovery • Archive Storage • Long term data preservation

  21. Archive Storage • Digital Archive Media Trade Study • To analyze offline digital archive technologies and recommend the next EDC archive media of choice. • Criteria in decreasing order of importance: • Reliability: A second copy will reduce risk somewhat, but a reliable technology is mandatory. Reliability is proven over time. • Performance: High capacity saves significant space and high transfer rates speed up transcription. • Cost: The actual drive cost is fairly insignificant, but the media cost is quite important.

  22. Archive Media Weighted Matrix FY04 Revision

  23. System Overview Quantity of data to be copied: 2 CopiesData SetScenesData VolumeDCTs / HDTs 9940B MSS-P 65,128 3.2 terabytes 118 DCTs 36 MSS-A 262,088 9.5 terabytes 277 DCTs 100 TM-A 13,733 3.6 terabytes 108 DCTs 40 TM-R 386,934 102.2 terabytes 2,357 DCTs 1,040 TM-R ~150,000 ~40.4 terabytes ~7,500 HDTs 420 Total: 877,883 158.9 terabytes 10,758 Tapes 1,636 Number of HDTs currently transcribed on TMACS to DCT: 30,500 Quantity of HDTs that can be land filled after conversion: 38,000+

  24. Big Changes • Your Order is in What Box?!

  25. Impact (1 copy) (2 copies) 13 Semis ½ cargo space in SUV38,000 HDTs < 1800 STK 9940B

  26. Impact (1 copy) (2 copies) 13 Semis 1/3 space of STK Silo38,000 HDTs < 1800 STK 9940B

More Related