260 likes | 429 Views
Mission Support Team Storage Architectures. Presented By Ken Gacke, SAIC* U.S. Geological Survey, EROS Data Center Sioux Falls, SD July, 2004 * Work performed under U.S. Geological Survey contract 03CRCN0001. Storage Architecture Agenda. Online/Nearline Storage Architecture
E N D
Mission Support TeamStorage Architectures Presented By Ken Gacke, SAIC* U.S. Geological Survey, EROS Data Center Sioux Falls, SD July, 2004 * Work performed under U.S. Geological Survey contract 03CRCN0001
Storage Architecture Agenda • Online/Nearline Storage Architecture • System Backup Architecture • Onsite Short-term system recovery • Offsite Disaster Recovery • Archive Storage • Long term data preservation
Storage Architecture Agenda • Online/Nearline Storage Architecture • System Backup Architecture • Onsite Short-term system recovery • Offsite Disaster Recovery • Archive Storage • Long term data preservation
Storage Architecture • Online • Direct Attached Storage (DAS) • Just a Bunch Of Disk (JBOD): • Intermediate processing. • Redundant Array Independent Disk (RAID): • Database, Web/ftp, and product generation. • Network Attached Storage (NAS): • Office automation. • Storage Area Network (SAN): • Clustered File System for High Performance Processing • Nearline: • Online disk cache with high performance tape backend
Storage Architecture • EDC’s historical nearline experience: • EPOCH, AMASS – (1987 – 1993) • Optical • AMASS – (1992 – 2000) • Quantum DLT 2000 • UniTree – (1992 – 2001) • StorageTek 3480/3490/D-3/9840 • DMF, AMASS, LAM (2000 – Present) • StorageTek 9840/9940B
Storage Architecture • Multi Tiered Storage Vision • Online • Supported Configurations • DAS – Local processing such as image processing • NAS – Data sharing such as office automation • SAN – Production processing • Data accessed frequently • Nearline • Integrated within SAN • Scalable for large datasets & infrequently accessed data • Multiple Copies and/or Offsite Storage
Storage Architecture Decisions • Optimized by individual program and program manager, not the enterprise. • Requirements Factors • Reliability – Data Preservation • Performance – Data Access • Cost – $/GB, Engineering Support, O&M • Scalability – Data Growth, Multi-mission, etc. • Compatibility with current Architecture • Evaluated and recommended through engineering white papers and weighted decision matrices.
2Gb Fibre 1Gb Fibre Disk Cache /dmf/edc 68GB /dmf/doqq 547GB /dmf/guo 50GB /dmf/pds 223GB /dmf/pdsc 1100GB Tape Drives 8x9840 2x9940B CR1 SAN/Nearline Architecture Ethernet DMF Server Product Distribution
DMF FTP (lxs37) PDS Tape Library 8x9840 3x9940B Future Seamless/Silo Architecture Ethernet CIFS Mount Data Servers TP9300S 3TB TP9400 Web/Extract
Storage Architecture Agenda • Online/Nearline Storage Architecture • System Backup Architecture • Onsite Short-term system recovery • Offsite Disaster Recovery • Archive Storage • Long term data preservation
System Backup Architecture • ITS is responsible for generating system backups to maintain system integrity. • Promote centralized data backup solution to the Projects • Legato is used for automated system backups for the Unix (SGI, SUN, Linux) platforms. • ArcServe is used for automated system backups for the Windows based platform. • Fully automated backup solution • Tapes located within tape library • Retention period is three months
System Backup Architecture • Unix Server • Weekly Full backups with daily incremental: • System partitions • Local and third party software packages • Databases • DORRAN, Earth Explorer, Inventory, Seamless • Legato Oracle Module for Very Large Databases • Quarterly Full backups with daily incremental • RAID Datasets (DRG, Browse, Anonymous FTP) • Backups with exclusion of image files and large files • User data file systems
System Backup Architecture • Windows Servers • Typically full backups with daily incremental (no exclusions) • Workstations and PCs • Generally, no system backups • Production workstations within CR1 are backed up (International, WBV)
System Backup Resources Dell 2550 (2 CPU) ArcServe -- Windows Sun E450 (4CPU) Legato -- Unix Overland Storage NEO 4100, Three LTO-2 Drives StorageTek L700, Six SDLT 220 Drives
Offsite Backup Architecture • ITS is responsible for generating offsite backups for disaster recovery • Mission essential data written to media and stored offsite • LTO-2 tape generated once per week • Data written in an open format (tar) • Retention period is three months • Projects currently using offsite storage • DORRAN • Inventory • EarthExplorer • Digital Archive • LAS Source Code • Web Servers
Storage Architecture Agenda • Online/Nearline Storage Architecture • System Backup Architecture • Onsite Short-term system recovery • Offsite Disaster Recovery • Archive Storage • Long term data preservation
Archive Storage • Digital Archive Media Trade Study • To analyze offline digital archive technologies and recommend the next EDC archive media of choice. • Criteria in decreasing order of importance: • Reliability: A second copy will reduce risk somewhat, but a reliable technology is mandatory. Reliability is proven over time. • Performance: High capacity saves significant space and high transfer rates speed up transcription. • Cost: The actual drive cost is fairly insignificant, but the media cost is quite important.
Archive Media Weighted Matrix FY04 Revision
System Overview Quantity of data to be copied: 2 CopiesData SetScenesData VolumeDCTs / HDTs 9940B MSS-P 65,128 3.2 terabytes 118 DCTs 36 MSS-A 262,088 9.5 terabytes 277 DCTs 100 TM-A 13,733 3.6 terabytes 108 DCTs 40 TM-R 386,934 102.2 terabytes 2,357 DCTs 1,040 TM-R ~150,000 ~40.4 terabytes ~7,500 HDTs 420 Total: 877,883 158.9 terabytes 10,758 Tapes 1,636 Number of HDTs currently transcribed on TMACS to DCT: 30,500 Quantity of HDTs that can be land filled after conversion: 38,000+
Big Changes • Your Order is in What Box?!
Impact (1 copy) (2 copies) 13 Semis ½ cargo space in SUV38,000 HDTs < 1800 STK 9940B
Impact (1 copy) (2 copies) 13 Semis 1/3 space of STK Silo38,000 HDTs < 1800 STK 9940B