170 likes | 181 Views
This study delves into the evolving landscape of mass storage, highlighting implications for image archives. It explores the exponential growth in archive sizes and traces the historical evolution of storage technologies, from punch cards to current tape and hard drive systems. The future potential of local hard drives and grid storage as alternatives to traditional tape libraries is also discussed, along with emerging technologies such as nano-storage and holographic data storage. The vision of future image archives emphasizes easily accessible data for processing and storage options for raw data, metadata, and processing code. This comprehensive analysis provides valuable insights for navigating the dynamic realm of image archiving.
E N D
Coming revolutions in mass storage: implications for image archives Christopher D. Elvidge, Ph.D.NOAA-NESDIS National Geophysical Data Center E/GC2325 Broadway, Boulder, Colorado 80305 USAEmail: chris.elvidge@noaa.gov And Dr. Mikhail ZHIZHIN Head of Information Technologies Lab Institute of Physics of the Earth and Geophysical Center Russian Academy of Science, Moscow, Russia Email: jjn@wdcb.ru APAN eScience Workshop – July 6, 2004
Image Archive Sizes Continuing to Grow Rapidly • For example, from 1992-2004 satellite image ingest at NOAA-NGDC runs from six to ten GB per day. • Once launched (~2007), NPP will produce about 2 TB of data per day, which NOAA will archive. • During the NPOESS era (2010-2020+) there will three satellites, each producing 2 TB of data per day, with NOAA responsible for the archive. • There are many other examples.
Abridged History of Storage(http://www.disk-tape-data-recovery.com/storage-history.htm) RAMAC – the first hard drive – 1956 • Punch cards – back when snakes had legs. • Ticker tape – faster than punch cards. • Magnetic tape – invented by IBM in 1952. • 1956 - IBM introduces the 305 RAMAC (Random Access Method for Accounting and Control), the first magnetic hard disk storage system. The RAMAC stored 5 megabytes (MB) of data, was the size of two large refrigerators and cost $10,000 per MB; the device could store 5 million characters of data on 50 disks, each 24 inches in diameter. Each disk could hold an equivalent of 25,000 punch cards.
Abridged History of Storage (http://www.columbia.edu/acis/history/media.html) 9-track tapes – workhorse of image archives in the 1960’s-early 1990’s. 50 mb at 1600 bpi. IBM MSS cartridge (1982) held 50 mb. 0.2 mb tape strip from IBM Data Cell (mid-1960’s)
Current Standard – Tape Library System • Used by NASA, NOAA, USGS and many others. • Tape is widely regarded as the standard for at least another ten years. Storage Technology 9310 robotic tape silo, can hold 6000 IBM 3590 tapes. At 20 GB each the silo can hold ~300 TB. Circa 1999.
LTO Tape Growth Path Already Planned(http://www.lto-technology.com/newsite/index.html) Currently Available
Alternative to Tape Library Systems:Use “Local” Hard Drives Instead of Tape • Approximate price parity between tape and hard drives. • Allows faster access. • Several design options (SAN, NAS). • Hard drive capacity already in the 200 GB range and has been projected to reach 20 TB. • Data may be more easily corrupted.
Alternative to Tape Library Systems:Use “Local” Hard Drives Instead of Tape • http://www.acmqueue.org/modules.php?name=Content&pa=showpage&pid=43 • http://www.firingsquad.com/hardware/building_budget_storage_server/ • http://www.archive.org/web/petabox.php • http://nbd.sourceforge.net/ • http://www.storage.ibm.com/software/virtualization/sfs/ • http://www.microsoft.com/windowsserver2003/techinfo/overview/san.mspx • http://www.enterprisestorageforum.com/technology/features/article.php/947551 • http://www.enterprisestorageforum.com/technology/features/article.php/981191 • http://www.cse.ohio-state.edu/~jain/refs/san_refs.htm • http://www.brocade.com/san/pdf/whitepapers/SANvsNASWPFINAL3_01_01.pdf
Alternative to Tape Library Systems:Use “GRID” Hard Drives Instead of Tape • Approximate price parity between tape and hard drives. • Allows faster access. • Several design options. • Hard drive capacity already in the 200 GB range and has been projected to reach 20 TB. • Community ownership may lead to more collaborations? • Data may be more easily corrupted. • Agencies may also choose to build stand alone archive to ensure long term data preservation. • See essay http://isec.pl/papers/juggling_with_packets.txt
Nano-Storage-Technology Still Emerging • Molecular-scale nanowire memory cells promises unprecedented data storage http://www.azonano.com/news_old.asp?newsID=122 • Big Blue says breakthrough means millipede may crawl out of labhttp://www.smalltimes.com/document_display.cfm?section_id=53&document_id=7860
Holographic Data Storage Still Emerging InPhase Promotional Video
Implementations of Nano and Holographic Data Storage • Tape • CD like disks • Hard drives Greater storage density – lower costs – but implementation routes likely to extend current forms.
Vision of Future Image Archives • Data easily accessed – readily processed • Combination of data from multiple sites / multiple sources • Copies of source data and processing tools kept on long term storage media
Storage Options in Future Image Archives Raw Data, Metadata, Processing Code, Higher Level Products, Experimental Products, Assessments Working Subsets Of Archive GRID Storage Raw Data, Metadata Processing Code, Higher Level Products Tape Library Systems Network Storage Raw Data, Metadata, Processing Code Long term Survivable Storage A.K.A. Data Vault
Storage Options in Future Image Archives Raw Data, Metadata, Processing Code, Higher Level Products, Experimental Products, Assessments Widely Held Data Raw Data, Metadata Processing Code, Higher Level Products Open Storage Facility Number of Users Raw Data, Metadata, Processing Code Data Vault
Regional Resources • Singapore Data Storage Institute: Agency for Science, Technology & Research, or A*STAR (then known as the National Science & Technology Board) and the National University of Singapore (NUS) http://www.dsi.a-star.edu.sg/research/spintronics.html • Others?
Conclusions - Advances in storage capacity & reductions in cost will allow archive storage to diversify – with copies held to meet specific objectives: • Widely distributed collections used in current projects. • Tape and hard drive media to provide operational access from data centers. • Long term “survivable” storage – two or more copies on highly durable media to preserve data hundreds of years – ability to survive technological collapse – reengineering of read capacity.