340 likes | 778 Views
Storage Resource Broker. Case Studies George Kremenek kremenek@sdsc.edu. Digital Sky Project (NPACI) {NVO (NSF)} Hayden Planetarium Simulation & Visualization ASCI - Data Visualization Corridor (DOE) Visual Embryo Project (NLM) Long Term Archiving Project (NARA)
E N D
Storage Resource Broker Case Studies George Kremenek kremenek@sdsc.edu
Digital Sky Project (NPACI) {NVO (NSF)} Hayden Planetarium Simulation & Visualization ASCI - Data Visualization Corridor (DOE) Visual Embryo Project (NLM) Long Term Archiving Project (NARA) Information Power Grid (NASA) Particle Physics Data Grid (DOE) {GrPhyN (NSF)} Biomedical Information Research Network (NIH) RoadNet (NSF) Grid Portal (NPACI) NSDL – National Science Digital Library (NSF) Knowledge Network for BioComplexity (NSF) Tera Scale Computing (NSF) Hyper LTER Earth System Sciences – CEED, Bionome, SIO Explorer Education – Transana (NPACI) Mol Science – JCSG, AfCS Digital Libraries – ADL, Stanford, UMichigan, UBerkeley, CDL Projects
Data Transfer and Share TearaBytes of Data Across the Internet Simulation Data produced at NSCA Visualized at SDSC Validated at AMNH, NCSA, UVa & other places Consumed at AMNH Data sizes ranged from 3 TB to 10 TB Other sites (CalTech, BIRN) used as cache resources Problem
The animations was done for the new planetarium show “A Search for Life: Are We Alone?” narrated by Harrison Ford. The show opened Saturday, March 2nd. Sites involved in the project : AMNH = American Museum of Natural History NCSA = National Center for Supercomputing Applications SDSC = San Diego Supercomputer Center University of Virginia CalTech, NASA, UCSD Hayden Planetarium Project“A Search for Life: Are We Alone?”
People involved AMNH : Producer, Anthony Braun, Director, Carter Emmart, Erik Wesselak, Clay Budin, Ryan Wyatt, Asst. Curator, Dept of Astrophysics, Mordecai Mac Low NCSA : Stuart Levy, Bob Patterson SDSC : David R. Nadeau, Erik Enquist, George Kremenek, Larry Diegel, Eva Hocks U. Virginia: Professor, John F. Hawley Hayden Credits
Disk accretion: Simulation run at SDSC by John Hawley. Data stored in SRB. Jet imagery : Images from Hubble Space Telescope. Data stored in SRB. Flight path: Planned at NCSA and AMNH. Data stored in SRB. Hayden Data in SRB.
Hayden Data Flow NCSA SGI AMNH NYC NY 2.5 TB UniTree production,parameters, movies, images data simulation CalTech SDSC GPFS 7.5 TB IBM SP2 BIRN HPSS 7.5 TB UVa visualization
ISM = Interstellar Medium Simulation run by Mordecai Mac Low of AMNH at NCSA : 2.5 Terabytes sent from NCSA to SDSC. Data stored in SRB (HPSS, GPFS). Ionization : Simulation run at AMNH, 117 Gigabytes sent from AMNH to SDSC. Data stored in SRB. Star motion: Simulation run at AMNH by Ryan Wyatt.38 Megabytes sent from AMNH to SDSC. Hayden, Data involved
Data total 3 * 2.5 TB = 7.5 TB Files 3 * 9827 files + miscellaneous files Duration December 2001, January, February 2002 Hayden totals
The SRB was used as a central repository for all original, processed or rendered data. Location transparency crucial for data storage, data sharing and easy collaborations. SRB successfully used for a commercial project in “impossible” production deadline situation dictated by marketing department. Collaboration across sites made feasible with SRB Hayden Conclusions
Area Advanced computations, three-dimensional modeling, simulation and visualization. Problem evaluating SRB as an advanced data handling platform for the DOE data visualization corridor. Requirements SRB working well with HPSS for handling large files as well and large number of small files. Data movement in “bulk” by researchers Advanced Simulation and Computing (ASCI)
ASCI is currently evaluating the DataCutter technology in SRB DataCutter handles multidimensional data subset-ing and filtering developed by U of Maryland Ohio State. ASCI is interested in the integration of DataCutter with SRB for the advanced visualization corridor. ASCII and DataCutter
Data movement across 3 hosts ASCI Data Flow applications SRB server SRB clients data cache local FS SRB server MCAT Oracle HPSS
ASCI project - LLNL Celeste Matarazzo, Punita Sinha The Storage Resource Broker (SRB): SDSC Michael Wan, Arcot Rajasekar, Reagan Moore Datacutter : Univ. Maryland, OSU Joel Saltz, Tahsin Kurc, Alan Sussman ASCI People
Time-line 1999 - Dec 2002 Data Sizes Very large files (multi GB) Large number of small files (over a million files) Total size exceeding 2 TB for each run SRB Solution SRB/HPSS interoperation – highly integrated SRB data mover protocol adapted to HPSS parallel mover protocol ASCI
HPSS server directs the parallel data transfer scheme uses the class of service HPSS feature SRB server is utilizing the HPSS's parallel mover protocol. transfer rates of up to 40 MB/sec speedup of 2 to 5 times using multiple threads can be achieved. ASCI Parallel Protocol
Ingesting a very large number of small files into SRB is time consuming if the files are ingested one at a time greatly improved with the use of bulk ingestion. ingestion was broken down into two parts the registration of files with MCAT the I/O operations (file I/O and network data transfer) multi-threading was used for both the registration and I/O operations. new utility - Sbload was created for this purpose. reduced the ASCI benchmark time of ingesting ~2,100 files from ~2.5 hours to ~7 seconds. ASCI Small Files
Very large number (2 million) of small/average files can be ingested into SRB (HPSS) in short time Sbload (with bulk SRB registration) can load and register up to 300 files a second Sbload will be included in next SRB release Sbload can be used for other resources also ASCI Conclusions
2MASS (2 Microns All Sky Survey): Bruce Berriman, IPAC, Caltech; John Good, IPAC, Caltech, Wen-Piao Lee, IPAC, Caltech NVO (National Virtual Observatory): Tom Prince, Caltech, Roy Williams CACR, Caltech, John Good, IPAC, Caltech SDSC – SRB : Arcot Rajasekar, Mike Wan, George Kremenek, Reagan Moore Digital Sky
http://www.ipac.caltech.edu/2mass The input data was on tapes in a random order. Ingestion nearly 1.5 year - almost continuous SRB performed a spatial sort on data insertion. The disc cache (800 GB) for the HPSS containers was utilized. Digital Sky - 2MASS
Digital Sky Data Ingestion Data Cache SRB SUN E10K star catalog Informix SUN HPSS 800 GB …. input tapes from telescopes 10 TB SDSC IPAC CALTECH
4 parallel streams (4 MB/sec per stream), 24*7*365 Total 10+TB, 5 million, 2 MB images in 147,000 containers. Ingestion speed limited by input tape reads Only two tapes per day can be read work flow incorporated persistent features to deal with network outages and other failures. C API was utilized for fine grain control and to be able to manipulate and insert metadata into Informix catalog at IPAC Caltech. Digital Sky Data Ingestion
Sorting of 5 million files on the fly Input tape files: temporal order Stored SRB Containers: spatial order Scientists view/analyze data by neighborhood Data Flow: Files from tape streamed to SRB SRB puts them in proper ‘bins’ (containers) Container cache-management a big problem Files from a tape may go into more than 1000 bins Cache space limitations (300-800GB) made for a lot of trashing SRB Daemon managed cache - watermarks Data Sorting
average 3000 images a day Digital Sky Data Retrieval Informix SRB SUN E10K WEB SUNs IPAC CALTECH HPSS 800 GB WEB SUNs SGIs …. 10 TB JPL SDSC
Processing 10 TB on thousands of nodes Digital Sky Apps(Eg.: sky mosaic, classification, …) SRB SUN E15K IBM SP2 (DTF) HPSS …. SAN disks, shared 10+TB 10 TB SDSC
SRB can handle large number of files Metadata access is still less than ½ sec delay Replication of large collections Single command for geographical replication On-the-fly sorting (out-of-tape sorting) Availability of data otherwise not possible Near-line access to 5 million files (10 TB) Successfully used in web-access & large scale analysis (daily) DigSky Conclusion
Thank you for your attention. Any questions? http://www.npaci.edu/dice/srb