250 likes | 354 Views
SRB as data grid solution ( Chinese version ). Arun Jagatheesan arun@sdsc.edu San Diego Supercomputer Center. SRB Workshop National Center for High-performance Computing (NCHC) Taiwan, August 3, 2004. SRB as data grid solution ( Chinese version ). Arun Jagatheesan arun@sdsc.edu
E N D
SRB as data grid solution (Chinese version) Arun Jagatheesan arun@sdsc.edu San Diego Supercomputer Center SRB Workshop National Center for High-performance Computing (NCHC) Taiwan, August 3, 2004
SRB as data grid solution (Chinese version) Arun Jagatheesan arun@sdsc.edu San Diego Supercomputer Center SRB Workshop National Center for High-performance Computing (NCHC) Taiwan, August 3, 2004
SRB? SRB = Storage Resource Broker
More Chinese Oops, don’t know any more Chinese to continue
SRB as data grid solution(English Version) Arun Jagatheesan arun@sdsc.edu San Diego Supercomputer Center
Talk Outline • Introduction to Problem statement(s) • How SRB is the solution • SRB Project History • SRB Team • SRB Architecture (from the Architect him self)
What problem, why SRB solution? • Why are people using SRB? • What problems did it solve for them? • Who are these people? • Did they use it because they liked Arun
Southern California Earthquake Center • Build community digital library • Manage simulation and observational data • Anelastic wave propagation output • 10 TBs, 1.5 million files • Provide web-based interface • Support standard services on digital library • Manage data distributed across multiple sites • USC, SDSC, UCSB, SDSU, SIO • Provide standard metadata • Community based descriptive metadata • Administrative metadata • Application specific metadata
SCEC Data Management Technologies • Portals • Knowledge interface to the library, presenting a coherent view of the services • Knowledge Management Systems • Organize relationships between SCEC concepts and semantic labels • Process management systems • Data processing pipelines to create derived data products • Web services • Uniform capabilities provided across SCEC collections • Data grid • Management of collections of distributed data • Computational grid • Access to distributed compute resources • Persistent archive • Management of technology evolution
NASA Data Grids • NASA Information Power Grid • NASA Ames, NASA Goddard • Distributed data collection using the SRB • ESIP federation • Led by Joseph JaJa (U Md) • Federation of ESIP data resources using the SRB • NASA Goddard Data Management System • Storage repository virtualization (Unix file system, Unitree archive, DMF archive) using the SRB • NASA EOS Petabyte store • Storage repository virtualization for EMC persistent store using the Nirvana version of SRB
OC-12 vBNS Abilene MREN OC-12 OC-3 TeraGrid:13.6 TF, 6.8 TB memory, 900 TB network disk, 10 PB archive ANL 1 TF .25 TB Memory 25 TB disk Caltech 0.5 TF .4 TB Memory 86 TB disk Extreme Blk Diamond 574p IA-32 Chiba City 256p HP X-Class 32 32 24 32 32 128p HP V2500 128p Origin 24 32 24 92p IA-32 32 HR Display & VR Facilities 5 4 8 5 8 HPSS HPSS NTON OC-48 Calren OC-12 ESnet HSCC MREN/Abilene Starlight Chicago & LA DTF Core Switch/Routers Cisco 65xx Catalyst Switch (256 Gb/s Crossbar) Juniper M160 OC-12 ATM OC-48 OC-12 GbE NCSA 6+2 TF 4 TB Memory 400 TB disk SDSC 4.1 TF 2 TB Memory 500 TB SAN vBNS Abilene Calren ESnet OC-12 OC-12 OC-12 OC-3 Myrinet 4 8 HPSS 9 PB UniTree 8 2 Sun Server Myrinet 4 1024p IA-32 320p IA-64 1176p IBM SP 1.7 TFLOPs Blue Horizon 14 16 15xxp Origin 4 2 x Sun E10K
NIH BIRN SRB Data Grid • Biomedical Informatics Research Network • Access and analyze biomedical image data • Data resources distributed throughout the country • Medical schools and research centers across the US • Stable high performance grid based environment • Coordinate data sharing • Federate collections • Support data mining and analysis
SDSC SRB User Community (Major US) • National Science Digital Library (NSDL) • National Optical Astronomy Observatory (NOAO) • ROADNet • Purdue University • SCCOOS, USA • Scientific Rich Media Archive • Salk Institute • Strand Map Service, USA • UC Berkeley Library • UCSD Library • University of Houston • Persistent Archives Test bed • University of Wisconsin, Madison • WebBase, Stanford University • Yale University Library • BaBar, Stanford Linear Accelerator Center (SLAC) • California Digital Library (CDL) • Center for Integrated Space Weather Modeling (CISM) • CVC, Visualization Portal • LDC Data Storage • NIH Bio Informatics Research Network (BIRN) • NSF Southern California Earthquake Center (SCEC) • National Archives and Records Administration (NARA) • National Aeronautics and Space Administration Centers (NASA) • National Virtual Observatory (NVO) • Npackage, NSF Middleware Initiative (NMI)
Academia Sinica, Taiwan Australian National University Bio-Lab, University of Genoa, Italy Council for the Central Laboratory of the Research Councils (CCLRC), UK CC-IN2P3, France Distributed Framework, Singapore Distributed Aircraft Maintenance Environment (DAME), UK eMinerals Project, UK eScience, Belfast Center Fraunhofer ITWM, Germany High Energy Accelerator Organization, KEK, Japan K* Grid Computing, Korea KEK Computing Center, Japan Lyon, France NorGrid, Norway Nanyang Data Grid, Singapore Queensland University of Technology (QUT), Australia Rutherford Appleton Laboratory (RAL), UK T-Systems, Germany UK eScience Project, UK UniGrid, Poland UMK, Poland Virtual Laboratory for eScience, Netherlands SDSC SRB User Community
What problem, why SRB solution? • Why are people using SRB? • What problems did it solve for them? • Who are these people? • Did they use it because they liked Arun
Why they use SRB? • Distributed unstructured data management • Data Grids, Digital Libraries, Persistent Archives, • Workflow/dataflow Pipelines, Knowledge Generation • Distributed data storage provisioning • Common logical namespace for data and storage • Data publication • Browsing and discovery of data in collections • Data Preservation • Management of technology evolution
Total data brokered by SDSC SRB 358 TB 324 TB 682 TB
Looking back… • 1995: MDAS Project by DARPA • 1998: SRB Releases • 2000: Arun joins SRB • Only after that SRB becomes a hit – lucky guy (just kidding) • 2000 ++: Multiple client interfaces, Many more functionalities, Multiple projects across the world • 2005: NCHC demonstrates significant interest in SRB and also their end-users in Taiwan (through this workshop)
Physical Layer (Real World) • Distributed digital entities • Heterogeneous and distributed storage resources • Autonomous Organizations • Distributed Users, distributed authentication • Heterogeneous authorization schemes • Users; sub-organizations; organizations/enterprises; virtual organizations
myActiveNeuroCollection patientRecordsCollection image.cgi image.wsdl image.sql E:\srbVault\image.jpg /users/srbVault/image.jpg Select … from srb.mdas.td where... Data Grid Transparencies/Virtualizations (bits,data,information,..) Inter-organizational Information Storage Management Virtual Data Transparency Data Replica Transparency image_0.jpg…image_100.jpg Data Identifier Transparency Storage Location Transparency Storage Resource Transparency
We are SRB Arun is here! - Shameless Self promotion Not in picture: Many students