770 likes | 913 Views
SRB Tutorial NPACI All Hands Meeting 1999. WWW. Exchange of information specifically text, images and multi-media hyper-links to navigate through documents search engines for indexing Not friendly for exchange of meta-information - yet Not easy to integrate with computation. DATA.
E N D
WWW • Exchange of information • specifically text, images and multi-media • hyper-links to navigate through documents • search engines for indexing • Not friendly for exchange of meta-information - yet • Not easy to integrate with computation
DATA • Data - any body of information that can be used for computation and communication • Scientific Data • data from experiments • images (scans) • genetic strings (DNA) • simulation
METADATA • Meta Data - data that qualifies data • information that captures the semantics of data • date of experiment, reactants used, result obtained, … • useful for communication and computation • Example • Titanic • Leanordo DiCaprio • James Cameron
Data Handling System • Require knowledge of file name • Distributed file systems • Persistent object environments • Require special interface for data access • Database systems • Local solution with well-knowm file name • Data migration systems
Data-Intensive Computing • Support new modes of science • Enable analysis of very large data sets • Improve the ability to conduct science • Build discipline specific data collections • Build tools that decrease time needed to transfer information • Automate information discovery • Enable Information Based Computing
Information Based Computing • Enable information discovery from scientific applications • Metadata Catalogs • Enable data management and access to heterogeneous, distributed data sources • Storage Resource Brokers • Provide scalable systems, terabyte data access from petabyte archives • Parallel I/o
Common Middleware • Distributed computing environment for remote execution of procedures • Distributed data handling environment for access to archives, databases and file systems • Inter-realm authentication system • Distributed information discovery • Collaboration environment
Evolution of data Handling Environment • Tightly couple database / archival storage • Metadata catalog implemented to identify resources • Separation of data identification from dataset access • Separation of services from repositories to improve interoperability • Integration with digital library technology
What is SRB? • SRB is an Intelligent Data Access System • SRB provides federated access to datasets • SRB provides protocol transparency to diverse and distributed storage systems • SRB provides location transparency to distributed datasets • SRB provides access transparency to remote user
What is SRB? • Extends File Systems • Extends Database Systems • Extends I/O protocol • Extends WWW • Extends Digital Library Systems
SRB Concepts(1) • Provide Scalability • Hosts • Resource Types • Resources • Collections • Data Objects - size and number • Users & Groups • Methods • MetaData
SRB Concepts(2) • Provide Logical Abstractions • srbSpace - an abstract storage space • Resource Types - resource defined by properties • Resources - resource identified by name and type • multiple resources tied together as a single resource • Collections - abstraction over directory structure • distributed & curated • Datasets - identified by properties • Users - authenticated across hosts/networks • Domain - abstraction over physical domains • Metadata Schema/Attributes
SRB Concepts(3) • Provide Uniform Interfaces • Uniform API to Resources - archival, file and DB • Uniform API to federated Resources • Uniform Access to Collections & Datasets • Uniform Authentication across SRB space
SRB Concepts(4) • Replication of Datasets • Access Control Lists • Ticket-based Access • Auditing • Authentication and Encryption (SEA) • Server-side proxy Operations • Metadata-based Discovery • Rich Interface - programmatic & interactive
What is MCAT? • Cataloging System • Metadata Repository • Digital Object Metadata • type, format, lineage, usage methods, domain-specific attributes, collection info, etc • System-level Metadata • access control, audit trails, location, replication, resource types, user groups, etc • Schema-level Metadata • ontology, relationships among attributes/schemas, semantics of attributes, etc • Uniform Access and Federation interface
Distributed Storage Resources (database systems, archival storage systems, file systems, ftp) The Storage Resource Broker is Middleware MCAT Application (SRB client) SRB Server DB2, Oracle, Illustra, ObjectStore HPSS, UniTree UNIX, ftp
The SRB Process Model Application (Host, port) SRB Master (port) SRB agents MCAT
Federated SRB Operation Application 1 6 SRB server SRB server 3 4 5 SRB agent SRB agent 2 MCAT
DR DR MC DL DL CP DR CP CP CP CP CP CP CP DR CP DR MC DR DL SRB Space SRB SRB SRB SRB SRB SRB SRB DL DR - Data Repository DL - Dig Library MC - Meta Catalog CP - Comp Process/ SRB Client SRB SRB SRB
SRB V1.0 Features • Multi-platform (clients and servers) • SunOS/Solaris, AIX, Cray C90, DEC OSF • API and command line interfaces • “Low-level” and “high-level” APIs • Storage systems supported • DB2, Illustra, Unitree, HPSS, UNIX • Support for federated servers • Released early September, 1997
SRB V1.1 Features • In beta in DOCT. To be released in January, 1998 • Ported to additional platforms - SGI, Cray T3E • Incorporates the SDSC Encryption and Authentication (SEA) Library • Ticket-based access control • Graphical user interface - SRBTool • Additional storage systems supported • Oracle, Objectstore, ftp, http • Oracle-based MCAT • Support for proxy operations, e.g. move, copy, replicate • Data replication using Logical Storage Resource
New SRB Features • Java-based SRB browser • C++ API • SRBIO - C library for redirecting stdio • Proxy functions for meta data extraction • System Monitor for remote auto-startup • System Parameters stored in MCAT
MCAT: Metadata Catalog • Stores metadata about • Users, Data sets, Resources, Methods • Provides “collection” abstraction • Stores detailed access control information • Maintains audit trail information on data sets • Implemented as a relational database with referential integrity constraints (currently uses DB2, ported to Oracle)
MCAT Architecture MCAT Interface Functions Schema to MAPS Convertor MAPS to Schema Convertor MAPS Initialization MAPS Semantics Answer Extractor & Cursor Control Dynamic Query Generator Schema Initialization Schema Semantics Oracle Query System DB2 Query System
Federated Catalog Architecture MAPS MCAT CATALOG Semantics & Definitions Local Routines Internal Catalogs External CATALOG Interface CATALOG MAPS Interface Local Interface Local Interface CAT-2 CAT-1 Semantics & Definitions Semantics & Definitions Local Routines CATALOG CATALOG Local Routines
New MCAT Features • Meta-Schema to hold System and User meta data schema information • Extensible meta data schema • Distributed meta data schema • Metadata exchange Interface Protocol • MAPS- Metadata Attribute Presentation Structure • query, update and result structures • Close to Z39.50
New MCAT Features (contd.) • Core Schema Implemented • MCAT Core - Data, Resources, Users and Methods • Dublin Core • IV Core - Image Visualization attributes • Web-based Prototype User Interface • extensible schema functions • query,, insert and update of meta data • integrated presentation of meta data and data
SRB Data Replication Support • Replication via Resource Set definition • Replication support integrated into write function • srbObjReplicate API can be used for post facto replication • Synchronous replication across all sites. Can choose any k out of n • Can choose specific replica on read operation
NWS Data Replication (DOCT) Application SAIC MCAT SDSC SRB SRB SRB Caltech NCSA LogRsrc1 LogRsrc2 HPSS HPSS Oracle DB2 Unix
SEA(SDSC Encryption & Authentication) • Developed as part of DOCT • Designed for Supercomputing/ MetaComputing Environment • Based On RSA Public/Private Keys and RC5 Encryption Algorithm • Integrated into SRB • Being integrated into ‘pftp’ & ‘hsi’ - for Remote HPSS Access
SEA Features • Secure User/Process Authentication Across Network (TCP Sockets) • Optional Encryption As Independent Function • Simple API • Batch Support - Long-term Certificates • Adjustable Key Lengths (Speed/Security Tradeoff) • User-Adjustable Encryption Levels (Speed/Security Tradeoff) • Multiple Initial User Registration Methods (Set By Administrator) • Self-Introduction • Trusted Host • Password • Available for Cray T90, C90, T3E, SunOS, Solaris, IRIX, OSF1, AIX, CS6400, NextStep • More Information: http://www.sdsc.edu/~schroede/sea.html
Ticket-based Access Control • Owner can request ticket for a data set • Ticket can be issued for a data set or a collection • Ticket controls access by • time-period (start and expire timestamps) • number of access (count) • user names ( any, single or group users) • Non-registered Users can also access using tickets • Useful for sharing data and access through the web • Tickets generated and stored in MCAT • Currently supports read-only tickets
SRB API • Programmatic API • High-level API • Low-level API • SRB Manager API • Command Level Interface - Scommands • Graphical User Interface - srbBrowser • Web Utilities
SRB API Interface Application MCAT SRB Master
High & Low-level API • Low-level API • talks to resource drivers • no registration of data sets in MCAT • no authentication through MCAT • User provides all information • High-level API • Uses low-level API to access resources • Registers data management information in MCAT • Uses MCAT for authentication and meta information • Uses MCAT for resource and data discovery • Access/store data in remote SRB
Low-level API • srbFileOpen(conn, storType, host, fileName, mode) • srbFileCreate(conn, storType, host, fileName, mode) • srbFileClose(conn, fd) • srbFileUnlink(conn, storType, host, fileName) • srbFileRead(conn, fd, buffer, length) • srbFileWrite(conn, fd, buffer, length) • srbFileSeek(conn, fd, offset, whence) • srbFileSync(conn, fd) • srbFileStat(conn, storType, host, fileName, statBuf) • srbFileMkdir(conn, storType, host, dirName, mode) • srbFileRmdir(conn, storType, host, dirName, mode) • srbFileChmod(conn, storType, host, fileName, mode)
Low-Level API (contd …) • srbDbLobjOpen(conn, storType, resourceLoc, positionName, mode) • srbDbLobjCreate(conn, storType, resourceLoc, positionName, mode) • srbDbLobjClose(conn, dd) • srbDbLobjUnlink(conn, storType, host, fileName) • srbDbLobjRead(conn, dd, buffer, length) • srbDbLobjWrite(conn, dd, buffer, length) • srbDbLobjSeek(conn, dd, offset, whence)
High-level API • srbObjOpen(conn, objChar, mode, collectionName) • srbObjCreate(conn, objName, objType, resourceName, collectionName, pathName, size) • srbObjClose(conn, od) • srbObjUnlink(conn, objChar, collectionName) • srbObjRead(conn, od, buffer, length) • srbObjWrite(conn, od, buffer, length) • srbObjSeek(conn, od, offset, whence) • srbObjMove(conn, objChar, collectionName, newResourceName, newPathName) • srbObjReplicate(conn, objChar, collectionName, newResourceName, newPathName) • srbObjProxyOpr(conn, Operation, sourceDesc, targetDesc)
High-Level API (contd …) • srbGetDatasetInfo(conn, objChar, collectionName, resultStruct, requiredNumber) • srbGetMoreInfo(resDesc, resultStruct, requiredNumber) • srbGetDataDirInfo(conn, conditionList, selectList, resultStruct) • srbModifyDataset(conn, objId, collectionName, newValue1, newValue2, modifyType, resourceName, pathName) • srbCreateCollect(conn, parentCollectionName, childCollectionName) • srbListCollect(conn, CollectionName, flag, resultStruct) • srbModifyCollect(conn, CollectionName, newValue1, newValue2, newValue2, modifyType) • srbModifyUser(conn, newValue1, newValue2, modifyType) • srbSetAuditTrail(conn, setValue)
System Manager API • srbChkMdasAuth(conn, userName, userAuth, domain) • srbChkMdasSysAuth(conn, userName, userAuth, domain) • srbRegisterUser(conn, userName, domain, password, userType, userAddress, userPhone, userEmail) • srbRegisterUserGrp(conn, userGrpName, userGrpPassword, userGrpType, userGrpAddress, userGrpPhone, userGrpEmail)
srbBrowser - A SRB Graphical Interface • A java GUI • Interface with SRB servers using the client API library. • Performs most SRB operations - cp, replicate, import, export, metadata query, etc. USER Java GUI Obtain user’s metadata information via SRB. Invoke SRB operations SRB Agent MCAT Proxy operation
SRB Command Line Interface Environment File USER SRB “shell” commands: Sls, Scp, Scat, Sput, Sget, ... MCAT SRB Agent Proxy operation
Sinit - initialize S-environment Sexit - clean up Sman - get manpage for Scommand Scat - display srbObject on screen Sput - copy local file into srbSpace Sget - copy srbObject to local space Sappend - append to srbObject Srename - change srbObject name Srm - remove srbObject Schmod - change/grant access to srbObject Scd - change collection Spwd - display current collection Sls - list collection Smkdir - make new collection Srmdir - remove old collection SgetD - get srbObject information SgetR - get resource information SgetU - get user information SmodD - modify srbObject info SmodU - modify user info Stoken - get native type information Scopy - copy srbObject in another collection and under another name Sreplicate - clone object in new resource - same internal id Smove - move srbObject to new collection or resource Scommands
Scommands (contd …) • ingestUser - adding a new user or group • ingestResource - adding a new resource • ingestLogicalResource - making a new resource grouping • addLogicalResource - adding to a resource grouping • ingetLocation - adding new location information • ingestToken - adding new native types (eg. resourceType, objectType, userType, domainName, ActionType, . . .)
Scommands • Sls • Sls [-h] [-L number] [-Y number] [-r|-f] [collection ...] • Sls [-L number] [-Y number] srbObj … • Sput • Sput [-p] [-D dataType] [-R resourceName] [-P pathName] localFileName ... TargetName • Sput [-p] [-D dataType] [-R resourceName] [-P pathName] -i TargetName • Sget • Sget [-C_n ] [-p] srbObj ... localFile • Sreplicate • Sreplicate [-Cn] [-p] [-R resourceName] [-P pathName] srbObj ...
Open creat read write close lseek fopen fread fwrite fclose fseek fflush fgetc fgets fputc fputs getc putc ungetc rewind vfprintf fprintf fscanf SRBIO
Web Utilities • Sgetw - copies a SRBobject into server site • Sputw - copies local file in SRBspace • Scatw - displays SRBobject on browser (handles types) • Slsw - displays information of SRBobjects