200 likes | 288 Views
SRB Space. iRODS: A Rule Oriented Data ManagementSystem. Beyond the Storage Resource Broker. SRB is a data management system for large-scale data Logical name space -- Independence from Physical Pin Downs Integrated Data and Metadata Management Uniform Access Interfaces
E N D
SRB Space iRODS: A Rule Oriented Data ManagementSystem
Beyond the Storage Resource Broker • SRB is a data management system for large-scale data • Logical name space -- Independence from Physical Pin Downs • Integrated Data and Metadata Management • Uniform Access Interfaces • Caters to multiple tasks and paradigms • Data grid Federations for distributed and replicated data handling • Cooperating Autonomous Virtual Organizations (VO) • Persistent Archives for long-term preservation • Building light, dim and dark archives • Digital Libraries for semantically searchable data sharing • multiple domains with collection-level functionalities • Server-side Operations for performing data intensive operations • Data sub setting, data fusion, administrative management • Used in large-scale systems in production
What Next? • SRB is quite complex – with many functions and operations • > 90 commands with many options • several 100 unique ops • The intelligence is hard-coded • extensions/modifications require extreme care • but, the modules are fairly robust and reusable • SRB is a one-size fits all architecture • everyone gets the same code base • Users want more functionality • increased customizability • want a small foot print as necessary • Easy for them to modify • independence from developers • functionality to fit policies and not the other way around!!
What Do the Users Want? • Innovative Access Control • Sometimes by groups, sometimes by users & sometimes by roles • Based on their login type – how they got authenticated • Third party authorization - outside authority agent • Dynamically changeable access control • Access Control Lists, Denial lists, over-rides,… • Ticket-based short-term and controlled access • Data Placement Strategies • Completely user controlled – user preference policies • Completely Administration controlled – site policies • Group-based policies • Over-rides, exceptions • Based on Data characteristic or Collection characteristic • Policies for staging, caching, archiving, purging, synchronization,… • Ingestion Policies • Check for authenticity – anonymization, • Pre and post process • Replication policies, metadata extraction policies, permission policies,… • And others …
Rule Oriented Data Management • Adaptive Middleware Architecture • Customizable and Flexible – User Configurable • Administratively Simpler – Admin Configurable • Build upon the experience of SRB Data Grid • Rule-oriented Programming • Well-defined set of functionalities --- Micro services • Define Rules which chain micro-services • Work-flow of micro services • Define Rule Application Condition • Define Recoverability for failure management • Administrators can set site policies • Users can encode their preferences • Groups can set their process requirements • Control actions at collection-level, format level, user level, resource level, ….
Rules and Constraints • Rule-based • Lower-level Functions are composed of micro-services • Higher-level Functions are composed of rules of lower-level micro-services • Rules are interpreted using a rule engine • Customizability • Problems with rule composition • Integrity checks to make sure rules do not break higher-level functionailties • Declarative programming • Rules define semantics • Operational programming • Rule invocation provides procedural interpretation • Rules can be used as “checks and balances” to make sure that collections are self-consistent • Example: Rule makes two copies of each files • Constraint checking: can be used to see if the collection is consistent with this rule
Client Interface Admin Interface Rule Invoker Service Manager Rule Modifier Module Config Modifier Module Metadata Modifier Module Resource-based Services Rule Consistency Check Module Consistency Check Module Consistency Check Module Micro Service Modules Engine Current State Confs Metadata-based Services Rule Base Meta Data Base Micro Service Modules Rule-Oriented Data Systems Framework Resources
Rules Flow Application Client Call Find Appropriate Rules Select Firstt/Next Rule Server Call Condition Check False Execute Recovery MicroService/Action Failure: No More Rules True Execute Next MicroService/Action Success No Yes Success: No More MS/A
Sample Rules • ingestObject(*F) • createFile(*F), registerFile(*F). • ingestObject(*F) • $userDept == sdsc OR $userDept == sio • createFile(*F), registerFile(*F), • computeChkSum(*F),!, • findBackUpRsrc(*F, *R), replicateFile(*F, *R), • computeCheckSum(*F, *R), • compareCheckSum(*F). • ingestObject(*F) • $dataType == FITS Image • createFile(*F), registerFile(*F), • extractFITSMetadata(*F).
Format of a Rule • Action :- Condition | MS1, …, MSn | RMS1, …, RMSn • Action to be performed • Condition checked to see if rule is applicable • If applicable micro services {1,…n} are executed • If any micro service fails, recovery micro service(s) executed to maintain transactional capability • createFile(*F) removeFile(*F) • ingestMetadata(*F,*M) rollback • Caveats: • More than one rule can define an action • R/MSi can be actions • Micro services can pass parameters
AMA & ROP • A New Paradigm in Middleware Development • Higher level Services composed of Micro-services • Customizable at multiple levels • Glass Box Architecture • Can explain what happens • Semantics can be checked • Run-time Version Control • Combines multiple paradigms • Workflow systems, active databases, rule-based execution, transaction systems, data grids and remote execution of services • Flexible Management • Administrative ease • Triggers for handling low/high water marks • Periodic Job execution – backup, archive, usage control,…
Components of Rule System • Actions • Name Space of Actions • Client Call Maps to Actions • Micro Services • Well-defined Server-side Procedures and Functions • Rule • Definitions for Actions • Workflow of what to do • Composed of of Actions and Micro Services • Invoked to execute an Action • Rule Base • Set of Rules • Each User Community can choose their own rule base • Data Components • Blackboard Architecture • Used by Micro Services,Actions and Rules
Data Components of Rule System • Persistent Data Attributes: # • Has an external name space • Mapping to internal database attributes • Persists across sessions • Session Data Attributes: $ • Has an external name space • Mapping to internal data structures • Used by micro-services/actions inside a session • Side Effects Set: % • Changes affected outside the system • File created, File Copied, Email Sent, … • Well-defined name space of activities
Micro Services • Compiled Functions • Short and Well-defined functionality • Should have a clear semantics • Works on $,#,% • Examples: • Metadata Extraction for DICOM • Access Control Permission Changed to User • Replicate a file from Source to Destination
Semantics • Micro Service Semantics • Input /Output Variables (in terms of $) • Input: what is needed • Output: what gets changed • Persistent Changes (in terms of #) • Updates to Databases • Activities Performed (in terms of %) • External Activities Performed
Semantics • Rule Semantics • Based on component micro services • Action Semantics • Based on corresponding rules • Only one rule semantics apply
Middleware • Software providing complex distributed applications/services • Client-server • Peer-to-peer • Web servers, Content Managers, Databases, Application Servers,… • Client access through common protocols • RPC, Message-oriented, Object Request Broker, WSDL or service-oriented • Middleware provide a specific set of services
Middleware • Normal Middleware are black boxes • Expose a set of interfaces/service definitions • No customization • System Developer has complete control • A Service will have very configurability option - even in open source middlewares • Applications are developed on top of middleware
Adaptive Middleware Architecture • Similar to normal middleware • Provides a set of services • Has a well-defined access protocol • AMA not a Black Box • Admin/User Customizable Service • Tweak services to achieve alternate goals • Can explain at a high-level what is happening • One can compare two AMA services to see how they differ • Useful for verification and analysis
Adaptive Middleware Architecture • External View – Logical Name Space • Persistent Memory – Database • Transient Memory – Variables • External Side-effects • Interaction to outside world • Ex. File is created, Email is sent • Services, Methods, Actions • Rules, Workflow • Internal View – Programmatic View • Changes in DB Tables, internal variables/structure • Procedures, Methods and Functions • Drivers, Protocols • Users, Resources, Data Objects – methods affecting them • Mapping • External to Internal • Capturing Semantics of Services and Rules • Validation, Analysis, Introspection