220 likes | 380 Views
Storage Resource Broker Actors and Applications in Kepler. Nandita Mangal Efrat Jaeger-Frank Ilkay Altintas Chien-Yi Hou Lucas Gilbert Arcot Rajasekar. What is a Scientific Workflow?. Combination of data integration, analysis, and visualization steps
E N D
Storage Resource Broker Actors and Applications in Kepler Nandita Mangal Efrat Jaeger-Frank Ilkay Altintas Chien-Yi Hou Lucas Gilbert Arcot Rajasekar
What is a Scientific Workflow? • Combination of • data integration, analysis, and visualization steps • larger, automated"scientific process pipelines" • Mission of scientific workflow systems • Promote “scientific discovery” by providing tools and methods to generate scientific workflows • Create an extensible and customizable graphical user interface for scientists from different scientific domains • Support computational experiment creation, execution, sharing, reuse and provenance • Design frameworks which define efficient ways to connect to the existing data and integrate heterogeneous data from multiple resources • Make technology useful through user’s monitor!!!
Kepler is a Problem solving environment for Scientific Workflows Ptolemy is a laboratory for investigating design. Kepler is a Scientific Workflow System www.kepler-project.org • Builds upon the open-source Ptolemy II framework
Ptolemy II Kepler is also a cross project collaboration Griddles SKIDL Resurgence SRB CIPRes NLADR Contributor names and funding info are at the Kepler website!! New contributor: - Chesire (UK Text Mining Center) LOOKING
Kepler Workflow: Actors • Actor • Encapsulation of parameterized actions • Interface defined by ports and parameters • Port • Communication between input and output data • Without call-return semantics • Model of computation • Communication semantics among ports • Flow of control • Implementation is a framework • Sample Actors • Web Service Client & Web Services Harvester • Grid Actors: Globus Job Runner, GridFTP-based file access, Proxy Certificate Generator • SRB Actors • Command Line Tools: local, ssh, scp, ftp, etc • Interaction with Nimrod and APST • Object Ring Buffer Communication • Imaging Gridding, Vis Support • …more generic and domain-oriented actors… Actor-Oriented Design
Directors are WF Engines • Implement different computational models • Define the semantics of • execution of actors and workflows • interactions between actors Ptolemy and Kepler are unique in combining different execution models in heterogeneous models! • Kepler is extending Ptolemy directors with specialized ones for web service based workflows and distributed workflows. • Dataflow • Time Triggered • Synchronous/reactive model • Discrete Event • Wireless • Process Networks • Rendezvous • Publish and Subscribe • Continuous Time • Finite State Machines
Designing Data Grid Workflows with Kepler & SRB Data Grid workflow automate the process of ingesting, transferring and processing data in the Grid Environment. Efficient Models for various Data Grid Application Problems Efficient Storage Management Extends to Various Scientific Disciplines Third Party Transfers Kepler-SRB Persistent Archiving Secure, Optimized file transfers Server – side data processing: SProxy Data Access to diverse repositories using single namespace
Utilizing SRB Functionality in Kepler • Developing SRB Actor Interfaces • Actors use the SRB JARGON java API • Connecting / Disconnecting to user’s SRB space via specialized SRBConnectactors, which create an authentication socket connection using a connections pool. • Actors (accessing the same SRB space) share the same connection socket by passing connection token through their I/O ports via channels. Connection: SRB host, port, username password, domain SRBFileSystem: Connection Token
SRB Actors in Kepler Data Access and Transfer Actors SPut/SGet use parallel put/get approaches as provided by JARGON API Streaming Actors Stream Put/Get read & write files from and to SRB as sequence of byte Arrays. SRB Proxy Operations
SRB Actors in Kepler Metadata Actors Gets the Physical Location of SRB logical file paths Gets the User Defined Metadata for the SRB file path Adds the user defined Metadata for the SRB file path Query metadata, get all files satisfying the conditions
SRB Actors in Kepler Server Side Processing Actor SProxyCommand A special actor which wraps the SRB Spcommand. This actor enables executing any server side command, deployed on an SRB space, on SRB stored data The above actor further helps in models involving third party transfers as well as shipping and handling computational models.
Third Party Transfers By Tim H. Wong
Real-Time Sensor Data Access & Visualization Nandita Mangal
Data Transfer & Reliable Replication: UCSD-TV Digital Media Archival
Checksum ChecksumBefore Recording the CheckSum as metadata on SRB File Path Upload to SRB ChecksumAfter Data Replicate Via SProxy Actor
Summary • Kepler is good at: • Integrating data, programs, and computing resources • Capturing your ideas and realizing them • Supporting computational experiment creation, execution, sharing, and reuse • Quickly prototyping scientific workflows • Building streamlining applications • Combination of Kepler and SRB • Flexible and powerful Data Grid applications
Future Work • Add more SRB Scommands • Serror: to display error • SgetColl: to display information on SRB data objects • An experimental SRB domain • Optimize data transfers • Provide connections at system level • Improve usability • New workflows as solutions to well-known data management challenges
Questions? http://www.sdsc.edu http://kepler-project.org