110 likes | 183 Views
DISCWorld, Virtual Data Grids and Grid Applications. Paul Coddington Distributed & High Performance Computing Group Department of Computer Science, University of Adelaide Adelaide SA 5005, Australia Email: paulc@cs.adelaide.edu.au December 2002. Background.
E N D
DISCWorld, Virtual Data Grids and Grid Applications Paul Coddington Distributed & High Performance Computing Group Department of Computer Science, University of Adelaide Adelaide SA 5005, Australia Email: paulc@cs.adelaide.edu.au December 2002
Background • Distributed and High-Performance Computing Group • Started 1996 at University of Adelaide, PDC joined Aug 1997 • Ken Hawick, Andrew Wendelborn, Francis Vaughan, Kevin Maciunas • Originally part of Research Data Networks CRC • Research into metacomputing middleware and on-line data archives • DHPC Bangor started in 2000 by Ken Hawick • Research areas: • Metacomputing (grid computing) • Java for High-Performance Computing • Parallel computing and parallel algorithms • Cluster computing • Scientific applications of DHPC
DISCWorld Project • Distributed Information Systems Control World (DISCWorld) is a metacomputing middleware system being developed by DHPC. • Mainly a vehicle for research into metacomputing systems, but also developing software and applications. • O-O, written in Java, provides access to well-known services. • Still work in progress -- design work done, various modules in different states of completion. • DISCWorld - high-level, but ideas not fully implemented. • Globus - very low-level, limited capabilities, but defacto standard. • Would like to use high-level DISCWorld ideas, but utilize grid tools, protocols, etc being developed around Globus. • Current work includes: • chains or process networks of services for remote distributed processing • transparent access to ``active’’ distributed heirarchical file systems • integration with Globus tools (using Java CoG)
Virtual Data Grids • Data grids - where storage, searching and accessing large (e.g. Pbyte) distributed data sets is at least as important as data processing. • High-energy physics, astronomy, satellite data, biological data, … • DHPC area of interest - On-Line Data Archives (OLDA) project. • Distributed ``active’’ data archives - or virtual data grids. • Servers don’t just provide ``passive’’ static data from files. • Can provide smart data pre-processing services (data reduction, conversion, etc). • Server(s) generate data on-the-fly, or access cached copy. • Specify data services or requirements (e.g. metadata), not filenames or URLs. • Transparently access ``best’’ copy from distributed replicas. • Distributed Active Resource arChitecture (DARC) • International Virtual Data Grid Laboratory (IVDGL) work on virtual data grids • Example • user specifies required satellite image using metadata (time, region, satellite) • DARC node searches distributed archives, finds ``nearest’’ copy, requests data • server does format conversion, georectification, cropping, ...
Active Data Using GASS Host B Host C Host A https://host:port/filename?program=Truncation&offset=1&length=100 https://host:port/Truncation?filename=myfile&offset=&length=100 Servlets using HTTPS • Legacy applications can access data grid resources using filenames (URLs) Active GASS Server Job Manager Active GASS Client Remote GASS Server Host Table
DARC Host A Host B DR TCP DR Node • Distributed Active Resource Architecture (DARC) • Allows building of distributed storage devices that support active data • Peer-to-peer approach, each machine runs a DARC node • User (or system) supplied Data Resources (DR) Node DR DR Node DR DR Host C
Active Data Using DARC Host A Host B Host C Active GASS Client GASS GASS Server Proxy GSI Node File System Node File System Host Table Host Table • Integration of DARC with Globus tools • Allows DARC to use GASS, GridFTP, Replica Catalog • Allows Globus grid applications to access DARC data resources
Mobile Metacomputing Middleware • m.Net G3 mobile network testbed in Adelaide city (North Terrace). • Collaborative project to provide metacomputing middleware for mobile devices (e.g. iPAQ, phones) - starting next year. • DISCWorld metacomputing ideas fit well to a mobile environment: • provide thin client with access to set of well-know remote services • provide resource broking in dynamic environment • Java implementation • Middleware handles low-level network details • dynamic network environment - mobile user, dropouts, handovers • 3G, regular mobile, 802.11 wireless, docking station • quality of service (adding software layer to interface to IP stack) • Allow users (or applications) to specify policies for services, tasks, priorities, costs. • Provide adaptation for dynamic network, user policies, QoS, cost.
Campus Computational Grid • Many different compute resources available on campus • Supercomputers (SGI box, PC cluster with Ethernet, Sun and PC clusters with Myrinet) • Several small clusters (Sun Netra, Alpha, Linux PC, Javastation, …) • Student labs (Windows PC, iMac with OS X) • Desktop workstations • Student labs are probably the largest computational resource! • A mixture of non-interoperable cluster management systems, each with pros and cons, significant effort to install and maintain • Condor - desktop workstations and Windows PCs (but not good for parallel machines) • Proprietary CMS (e.g. on Sun cluster) • PBS - other parallel computers • Only Sun Grid Engine currently ported to Mac OS X • How to integrate this heterogeneous mix of computer resources and management systems to make them easily and transparently accessible by a variety of users?
Problems with Campus Grids • Could integrate with Globus, but how to submit jobs? • Not globusrun - too low level. • Ideally users would submit jobs as they do now - with shell scripts, PBS job scripts, Condor job submission files - and they run on any resource (whether or not it runs PBS, Condor, etc). • But currently requires something like CMS script -> RSL/globusrun -> CMS/scheduler • Globus (mostly) handles second translation, but not the first. • Non-trivial - CMS combines job specification/execution and resource request/brokering, but Globus separates the two. • How to match jobs with appropriate resource (e.g. which jobs need shared memory, Myrinet, Ethernet, no comms?) • How to interface and cycle share with external grid resources?
Grid Applications • Lattice Gauge Theory • Centre for the Subatomic Structure of Matter (CSSM) • International Lattice Data Grid for sharing simulation data • Bioinformatics • National Centre for Plant Functional Genomics • Molecular Biosciences department • APGrid biogrid project • High-energy physics • Collaboration between CSSM and Jefferson Lab in US • Data analysis and results • Access Grid • Computational chemistry • CANGAROO gamma ray telescope • Collaboration between Australia and Japan • Link to national/international Virtual Observatory projects