220 likes | 319 Views
QCDgrid User Interfaces. James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh. QCDgrid Summary. QCDgrid project is developing a data and compute grid for scientists in the UKQCD collaboration data storage grid has been up and running for some months now
E N D
QCDgrid User Interfaces James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh
QCDgrid Summary • QCDgrid project is developing a data and compute grid for scientists in the UKQCD collaboration • data storage grid has been up and running for some months now • job submission system is in early stages of development • software developed is released as open source • builds on Globus 2.0, eXist XML database and various other technologies • For more information on the project in general, see Lorna’s talk this afternoon!
User Requirements • User interface is naturally driven by users’ requirements • most QCDgrid users have a good understanding of computers • for them, advanced scripting capabilities are more important than user-friendly GUIs • powerful command line interface is top priority for QCDgrid software • GUIs also useful for some operations • for example, searching and browsing the metadata catalogue • C and Java APIs facilitate integration with software in many different programming languages
Datagrid Interfaces • Data management aspect of grid consists of two distinct parts • Low level data replication deals with files themselves • files at this level are just blocks of binary data – they could contain anything • Globus replica catalogue maps logical filenames to actual physical locations • Metadata catalogue associates some meaningful, structured information with each file • allows users to search for data more easily • maps interesting characteristics of data (structured as XML) to logical filenames
Low Level Data Grid Interface • Low level operations provided by command line tools and C API • Java interface using JNI also available • SRM-compliant interface to some functionality • Fairly small set of basic operations • put a file or directory on the grid • get a file or directory from the grid • delete a file or directory • list files on grid • register interest in a file or directory • User must have a valid Globus proxy initialised
Data Grid Example Commands • Some example data grid commands: put-file-on-qcdgrid /home/username/myfile gridfile • puts the local file ‘myfile’ onto the grid under logical name ‘gridfile’ • replication software will take care of deciding where to store the file, adding replica catalogue entries, etc. get-file-from-qcdgrid -R griddir /tmp/mydir • gets directory ‘griddir’ from the grid, storing it in local directory /tmp/mydir • ‘-R’ switch means ‘recursive’, works with most QCDgrid commands
More Example Commands • More example commands... qcdgrid-list • lists all files on grid by logical name i-like-this-file interestingdata • registers interest in the file with logical name ‘interestingdata’ • replication system takes this into account, tries to store files close to where they are most often wanted qcdgrid-delete olddata • removes all copies of the file ‘olddata’ from grid
Data Grid APIs • APIs provide similar functionality • Example: QCDgridClient grid = QCDgridClient.getClient(true); String logicalFile = “gridfile”; File physicalFile = new File(“localfile”); grid.getFile(logicalFile, physicalFile);
Metadata • Problem: logical file names may not be meaningful • users may have trouble finding data • Solution: metadata catalogue • associate some meaningful information with each file on the grid • including date produced, machine used, code used, actual physical parameters • users can then search on these fields • metadata is XML, stored in eXist XML database • queried using XPath query language • Command line, GUI and Java interfaces (via standard XMLDB API) available
Metadata Interface: Commands • Command line functionality currently limited to 3 operations • submit • remove • update schema • Examples: java QCDgridMetadataClient localhost:8080/exist \ updateSchema newschema.xsd java QCDgridMetadataClient localhost:8080/exist \ submit newfile.xml newdocumentid
Metadata Interface: GUI • Metadata browser GUI allows users to easily search for the data they want • XPath queries can be built using simple graphical input methods • GUI generated automatically from current schema • when schema is updated, GUI updates itself • matching data can be easily retrieved from the grid
Searching MDC, Step 1 • Main browser window gives a list of saved queries • these are stored in the user’s profile • support for ‘libraries’ of queries is planned
Searching MDC, Step 2 • Creating a new query • first a node in the XML document structure must be selected from the tree • tree is automatically generated from schema when browser starts up • e.g. to find all the data produced on a certain date, user should select the ‘date’ node
Searching MDC, Step 3 • Once node has been selected, predicate must be specified • this is just an XPath term for criteria for matching node data • predicate can be entered as raw XPath if desired • most users will want to make use of form to simplify process
Searching MDC, Step 3 cont... • More complex queries can be created relatively easily • in this example, the query is extended to search for data from 2 years • for most queries, knowledge of XPath is not required
Searching MDC, Step 4 • New query now appears on list • from here, queries can be managed • queries can be combined together • or can be submitted to the database backend
Searching MDC, Step 5 • Matching metadata documents are displayed • XML is parsed into easy-to-read, expandable tree format • corresponding data files can be fetched from grid at the press of a button
Job Submission • QCDgrid job submission still in very early stages • As with data management, users require command interface that can be used from scripts • integration with data grid will simplify user interface • unlike plain Globus, job input, output and error streams can be redirected to and from the user’s console • this allows for interactive jobs on the grid – useful for debugging etc. • GUI or web portal interface may be added later
Example Commands • Early prototype of job submission software is up and running • Syntax quite similar to globus-job-run • Example commands: qcdgrid-job-submit qcdtest.epcc.ed.ac.uk \ /bin/date qcdgrid-job-submit doorstopper.epcc.ed.ac.uk \ /usr/bin/program arg1 arg2 arg3 \ --fetch-from-qcdgrid gridName localName
Administrator Interface • Previous slides have focussed on normal end users’ experience • QCDgrid software also provides tools to aid in administration • commands to add and remove grid nodes, and change the state of existing nodes • commands for building and maintaining the Globus replica catalogue • commands for maintaining directory of grid users • Admin GUI to integrate many of these functions is a possibility
Some Admin Commands • Administrators are identified by their certificate subjects • Must have a valid proxy with subject listed in the config file before executing these commands add-qcdgrid-node newnode.ed.ac.uk Edinburgh \ /home/qcdgrid disable-qcdgrid-node notworking.ed.ac.uk verify-qcdgrid-rc setup-security.sh adduser James Perry \ /O=certificate/O=subject/CN=jtp \ jamesp@epcc.ed.ac.uk
Interface Summary • Low level data grid has command line interface and APIs • Metadata catalogue mainly accessed through browser GUI • this also integrates with low level data grid • Job submission currently usable from command line only • possible GUI/web portal in future • Various admin tools exist or are in development • Better integration of the different parts of the project is planned