230 likes | 365 Views
Accounting for the Grid. Usage Records and a Resource Usage Service. Acknowledgements. Work presented is the output from two Global Grid Forum Working Groups Usage Record Working Group (UR-WG) Resource Usage Service Working Group (RUS-WG) I was involved mainly in the RUS-WG
E N D
Accounting for the Grid Usage Recordsand aResource Usage Service
Acknowledgements... • Work presented is the output from two Global Grid Forum Working Groups • Usage Record Working Group (UR-WG) • Resource Usage Service Working Group (RUS-WG) • I was involved mainly in the RUS-WG • Work was funded through UK e-Science Markets for Computational Services (MCS) Project • The recent implementation of the RUS, and much of the material presented today is from John Ainsworth, University of Manchester.
Accounting on the Grid? Q. Why is it different from HPC Center accounting? A. Like accounting for a HPC Center, we need to track usage on more than one machine, but: • users have single sign-on – need to work with X509 Distinguished Names... • ...so usernames may differ • Also, some machines are at (and run by) different organizations
How do we do this? (1) • We know that different batch systems produce different accounting records • As many formats as batch systems (similar content) • But aggregating these directly is hard • Also, need to cope with single sign-on (X509) • So first, we create a standard accounting record representation (Usage Record) • Defined by the GGF UR-WG • This is defined as an XML Schema. The spec. and XML Schema are at: • http://www.psc.edu/~lfm/Grid/UR-WG/ • The work of this group is nearly completed • Specification is now stable
Example Usage Record <UsageRecord xmlns=http://www.gridforum.org/2003/ur-wg xmlns:urwg="http://www.gridforum.org/2003/ur-wg" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <RecordIdentity urwg:recordId="JSS-UNIQUE-ID” urwg:createTime="2003-08-13T18:56:56Z" /> <JobIdentity> <GlobalJobId>green147989</GlobalJobId> <LocalJobId>147989</LocalJobId> </JobIdentity> <UserIdentity> <LocalUserId>wwmarko</LocalUserId> <ds:KeyInfo xmlns="http://www.w3.org/2000/09/xmldsig#" xmlns:ds="http://www.w3.org/2000/09/xmldsig#"> <X509Data> <X509SubjectName>CN=john ainsworth, L=MC, OU=Manchester, O=eScience, C=UK</X509SubjectName> </X509Data> </ds:KeyInfo> </UserIdentity>
...continued! <JobName>------</JobName> <Status>completed</Status> <TimeDuration urwg:type="cpuTimeRequested">PT1800S</TimeDuration> <TimeDuration urwg:type="wallTimeRequested">PT1800S</TimeDuration> <TimeInstant urwg:type="timeSubmitted">2004-11-29T06:47:30</TimeInstant> <Processors>1</Processors> <ProjectName>cs5015</ProjectName> <Host>green</Host> <CpuDuration>PT0.0S</CpuDuration> <WallDuration>PT1S</WallDuration> <StartTime>2004-11-29T06:48:33</StartTime> <EndTime>2004-11-29T06:48:34</EndTime> <MachineName>green</MachineName> <SubmitHost>wren</SubmitHost> <Queue>normal</Queue> <Resource urwg:description="quoteReference">contract1234</Resource> <Resource urwg:description="contractNumber">escience</Resource> </UsageRecord>
How do we do this? (2) • Next, we need somewhere to store the records • Something that we can push records into, and pull them back out of • So first, we now define a standard Web Service interface (Resource Usage Service) • Defined by the GGF RUS-WG • Service interface is based on “plain” Web Services, i.e. it is compliant with the WS-I Basic Profile 1.0 • This is defined as WSDL with XML Schema. The spec. is being updated prior to going to the GGF Editor • http://www-unix.gridforum.org/mail_archive/rus-wg/maillist.html • The work of this group is nearly completed • Specification is now stable
How do I work this? • Specs are all very well, but what about running it? • There is an implementation of the RUS • Also a record spooler for uploading records • Built at ESNW in Manchester • Will be maintained by LeSC in London • Will receive continued support through the UK’s Open Middleware Infrastructure Institute (OMII) • Current version is downloadable: • http://www.sve.man.ac.uk/Research/AtoZ/MCS/RUS/
How do Igenerate records? • This is trickiest part... • To some extent, this is scheduler specific • Platform LSF can generate UR format directly • For OpenPBS/PBSPro, you can use SourceForge’s PBSAccounting • http://pbsaccounting.sourceforge.net • Complex part is getting the X509 Distinguished Name into the record (for Grid jobs) • Need to tweak Globus jobmanagers
Service Interface (1) • Write Operations • insertUsageRecords(UsageRecord[]), replaceUsageRecords(RecordAndId[]) • deleteRecords(XpathQuery), deleteSpecificRecord(RecordId[]) • modifyUsageRecordPart, updateUsageRecordPart (not implemented) • Read Operations • extractRecords(XpathQuery), extractSpecificRecords(RecordId[]) • extractUsageByGlobalUserId, extractUsageByMachineName, extractUsgaeBySubmitHost,
Service Interface (2) • Management • retrieveConfiuration • updateConfiguration • Faults • RUSProcessingFault • RUSUserNotAuthorised • RUSInputFault
Security Model • Role based security • Specified through access control file (XML) (Cached) • Administrator • Unrestricted read/write authorization • ResoureManager • Restricted read/write authorization • Requires a ResourceDescription to specify the resources for which the RM has permission • ResourceTypes are urwg:MachineName, urwg:SubmitHost, urwg:ProjectName and Domain • Authorization for a record determined by Logical AND between different ResourceTypes, logical OR within values of same ResourceType • All other users denied both read and write access
Configuration • Mandatory Record Elements • A record must contain these elements for it to be valid for this RUS • Resolves “everything is optional” problem inherent in Usage Record specification
RUSUsageRecord • Internal wrapper around UsageRecord • Adds elements • RUSId • RecordHistory • Audit trail of record insertion and modification • Records who and when in StoredBy and ModifiedBy elements
InsertUsageRecords • Check user authorization for record • Validate record against schema • Check mandatory elements are present • Check the record is not a duplicate • Insert into database
Implementation notes • Started with WS-Security, but moved to TLS • More widely available • Extended set of error codes • Added InvalidRecord and DuplicateRecord (used in response for insert and replace) • Database stores each record as a document • Xindice single document size limitation • Developed web-based query client • Developed a Perl usage record spooler
Final Comments • This is mature work, which is being deployed in the UK, at Manchester, and other sites • The work is based on emerging standards from the Grid community • The implementation has a future, including development and support • Any questions?