170 likes | 313 Views
Stephen Booth s.booth@ed.ac.uk. GridSafe Overview. Grid-SAFE. JISC funded project to build general purpose accounting/monitoring solution. http://gridsafe.forge.nesc.ac.uk/ Builds on accounting subsystem from SAFE user administration system used by UK national facilities HPCx/HECToR.
E N D
Stephen Booth s.booth@ed.ac.uk GridSafe Overview
Grid-SAFE JISC funded project to build general purpose accounting/monitoring solution. http://gridsafe.forge.nesc.ac.uk/ Builds on accounting subsystem from SAFE user administration system used by UK national facilities HPCx/HECToR
Challenges • Need to work with different HPC technologies • Different batch systems • Different middleware • Need to work with wide variety of different local policies. • Need to work with both grids and local HPC resources. • One solution won’t fit all potential users • Build kit of parts • Pre-built solutions for common deployment scenarios. • Key aims • Modular design, individual functions can be deployed independently • Behaviour can be customised using plug-ins to implement different service policies.
Data Formats • System can consume accounting data in a variety of formats. • Each format has a plug-in parser module • New formats can be supported by writing additional parser plug-ins. • Data is stored in an SQL database. • Additional policy plug-ins can augment the parser to customise behaviour. Raw Data DB Policy Policy Policy Parser
Parser • System can support multiple input formats at the same time. • Current supported parsers • OGF-UR XML • SGE accounting logfile • PBS accounting logfile • EGEE JobManager logfile • Etc. • New parsers easy to generate
OGF-UR support OGF-UR XML is supported as an interchange format Parser plug-in to parse OGF-UR Export module to format internal data as OGF-UR Grids may only want to use only this Format for central accounting. Local instances could use raw data and generate UR for central processing. Various grid communities seem to interpret OGF-UR differently and/or make additional requirements beyond that in the schema Required fields Different charging models Different global username models OGF-UR spec allows extensions. Specification will also evolve over time. Parser/exporter highly configurable to support variations/extensions.
Use in the grid Grid accounting XML XML Site accounting Independent UR Generator
Report generation module Reports can be generated on demand from web interface Grid-safe uses XML templates to define reports Can generate unified reports over multiple data tables containing different types of data Tables/charts Parameterised reports (e.g. to select user or project). Support reports in multiple output formats PDF HTML CSV XML
Report generation speed • Performance of report generation a particular issue • Number of database records key to this. • Need to utilise database effectively. Not acceptable to read all records into memory. • ~1,000,000 record database table not a problem. • Current National HPC systems within this range. • Throughput clusters often have significantly larger record counts due to large numbers of small short jobs. • Old data can be moved to separate tables. • Support for Daily aggregates via policy plug-in • Builds secondary accounting table combining similar records. • For ECDF 51 million records -> 35 thousand aggregates
Policy plug-ins • Allow behaviour to be customised to local requirements • Generate new properties • E.G. Charge values • Trigger additional processing • Decrement charging allocations • Generate aggregate records • Etc. • New policies can be written for specific requirements
Aggregation Policy • Generates Aggregated records • Each time a new record is loaded • Corresponding aggregate is located/created • Aggregate values updated • The raw data is also kept and can be used in reports if required. • Aggregate data can be regenerated if required.
ClassificationPolicy • Converts selected fields from raw accounting data into references to separate database table. • Reduces data footprint. • Augmenting information can be added to these tables. • Example: User Institution URRecord UnixGroup Site DailyAggregate
DerivedPolicy • Defines new properties as expressions over existing properties • E.g. (EndTime-StartTime)*CPUs • These expressions can then be used in reports.
LinkPolicy • Merge data from different sources • E.g. Batch system logs and middleware logs. • Each data source is parsed to its own table. • Primary table parsed first. • LinkPolicy added to secondary data source. • Locates corresponding primary record, • Adds cross reference or copies additional properties to primary
Web Services • RUPI • Current proposal from OGF RUS-WG • Web service for the upload of XML usage record. • Grid-SAFE has an implementation of the current upload service (RUPI). • RUQI • Currently working on a proposal for a Query specification • Aims • Easy to implement in different code bases. • Provide sufficient functionality for efficient report generation. • Long term aim to provide reporting portal that can query any system that implements this interface.