1 / 22

AstroGrid-D Monitoring

AstroGrid-D Monitoring. AstroGrid-D Meeting @ AIP 25.-26.2.2008 Frank Breitling Stephan Braune. Contents. Host Monitoring (for compute resources) Status Goals until the end of the project Perspectives beyond the project Robotic Telescope Monitoring Status

breck
Download Presentation

AstroGrid-D Monitoring

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. AstroGrid-D Monitoring AstroGrid-D Meeting @ AIP 25.-26.2.2008 Frank Breitling Stephan Braune

  2. Contents Host Monitoring (for compute resources) Status Goals until the end of the project Perspectives beyond the project Robotic Telescope Monitoring Status Goals until the end of the project Perspectives beyond the project

  3. Host Monitoring Status Since Dec. 2007 AGD monitoring solution It builds on Audit Logging provided by Globus Toolkit V4.0.5 and later PostgreSQL Database (DB)‏ DB Triggers Usage Records (UR) XML format (http://staff.psc.edu/lfm/PSC/Grid/UR-WG/) XML2RDF XSLT Stellaris SPARQL queries A test setup is running at the AIP since Dec. 2007

  4. AGD Monitoring Architecture globusrun_ws GlobusClient Globus grid Resource AuditDatabase globus_job_run Trigger User Workstation curl Stellaris RDF- Database Earlier: status information via EPR-files and monitoring.pl Browser SPARQL QueriesTimelines

  5. Activation of Audit Logging in Globusfor WS GRAM (globusrun-ws)‏ Changes in the Globus Toolkit configuration: in $GLOBUS_LOCATION/container-log4j.properties: ... # GRAM AUDIT log4j.category.org.globus.exec.service.exec.StateMachine.audit=DEBUG, AUDIT log4j.appender.AUDIT=org.globus.exec.utils.audit.AuditDatabaseAppender log4j.appender.AUDIT.layout=org.apache.log4j.PatternLayout log4j.additivity.org.globus.exec.service.exec.StateMachine.audit=false output to database (PostgreSQL or MySQL), Database Connection has to be declared in $GLOBUS_LOCATION/etc/gram-service/jndi-config.xml: <resource ...> <resourceParams> ... <parameter> <name>url</name><value>jdbc:mysql://<host>[:port]/auditDatabase</value> </parameter> <parameter><name>user</name><value>globus</value></parameter> <parameter><name>password</name><value>foo</value></parameter> ... </resourceParams> </resource> table update whenever a job ist started or changed it's status (contrary to SAGAS)‏ database content is converted into Usage Record format and sent to Stellaris via DB triggers

  6. Activation of Audit Logging in Globusfor Pre WS GRAM (globus-job-run)‏ Changes in the Globus Toolkit configuration: in $GLOBUS_LOCATION/log4j.properties: ... # GRAM AUDIT log4j.category.org.globus.exec.service.exec.StateMachine.audit=DEBUG, AUDIT log4j.appender.AUDIT=org.globus.exec.utils.audit.AuditDatabaseAppender log4j.appender.AUDIT.layout=org.apache.log4j.PatternLayout log4j.additivity.org.globus.exec.service.exec.StateMachine.audit=false text file output has to be configured in $GLOBUS_LOCATION/etc/globus-job-manager.conf: -audit-directory /tmp/globus file is converted into Usage Record format and sent to Stellaris via a cron job

  7. Audit Fields in PostgreSQL DB

  8. DB Trigger The triggers are installte in the PostgreSQL DB using: audit=# \i trigger.sql Documentation is available at AGD intranet: http://mintaka.aip.de:8080/lenya/intranet/live/workpackages/wg2/GRAM_audit_logging.pdf CREATE FUNCTION update_stellaris() RETURNS "trigger" AS $update_stellaris$ use strict;use URI;use Net::hostent;use XML::Writer;use HTTP::Request;use LWP::UserAgent; my $job_grid_id = URI->new($_TD->{new}{job_grid_id}); my $id = unpack("H*", $job_grid_id->query()); my $host=gethost($job_grid_id->host())->name(); my $usage_record = ""; my $writer = XML::Writer->new(OUTPUT => \$usage_record, NEWLINES => 1, UNSAFE => 1); $writer->xmlDecl("UTF-8"); $writer->startTag("JobUsageRecord", "xmlns" => "http://www.gridforum.org/2003/ur-wg#", ...); $writer->startTag("RecordIdentity"); $writer->dataElement("LocalJobId", $_TD->{new}{local_job_id}); $writer->endTag("RecordIdentity"); ..... $writer->raw($_TD->{new}{job_description}); $writer->dataElement("success_flag", $_TD->{new}{success_flag}); $writer->dataElement("finished_flag", $_TD->{new}{finished_flag}); $writer->endTag("JobUsageRecord"); $writer->end(); my $req = HTTP::Request->new("PUT", "http://stellaris.astrogrid-d.org/files/hosts/".$host."/urs/".$id, HTTP::Headers->new(Content_Length => length($usage_record)), $usage_record); my $ua = LWP::UserAgent->new(); my $res = $ua->request($req); ..... return; $update_stellaris$ LANGUAGE plperlu; CREATE TRIGGER update_stellaris_trig BEFORE INSERT OR UPDATE ON gram_audit_table FOR EACH ROW EXECUTE PROCEDURE update_stellaris();

  9. SPARQL Queries for Usage Statistics PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX ur: <http://www.gridforum.org/2003/ur-wg#> PREFIX x2r: <http://www.astrogrid-d.org/2007/08/14-xml2rdf#> SELECT ?job_grid_id ?GlobalUserName ?SubmitHost ?executable ?creation_time ?StartTime ?EndTime ?wdv ?Count ?CPU_Time WHERE { graph ?g { ?n1 ur:JobIdentity ?JobIdentity . ?JobIdentity ur:job_grid_id ?job_grid_id . ?n1 ur:UserIdentity ?UserIdentity . ?UserIdentity ur:GlobalUserName ?GlobalUserName . ?n1 ur:creation_time ?creation_time . ?n1 ur:SubmitHost ?SubmitHost . OPTIONAL { ?n1 ur:StartTime ?StartTime . ?n1 ur:EndTime ?EndTime . } OPTIONAL { ?n1 ur:WallDuration ?wall_duration . ?wall_duration x2r:value ?wdv . } OPTIONAL { ?n1 ur:Resource ?res . ?res x2r:value ?executable . } OPTIONAL { ?n1 ur:Count ?Count . } OPTIONAL { ?n1 ur:CPU_Time ?CPU_Time . } }} ORDER BY DESC(?creation_time) LIMIT 25 PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX ur: <http://www.gridforum.org/2003/ur-wg#> PREFIX x2r: <http://www.astrogrid-d.org/2007/08/14-xml2rdf#> SELECT distinct ?GlobalUserName ?executable ?SubmitHost sum(?CPU_Time)‏ WHERE { graph ?g { ?n1 ur:JobIdentity ?JobIdentity . ?JobIdentity ur:job_grid_id ?job_grid_id . ?n1 ur:UserIdentity ?UserIdentity . ?UserIdentity ur:GlobalUserId ?GlobalUserName . ?n1 ur:SubmitHost ?SubmitHost . ?n1 ur:CPU_Time ?CPU_Time . OPTIONAL { ?n1 ur:Resource ?res . ?res x2r:value ?executable . } }} ORDER BY ?GlobalUserId

  10. Retrieving Usage Statistics via Stellaris

  11. Goals until the end of the project Integrate monitoring info in Timeline and Resource Map Provide more SPARQL query templates (See svn://svn.gac-grid.org/software/monitoring/host/) Provide improved documentation and installation instructions Include all AGD institutes and resource in monitoring Come from test to production mode, i.e. solve remaining problems

  12. Solve instable DB connection Audit Logging establishes a DB connection only once, i.e. the first time a job is submitted to Globus If the DB goes down, the connection is lost and no further data received => a restart of the Globus Container necessary Solution: we have informed the GT developers via mailing lists: gt-user & gram-user but report: http://bugzilla.globus.org/globus/show_bug.cgi?id=5863

  13. Add missing fields in audit logging Some important information is not provided by audit logging global job id (UUID format)‏ resource usage information as reported by the UNIX time command, i.e.: (i) the elapsed real time (ii) the user CPU time (iii) the system CPU time end time of the job, in the same format as creation_time name of submission client name of execution host (and maybe also the number of used CPUs)‏ Solution: we have informed the GT developers via mailing lists: gt-user & gram-user but report: http://bugzilla.globus.org/globus/show_bug.cgi?id=5864

  14. Add Usage Record (UR) format Audit logging is not compatible to the UR format, the OGF standard for monitoring information currently we construct URs via database triggers Solution: we have informed the GT developers via mailing lists: gt-user & gram-user but report: http://bugzilla.globus.org/globus/show_bug.cgi?id=5865

  15. Simplify installation procedure Currently the PostgreSQL has to be recompiled with Perl support DB triggers have to be installed Globus configuration is necessary Solution: we want to optimize the installation process, maybe with a Globus helper package

  16. Upgrade to Stellaris V 0.2.0 Currently a few problems also exist with Stellaris V 0.2.0 We continue testing and Report every problem to Mikael Högqvist

  17. Perspectives beyond the project Define a common policy about data privacy, since AGD resources are shared with other grid communities (e.g. LRZ) which might have different restrictions on logging of user information Suggest AGD monitoring solution to other grid communities

  18. New vision of the RT project as reflected by new name: OpenTel corresponding project page: http://www.gac-grid.org/project-products/RoboticTelescopes.html OpenTel is an open network for rob. telescopes. Open means open standards open source open for telescopes to join OpenTel is for professional and amateur astronomers OpenTel is currently the only open network and therefore a unique and promising approach in robotic astronomy Robotic Telescopes Status

  19. Project History Progress so far D2.4 Static metadata: FB, done (15.5.2007)‏ D2.7 Dynamic metadata / Monitoring: FB, 66% complete, publication expected in March D5.3 First Integration of RTs: FB, done (31.7.2007)‏ Goals until end of project D5.5 Resource Broker: TR, work in progress. FB will help. D5.8 Scheduler: FB, TR, Thomas G., to be done

  20. Monitoring / Dynamic Metadata Monitoring a network of robotic telescopes - Deliverable 2.7: STELLA-I & II as info providers for Stellaris Same database triggers as for host monitoring RDF Calendar format is used for scheduling info (understood by RDF tools)‏ Trigger templates can be easily adjusted for other telescopes Software is collected in a package called “ottools” Timeline showing observation schedule directly from the STELLA DB (http://photon.aip.de:25000/timeline/telescopes.html) Timeplot showing weather information (tbd)

  21. Goals until the end of the project Provide a general solution for the integration of other telescopes. This requires: Metadata management based on user certificates Software package with tools and templates (ottools) svn://svn.gac-grid.org/software/OpenTel/ottools‏ Comprehensive documentation Improved user interfaces: Timeline & Timeplot with menu for selection of telescopes, time windows, etc. Timeplot displaying new metadata of time series (temperature, seeing, etc.)‏ Resouce map displaying dynamic metadata Resource Broker (D5.5)‏ Scheduler (D5.8)‏ Integrate STELLA-I & STELLA-II First observation via the grid

  22. Perspectives beyond the project Improve software, in particular the scheduler Perform more grid observations, more testing Perform first network observations Integrate more telescopes, in particular from hobby astronomers. Software contributions would be welcome Collaboration with other networks such as the LCOGT Attract and collaborate with the amateur astronomy and open source community Find an OpenTel logo

More Related