140 likes | 246 Views
SAM Aggregated Topology Provider. pedro.andrade@cern.ch 5 June 2013 IT/SDC/MI section meeting. History. Development started in 2009 during the EGEE project by Steve Traylen , James Casey, David Collados , and others Many features/improvements added by the BARC team during 2011-12
E N D
SAM Aggregated Topology Provider pedro.andrade@cern.ch 5 June 2013 IT/SDC/MI section meeting
History • Development started in 2009 during the EGEE project by Steve Traylen, James Casey, David Collados, and others • Many features/improvements added by the BARC team during 2011-12 • Maintained by me in the last months
Overview • ATP scope is: • Aggregate grid topology info from different sources • Single authoritative source of grid topology info • Manage groups of resources from VO perspective
Architecture GOCDB BDII OSG Central Operations Portal ATP Sync VO Feeds MyWLCG ATP API MyWLCG ATP WEB ORACLE MYSQL
Architecture • ATP is composed of 3 main packages: • ATP Sync: A python based package to periodically synchronize data from various topology providers. It also includes PL/SQL for Oracle/MySQL and the Django model. • MYWLCG ATP Web: A front-end for ATP developed in Django. It provides a web interface to display/find the topologies of grid resources • MYWLCG ATP API: A front-end for ATP developed in Django. It provides programmatic feeds to expose ATP data through JSON/XML interfaces.
Input • CIC Portal • VOs • VOMS • VO contacts • GOCDB (EGI) • Sites, Services • Flavours, Downtimes • Site and region contacts • RSV (OSG) • Sites, Services • Flavours, Downtimes • Capacity • GStat: • Capacity • REBUS: • WLCG federations • WLCG tiers • BDII: • Service endpoints • Services/VOs mapping • MPI info • VO Feeds • VO groups of services
Clients • ATP WEB: POEM, NCG • ATP DB: MRS, ACE, MyWLCG
Source Code Repo: http://svnweb.cern.ch/world/wsvn/sam/trunk/atp/ Doc: http://sam-doc.web.cern.ch/sam-doc/atp/doc/build/html/
Configuration • Default configuration structure distributed in ATP package • atp_synchro.conf : main configuration file • atp_db.conf : database connection configuration • atp_logging_files.conf : location of log configuration file • atp_logging_parameters_config.conf : log configuration • roc.conf : list of enabled regions • vo_feeds.conf : list of enabled vo feeds
Execution • Cronjob running ATP daemon: [root@samnag031 ~]# cat /etc/cron.d/atp-sync 50 * * * * edguser [ -f /var/lock/subsys/atp_synchro ] && ( /usr/bin/atp_synchro -d /etc/atp/atp_db.conf -c /etc/atp/atp_synchro.conf -l /etc/atp/atp_logging_files.conf ) > /dev/null 2>&1 • ATP sync execution is structured in synchronizers: [root@samnag031 ~]# cat /etc/atp/atp_synchro.conf cic_portal = Yes gocdb_topology = Yes gocdb_downtime = Yes osg = Yes osg_downtime = Yes gstat = Yes bdii = Yes vo_feeds = Yes
Logs • Log of last execution: /var/log/atp/atp.log • Log of all executions: /var/log/atp/atp_full.log (logrotate) • Errors are also sent to system logging • Six levels of debugging: • CRITICAL, ERROR, WARNING, INFO, DEBUG, NOTSET • Default configuration is on INFO (20) • Standard log file line: • “2012-03-22 15:24:02,308 - ATP - INFO - CIC - Execution – Starting” • CIC: synchronizer name (e.g. CIC, GOCDB Topology, VOFeeds, etc) • Execution: task type (e.g. configuration, validation, execution) • Starting: action description
Debug Tips • The atp.log is quite useful to understand problems: • It will at least help to locate the affected synchronizer • However ATP is based on many PL/SQL procedures/functions: • SQL developer will help ;) • ATP synchronizes from distinct external data sources. ATP execution fails due to “invalid” or “not available” input data: • Check the “aalidation” tag in atp.log to understand which data source was not reachable or was providing invalid data
Problems • No support for other topology entities • Designed to monitor only services • Services check • Strict dependency on services declared in GOCDB, OIM • Duplication of PL/SQL code • Difficult to manage two versions for Oracle and MySQL • Complex relational database model • e.g. isdeleted flags
Suggestions • ATP was started in Sep 2009… 4 years ago • Perhaps it is ready for retirement • The grid topology is always evolving • Perhaps less focus on state, and more on history • Support for two RDBMS is hard • Perhaps no RDBMS can be even better