190 likes | 303 Views
MySQL Replication and ApMon. Marco MEONI. Alice Offline weekly meeting Thursday 2nd December 2004. New MonALISA Repository server. http://alimonitor.cern.ch:8080 Pentium IV 2.8GHz, 2GB RAM, GB network. Outline. MySQL Replication ApMon Repository Web Services MonALISA@SC2004
E N D
MySQL Replication and ApMon Marco MEONI Alice Offline weekly meeting Thursday 2nd December 2004
New MonALISA Repository server • http://alimonitor.cern.ch:8080 • Pentium IV 2.8GHz, 2GB RAM, GB network
Outline • MySQL Replication • ApMon • Repository Web Services • MonALISA@SC2004 • Documentation
Data Replication: MASTER DB SPARE DB Online Replication aliweb01.cern.ch alimonitor.cern.ch Data collecting and Grid Monitoring Grid Simulation Repository DB Replication Current Situation: • 7+ Gb of performance information, 24.5M records • During DCs data from ~1K monitored parameters arrive every 2/3 mins • ROOT • CARROT • MonALISA Agents • Repository Web Services • AliEn API • LCG Interfaces • WNs monitoring (UDP) • Web Repository
Why Replicate • • Need a hot (on-line) spare • Load (scalable) balancing • Make non-disturbing backups • • Separate environments (i.e. AliEn monitoring and Grid simulation)
Transaction (binary) log Replication Basics • Master records all update queries in the transaction log• Slaves read the transaction log from the Master and run the queries locally read queries only Write/read queries Slave server Master server • A Master can have many Slaves• A Slave can have only one Master• A server can be either a Master or a Slave • Masters are mostly unaware of their Slaves • Topologies: Master/Slave(s), Multi-Master “Ring”, Master/Slave/Slave • • Replication is asynchronous!
1) Configure replication account on the master 2) Enable binary (transaction) log on the master (my.cnf file) server-id = #log-bin = filenamebinlog-do-db = dbnamebinlog-ignore-db = dbnamemax_binlog_size = # 5) Setup replication options on slave (my.cnf file) Binary log 3) Snapshot master server-id = #master-host = hostnamemaster-user = usernamemaster-password = passwordmaster-port = #master-connect-retry max_relay_log_size = # 4) Install snapshot on the slave Master Server Slave Server Replication Setup 6) Restart the slave (check the error log)
Slave Relay (I/O) thread connects to the master to copy queries locally (relay log) 1 Relay log 2 Binary log Master BinLog thread sends the statements recorded in its binary log to the slave that asks for Slave SQL thread reads the local relay log (rather than connecting to the master like in MySQL 3.23) and executes the updates it contains 3 Replication Functioning (4.0+) Write/read queries read queries only Master Server Slave Server .
Replication Management • Useful Tools• mysqlbinlog – Converts binary/relay log to normal SQL• mysqlsnapshot – Creates snapshot for setting up slaves – http://jeremy.zawodny.com/mysql • Mysql commands • – STOP/START MASTER/SLAVE to stop/start safely masters and slaves • – SHOW MASTER/SLAVE STATUS • – Force master to block until the slave catches up Known Problems• Binary (or relay) logs use all your disk space!• Pay attention to MySQL compatibility
ApMon • ApMon (Application Monitoring) is a set of flexible APIs that can be used by any application to send monitoring information to MonALISA services • The monitored data is sent as UDP datagrams (encoded in the XDR, eXternal Data Representation) to one or more hosts running MonALISA services.
User applications can periodically report any type of information the user wants to collect, monitor or use in the MonALISA framework to trigger alarms or activate decision agents We want to incomporate our ApMon application into the AliEn process monitor running at WNs level WNs monitoring allows gathering performance information such as network traffic and job status at one-level deeper than at the moment (CEs monitoring) ApMon tests successfully completed either C++ or in Java It is easy to be used in complex data processing programs as well as from scripts or utility programs ApMon for AliEn monitoring
ApMon usage: C++ example #include <stdlib.h> #include <time.h> #include "ApMon.h" int main(int argc, char **argv) { char *filename = "destinations.conf"; int nDatagrams = 20; double myval; int i; srand(time(NULL)); try { ApMon apm(filename); for (i = 0; i < nDatagrams; i++) { myval = 2 * (double)rand() / RAND_MAX; try { apm.sendParameter("ApMonCluster", NULL, "myMonitoredParameter", myval); } catch(runtime_error &e) { fprintf(stderr, "Send operation failed: %s\n", e.what()); } sleep(1); } } catch(runtime_error &e) {} }
The routines provided by ApMon handle the encoding of the monitoring data in the XDR representation and the building and sending of the UDP datagrams. XDR because is cross-platform and works trasparently with big-endian and little-endian: client and server can be implemented in any languages XML might be the obvious alternative but an XML package is many times larger than the same XDR: smallest footprint possible The addresses of the MonALISA services to which ApMon sends the monitoring data are specified in configuration files or in webpages. MonALISA services listen on a specified port (8884 by default) ApMon packets
Repository Web Services • Alternative to ApMon for WEB repository purposes: don’t need MonALISA agents and store data directly into the DB repository • WSs provide interoperability between various software applications running on various platforms • By piggybacking on HTTP, WSs can work through many common firewall security
import org.apache.axis.client.Call; import org.apache.axis.client.Service; import javax.xml.namespace.QName; public class repositoryWSClient { public static void main(String [] args) { ... String endpoint = "http://alimonitor.cern.ch:8080/axis2/services/MLWebService1"; Service service = new Service(); Call call = (Call) service.createCall(); call.setOperationName(new QName(endpoint, “directInsert”)); call.setTargetEndpointAddress( new java.net.URL(endpoint) ); retval = (Integer) call.invoke( new Object[] { hostName, clusterName, siteName, monitoredParameterName, value, timeStamp} ) ; ... } } WS example • Trivial to send typed data to the DB repository in the format hostName/siteName/monitoredParameterName/value/timeStamp
http://ultralight.caltech.edu/sc2004/BandwidthRecord http://pr.caltech.edu/media/Press_Releases/PR12620.html N E W S R E L E A S EFor Immediate ReleaseNovember 24, 2004 World Network Speed Record Quadrupled For the second consecutive year, the “High Energy Physics” team of physicists … … led by the California Institute of Technology … …joined forces at the Supercomputing 2004 (SC04)… … Their demonstration of “High Speed TeraByte Transfers for Physics” achieved a throughput of 101 gigabits per second (Gbps) to and from the show floor, which exceeds the previous year's mark of 23.2 Gbps, set by the same team, by a factor of more than four. … … The team used the MonALISA (MONtoring Agents using a Large Integrated Services Architecture) system developed at Caltech to monitor and display the real-time data for all the network links used in the demonstration, as illustrated in the figure. MonALISA (monalisa.caltech.edu) is a highly scalable set of autonomous self-describing agent-based subsystems which are able to collaborate and cooperate in performing a wide range of monitoring tasks for networks and Grid systems, as well as the scientific applications themselves. Detailed results for the network traffic on all the links used are available at: boson.cacr.caltech.edu:8888. MonALISA@SC2004