1 / 6

Failover Procedures for operational tools COD-15 PARALLEL SECTIONS Lyon, 7 February 2008

Alessandro Cavalli, Alfredo Pagano (INFN/CNAF, Bologna, Italy). Failover Procedures for operational tools COD-15 PARALLEL SECTIONS Lyon, 7 February 2008. GOCDB. People present: Alessandro, Cristina, Cyril, Gilles, Guillaume, Kai, Osman

bnorman
Download Presentation

Failover Procedures for operational tools COD-15 PARALLEL SECTIONS Lyon, 7 February 2008

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Alessandro Cavalli, Alfredo Pagano (INFN/CNAF, Bologna, Italy) Failover Proceduresfor operational toolsCOD-15PARALLEL SECTIONS Lyon, 7 February 2008

  2. GOCDB Lyon, 7 Feb 2008 • People present: Alessandro, Cristina, Cyril, Gilles, Guillaume, Kai, Osman • The 2 main points to work on, in parallel, in the near future: • Quick solution to have some level of failover ASAP • The final failover solution • First quick solution: • Alessandro fast and dirty idea is to try to automate what was manually done for the CIC portal in the past: • Close the write access on DB • Issue the “export” command to create a full dump (schema,table,indexes, etc) • Open again full access to DB • Compress and transfer the dump, doing checksum control • Apply the dump at the backup DB side (CNAF) • The idea is to choose the less impacting moment of the 24 hours (e.g. when it’s working hours on America+pacific), and to automate all this as a cron job, on both sides (RAL+CNAF). On Oracle RAC cluster, it has to be done maybe on only one of the cluster elements • The small DB size suggests, given the experience with the CIC portal, that the main DB will be off for only few minutes (well < 5)

  3. GOCDB (2) Lyon, 7 Feb 2008 • As the other, more definitive failover approach, it has been stated that: • We want to set it up between RAL and CNAF • It should have more frequent refresh of data, possibly real-time synchronization • It would be a read only DB anyway • We could give it a try with Streams: • Guillaume is confident that it should be achievable, also as a first step (avoiding the dirty solution) • Alessandro is more worried, because of the feedback from CNAF DBAs about Streams trickiness, and lots of resistances from other DBA teams around, to give support for this • Connection method: • From the operations tools (READ ONLY): Oracle-style connection properties can be used, with “FAILOVER = ON” attribute. Also the RO replica can be included, as far as we can provide the proper freshness of data on the replica DB • From the web interface (READ WRITE): we must have special care to detect when the web is using the RO replica because of main is KO: the web interface must notify the web user that UPDATE DATA is NOT possible

  4. GOCDB (3) Lyon, 7 Feb 2008 • Cristina has to bring all we said to Keir (GOCDB Oracle admin), so that he makes his own idea about it • Soon, right after the COD, we will have a phone meeting with Keir and Alfredo. We will get into deeper details, and establish the collaboration to produce some result for the 1st level-solution • Fruitful talk with Osman: he is getting data from GOCDB for the CIC portal with materialized views. This must be investigated asap, because with the “REFRESH COMPLETE” clause we might get pretty easily a periodic snapshot at CNAF without ANY added feature/configuration at RAL • As soon as possible Alessandro and Alfredo will study the materialized view that should be performed from CNAF to RAL

  5. CIC portal Lyon, 7 Feb 2008 • People: Alessandro and Gilles • We had only time to focus on: • What components of CIC portal would be seriously affected by going on READ ONLY DB failover mode? • Is it worth the effort? (Do we get a usable portal, compared with the effort to get to this result?) • Affected portal elements: • COD dashboard should work!! (except NOTEPAD & HANDOVER) • VO ID cards: READ ONLY • SITE/ROC reports: NOT working • BROADCAST TOOL: without archiving (can we accept it?) • S.D. notification: NOT working • Conclusions: • while we are getting achievements with GOCDB in the next few weeks, we could try to apply a similar solution to the CIC portal • About “is it worth the effort” question… someone has comments as portal developer or as portal user?? • Realtime discussion with Gilles: • he can produce some interesting statistic about % of READ and WRITE requests • the impression is that it is worth the effort: less priority than GOCDB, but let’s give it a try

  6. Actions Lyon, 7 Feb 2008 • GOCDB: some result has to be achieved in the next 2 weeks (by Feb 22) • GOCDB: 1st level of failover, that could be either manually or automatically depending on the progresses done, ready at CNAF (end of March) • CIC portal: TBD depending also on GOCDB achievements • Contact again Emir: some test result for next f2f COD • Update Wiki

More Related