1 / 39

Building your on DBaaS

Ruben Gaspar – CERN - Speaker Dawid Wojcik – CERN Ignacio Coterillo– CERN Daniel Gomez- CERN. Building your on DBaaS. UKOUG Database Server SIG Meeting , 29th January 201 3. Outline. CERN and Databases Architecture Hardware overview Empowering users F unctionality overview

cate
Download Presentation

Building your on DBaaS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ruben Gaspar – CERN - Speaker Dawid Wojcik – CERN Ignacio Coterillo– CERN Daniel Gomez- CERN Building your on DBaaS UKOUG Database Server SIG Meeting, 29thJanuary 2013

  2. Outline • CERN and Databases • Architecture • Hardware overview • Empowering users • Functionality overview • Conclusions

  3. CERN • European Organization for Nuclear Research founded in 1954 • 20 Member States, 7 Observer States + UNESCO and UE • 60 Non-member States collaborate with CERN • 2400 staff members work at CERN as personnel, 10 000 more researchers from institutes world-wide

  4. LHC and Experiments • Large Hadron Collider (LHC) – particle accelerator used to collide beams at very high energy • 27 km long circular tunnel • Located ~100m underground • Protons currently travel at 99.9999972% of the speed of light • Collisions are analysedwith usage of special detectors and software in the experiments dedicated to LHC • New particle discovered! consistent with the Standard Model Higgs boson

  5. WLCG • The world’s largest computing grid • More than 25Petabytesof data stored and analysedeveryyear • Over 68 000physicalCPUs • Over305 000 logicalCPUs • 157computer centres in 36 countries • More than 8000 physicists with real-time access to LHC data

  6. Oracle at CERN • Relational DBs play a key role in the LHC production chains • Accelerator logging and monitoring systems • Onlineacquisition, offline:data (re)processing, data distribution, analysis • Grid infrastructure and operation services • Monitoring, dashboards, etc. • Datamanagement services • File catalogues, file transfers, etc. • Metadata and transaction processing for tape storage system • Database on Demand service: Oracle & MySQL instances (more later)

  7. Database as a Service – Rationale • Empowering CERN IT and research community • Users can request and manage different database instances (currently MySQL and Oracle single instance) • Aimed at medium size and long-term projects • Users are provided with a self-service portal • Ease of administration • Integrated backup & recovery • Monitoring solution • One click patching

  8. Database as a Service principles • Scalable • Provide flexible and cost effective Database as a Service • Owners are grouped by a mailing group (access authorization) • Owners receive full DBA privileges* on their instances • Owners are responsible for ensuring that their systems, and the use of their systems, are fully compliant with the Rules of CERN Computing Facilities (including security) • The “Database on Demand” (DBoD) service – OS administration and providing support for self-service portal functionality • The DBoD service does not provide DBA or application support

  9. DBaaS providers add-on

  10. Private cloud model • Reuse existing virtualization infrastructure and know-how – cost efficient • Improve operations • Standardization • Consolidation– migrate existing DBs to DBoD service • Reuse tools and management frameworks • HA • via virtualization (live migration) • Oracle clusterware • Master/Slave replication (just for MySQL)

  11. Architecture • Virtualization • Oracle VM (2.2, 3.0.2 and 3.1.1) on Linux x86_64 • Typical VM size: 2 cores, 8 or 16GB RAM • Physical server: usually running several instances • Storage • NFS over 10 Gigabit Ethernet • Configuration Management • Open-source Quattor Toolkit • CERN is currently adopting Puppet • Management framework • Syscontrol– developed at CERN • Custom development mainly Perl + Bash. About 10K lines code • Web self-service portal

  12. + Physical Servers

  13. Hardware servers NetApp cluster • Dell blades PowerEdge M610 • 2x Quad-Core Intel Xeon @ 2.53GHz • 48 GB RAM • Transtec Database server • 2x Six-CoreIntel Xeon @ 2.26 GHz • 128 GB RAM 10GbE Next release

  14. Hardware storage • Moving from 7-mode Ontap to C-mode • QPI = Intel QuickPath Interconnect (measured x3 memory bandwidth over FSB)

  15. Hardware storage II DBOD instances DBOD instances 10GbE NetApp cluster Dual SAS loop 2x3GbE diagnostics Datafiles Binary logs Redo logs Controlfile Datafiles

  16. Shared Instances • DBoD supports more than one MySQL instance on one VM • Sharing CPU • Sharing MySQL/Oracle binaries • Separate buffer pools (pre-allocated memory) • Separate NFS volumes • Independent backup and restore

  17. MySQL • Currently running MySQL Community Edition 5.5 • InnoDBas the preferred storage engine – backup & recovery • Binary logsenabled • ACID(atomicity, consistency, isolation, durability) – innodb_flush_log_at_trx_commit=1, sync_binlog and innodb_flush_method=O_DIRECT • Using innodb_buffer_pool_size of ~5GB • Using thread cache (big gain for some clients) • Using query cache (query_cache_size=768M) • Performance schema is enabled by default

  18. Oracle • 11gR2 • Archivelog mode • Scheduler job for automatic archivelog clean-up • COST  to Restrict Instance Registration [ID 1453883.1] • SQLNET.CRYPTO_CHECKSUM_SERVER=required & SQLNET.ENCRYPTION_SERVER=required • filesystemio_options='SETALL‘ • Scheduler job for automatic archivelog clean-up

  19. Empowering users • Self-service portal • Instance administration (status, start, stop) • Manage configuration and logs • MySQL: download/upload my.cnf, download slow queries log • Oracle: download trace files • Set up backups (automatic or manual) or command a restore • One button instance upgrade (coming one button system upgrade) • Access to monitoring information • Behind scenes: • J2EE Web Application running on central web servers • ZK Framework 5.0.11 (Ajax based) • SSO (Single Sign On) + SSL for authentication/authorisation • JDBC + Apache DBCP connection pooling via JNDI • JAX-WS 2.2 for SOAP Web Services • Webapp(Java, ZUML, Javascript, CSS, etc): ~ 12500 lines of code

  20. Backup solution • MySQL instances running in binary log mode (InnoDBrecommended) • Oracle instances running in archivelogmode • Backups based on storage snapshots • Full online DB backups done just in a few seconds • Manual and/or automatic (scheduled) • Small storage overhead (depends on instance activity) • Point-in-time recovery – easy with snapshots and binary/archive logs • Snapshots can be configured to be sent to tape (DR)

  21. Backup management • Backup configuration panel • Backup procedure: … some time later new snapshot resume snapshot mysql> FLUSH TABLES WITH READ LOCK; mysql> FLUSH LOGS; Or Oracle>alter database begin backup; mysql> UNLOCK TABLES; Or Oracle>alter database end backup;

  22. Flush tables with read lock ~1 sec

  23. Instance restore • Owners can request point-in-time recoveries • Full restore takes just a few seconds • Recovery time depends on number of binary logs/redo logs to replay/apply • Warning: snapshots taken afterthe one used for recovery are lost

  24. Instance restore Automatic snapshots Data files TIME Now Point-in-time recovery Binary logs Manual snapshot

  25. Framework Monitoring • Management server • Queries its jobs table regulary (based on Oracle dbms_scheduler) • Informs admins in case of: • Pending jobs not executed • Timed out jobs • Failed jobs • Lines of code: SQL, PL/SQL ~ 1300

  26. Instance Monitoring • Evaluated several monitoring products • OEM with Pythian plug-in • MySQL Enterprise Monitor • Monitoring server runs RACMon (in-house development) • Implemented using Python ~13k lines & 15k lines of PHP • Availability and performance monitoring system for Oracle DBs, MySQL, NAS storage and VM infrastructure • ~30 MySQL metrics stored in monitoring DB • mysql> show status • Selected AWR metrics stored for Oracle instances • Admins are notified via email (and SMS if needed) about • Availability problems • Performance issues (OS level and DB checks)

  27. Monitoring interface

  28. One button upgrades • DBoD admins prepare upgrade scripts • Complete upgrade process is scripted and tested • Upgrades of one minor version or several minors possible • Owners can decide to upgrade at their convenience - one button upgrade • Instance is stopped • Binaries are upgraded • shared instances must be upgraded at the same time • Instance is restarted • All post-installation tasks executed

  29. HA: Oracle VM • CERN has more than 2 years of experience of running Oracle VM in production • Easy scale-out of VM Pools • 10 Gbit Ethernet – one network only (NAS over NFS) • Production on OVM 2.2, currently testing OVM 3.1.1 and 3.2.1 • Live migration for HW interventions and host OS upgrades

  30. HA: Oracle CRS (being tested) • Well-known clusterware in IT-DB group • Requires to modify /etc/init.d/mysql • Special case mysqld is suddenly killed → pid file not removed • Stop/start operation should be done via crsctl • Start-up after server’s boot as well • Tar files allow several binaries upgrades • mysqld_safe, mysqlbinlog, mysqladmin,… invoked from right distribution and right environment: --basedir=$BASEDIR --bindir=$BINDIR --datadir=$DATADIR • Two basic configurations: • 2 nodes cluster → critical applications • +2 nodes cluster → optimised resources utilization

  31. HA: Oracle CRS Check Start Stop Clean HOSTING_MEMBERS + SERVER_POOLS to assign instances VIP for MySQL instance Enough time for crash recovery

  32. HA: Replication • MySQL master → slave replication • Monitoring • Based on a “ping” table, show slave status not reliable • Perconascript pt-table-checksum (manual run) • Change role “automated” • my.cnf variables: • Disaster recovery + Upgrades

  33. HA comparison • On All cases sysbench was producing some OLTP load

  34. Important clients • Hosting ~45 instances at CERN for IT and experiments • Drupal content management system • BOINC – LHC@home • CERN document server • Audio video conferencing and webcasts service • HammerCloud for experiments • Piwik (open source web analytics software) • OpenStack Nova • Trac for subversion • PVSS, Fisheye,…

  35. Summary • Many lessons learned during the design and implementation of the DBoD service • Building Database as a Service helped CERN DB group to • Gain experience with MySQL • Provide a solution for Oracle database with special needs e.g. Unicode character sets • Improve tools and operations • Standardize on tools and frameworks • Consolidate

  36. Acknowledgements • DBoD Team: • IT-DB group colleagues, specially our Virtual experts! Ignacio Coterillo Daniel Gomez Dawid Wojcik

More Related