100 likes | 285 Views
Continuous DB integration testing with RAT. „RATCOIN”. Goals. Ensure database stability in agile environment Safeguard against unmanaged changes Perform checks on regular basis Detect new ORA errors or performance degradation and alert if issues found
E N D
Goals • Ensure database stability in agile environment • Safeguard against unmanaged changes • Perform checks on regular basis • Detect new ORA errors or performance degradation and alert if issues found • Allow ad-hoc checks for scheduled changes • Assumptions • 12.1+ database versions supported only • Dedicated test environment (at least 2 node RAC) • Use Workload Capture with STS in production • Use SPA and Workload Replay on test cluster
Overview Production environment Test environment Workload Capture + STS Automated Recovery Environment Synchronization Workload Replay + SPA Ratcoin Repositoryand Engine Compare and Alert!
Workload Capture + STS Automated Recovery On test cluster Recover database with RMAN to SCN of the capture Reuse scripts developed by B&R team Use flashback or NAS snapshots to allow multiple replays Mount the volume with captured workload • Capture 8-24h workload in production • Create STS during capture (all captured SQLs) • Note SCN of the capture • Use dedicated NAS volume for workload files • Export AWR to the Ratcoinrepository
Test Env Synchronization Workload Replay + SPA Run replay, export AWR to Ratcoinrepository Run SPA on all statements, export results to the repository One baseline Replay + SPA just after capture Regular Replays + SPA on weekly basis(?), just after env. re-synch • OS packages version deployed with Puppet (same version as PROD) • Fresh (adjusted) spfile generated from PROD • Database binaries for CPU patching • Storage changes / patches – automatic • Schema changes: e.g. indexes, mviews– write custom sync code
Compare and Alert! • Generate workload replay and SPA comparison reports • Between the current replay and the baseline • Between the current and previous replay • Store them for review (if needed) • Look directly into AWR and SPA views to detect errors and performance issues • Rule based discovery of all new ORA errors and performance degradation • Send e-mail alerts to concerned DBAs
Ratcoin Repository • Stores Ratcoin metadata and run results • List of databases, environment configurations • Capture and Replay schedule and history • AWR data for captures and replays • STS and SPA results • Custom environment changes for ad-hoc runs • Findings and alerts (optional)
Ratcoin Engine • Ratcoin engine controls • Regular workload captures and replays, SPA executions • Ad-hoc replays for custom checks • Test environment synchronization before replays • Collection of AWR and SPA results after replays • ExecuteSPA comparisons • current to baseline and current to previous • Problem detection and alerts (optional)
Usage of Enterprise Manager • View results • Replay comparisonreports • SPA comparison reports • EMCLI • Can we use EMCLI to configure and run workload captures and replays? • Can we use EMCLI to run SPA tasks? • Alerting • Use custom metrics to alert from replay AWR data and SPA results
To Do List • Test if STS capture works with workload capture on RAC clusters – in 12.1 version only • Get and install 2 machines for the test RATCOIN1 cluster • Order dedicated NAS volume for workload captures • Test reuse of B&R scripts for DB recovery on the new cluster (to a specific SCN) • Check if we can ensure consistent versions of Puppet environment • Test replay and SPA reporting with EM • Verify if EMCLI can be used for configuring and running workload captures, replays and SPA tasks • Work on code to synchronize spfile, DB binaries and schema changes • Design and implement RATCOIN repository • Design EM custom metrics and configure alerting from AWR/SPA • Design and implement RATCOIN engine