100 likes | 303 Views
ASAP RDF SGP RDF 1.2 and 1.3 Transfer of Information. Simon.Whitworth@hp.com. RDF ASAP SGP- Introduction. ASAP – Availability Stats and Performance Network, system & application monitoring through states and metrics User interface is though a Windows GUI, NSK CI & EMS event generation
E N D
ASAP RDF SGP RDF 1.2 and 1.3 Transfer of Information Simon.Whitworth@hp.com
RDF ASAP SGP- Introduction • ASAP – Availability Stats and Performance • Network, system & application monitoring through states and metrics • User interface is though a Windows GUI, NSK CI & EMS event generation • Standard subsystem monitoring included with ASAP product (CPU, Disk, Expand, Spooler, TMF, etc.) • Highly extensible through published APIs enabling user written SGPs (Stats Gathering Processes) to be written • RDF monitoring provided by an SGP developed by RDF development and released with RDF/IMPX 1.3 (ASAPRDF) • ASAP 2.0 release includes the required support for the RDF SGP
ASAP RDF SGP Architecture ASAP Collection node ASAPMON ASAPCOL StatsD/B ASAP Client ASAPMON ASAPMON ASAPRDF ASAPRDF Purger Monitor Receiver Extractor Updater RDF Primary node RDF Backup node
Configuring ASAPRDF • ASAPRDF is distributed with the rest of the RDF product • No configuration changes are required to the RDF environment to enable ASAP support • RDF monitoring is enabled through the ASAP CI using the SET commande.g. SET RDF ON, OBJECT $SYSTEM.RDF.ASAPRDF, PARAMETERS “RATE 10“ starts RDF monitoring with status being collected every 10 minutes • ASAP SGPs are persistent programs automatically restarted by the ASAP monitor • By default the SGP will auto discover each RDF subsystem on the local node • Auto discovery can be overridden using the MONITOR command in the CIe.g. MONITOR RDF RDF05->RDF07
Autodiscovery • At startup the SGP attempts to auto discover all the RDF environments on the local node • Search $SYSTEM for all CONFIG files with a code of 721 (T5864 config files are ignored based on the key length of the config file) • Clean up any old environments before you start ! • SGP keeps an internal table of all the environments it finds • Control subvol, backup node and primary/backup flag are stored for each environment • New RDF environments being added or old ones being removed require the SGP to be stopped/started to refresh the table and allow monitoring to take place
Gathering and reporting RDF status • The ASAP Monitor triggers the SGP to report status based on the time interval specified in the SET RDF command • The SGP works through the RDF subsystem table and sends a status message to each RDF process in the subsystem on that node • Primary reports status of the Monitor & Extractor(s) • Backup reports status of the Receiver(s), Image trails, Purger and Updaters • The status is the same as the information obtained through an RDFCOM STATUS RDF command (the same messages are used) • ASAP allows each piece of information to be ‘ranked’ against user defined ‘objectives’ before it is passed to the collector (out of scope of this TofI) • Rankable values are error, RTD time, primary and backup CPUs and priority • Status information for all the subsystems on the node are passed back to the ASAP collector
ASAP CI RDF Command • RDF [/out <file>/] [\* ] [[.] * ] [, <options>] • [\<node>] [[.] <domain>] • where options is one or more of: • CPUDisplays RDF domain/metric values in the specified CPU • DETAILDisplays RDF Metric values based on FORMAT command • settings. • SAMPLES Number of samples to display. • STATE Displays RDF Metric values and their objective states. • TIME Show stats starting at a time other than the current time. • VOLUME Displays the associated disk volumes. • +rdf \rdf06,detail • \RDF06 Domain\Name\Hierarchy Status Date Time Error RTDTime PCpu BCpu Pri • -------------------------------- ---------------- ----- ----- ----- --------- ---- ---- --- • Rdf05->Rdf06\Imagetrail\$Data12 Running 9/27 9:28 0 0:00:00 0 0 0 • Rdf05->Rdf06\Imagetrail\$Data13 Running 9/27 9:28 0 0:00:00 0 0 0 • Rdf05->Rdf06\Imagetrail\$Data14 Running 9/27 9:28 0 0:00:00 0 0 0 • Rdf05->Rdf06\Purger\$R5pg Running 9/27 9:28 0 0:00:00 1 2 180 • Rdf05->Rdf06\Receiver\$R5r0 Running 9/27 9:28 0 0:00:00 1 2 180 • Rdf05->Rdf06\Receiver\$R5r1 Running 9/27 9:28 0 0:00:00 1 3 180 • Rdf05->Rdf06\Receiver\$R5r2 Running 9/27 9:28 0 0:00:00 3 2 180 • Rdf05->Rdf06\Updater\$R5u0 Updt off 9/27 9:28 0 0:00:00 0 0 0 • Rdf05->Rdf06\Updater\$R5u1 Updt off 9/27 9:28 0 0:00:00 0 0 0 • Rdf05->Rdf06\Updater\$R5u2 Updt off 9/27 9:28 0 0:00:00 0 0 0
States reported • The following RDF process states can be reported by the SGP: Running Stopped Aborted Updt Off TKOver active TKOver complete TKOver part comp
Program structure • Subsystem specific code is embedded within a standard SGP shell • Shell code comes from the ASAP product T0402 • The ASAP monitor sends a message to the SGP when the timer pops, the shell invokes the subsystem specific procedure process^stat^sum^request to collect and report the status info
Troubleshooting • Check correct versions of ASAP and RDF (RDF IMP V1.3 only works with ASAP V2.0. ASAP 2.1 compatible IPM to follow) • Check RDF SGP is actually running on both the RDF primary and backup nodes • If no stats are being reported, check RDF status with RDFCOM from the RDF primary system using STATUS RDF • If autodiscovery is being used find all the code 721 RDF CONFIG files FILEINFO $SYSTEM.*.CONFIGThis will show the environments that should be reporting stats • If using MONITOR RDF for specific RDF environments check what is being monitored with MONITOR RDF, LIST • Manually stopping the $<AsapIA>N process will cause it to be re-started and re-autodiscover