270 likes | 474 Views
KC CMG 2004. MXG Update 1. IFAs a/k/a zAAPs – what they are, where’s CPU time captured? 2.MXG 22.08 required for SAS Version 9.1 – other SAS issues. 3. ASMTAPES/MXGTMNT – Allocation Recovery Monitor 4. BUILDPDB MVS runtime elongation if ANY output to tape
E N D
KC CMG 2004 MXG Update 1. IFAs a/k/a zAAPs – what they are, where’s CPU time captured? 2.MXG 22.08 required for SAS Version 9.1 – other SAS issues. 3. ASMTAPES/MXGTMNT – Allocation Recovery Monitor 4. BUILDPDB MVS runtime elongation if ANY output to tape 5. MXG 22.04 required for CICS/TS 2.3 6. MXG 22.06 required for DB2 Parallel 7. If you are using IRD, you must install MXG 22.02 or later: 8. Still OS/390? MXG 22.07 required for z890, 21.04 for z990. 9. MXG/SAS V9.1 benchmarks Windows XP, Linux RH8, z/OS 1.4.1 10. Running out of memory “Below the Bar” H. W. "Barry" Merrill, PhD Merrill Consultants Dallas, TEXAS www.mxg.com Wednesday, October 6, 2004 Interactive session by phone
1. What are IFA's, a/k/a zAAPs? For z/os on z890s and z990s, IFA was the internal name of zAAPs: zAAPs -zOS Attached Assist Processors all of the RMF/SMF field names use IFA zAAP is the marketing name for these engines. IFAs are engines that execute only Java code are not included in MSU capacity can add new Java applications, or offload current Java to IFAs can increase hardware capacity without increase software costs. can force all Java workload to execute only on IFA engines maximizing the offload of work from your CP engines, or can let Java work execute on CPs when they are not busy: can let it run at the priority of the Service Class, or can be run even lower than discretionary. keywords of IFACROSSOVER and IFAHONORPRIORITY let you choose
Where does IFA CPU time show up: What's really important about these details on CPU measurement? That it exists at all: IBM has been extremely pro-active, at NYC SHARE meeting two weeks ago, provided much of these data well before z/OS 1.6 delivery - prereq to use the IFA engines. Below info taken from those SHARE presentations plus follow up from IBMers, with additional valuable input from Cheryl Watson. Service Units: In all records that contain Service Units, the TCB Service Units that formerly had only the service units due to TCB time on CP engines now includes SUs from both IFAs and on CPs. If you use Service Units for billing, your billing units may be increased when Java work runs on IFAs. Why? Because WLM manages based on service units; an all-Java work could have zero TCB service units, could be using 100% of IFAs, and WLM wouldn't know, if IFA service wasn't included in TCB.
SMF 30. Existing CPUTCBTM TCB CPU time does NOT include IFA CPU time. existing billing based on TCB CPU time will not change (but CPU time on IFAs will not be recovered until you change your billing code). New IFACPUTM is the CPU time on IFAs you can charge IFA CPU time at different rate than CP CPU time or could add the IFACPUTM to the total CP CPU time in CPUTM and charge both CPU times at the same rate. On z990s the CPs and IFAs run at the same speed On z890s the IFAs run at full speed (365MIPS) while the CPs run at 28 different "speeds", as low as 27MIPS: IFA CPU time are normalized back to CP speed of z890 model. MXG will NOT add the IFACPUTM into the existing CPUTM variable.
New variable IFECPUTM (‘eligible') contains the CP CPU time that was executed on a CP, but that was elibible to run on IFA. (IFECPUTM is included in CPUTCBTM). With z/OS 1.6 and Java SDK V1.4 IFECPUTM measures exactly the Java CP CPU time that could be offloaded to an IFA. The zAAP Projection Tool (WSC White Paper WP100431) can be used now on z/OS 1.4+ to estimate current CP CPU time for Java apps. Bad news: While IFA CPU time is available in the SMF 30 record, CPUCOEFF, SU_SEC, and R723NFFI (normalization) factors were NOT, but an IBM APAR will add the facors, so we will be able toback out IFA Service Units from the total IFA+CP Service Units in CPUUNITS in type 30s. Maybe bad news: There are concerns as to the repeatability of CPU metrics, especially since Java work can run on either IFAs or on CPs when IFACROSSOVER=YES is specified. Additional concerns for the repeatability of the normalization factors on z890s. But data from real sites running real Java work will answer the magnitude of these concerns. The good news: since so few sites are actually running any Java on their current z/OS systems, future concerns won't impact the current chargeback, and you can benchmark and measure your work repeatability before IFAs are placed in production!
SMF 70. TYPE70 MVS System data flags each LCPU, new IFATYPnn variable indicates whether an engine is a CP or an IFA. PCTCPUBY calculation and related capacity metrics still based ONLY on the CP engines, as has always been the case. PCTIFABY calculation and new IFA capacity metrics are added to TYPE70 dataset so IFA capacity can be separately tracked. IBM RMF CPU Activity report has separate lines for CPs and IFAs TYPE70PR PR/SM LPAR data still has only 'ICF' in SMF70CIN for ICFs, IFLs (Linux) or for IFA engines. But RMF development has acknowledged that true identification of each engine type is an urgent requirement. But for now, can only count how many CPs and how many "others" engines are available for each LPAR in the PR/SM dataset.
SMF 72. R723CCPU/CPUUNITS will contain sum of IFA and CP Service Units! CPUTCBTM is calculated from R723CCPU and SU_SEC and CPUCOEFF. BUT: MXG variable CPUTCBTM will still contain ONLY the CP CPU so existing CP capacity measurements, capture ratio unchanged by subtracting new R73IFAT/IFACPUTM variable (IFA CPU time) from R723CCPU/CPUUINGS when creating MXG's CPUTCBTM. R723IFCT/IFECPUTM contains CP CPU time that was IFA-eligible. The bad news: The RMF Workload Report shows sum of CP+IFA time in "TCB" line, which will no longer match the CPUTCBTM in MXG. The RMF report does have a new line with "IFA" CPU time with IFACPUTM, and new APPL% CP, APPL% IFACP, and APPL% IFA values. The R723IFAT and R723IFCT times are actually converted from IFA service units using the new R723NFFI normalization factor.
SDSF display. JOB-level data for CPU% includes both CP and IFA CPU percent but a new IFA CPU column will eventually be added. The Bad News: The CPU percent busy field in the top line of the display DOES (incorrectly?) include IFAs in the denominator: a two-CP one-IFA system with 100% in both CPs and 0% in IFA displays CPU Busy of 66% on SDSF, because the capacity was based on three total engines, not the two CP engines. Unrelated: it was just observed that the JOB %CPU in SDFS includes "Initiator" CPU time (CPITCBTM/CPISRBTH see ADOC30), CPU time before your program actually starts executing, so can see CPU time while watching a job that ultimately never records any CPU time in the IEF374I messages!
IEF374I step end and IEF376I job end joblog messages contain sum of CP and IFA CPU time, but IBM intends to add a IFA value, or more likely create a new IEFnnnI message with IFA CPU time. CICS TASCPUTM already includes J9 Java TCB's CPU time, available separately in the J9CPUTTM variable, but that is the total Java CPU time for the transaction. Won't tell how much was CP versus how much was IFA time. You will have the SMF 30s for each CICS region, and interval records, to measure how much time is CP, IFA, and IFA eligible at least at the region level. May have to use those percentages in your chargeback code. IMS Message CPU time - unconfirmed, but it is expected that the IMS 07 log record contains both TCB and IFA CPU time.
2. Support for SAS Version 8.2, 9.1, 9.1.2, and 9.1.3 The good news: Most importantly, there are no data incompatibilities between V8 and V9. Data libraries and catalogs built with V8 can be read and written with SAS V9, and libraries and catalogs built with V9 can be read and written with SAS V8. Remove MEMSIZE=64M from your CONFIG member (use MXG CONFIGV9 member); REGION=0M or 80M controls SAS virtual storage in V8+. V8+: Use SAS member (BATCH) instead of (BATCHXA) in your //CONFIG concatenation in your JCL, V9+ Use ENTRY=SAS instead of ENTRY=SASHOST (or use MXGSASV8 or MXGSASV9 JCL procedure example).
The bad news: MXG 22.08 required for safe execution with SAS V9.1.2 or V9.1.3. While MXG 22.07 had critical revisions for SAS 9.1.2, more design changes were discovered in V9.1.2 that required more MXG changes. Plus, errors in SAS when Syncsort's add-on product PROC SYNCSORT is used corrupted INFORMATs, caused fatal errors in BUILDPDB: I removed all MXG INFORMAT statements faster than they could examine their error, so you can use PROC SYNCSORT with MXG even though they have not fixed their error. Errors are with PROC SYNCSORT add-on (prints "PROC SYNCSORT" on SAS log - no errors with the Syncsort SORT product itself, and the fix will be from SAS, not SyncSort. Critical actions for you to run MXG with SAS V9.1+: Install MXG 22.08, use MXGSASV9 and CONFIGV9 from 22.08, and Run UTILS2ER utility against all of your SAS programs to see if any lines conflict with S2=72 option that replaced S2=0 option set by MXG previously.
Specific Changes related to SAS V9.1.2 and MXG execution: CONFIGV9: NOTHREADS required for SAS V9.1.2, fixed in 9.1.3. Change 22.207. SAS Note SN-12943 reports incorrect results, no error message: PROC MEANS, SUMMARY, REPORT, TABULATE Only on "MVS", only if threading is used. V9 default is THREADS While fixed in 9.1.3, I chose to force NOTHREADS in CONFIGV9. Can use OPTIONS=THREADS with 9.1.3 on // EXEC to change. CONFIGV9: NLSCOMPATMODE required: SAS V9 changed default to NONLSCOMPATMODE Only on "MVS" (thus far!), doesn't fail if LOCALE is ENGLISH/blank But with LOCALE=GERMAN_GERMANY or other non-blank, or non-ENGLISH every @ symbol causes an error at compile time. Extensive discussion in text of Change 22.129 for NLS and LOCALE.
CONFIGV9: S2=0 option now required; MXG previously used S2=72 in CONFIGxx. Only on "MVS". Extensive discussion in Change 22.123. S2 sets line size of source read by %INCLUDE or AUTOCALL. V9 MVS SASMACRO library was changed from RECFM=FB to RECFM=VB -no standard for line size of SAS-provided %macro text -new macros were written by ASCII folks, line length 255 -Rather than make the authors correct, RECFM changed to VB. BUT: RECFM VB has entirely different meaning for S2 than FB. S2=72 FB ==> read only first 72. VB: ==> START IN 72!!!! MXG had always specified S2=72 to protect you from line numbers S2=0 ==> look at last 8 columns to see if line numbers exist All MXG code is only 72 positions, so S2=0 is no-risk to MXG. BUT: If you have SAS code with mixed blanks and numbers S2=0 will cause your code to syntax error. So: New UTILS2ER utility will read all of your source libraries and identify any exposures in your SAS programs.
CONFIGV9: V6SEQ may still be required with SAS V9.1, V9.1.2 Only on "MVS". SN-012437 and Change 22.108 discuss. SAS V9.1 and V9.1.2 create corrupted and unreadable datasets with no error at create time, and data is unrecoverable, if V7SEQ, V8SEQ, or V9SEQ are used. SAS Hot Fix in SN-012437 does correct the error for V9.1/9.1.2 BUT: I can't guarantee you have that hot-fix installed, so MXG SEQENGINE default was again set back to V6SEQ in 22.05. But: V6SEQ failed with long-length variables, so Change 22.108 shortened all from-MVS variables. MXG has had numerous iterations on SEQENGINE. Mostly because unnecessary compress was done.
MXGSASV9: MVS JCL Example has new symbolics for NLS/LOCALE options. XX='EN' - Default Language Value (ENGLISH) YY='W0' - Default Encode Value (USA) 'DEW3' is for most GERMAN, but 'DEWB' is for SWIZTERLAND. You must look at the SAS JCL proc built by your SAS installer to find the correct XX and YY values, and then set them as your MXGSASV9 JCL Procedure defaults. //CONFIG DD DSN= ... CNTL(BAT&YY.) //SASAUTOS DD DSN= ...&YY..AUTOLIB //SASHELP DD DSN= ...&XX.&YY..SASHELP // DD DSN= ...EN&YY..SASHELP //SASMSG DD DSN= ...&XX.&YY..SASMSG // DD DSN= ...EN&YY..SASMSG New DD statements for TRNSHLP, ENCDHLP and TMKVSENV were added.
ASCII-execution code change: EBCDIC character variables INPUT with $VARYING had hex zeros where they should have had blanks because of a SAS V9 Design Change in $VARYING informat. $VARYING always has returned a "raw" $CHAR string that must be converted if the string is EBCDIC text, using: INPUT VARIABLE $VARYINGnn. LENTEXT @; VARIABLE=INPUT(VARIABLE,$EBCDICnn.); but when LENTEXT was less than nn, the "pad" of '80'x was found on SAS ASCII platforms, so the statement VARIABLE=TRANSLATE(VARIABLE,' ','80'x); was added to translate the unexpected/undocumented '80'x. Now, also undocumented, in V9, the "pad" of '00'x is returned! So an additional VARIABLE=TRANSLATE(VARIABLE,' ','00'x); had to be added 511 times in 55 members.
3. ASMTAPES/MXGTMNT ML-29 - ALLOCATION RECOVERY MONITOR Brand new in MXG 21.04 dated Aug 25, 2003, enhanced ASMTAPES at ML-29: new SMF subtype created by MXGTMNT new MXG dataset automatically created PDB.TYPETARC each Allocation Recovery: Job, How Long Delayed, etc. a job must wait because there is no tape drive applies to real and virtual tapes
4. BUILDPDB MVS runtime elongation if ANY output to tape - like CICSTRAN or DB2ACCT - or even if output is sequential format on DASD. - introduced in MXG 19.19, %VGETENG added for RMFINTRV to test if a //SPIN DD existed. - no elongation if no tape or sequential format output - PROC SQL with FROM DICTIONARY.MEMBERS in VGETENG to get ENGINE of //SPIN, but no WHERE LIBNAME=SPIN clause used; it read all LIBNAMEs to populate DICTIONARY.MEMBERS - If CICSTRAN and DB2ACCT both multi-vol on tape, log has: hh:mm-hh:mm SMF Opened, read started 14:25 CICSTRAN Mount-Dismount 5 vols 14:24-16:06 DB2ACCT Mount-Dismount 2 vols 14:24-15:25 SMF Closed, read completed 16:12 VGETENG-remount/read 2 DB2ACCT vols 16:17-16:30 VGETENG-remount/read 5 CICSTRAN vols 16:40-17:09 Total Elapsed time: 164 minutes with re-read. VGETENG: wasted 52 minutes mounting and rereading - And PROC SQL prints NO messages about CICSTRAN/DB2ACCT - Only clue to elongation are those extra tape mounts. - Fixed in MXG 22.01+, can circumvent with FREE=CLOSE
5. MXG 22.04 is required to support CICS/TS 2.3 SMF records MXG 21.04 supported UTILEXCL to read CICS/TS 2.3 SMF 110s. If you used UTILEXCL to read your CICS/TS 2.3 Dictionary to create IMACEXCL, the IMACEXCL correctly read SMF 110 data. You MUST use UTILEXCL if there were EXCLUDEd fields. You SHOULD use UTILEXCL always as it only outputs the variables that exist in your current release(s) of CICS. But if you read SMF 110 records with MXG 21.04 thru 22.03, and all fields were present, the TASCPUTM and many other variables were completely wrong, and there were no error messages. All my CICS/TS 2.3 test data had EXCLUDEd fields!
6. MXG 22.06 is required for DB2 Parallel CPU time for DB2 Parallel Trans was not output (i.e., lost, could be very large) in DB2ACCT. Code in MXG Exit Members EXDB2ACC/EXDB2ACP/EXDB2ACB deleted all obs with DB2PARTY=‘P’, which was wrong because those obs contain the DB2TCBTM for parallel events. New DB2PARTY=‘R’ for the Roll-Up observations also added. Extensive DB2 Technical Note in Newsletter FORTY-FIVE and additional documentation in Change 22.121 text.
7. If you are using IRD, you must install MXG 22.02 or later: Full Support for IRD (Intelligent Resource Director) in all CPU-related datasets. IRD support was incremental in MXG: Datasets When MXG Version Change ASUM70PR/ASUMCEC Sep 22, 2003 21.05 21.170 TYPE70PR Mar 11, 2004 22.01 22.011 TYPE70,RMFINTRV Mar 22, 2002 22.02 22.050 PCTCPUBY in TYPE70 and RMFINTRV were wrong in any interval when IRD varied CPUs offline. I'm embarrassed, since PCTCPUBY is the second most important variable in all of MXG (CPUTM for billing is the most important); This is the first PCTCPUBY error in MXG's TWENTY-YEAR history! When all engines remained online, however, there was no error.
8. Still OS/390? MXG 22.07 is required for z890 and 21.04 for z990s. IBM changed CPUTYPE value z990 – 2084x z890 – 2086x Only impacts MSU variables that MXG had to set via a table lookup based on CPU TYPE for OS/390. With z/OS, MSU fields are in the SMF records so there is no table lookup required.
9. MXG/SAS V9.1 benchmarks Windows XP, Linux RH8, z/OS 1.4: -Linux and Windows runs on same AMD 1400 1.5GHz, 500MB ram -z/OS runs on IBM 2064-210. An 842 MB SMF file was used: -TYPE30 DATA step cost, and PROC SORT TYPE30D (3.4 GB): Data Step LINUX WINDOWS z/OS Elapsed Time 4:27.70 7:40.02 11:03.36 CPU time 3:59.46 3:57.64 5:56.70 PROC SORT SORTSIZE DEFAULT 48MB 64MB MAX Elapsed 15:39.82 28:19.89 12:28.98 User CPU 5:01.43 4:02.93 5:23.16 SORTSIZE 200MB 200MB Elapsed 15:26.10 28:19.89 User CPU 5:01.02 4:02.93 SORTSIZE 400MB 400MB Elapsed 19:12.40 35:05.38 User CPU 5:02.79 4:15.81
Benchmark observations: - SAS V9.1 under LINUX significantly outperforms Windows XP in both the DATA step and in the SORTS. - SAS V9.1 under z/OS is significantly slower than either LINUX or Windows, for the DATA step, but PROC SORT on z/OS, which used SYNCSORT, is better than Windows or Linux. - but: SYNCSORT on Windows product tests with SAS V9.0 was significantly better than the V9.0 SAS Internal SORT. (See Newsletter FORTY benchmarks). - SYNCSORT on Windows was not available for these runs. - SORTSIZE on ASCII does impact elapsed time - elongation occurs if SORTSIZE is too large or too small - SAS V9.1 defaults were increased from old 2MB default and seem fine for this 3.5 GB sort.
Benchmark conclusions: - Real message: the repeatability and reliability of SAS: - 5-10 minutes to read 1 GB SMF file to create a PDB lib - 15 minutes to sort a 4 GB dataset NO MATTER WHAT PLATFORM YOU CHOOSE FOR SAS. BUT: this is NOT a capacity comparison of these platforms - running MXG on ASCII requires dedicated hardware at least during the creation of PDB libraries neither Windows nor Linux/unix have Workload Manager BUILDPDB blocks out other users on shared unix platform - Gives you confidence that MXG and SAS will continue to measure computer systems, no matter where SAS runs, - Your MXG and SAS skills are transferable across platforms - You can keep your job!
10. Running out of storage "Below the bar" is catastrophic: - Failing system had to be removed from the SYSPLEX - SYSPLEX recovery took 3 minutes, halting all systems - Caused by twenty parallel sorts with SYNCSORT HIPER APAR OA03577 from IBM for RSM/SRM plus fix from SYNCSORT for z/OS 1.1 release if you have lots of sorts with DSM enabled. - Sort jobs fixed 99% of page frames 16MB ==> 2GB "Bar" RSM failed to detect the page shortage "below the bar", so SRM did not take any action. APAR addresses RSM problem so that SRM takes action SYNCSORT fix limits storage that it fixes. Only occur with SYNCSORT's global DSM option enabled - Turning off DSM limits job to the VSCORET parm - Turning off DSM helps overall system, but can elongate run times for large sorts.