280 likes | 498 Views
To Compress or not to Compress? . Chuck Hopf. What is your precious?. Gollum says every data center has something that is precious or hard to come by CPU Time DASD Space Run Time IO Memory. Lots of talk. On the LISTSERVE – does compression use more CPU? Does it save DASD space?
E N D
To Compress or not to Compress? Chuck Hopf
What is your precious? • Gollum says every data center has something that is precious or hard to come by • CPU Time • DASD Space • Run Time • IO • Memory
Lots of talk • On the LISTSERVE – does compression use more CPU? Does it save DASD space? • On the LISTSERVE – what is the best BUFNO= to use with MXG
Testing the theories • Built two tests • COMPRESS=NO varying BUFNO from 2 10 15 20 • COMPRESS=YES again varying the BUFNO
An Epiphany! • What if you run with COMPRESS=NO and send the output to PDB as a temporary dataset and then at the end, turn on COMPRESS=YES and do a PROC COPY INDD=PDB OUTDD=PERMPDB NOCLONE; ? That would eliminate all of the compression during the reading and writing of all of the interim datasets but still create a compressed PDB.
So there are now 3 Tests! • TEST=NO - COMPRESS=NO • TEST=NO/YES - COMPRESS=NO but final PDB is compressed • TEST=YES – COMPRESS=YES
Conclusions? • Running with COMPRESS=NO and then copying to a compressed PDB optimizes permanent DASD space and uses very little additional CPU. • Even better, use the LIBNAME OPTION to turn it on where you want: • LIBNAME PDB COMPRESS=YES; /* zOS only */ • Memory requirements increase with BUFNO but are not really that bad and BUFNO GT 10 shows very little additional benefit
Caveats! • BLKSIZE matters. SAS procs are sometimes built with a BLKSIZE of 6160 on WORK. This radically affects the IO counts. Use the recommended BLKSIZE=DASD(OPT) and leave the DCB attributes off of SAS datasets. • REGION may have to be increased – use REGION=0M and be sure you are using the MXG defaults for MEMSIZE. • This all applies to zOS not to ASCII platforms
So What About ASCII? • Using the same data, tests run with SAS 9.2 on Win 7 system • 1.5GB memory • Dell 4600 – P4 2.7GHz
Wow! • COMPRESS=YES outperforms COMPRESS=NO! • BUFNO makes some difference but not a lot and BUFNO=10 looks to be optimal • Difference is in seconds not minutes • But… there is something we don’t understand in the memory numbers • Runs faster under Win 7 than under zOS • But does not include download time
So What Should You Do? • It Depends on what your ‘precious’ is • Running zOS • Optimal for CPU and DASD is COMPRESS=NO with a copy to a compressed dataset at the end or by setting the compress=YES option with a LIBNAME • Optimal for CPU is COMPRESS=NO • Optimal for DASD is COMPRESS=YES • BUFNO=10 is optimal for run time • Running ASCII • Optimal for CPU and DASD is COMPRESS=YES
JCL //* SAMPLE JCL TO RUN BUILDPDB WITH COMPRESS=NO AND COMPRESS AT //* THE END USING PROC COPY //S1 EXEC MXGSASV9 //PDB DD DSN=MXG.PDB(+1),SPACE=(CYL,(500,500)), // DISP=(,CATLG,DELETE) //SPININ DD DSN=MXG.SPIN(0),SPACE=(CYL,(500,500)) // DISP=(,CATLG,DELETE) //SPIN DD DSN=MXG.SPIN(+1),DISP=OLD //CICSTRAN DD DSN=MXG.CICSTRAN(+1),SPACE=(CYL,(500,500)), // DISP=(,CATLG,DELETE) //DB2ACCT DD DSN=MXG.DB2ACCT(+1),SPACE=(CYL,(500,500)), // DISP=(,CATLG,DELETE) //SMF DD DSN=YOUR,SMF DATA,DISP=SHR //SYSIN DD * OPTIONS COMPRESS=NO BUFNO=10; LIBNAME PDB COMPRESS=YES; LIBNAME SPIN COMPRESS=YES; %LET SPININ=SPININ; %UTILBLDP( MACKEEPX= MACRO _LDB2ACC DB2ACCT.DB2ACCT % MACRO _KDB2ACC COMPRESS=YES % MACRO _KCICTRN COMPRESS=YES % , SPINCNT=7, SPINUOW=2, OUTFILE=INSTREAM); %INCLUDE INSTREAM; JCL is in the 27.10 SOURCLIB as JCLCMPDB
Why UTILBLDP? • Allows you to add data sources to BUILDPDB without having to edit the macros in the SOURCLIB. • Allows you to suppress data sources like 110 and DB2 and TYPE74 and process them in other jobs again without editing the macros. • Flexibility
Example OPTIONS COMPRESS=NO BUFNO=10; LIBNAME PDB COMPRESS=YES; LIBNAME SPIN COMPRESS=YES; %LET SPININ=SPININ; %UTILBLDP( USERADD=42, SUPPRESS=110 DB2, SPINCNT=7, OUTFILE=INSTREAM); %INCLUDE INSTREAM; RUN;
MXG User Experience • Running MXG with WPS instead of SAS • Data from multiple platforms • Processed under two Virtual products • Also, Comparison of SAS/PC and WPS on zLinux
PC/SAS VMWARE/Windows versus PC/SAS Hyper-V/Windows: (four platform’s data, three installation “groups” PROD/QA/DEV) Data From VMWARE(PROD) Hyper-V(PROD) Unix 00:05:30 00:10:56 zOS 00:01:30 00:04:54 zVM/Linux 00:03:07 00:08:08 Windows Servers 02:43:08 09:32:57 Data From VMWARE(QA) Hyper-V(QA) Unix 00:00:31 00:04:18 ZOS 00:01:27 00:02:46 zVM/Linux 00:01:02 00:07:06 Windows Servers 00:41:24 02:34:19 Data From VMWARE(DEV) Hyper-V(DEV) Unix 00:00:43 00:02:42 ZOS 00:00:21 00:01:42 zVM/Linux 00:01:08 00:03:34 Windows Servers 00:09:06 00:38:47 Processing of performance Data collected from Unix, zVM/Linux, zOS and Windows.
PC/SAS versus LNX/WPS • PC/SAS VMWARE/Windows versus WPS zVM/Linux • PC/SAS VMWARE is taking 2:43:08 to process the data from “Window Servers” for what the WPS zVM/Linux environment can do in 1:30:00 (hh:mm:ss). • That is, the Mainframe WPS zVM/Linux is a 45% improvement over the PC/SAS VMWARE/WIN. • This is most likely due to the extra bandwidth the mainframe has for I/O’s compared to the Windows environment. • The results for Windows would probably be better if WIN2008 had been used.
PC/SAS versus WPS on z • PC/SAS under Hyper-V • WPS under zVM/Linux on z-10
Z10: SAS versus WPS • zOS/SAS versus zOS/WPS to run MXG • 30% more I/O’s for SAS • TCB for WPS = 551,423 • TCB for SAS = 551,273 • NOTES: • WPS version 2.4.0.1 and SAS 9.1.3 • MXG from FEB 2009