500 likes | 728 Views
Java on z/OS: A fresh look. Scott Chapman American Electric Power. Important notes. I don’t really like Java as a language I’m not a Java expert Results presented herein may be installation-dependent There’s a lot of moving parts here I understand there’s zAAP on zIIP
E N D
Java on z/OS: A fresh look Scott Chapman American Electric Power
Important notes I don’t really like Java as a language I’m not a Java expert Results presented herein may be installation-dependent There’s a lot of moving parts here I understand there’s zAAP on zIIP “zAAP” used generically here All trademarks of IBM, Oracle, and everybody else hereby recognized
Why Java on z/OS? • Because programmers want to use it http://xkcd.com/801/
Why Java on z/OS • Because it enables open source projects that are cool/useful/interesting • Key trick: run the JVM in ASCII • -Dfile.encoding=ISO8859-1 • Many things will just run with that run-time option!
What about a GUI? • Turns out that that just works too! • Start Xming X server on your PC • Check the “No Access Control” option • Set the DISPLAY environment variable • Run the code S147774:/u/s147774: >export DISPLAY=10.97.131.15:0 S147774:/u/s147774: >java -Xmx320m -jar ga33.jar
Debugging Javascript code running in Helma on the mainframe with the GUI connected to Xming on my laptop Works better than I expected
Why Java on z/OS • Because it enables more programming language choices • Javascript built in to Java 6 • Rhino interpreter from Mozilla • In theory, should be able to run any JVM-based language (I haven’t tested these) • Jython • Groovy • Clojure • Scala • Ruby (via JRuby)
Why Java on z/OS • It may perform better • If you are on a sub-capacity machine • It may save you money • Pretty unlikely • Only if you can take some work away from your peaks
How cheap are zAAP/zIIPs? • $100K/SE (z196, zEC12) • How much is $100K? • Consider adding 1 engine to z196-710: • 710 = 10,250 MIPS, 1191 MSUs • 711 = 11,073 MIPS, 1286 MSUs • 710+1 zIIP = 10,302+1,000 MIPS • z/OS (base) at this level costs $62/MSU • Scenario B, z/OS base goes up almost $6K/month • zIIP costs < 17 months of z/OS Base • Not to mention features, DB2, CICS, etc.
What about accessing z/OS services? • JZOS Classes to easily access z/OS specific constructs • z/OS datasets • RACF • Respond to operator commands • Access JES Spool
Ways to Run Java on z/OS • WebSphere • CICS • DB2 Stored Procedures • Batch • Started Tasks • Unix shell
Batch / Started Task options • BPXBATC • BPXBATCH (traditional alias) • BPXBATSL (local spawn alias) • Traditional approach • Difficulty with 100-byte JCL Parm • JZOS • Ships with z/OS • Avoids 100-byte parm limit • Adds a lot of flexibility
zAAP vs. GCP time • Watch the normalization factor! • Most SMF values not normalized • Tools/reports may normalize for you • Consider IFAHONORPRIORITY=NO • Avoid using GCPs to help zAAPs • Can result in >99% of Java CPU time executed on zAAP
SDSF zAAP vs. GCP columns This data comes from RMF JOBNAME CPU-Time GCP-Time zAAP-Time zACP-Time zAAP-NTime P3SR01BS 1514.11 9.53 772.02 2.26 1501.82 P3SR01AS 1706.50 12.82 868.75 1.95 1690.00 P3SR01B 788.55 197.66 281.64 1.53 547.87 P3SR01A 763.01 192.47 272.33 1.10 529.77 P3SR02A 2953.37 422.62 1188.79 5.39 2312.56 P3SR02B 3051.88 437.74 1226.02 6.55 2385.00 P3SR01AS 7281.39 62.56 3698.72 11.47 7195.17 P3SR02BS 2805.58 123.85 1316.22 22.15 2560.45 P3SR01BS 7783.21 63.38 3955.54 14.38 7694.77 P3SR02AS 2591.27 118.60 1216.36 10.74 2366.21 RTMSERVE 2661.39 3.85 1363.45 1.03 2652.34 TCB + SRB real zAAP on GCP normalized
SMF 30 Accounting • BPXBATCH vs. BPXBATSL vs. JZOS • Important due to spawned OMVS tasks • Single step job results: • BPXBATSL: 1 step, 1 job record • BPXBATCH: 6 step, 4 job records • CPU time collected on type OMVS records • JZOS: 2 step, 2 job records • CPU time almost completely on JOB types
Some interesting calculations • zAAPn = SMF30_TIME_ON_IFA * SMF30ZNF / 256 • percent work done on zAAP = • zAAPn / (zAAPn + SMF30CPT + SMF30CPU) • (“Generosity” or “offload” factor) • percent zAAP sent to GCP = • SMF30_TIME_IFA_ON_CP / (SMF30_TIME_ON_IFA+SMF30_TIME_IFA_ON_CP) • (“Fallback” percentage—can be <1%, although some fallback is normal and expected)
Other SMF records • RMF records • Look for breakdown of processor types for both hardware and report / service classes • WAS 120 records • New subtype 9s for WAS 7+ much better! • HIS type 113 records • GCP vs. zAAP vs. zIIP
What about performance? • Java on the mainframe has a history of performance problems • Java is inherently “heavy” due to the JVM • Scott’s Law: “The easier you make it on the programmer, the harder it is on the system” • Today’s z hardware and software are up to the task! • (But you probably want zAAPs!)
Heard at WAS Week 200x… • “Our goal is to get JVM startup time down to about 1 second.” • Seemed like a stretch at the time! • WAS startup took several minutes
Today: WAS Servant Startup <1 min • 15.49.15 STC14327 ---- MONDAY, 18 APR 2011 ---- • 15.49.15 STC14327 $HASP373 P3SR02AS STARTED • 15.49.15 STC14327 IEFUSI BPXBATSL-P3ASRU ABOVE REGION SET TO 1536MB • 15.49.15 STC14327 IEF403I P3SR02AS - STARTED - TIME=15.49.15 • 15.49.16 STC14327 +BBOO0004I WEBSPHERE FOR Z/OS SERVANT PROCESS • P3CELL/P3NODEA/P3SR02/P3SR02A IS STARTING. • 15.49.16 STC14327 +BBOO0239I WEBSPHERE FOR Z/OS SERVANT PROCESS p3cell/p3nodea/p3sr02a IS • STARTING. • 15.49.16 STC14327 +BBOO0308I SERVANT PROCESS P3CELL/P3NODEA/P3SR02/P3SR02A IS EXECUTING • IN 64-BIT ADDRESSING MODE. • 15.49.16 STC14327 +BBOM0007I CURRENT CB SERVICE LEVEL IS build level 7.0.0.12 • (cf121027.08) release WAS70.ZNATV date 07/09/10 11:02:02. • ... • 15.49.56 STC14327 +BBOO0222I: WSVR0001I: Server SERVANT PROCESS p3sr02a open for • e-business • 15.49.57 STC14327 +BBOO0020I INITIALIZATION COMPLETE FOR WEBSPHERE FOR Z/OS SERVANT • PROCESS P3SR02A. • 15.49.57 STC14327 +BBOO0248I INITIALIZATION COMPLETE FOR WEBSPHERE FOR Z/OS SERVANT • PROCESS P3CELL/P3NODEA/P3SR02/P3SR02A. Not much in that particular servant
Today: HelloWorld in <2 seconds 10.08.55JOB47259 IEF403I S147774B - STARTED - TIME=10.08.55 10.08.57 JOB47259 - --TIMINGS (MINS.)-- ----PAGING COUNTS--- 10.08.57 JOB47259 -JOBNAME STEPNAME PROCSTEP RC EXCP CPU SRB CLOCK SERV PG PAGE SWAP VIO 10.08.57 JOB47259 -S147774B RUNOMVS 00 59 .00 .00 .02 2524 0 0 0 0 10.08.57 JOB47259 IEF404I S147774B - ENDED - TIME=10.08.57 10.08.57 JOB47259 -S147774B ENDED. NAME-BPXBATCH TEST TOTAL CPU TIME= .00 TOTAL ELAPSED TIME= .02 10.08.57 JOB47259 $HASP395 S147774B ENDED z10 EC 504 with zAAP Output Hello Scott Java runtime: IBM Corporation 1.6.0, vm version 2.4 Running on: s390 z/OS 01.10.00 Running for: S147774 Classpath: /usr/lpp/java/J6.0/lib:/usr/lpp/java/IBM/J1.3/l JCL //RUNOMVS EXEC PGM=BPXBATCH, // PARM='SH java -Xms32M -Xmx32M HelloWorldApp Scott' //SYSOUT DD SYSOUT=* //SYSPRINT DD SYSOUT=* //SYSUDUMP DD SYSOUT=* //STDENV DD * //STDOUT DD SYSOUT=* //STDERR DD SYSOUT=*
Small machine • 10.51.53 JOB10901 IEF403I S147774B - STARTED - TIME=10.51.53 • 10.52.04 JOB10901 - --TIMINGS (MINS.)-- ----PAGING COUNTS--- • 10.52.04 JOB10901 -JOBNAME STEPNAME PROCSTEP RC EXCP CPU SRB CLOCK SERV PG PAGE SWAP VIO • 10.52.04 JOB10901 -S147774B RUNOMVS 00 86 .00 .00 .18 2252 0 0 0 0 • 10.52.04 JOB10901 IEF404I S147774B - ENDED - TIME=10.52.04 • 10.52.04 JOB10901 -S147774B ENDED. NAME-BPXBATCH TEST TOTAL CPU TIME= .00 TOTAL ELAPSED TIME= .18 • 10.52.04 JOB10901 $HASP395 S147774B ENDED z10 BC E02 without zAAPs Not surprising that ~50 MIPS engines can’t keep up with 450 / 900 MIPS engines
What about doing real work? • Days of assuming it will run faster on your PC are over • Have seen H2 perform better on z/OS • Still, it is Java, it’s not CPU-free • Performance may depend on: • zAAP and GCP capacity • System settings (USS, zFS, WLM) • Application code • Java Settings (heap size, GC policy) • Random luck
Application code • Application code is always important • Regardless of the language! • BufferedReader or ZFile? • Classic “it depends” • BufferedReader seems like it should be faster • But they provide different results: byte array vs. string • What you want to do with the result may impact which is best for any given situation • Java has lots of similar but slightly different ways of doing things
Heap settings • Heap settings always seen as an issue • Size is the usual suggestion • Is bigger always better? • Does anybody know how much heap they really need? (no) • Min / Max sizes same or different? • Garbage collection policy options
Memory is an issue • Java’s memory usage can be an issue • “Requirements” for 100s of MBs are not unusual • Often “requirements” seem to be a SWAG • Java heap size can’t be reliably predicted from the code & expected volumetrics • Test with reasonable numbers before assuming the requirements are real • Be sure to get all processing scenarios!
Garbage Collection Options (IBM Java 6) • optthruput – default • Probably best for batch • gencon – generational / concurrent • maybe good for large heap, transactional workloads (WAS) • optavgpause – reduces long pauses • subpool – “improved” object allocation • For important workloads, may want to test all of them at various size • Lots of other heap/gc options too • See IBM JDK Diagnostics Guide!
For some workloads, heap size may not matter
Too small of a heap can cause CPU increase
There might be a slight benefit to a fixed heap size
Heap size most important, but GC Policy also can be significant
Don’t mess with the JIT!
Could be good for certain workloads
So what’s the random thing? • Much more variation in CPU time measurements with today’s CPUs • Superscalar pipeline and cache issues • Seems to impact my Java work more than I expected • Consistently ran same workload • Extremely lightly utilized LPAR • Lightly utilized zAAPs • Same variability over time • So I tried some more tests…
One zAAP Two zAAPs Zero zAAPs
Why is this? • I don’t know, but best guess is CPU cache and memory access effects • But I thought I’d look at the 113 records to see if I could find anything interesting….
Data from Test period 1 (One zAAP) Proc 0 = GCP Proc 2 = zAAP
Proc 0 = GCP Proc 2 = zAAP Seems to confirm our SMF30 data
Proc 0 = GCP Proc 2 = zAAP
L1.5 Improvement corresponds to dip in machine usage Proc 0 = GCP Proc 2 = zAAP
Dip in GCP TLB Miss overhead due to machine less busy Proc 0 = GCP Proc 2 = zAAP
Proc 0 = GCP Proc 2 = zAAP
My Guesses… • My test Java workloads were too cache and superscalar friendly • Perhaps makes it more susceptible to pipeline hazards • But: • Wouldn’t the REXX workload be even more superscalar and cache friendly? • Why were the 113 measurements so consistent? • Or Java is really doing variable amounts of work? • Or… something isn’t right someplace? • Take away: Java CPU measurements might be more variable than you expect
Most recent testing • Repeated testing later in the year • z/OS 1.12 vs. 1.10 • 1 Year more recent Java 6 (Fall 2010 vs. Fall 2009) • Still saw variability, but worst of it was closer to 25-30% instead of upwards of 75% • Saw similar variability when testing on a z9 with zAAPs • Saw at least one instance in a production LPAR with similar variability: (in 3 executions of the same job, 1st consumed just over half as much CPU of the later runs) • Could not readily replicate on a WSC system running under z/VM
Summary • Java enables all sorts of cool things you might not have thought could run on the mainframe • Mainframe’s Java performance not significantly worse than any other platform • (Assuming adequate zAAP capacity) • Lots of tuning knobs for Java • Java CPU time measurements might be more variable