390 likes | 497 Views
ColeSoft Marketing Presents. Event Profiling vs. State Profiling under z/OS. By David Day, Sr. Engineer. And brought to you by:. Event Profiling is:. An entirely new technology for more accurately measuring resource consumption by z/OS-based application programs, such as:. Started Tasks
E N D
ColeSoft Marketing Presents Event Profiling vs. State Profiling under z/OS By David Day, Sr. Engineer And brought to you by:
Event Profiling is: An entirely new technology for more accurately measuring resource consumption by z/OS-based application programs, such as: • Started Tasks • Batch jobs • DB2 • IMS (soon) • CICS (even sooner) Resources are critical. Let’s not waste any.
State Profilers are mature products Examples: • Strobe, by Compuware; • Freeze-Frame, by Macro 4; • ASG-TriTune, by Allen Systems Group/CA; • APA, by IBM. These are highly-regarded products. They ARE the market. In such a mature market, you don’t introduce new technology unless it’s MUCH: • Better • Faster • Cheaper! And that’s just what ColeSoft did! So, what is a “State” profiler?
State Profilers: • Periodically stop the address space; • Interrogate the contents; • Write out data; • Start the address space. • They can do this about 1000 times per second. State profilers capture the “state” of an address space.
State Profilers Advantages: • Information is accurate. • An accepted solution. • Mature products with huge penetration. • Disadvantages: • Directly affect execution time. • Don’t capture enough data. • Are “blind” to many interesting events. • Cost an arm and a leg.
Event profiling: • Instead of stopping/starting an address space, Event profilers monitor continuously at the processor level. • NO presence in the address space. • NO system HOOKS of any kind. • Capture FAR more information. • “See” what others are “blind” to. • Do NOT cost an arm and a leg. Really. is the world’s first Event Profiler! (patent pending, etc.)
The End Thank you very much for your time. Isn’t Event Profiling cool? Call (800) XDC-5150 for more information or write tobshimizu@colesoft.com Oh, you want to know MORE? OK…
How an “E. P.” works z/OS processes millions of interrupts per hour for each processor. As part of interrupt processing, Trace Records are created and written to the system Trace Buffers. An E. P. captures Trace Records, filters them, and makes a time-stamped log file from them. Then, it writes them to VSAM ESDS files for report processing. Because of this architecture, an E. P.: • Is absolutely non-invasive; • Gathers MILLIONS of records per run; • Can “see” things that other profilers are “blind” to.
SVC execution by count and elapsed time SRB execution time Page Faults I/O Elapsed time Memory events (GETMAIN/FREEMAIN Storage Obtain/Release) Lock Suspends Enqueues by SVC and ISGENQ Program Call statistics Latch Contention CPU Contention Wait Times What does an Event profiler capture? Things that Capacity Planners, Measurement folks and Software Engineers find really useful: YOU get more granular information to work with than ever before!
Right down to the details… • If you care to, you can report at the individual Trace record level. • You can also invoke SLIP and an E. P. will create trace records for all: • Branch trace • Instruction Fetch • You can’t get more definitive than that! This is “High-definition” profiling! • An E. P.’s filters let you drill down right to the PSW level. • You see more. Better tool = better results. • An E. P. is unique, more powerful and far less expensive than other such tools.
The End Thank you very much for your time. Call (800) XDC-5150 for more information or write tobshimizu@colesoft.com YOU STILL don’t believe us? OK…
Let’s go deeper.What is the Trace Table? • Each processor on the system image has a unique Trace Table. • They are a series of buffers linked in a circular chain. • The default buffer size is 64 4K buffers. • In z/OS V1R10 the Trace Table was increased to 256 4K buffers, and was moved into 64-bit storage. • Sites can choose to increase the size of the Trace Table buffers with an Operator command. In fact, we ENCOURAGE it! • There are two kinds of Trace Records: • Implicit • Explicit
Explicit Trace Records contain: Time-stamp Type of Interrupt TCB Address HASN/PASN/SASN value Are created by Software-issued TRACE commands. Reference: SYS1.MACLIB (IHATTE) Implicit Trace Records contain: NO time stamp Type of interrupt Instruction address HASN/PASN/SASN value Are created by the microcode upon certain events. Reference: POPS manual, Chapter 4 Trace Record anatomy An E. P. “interpolates” in order to derive a reliable time-stamp for Implicit Trace Records.
Explicit External Interrupts Start Sub-channel I/O Interrupts Program Checks SVCs Implicit Program Call Program Return Program Transfer Examples of Trace Records An E. P. LIKES interrupts; the more the better. The more interrupts, the more opportunities z/XPF gets to process the Trace Buffers, and the more records you will get in your reports. PR/SM gives z/XPF 40 interrupts per second all by itself. When an E. P. is active, it takes a few interrupts for itself also. So, what’s the catch?
An E. P.’s “Challenges”: • Under a heavy work-load, z/OS Trace Tables can “wrap” very quickly. • If an E.P. can’t get access to Trace Tables often enough, it loses data. • Therefore, any E. P. needs to execute at high priority. • We accomplish this by running at Execution Class “SYSSTC” and as a Global SRB. • However, the E. P. has a very small footprint, because z/OS does all the work. • An E. P. “sips” system resources itself, because the data is just there. • Remember, Event Profilers have no HOOKS to the OS, and NO presence in the target address space. It’s an elegant solution.
The End Thank you very much for your time. Call (800) XDC-5150 for more information or write tobshimizu@colesoft.com Oh, you want PROOF? OK. Lets see a case study…
Here’s a story, which we’ll entitle: “E. P. and the misplaced buffer” • A customer’s engineers decided to move a series of buffers into 64-bit storage in order to reduce I/O. • Instead, they saw CPU consumption rise dramatically. • State profilers were used, but they could not isolate this problem. What the heck? • The E. P. DID find the problem. Here we go:
Top CPU Consumers Report(before the buffer change) First, our customers ran z/XPF’s Top CPU Consumer’s report. Module DKEXP is what we’re interested in. WORK LOAD CSECT BEGIN END TOTAL % UNIT MODULE OFFSET OFFSET EVENTS TOTAL THIS CPU GROUP EVENTS BATCHTCB LOADMOD1 WAITLIST 000130 000137 400,679 40.31 BATCHTCB LOADMOD1 TIMEUSED 000060 000067 276,920 27.86 BATCHTCB LOADMOD1 DKEXCP 000060 000067 138,131 13.89 BATCHTCB LOADMOD1 WAIT 000030 000037 4,492 00.45 BATCHTCB LOADMOD1 GETMAIN 0000D8 0000DF 1,113 00.11 DKEXP manipulates the application’s buffers. This is what the engineers wanted to optimize.
Top CPU Consumers Report(after buffer change) Surprise! After moving the buffers to 64-bit storage, DKEXP consumed FAR more CPU cycles. WORK LOAD CSECT BEGIN END TOTAL % UNIT MODULE OFFSET OFFSET EVENTS TOTAL THIS CPU GROUP EVENTS BATCHTCB LOADMOD1 DKEXCP 000060 000067 380,707 25.99 BATCHTCB LOADMOD1 WAITLIST 000130 000137 375,069 25.60 BATCHTCB LOADMOD1 TIMEUSED 000060 000067 259,604 17.72 BATCHTCB LOADMOD1 WAIT 00030 000037 4,324 00.29 BATCHTCB LOADMOD1 GCORE64 000250 000257 1,363 00.09 Why is there a difference between these two runs?
Top CPU Consumers Report Before moving the buffers, DKEXCP appeared in its usual place in the list of CPU consumers, with IOSVSSCQ right below it. WORK UNIT .................................................. BATCHTCB LOAD MODULE ................................................. LOADMOD1 CSECT ....................................................... DKEXCP BEGIN OFFSET LOCATION ....................................... 00000060 END OFFSET LOCATION ......................................... 00000067 TOTAL EVENTS THIS OFFSET .................................. 138,131 PERCENT OF TOTAL CPU EVENTS ................................. 13.89 NBR OF MODULES REFERENCED FROM THIS OFFSET ................ 56 LOAD MODULE PROGRAM OFFSET BREAKOUT VALUE ................. 5 LOAD MODULE ................................................. IOSVSSCQ TOTAL EVENTS THIS LOAD MODULE ............................. 1,132 PERCENT OF LOCATION EVENTS .................................. 00.81 PERCENT OF TOTAL CPU EVENTS ................................. 00.11 LOAD MODULE ................................................. IAXPQ TOTAL EVENTS THIS LOAD MODULE ............................. 836 PERCENT OF LOCATION EVENTS .................................. 00.60 PERCENT OF TOTAL CPU EVENTS ................................. 00.08
Top CPU Consumers Report IGC00231, which is called by DKEXCP, appears in ITS normal place. So far, so good. LOAD MODULE ................................................. IGC0023I TOTAL EVENTS THIS LOAD MODULE ............................. 449 PERCENT OF LOCATION EVENTS .................................. 00.32 PERCENT OF TOTAL CPU EVENTS ................................. 00.04 LOAD MODULE ................................................. IRASRCHP TOTAL EVENTS THIS LOAD MODULE ............................. 123 PERCENT OF LOCATION EVENTS .................................. 00.08 PERCENT OF TOTAL CPU EVENTS ................................. 00.01 LOAD MODULE ................................................. BPXMIDIE TOTAL EVENTS THIS LOAD MODULE ............................. 37 PERCENT OF LOCATION EVENTS .................................. 00.02 PERCENT OF TOTAL CPU EVENTS ................................. 00.00
Top CPU Consumers Here is the break-out report after the buffer change: WORK UNIT ................................................... BATCHTCB LOAD MODULE ................................................. LOADMOD1 CSECT ....................................................... DKEXCP BEGIN OFFSET LOCATION ....................................... 00000060 END OFFSET LOCATION ......................................... 00000067 TOTAL EVENTS THIS OFFSET .................................. 380,707 PERCENT OF TOTAL CPU EVENTS ................................. 25.99 NBR OF MODULES REFERENCED FROM THIS OFFSET ................ 72 LOAD MODULE PROGRAM OFFSET BREAKOUT VALUE ................. 5 LOAD MODULE ................................................. IGC0023I TOTAL EVENTS THIS LOAD MODULE ............................. 249,805 PERCENT OF LOCATION EVENTS .................................. 65.61 PERCENT OF TOTAL CPU EVENTS ................................. 17.05 Now, module DKEXCP is “heaviest user” in the Top CPU Consumer’s report. IGC00231 is just below it in the report. I/O has increased dramatically! WHAT is going on?
Here’s what was observed: • Csect DKEXCP had jumped to the top of the CPU Consumers report. • Within DKEXCP, it was obvious that IGC0023I had gone from less than 500 events to ¼ million. • The next slide contains data from the LOAD MODULE SUMMARY Report. • Lets see what activity is taking place inside IGC0023I.
TOP CPU ConsumersHere is the break-out report after the buffer change: WORK UNIT ................................................... BATCHTCB LOAD MODULE ................................................. LOADMOD1 CSECT ....................................................... DKEXCP BEGIN OFFSET LOCATION ....................................... 00000060 END OFFSET LOCATION ......................................... 00000067 TOTAL EVENTS THIS OFFSET .................................. 380,707 PERCENT OF TOTAL CPU EVENTS ................................. 25.99 NBR OF MODULES REFERENCED FROM THIS OFFSET ................ 72 LOAD MODULE PROGRAM OFFSET BREAKOUT VALUE ................. 5 LOAD MODULE ................................................. IGC0023I TOTAL EVENTS THIS LOAD MODULE ............................. 249,805 PERCENT OF LOCATION EVENTS .................................. 65.61 PERCENT OF TOTAL CPU EVENTS ................................. 17.05 LOAD MODULE ................................................. IAXV6 TOTAL EVENTS THIS LOAD MODULE ............................. 2,560 PERCENT OF LOCATION EVENTS .................................. 00.67 PERCENT OF TOTAL CPU EVENTS ................................. 00.17
Load Module Activity Report LOAD MODULE ............................................... IGC0023I EXECUTING UNDER WORK UNIT ................................. BATCHTCB DATA CAPTURE RECORD NUMBER, FIRST EVENT ................... 4,355 DATA CAPTURE TIME, FIRST EVENT ..................... 14.17.06:34.7535 DATA CAPTURE RECORD NUMBER, LAST EVENT .................... 2,260,369 DATA CAPTURE TIME, LAST EVENT ...................... 14.21.45:83.1730 TOTAL OBSERVED EVENTS, THIS LOAD MODULE ................... 252,040 PERCENT OF WORK UNIT TOTAL ................................ 19.37 PERCENT OF SEGMENT TOTAL .................................. 15.83 RATE PER SECOND ........................................... 900.14 TOTAL NUMBER OF MEMORY MANAGEMENT EVENTS .................. 125,397 SVC 120--GETMAIN/FREEMAIN ................................. 3 BR ENTER GETMAIN/FREEMAIN ................................. 6 PROGRAM CALL IARV64 ....................................... 125,388 Now we can see that IGC0023I calls SVC 120, and does so a WHOLE LOT. DKEXCP is “killing” the system with Program Calls to module IARV64. AHA!
Why did event profiling succeed when the others had failed? Why didn’t the State profilers identify the problem? State profilers acquire a local Lock, and use STATUS STOP (an SVC) in order to “freeze” the address space for measurement. Memory management events ALSO need the local lock. So, the State profilers were “blind” to this problem. The Event Profiler caught it right away. Any profiler that needs the Local Lock to perform its logic during the active part of the interval will never observe any activity in any application when that application owns the Lock. Nanner-nanner!.
The desktop GUI Interface The Event Profiler can present its information on the mainframe via ISPF screens, OR on the “wintel” desktop, via z/XPF-PC. • The E. P. can generate .csv records, which are downloaded to the PC platform via TCP/IP. • z/XPF-PC then “massages” the data and presents it to you in a clear, intuitive way. You can drill down/zoom, pan around – as you might expect. • When you’ve gotten the “big picture” you can go back to z/XPF’s detailed reports for more concise information. • Today, z/XPF-PC only looks at three of the major reports in z/XPF. • Tomorrow, you will be able to do ALL your work on the desktop! • That means soon. Like in a month or two!
So, E. P. users have options They can use The E. P.’s mainframe-based ISPF interface: OR…
E. P. users have options, cont’d They can use the z/XPF-PC interface: OK, let’s wrap this up. It’s time to summarize.
What’s so great about Event profiling? • Event profiling has NO effect on the target address space. • Event profiling needs no HOOK to the operating system. • Event profiling captures data that is just there in the OS. • Event profiling captures MUCH MORE data than State profilers. • Event profiling “sees” MUCH more than State profilers do. • Therefore, Event profiling offers finer measurement to people who are concerned with efficiency. Better tool = better result. • You know absolutely what’s going on. • Your programmers and vendors have nowhere to hide! • Because E. P. technology is “new”, we’re humble. That means the E. P. isn’t going to cost an arm and a leg! Is the world’s first Event profiler.
And just who is “ColeSoft? • ColeSoft has 30 years in the mainframe ISV industry. • We’re more of an engineering company, and less of a marketing organization. • Our flagship product is z/XDC, the world’s most comprehensive debugging tool for z/OS-based software engineers who write in Assembler language. • z/XDC is a go-anywhere/do anything diagnostic tool. • Our customers ARE the ISV community. Just about all of them. • We have happy customers all over the world. • Back to z/XPF!
Want to know more? • Seeing is believing. • We offer free, Internet-based demonstrations. • We offer FREE on-site education for trial customers. • We’ll work with you to help you get the most out of the product. • There is no obligation to purchase. • We’d like to make a living. • We’re not here to create an empire. • Bob Shimizu (800) XDC-5150 • Please visit: www.colesoft.com, or www.z-xpf.com
The End This time we mean it. Thanks! Thank you very much for your time. Call (800) XDC-5150 for more information or write tobshimizu@colesoft.com
Why are you still here?Get back to work, or call us! (800) XDC-5150bshimizu@colesoft.com