130 likes | 430 Views
PROC UNIVARIATE vs. PROC SUMMARY. A Comparison of Performance. Background. For many of the common things I do, PROCs UNIVARIATE and SUMMARY can accomplish similar results Many years ago, someone suggested I use PROC UNIVARIATE because it had more functions
E N D
PROC UNIVARIATE vs.PROC SUMMARY A Comparison of Performance
Background • For many of the common things I do, PROCs UNIVARIATE and SUMMARY can accomplish similar results • Many years ago, someone suggested I use PROC UNIVARIATE because it had more functions • They claimed that both procedures performed about the same • I didn’t bother to check that out • Unless I needed something that could be done only with PROC SUMMARY, I got in the habit of using PROC UNIVARIATE
More Background • Several months ago, I was becoming frustrated with how long it was taking to run some large PROC UNIVARIATEs for simple functions (like SUM, MEAN, MIN, MAX, etc.) • It also was using a lot of CPU • There had to be a better way
My First Experiment • Wrote DATA steps to do simple functions • Benchmarked the DATA steps again PROC UNIVARIATE steps • Compared output results to ensure integrity • Ran tests using SAS on both Mainframe and PC • The results were surprising
Results of First Test • Data step showed: • 95% reduction in elapsed time • 99% reduction in CPU time • Decided to also run tests comparing PROC SUMMARY
Results of First Test • Compared to PROC UNIVARIATE, PROC SUMMARY showed: • 94% reduction in elapsed time • 96% reduction in CPU time
Overall Test Results • Ran many tests on several types of data • Data Step vs. PROC UNIVARIATE • Elapsed time was 71% to 95% lower • CPU was 74% - 99% lower • PROC SUMMARY vs. PROC UNIVARIATE • Elapsed time was 72% to 94% lower • CPU was 76% - 96% lower • In tests where PROC MEANS was also run, results were similar to PROC SUMMARY • Sometimes a little less CPU and elapsed time, sometimes a little more
Other Observations • Data steps performed slightly better then PROCs SUMMARY and MEANS for simple functions but not as good on more complex functions • Most tests were run on both mainframe and PC • Elapsed time and CPU improvement percentages (vs. PROC UNIVARIATE) were usually similar on both platforms • The tests were run on an older, slower mainframe and a new Windows 7 PC • For each test, the same data and parameters were run on both the mainframe and PC • The PC generally ran 80-95 percent faster than the same tests on the mainframe (for tested functions) and used 85-95 per less CPU