300 likes | 467 Views
Old Car, But New Engine (Proof of Concept Study). Haibin Shu Duramed Research, Bala Cynwyd, PA A subsidiary of Barr Pharmaceuticals. Overview. Observing SUGI’ returning to Orlando, where four years ago at the 27 th annual meeting Billy Clifford (SAS Institute) claimed that … .
E N D
Old Car, But New Engine(Proof of Concept Study) Haibin Shu Duramed Research, Bala Cynwyd, PA A subsidiary of Barr Pharmaceuticals
Overview Observing SUGI’ returning to Orlando, where four years ago at the 27th annual meeting Billy Clifford (SAS Institute) claimed that …
Overview (cont’d) Primary Users of SPDE Are Characterized by- • A strong SAS background • Solid familiarity with the SPDE tuning knobs • Knowledge of disk and CPU configurations
Overview (cont’d) However, if you just graduated from University of Philadelphia, then…
Overview (cont’d) Message to take home – Yes, you can!!
Topics to Be Covered • Introduction of Computing Technology • Features in SPDE • Comparison Between Base Engine and SPDE • Application • Conclusion • Q & A
Introduction of Computing Technology • Threading • Parallelism • I/O Efficiency Footnote: CPU usage is not a major concern!
Features in SPDE SPDE: Scalable Performance Data Engine was designed to address the new requirements of computing technology
Features in SPDE (cont’d) • BIG (> 2GB) Allows > 32,000 variables Allows up to 263 – 1, or > 9 * 1018 records
Features in SPDE (cont’d) • Partition Data Sets xyz.dpf.c_meta.#.#.spds9
Features in SPDE (cont’d) • Partitioned Data Sets - example
Features in SPDE (cont’d) • Parallelism Sort, Index, Where I/O Loads
Features in SPDE (Summary) • Separate data and data descriptor • Two index files: global index and segment index • Volume span • Partition
Comparisons Between SPDE and Base Engine (cont’d) Testing environment: Windows 2000, SP4 Dual CPU RAM 2 GB
Comparisons Between SPDE and Base Engine (cont’d) Testing data: # of records ~110 K # of variables = 36 Size = 1.3 GB
Comparisons Between SPDE and Base Engine (cont’d) Testing data (cont’d): Indexed keys: Site Number, Subject Number Data SPDE_lib.xyz (index=(Site # Subject ID));
Comparisons Between SPDE and Base Engine (cont’d) Test data (cont’d):
Comparisons Between SPDE and Base Engine (cont’d) Test data (cont’d): (CPU, Real Time)
Comparisons Between SPDE and Base Engine (cont’d) Test data (cont’d):(distribution) %do j=1 %to &n; %let begin=%sysfunc(time()); data lib_spde.one; set lib_spde.all; where sitenum=scan("&sites",&i); run; %let end=%sysfunc(time()); %let _%scan(&sites,&i)_&j=%sysfunc(sum(&end,- &begin)); %end;
Comparisons Between SPDE and Base Engine (cont’d) Test data (cont’d): double size = 2.6 GB
Comparisons Between SPDE and Base Engine (cont’d) Test data (cont’d): Observations 1. Real time using SPDE was approximately one third of that with Base Engine 2. Real time using SPDE was much more stable than that in Base Engine
Application – Patient Profiles • What it looks like -
Application – Patient Profiles Where site =? Where patient =? One Data Set (SPDE) Data Modules ODS
Application – Patient Profiles (cont’d) • SPDE: makes a big difference! Real time was reduced from > 6 hrs to ~ 1 hr.
Application – Patient Profiles (cont’d) • Simple set-up: libname pp spde ‘c:\meta’ datapath=(‘d:\data’) indexpath=(‘c:\index’);
Application – Patient Profiles (cont’d) • Simple set-up (cont’d): data pp._rpt; set pp.all (where=(site id = ? & pat = ?)); run;
Conclusion - Efficiency was gained through SPDE in ordinary SAS environment - Default options work!
Contact Haibin Shu Associate Director, Programming Duramed Research, Inc 1 Belmont Ave, 11th Floor Bala Cynwyd, PA 19004 Email: hshu@barrlabs.com
Q & A Any questions?