430 likes | 1.08k Views
IML Workshop 2.0 May 21, 2003 Charlie Hallahan IML Workshop Programming Environment * Program editor, output window, error log window Program editor color-codes IML syntax Editable output window Multiple independent program environments Multithreaded
E N D
IML Workshop 2.0 May 21, 2003 Charlie Hallahan
IML Workshop Programming Environment* • Program editor, output window, error log window • Program editor color-codes IML syntax • Editable output window • Multiple independent program environments • Multithreaded * these notes are based on a tutorial at SUGI 28 presented by • Simon Smith and Rick Wicklin of SAS Institute Demo
The IML Workshop Environment Plot window Program editor Data table window Output window
Programming with IMLPlus • Use same syntax as for PROC IML • Execute data/procedure steps from PROC IML • Create high-level statistical graphics • Call modules without explicitly loading them • Call external functions written in C/Fortran/Java
Graphics Programming with IMLPlus • Create and manipulate graphics using objects • Use declare statement to define variables that refer to objects declare ScatterPlot plot; • Call Create method to construct new object plot = ScatterPlot.Create(“Example”, x, y); • Use dot syntax to invoke methods plot.ShowAxisReferenceLines(YAXIS); Demo
Creating Graphs Using IMLPlus, Part 1 /* BasicGraphs2.iml */ x = 1:200; e = 30 * normal( x ); y = x + e; declare ScatterPlot splot; splot = ScatterPlot.Create( "Example", x, y ); declare Histogram hist; hist = Histogram.Create( "Error Distribution", e ); These graphs are not dynamically linked. Demo
Dynamic Linking Infrastructure • DataObject and DataView classes are the backbone • DataObject contains the data being displayed • DataView renders the data in a specific way ScatterPlot Histogram
The DataObject Class • Methods for adding data from matrices • Methods for loading SAS data sets • Methods for accessing data/metadata • Every graph is based on a DataObject
The DataView Class Hierarchy DataView DataTable Plot Plot2D BarChart BoxPlot ContourPlot Histogram LinePlot MosaicPlot PolygonPlot ScatterPlot Plot3D † RotatingPlot
Creating Graphs Using IMLPlus, Part 2 /* BasicGraphs3.iml */ x = 1:200; e = 30 * normal( x ); y = x + e; declare DataObject dobj; dobj = DataObject.Create( "Data", {"X", "Y", "E"}, x`||y`||e` ); declare ScatterPlot splot; splot = ScatterPlot.Create( dobj, "X", "Y" ); declare Histogram hist; hist = Histogram.Create( dobj, "E" ); These graphs are dynamically linked. Demo
Color-Coding Observations by Interval Data … colormatrix = {255 0 0, 255 255 0, 255 0 0}; run ColorCode(dobj,"E",colormatrix,9); splot.SetMarkerSize(7);
IMLPlus Module Library • Collection of prewritten modules • Similar to IMLMLIB modules • DrawLegend module adds legends to a graph • DoDialog… and DoMessageBox… modules enable interaction with the user
Graphing Data Stored in SAS Data Sets declare DataObject dobj; dobj = DataObject.CreateFromFile("Business.sas7bdat"); declare ScatterPlot plot; plot = ScatterPlot.Create(dobj,"Sales","Profits"); plot.SetObsLabelVar("Company"); Demo
Creating Graphs Using the GUI • Use File menu to open SAS data set (data table appears) • Use Plot menu to create graphs • Can use graph’s GUI to customize appearance • Can copy and paste graphs into other Windows apps
Modifying Graph Properties Using GUI • Graph window divided into Plot Area and Graph Area • Right-click area to change properties • Right-click axis to modify axis properties • Right-click axis title to change text/font
Local Selection • Technique for using graphs to filter data • Terminology: Selector view, Observer view • Each graph is either Selector view or Observer view • Selector views enable user to select observations • Observer views display either union or intersection of selections
Local Selection Basics Selector view ↓Observer view (intersection) Selector view ↓
Executing Procedures from IMLPlus • Execute any SAS statements from IMLPlus • Use submit and endsubmit statements to delineate SAS code block • Use data sets to pass data • Write modules that use procedures
IMLPlus Module to Compute Skewness /* CalcSkewness.iml */ start CalcSkewness( x ); create InputDataSet var {x}; append; close InputDataSet; submit; proc univariate data=InputDataSet noprint; var x; output out=OutputDataSet skewness=skewness; quit; endsubmit; use OutputDataSet; read var {skewness}; close OutputDataSet; return (skewness); finish; v = { 2, 4, 6, 3, 1 }; y = CalcSkewness( v ); print y; Output: Y 0.5901287 Demo
Putting the Pieces Together • Filtering Data Using Graphics • Interacting with Programs • Integrating Statistical Results and Graphics
Interacting with Programs Choose aspects of analysis at run-time Data sets, parameters, PROC options, … Display dialog boxes to prompt for input Present menu of choices for analysis
Interacting with Programs /* SummaryStatsCalculator.iml This program allows you to choose a data set to select a numeric variable of interest, and to see summary statistics for that variable. The statistics are MIN, MAX, N, NMISS MEDIAN, MEAN, STD, and SUM */ run GetPersonalFilesDirectory( path ); dir = path + "Data Sets"; ok = DoDialogGetOpenDataSetFileName( dataset, "Choose a Data Set", dir ); if ok=false then do; reset log; print "Abort. No data set chosen."; abort; end; declare DataObject dobj; dobj = DataObject.CreateFromFile( dataset );
Interacting with Programs /* get list of numerical variables */ numVars = dobj.GetNumVar(); varNames = j(numVars,1," ");/* length 32 */ do i = 1 to numVars; name = dobj.GetVarName(i); if dobj.IsNumeric(name) then varNames[i] = name; end; numericVarIndex = loc( varNames ^= " " ); numNumericVars = ncol( numericVarIndex ); if numNumericVars=0 then do; reset log; print "Abort. The data set does not contain", "any numeric variables. Abort"; abort; end; varNames = varNames[numericVarIndex];
Interacting with Programs /* display list in a dialog */ ok = DoDialogGetListItem( selection, "Numeric Variables", "Choose a variable to display summary statistics", varNames ); varName = varNames[selection]; dobj.GetVarData( varName, x ); statNames = {'MIN', 'MAX', 'N', 'NMISS', 'MEDIAN', 'MEAN', 'STD', 'SUM' }; stats = j(8,1,.); stats[1] = min( x ); stats[2] = max( x ); stats[3] = nrow( x ); stats[4] = ncol( loc(x=.) ); stats[5] = median( x ); stats[6] = x[:];
Interacting with Programs nonMissing = stats[3]-stats[4]; /* N - NMISS */ if nonMissing > 1 then do; xCenter = x - stats[6]; stats[7] = sqrt( xCenter[##] / (nonMissing-1) ); end; stats[8] = sum( x ); print stats[rowname=statNames colname=varName]; if dobj.IsNominal(varName) then BarChart.Create( dobj, varName ); else Histogram.Create( dobj, varName ); Demo
Integrating Statistical Results and Graphics • Call procedures in other SAS products without losing state of IML program. • Read results into IML Workshop • Use results in program or to augment graph • Example: Add nonparametric smoother with confidence limits to ScatterPlot
Add Kernel Density Estimation to Histogram /* KDE.iml: create a histogram */ declare DataObject dobj; dobj = DataObject.CreateFromFile( "Colleges.sas7bdat" ); declare Histogram hist; hist = Histogram.Create( dobj, "RoomBd" ); hist.ShowDensity(); /* default is frequency */ /* set up libref to data set directory */ run GetPersonalFilesDirectory( dir ); dir = dir + "Data Sets"; dir = "d:\charlie\iml Workshop"; libname dataDir (dir);
Add Kernel Density Estimation to Histogram /* compute kernel density estimation */ /* This KDE syntax is for SAS 8.2 */ submit; ods exclude all; /* turn off printing */ proc kde data=dataDir.Colleges out=KDEOut; var RoomBd; quit; endsubmit; /* read the output from the KDE procedure */ use KDEOut; read all var{RoomBd density}; close KDEOut; /* overlay on existing Histogram of RoomBd */ hist.DrawUseDataCoordinates(); hist.DrawLine( RoomBd, density ); Demo
Add Nonparametric Smoother to ScatterPlot /* TPSpline.iml */ /* set up libref to data set directory */ run GetPersonalFilesDirectory( dir ); dir = dir + "Data Sets"; dir = "d:\Charlie\IML Workshop"; libname dataDir (dir); /* pred thin plate spline and ask for 95% confidence limits for the predicted mean. */ submit; ods exclude DataSummary FitSummary FitStatistics; proc tpspline data=dataDir.Colleges(where=(PubPriv=1)); model PctPhD=(SAT); output out=pred predicted lclm uclm; quit; endsubmit; /* read work.pred */ declare DataObject dobj; dobj = DataObject.CreateFromServerDataSet( "work.pred" );
Add Nonparametric Smoother to ScatterPlot /* manufacture names output by proc */ xVar = "SAT"; yVar = "PctPhD"; predName = "p_" + yVar; lclName = "lclm_" + yVar; uclName = "uclm_" + yVar; /* create scatter plot of Y versus X */ declare ScatterPlot plot; plot = ScatterPlot.Create( dobj, xVar, yVar ); plot.SetAxisViewRange(YAXIS, 0, 100); /* get predicted values and confidence limits */ dobj.Sort( xVar ); dobj.GetVarData( xVar, x ); dobj.GetVarData( predName, pred ); dobj.GetVarData( lclName, lower95 ); dobj.GetVarData( uclName, upper95 );
Add Nonparametric Smoother to ScatterPlot /* plot the predicted pred in red */ plot.DrawUseDataCoordinates(); plot.DrawSetPenColor( RED ); plot.DrawLine( x, pred ); /* plot the confidence limits in blue */ plot.DrawSetPenColor( BLUE ); plot.DrawLine( x, lower95 ); plot.DrawLine( x, upper95 ); /* Add legend at Inside Right Bottom position */ labels = { "Predicted" "95% Lower CLM" "95% Upper CLM" }; colors = RED || BLUE || BLUE; run DrawLegend( plot, labels, 14, colors, SOLID, NULL, -1, "IRB" ); Demo