1.11k likes | 1.51k Views
Yiming Peng Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech http://www.lisa.stat.vt.edu/ Nov, 2013. Introduction to Using JMP®. Outline. Introduction Getting Started Managing Data Visualizing Data Creating Summary Statistics
E N D
YimingPeng Laboratory for Interdisciplinary Statistical Analysis Department of Statistics, Virginia Tech http://www.lisa.stat.vt.edu/ Nov, 2013 Introduction to Using JMP®
Outline • Introduction • Getting Started • Managing Data • Visualizing Data • Creating Summary Statistics • Performing Basic Statistical Analysis • Saving and Exporting Results • Resources
About JMP® • JMP was developed by SAS Institute Inc., Cary, NC • Using JMP statistical software, you can • Interact with your graphs and data to discover patterns and relationships in your data • See how the data and the model work together to produce the statistics • Perform statistical summary and analysis • No need to write computer code
JMP Download and Installation • JMP license information • All Virginia Tech researchers may Purchase products directly from Software Distribution Office at the Torgersen end of Torgersen Bridge. • Price: 7$ + tax • JMP 10 is available for both Windows and Mac • http://www2.ita.vt.edu/software/student/products/sas/jmp/index.html
Prerequisites • Before you begin using JMP, note the following information: • You can use many JMP features, such as data manipulation, graphs, and scripting features, without any statistical knowledge • A basic understanding of basic statistical concepts, such as mean and variance, is recommended • Analytical features require statistical knowledge appropriate for the feature
JMP Terminology • JMP platforms use these windows: • Launch windowswhere you set up and run your analysis • Report windows showing the output of your analysis • Report windows normally contain the following items: • A graph of some type (such as a scatterplot or a histogram) • Specific reports that you can show or hide using the disclosure button • Platform options that are located within red triangle menus
Outline • Introduction • Getting Started • Managing Data • Visualizing Data • Creating Summary Statistics • Performing Basic Statistical Analysis • Saving and Exporting Results • Resources
JMP Home Window (Windows Only) • Tab + Alt to switch among different windows • Ctrl + Q to quit
JMP Data Table • You can enter, view, edit, and manage data using data tables • In a data table, each variable is a column, and each observation is a row • To create a new data table: • Select File > New > Data Table • Ctrl + N • Click on the first icon below the File menu
JMP Data Table • This shows an empty data table with no rows and one numeric column, labeled Column 1
Entering Data • Manually: • Move the cursor onto a cell, click in the cell and enter a value • Construct a formula to calculate column values • Open the formula editor by right-clicking the column name to which you want to apply the formula and selecting Formula… • Or Double-click the column name to which you want to apply the formula, Column Properties > Formula > Edit Formula • Select an empty formula element in the formula editing area by clicking it
Importing Data • You can import many file formats into JMP by default. For example: • Comma-separated (.csv) • .dat files that consist of text • Microsoft Excel 1997–2003 (.xls) • Microsoft Excel 2007,2010 files (.xlsx) • Plain text (.txt) • SAS versions 6–9 on Windows (.sd2, .sd5, .sd7, .sas7bdat) • SPSS files (.sav)
Import from Excel Files • File > Open or Ctrl + O or • Or, select all data in the excel spreadsheet, copy, switch to JMP, create a new data table, Edit > Paste with Column Names • Exercise: • Open the SAT.xlxs excel file in JMP • In the Open Data File window, change ‘All JMP Files’ to ‘All Files’ • Copy and paste data in SAT.xlxs to a JMP data table
Data Table Panels • There are three data table panels • Table panel • Columns panel • Rows panel • The data table panels are arranged to the left of the data grid • These panels contain information about the table and its contents
JMP Modeling Types • The modeling type of a variable can be one of the following types, shown with its corresponding icon: • Continuous • Ordinal • Nominal • When you import data into JMP, it predicts which modeling types to use • Character data is considered nominal • Numeric data is considered continuous • To change the modeling type, click on the modeling type icon next to the variable and make your selection
Access Sample Data Tables • All of the examples in the JMP documentation suite use sample data. To access JMP’s sample data tables, • Select Help > Sample Data.From here, you can: • Open the sample data directory • Open an alphabetized list of all sample data tables • Search for a sample data table within a category • Alternatively, the sample data tables are installed in the following directory: • On Windows: C:\Program Files\SAS\JMP\10\Samples • On Macintosh: \Library\Application Support\JMP\10\Samples
Outline • Introduction • Getting Started • Managing Data • Visualizing Data • Creating Summary Statistics • Performing Basic Statistical Analysis • Saving and Exporting Results • Resources
Adding Rows • To add one or multiple new empty rows, you can take one of the following actions: • Select Rows > Add Rows • Double-click an empty row number area below the last row to add that many empty rows • Double-click the gray lower triangular area in the upper left corner of the data grid. In the Add Rows… window, • Enter the number of rows to add • Specify where you would like to add them • Right-click in an empty row below the last row, and select Add Rows… • Enter the number of rows to add
Deleting Rows • To delete rows from the data grid, you can do one of the following: • Highlight the rows that you want to delete, then select Rows > Delete Rows • Right-click on the row numbers and select Delete Rows
Adding Columns • To add one or multiple new empty columns, you can take one of the following actions: • Select Cols > New Column • Double-click the empty space to the right of the last data table column • Select Cols > Add Multiple Cols… (or double-click the gray upper triangular area in the upper left corner of the data grid). In the Add Multiple Cols… window, • Enter the number of columns to add • Specify if they are to be grouped • Select a data type • Enter their location • Select the initial data values
Deleting Columns • To delete columns from the data grid, you can do one of the following: • Highlight the columns that you want to delete, then select Cols > Delete Columns • Right-click on the column numbers and select Delete Columns
Selecting/Deselecting Rows • Select or deselect rows: • Select Rows > Row Selection > Go to Row… to select a certain row number • Select Rows > Row Selection > Select All Rows Select Rows > Clear Row States • Hold down Shift and click the gray lower triangular area in the upper left corner of the data grid to select all rows. Click again to deselect • To clear all highlights in the data table, press the ESC key on your keyboard
Selecting/Deselecting Columns • Select or deselect columns: • Select Cols> Go … to select a certain column number or name • Hold down Shift and click the gray upper triangular area in the upper left corner of the data grid to select all columns. Click again to deselect • To clear all highlights in the data table, press the ESC key on your keyboard
Selecting Cells with Specific Values • Selecting cells that match the currently highlighted cell • Highlight the cells that contain the value(s) that you want to locate • Select Rows > Row Selection > Select Matching Cells • Selecting cells that contain specific values • Select Rows > Row Selection > Select Where
Show/Hide Data • You suppress (hide) rows and columns so they are included in analyses but do not appear in plots and graphs. To do so, you • Select Hide/Unhide from the Rows menu or Cols menu • A mask icon appears beside the hidden row number or the column name, indicating that the row or column is hidden • To unhide rows or columns, you select Hide/Unhide again
Include/Exclude Data • You can exclude data from calculations in analyses. For most platforms, excluded data are not hidden in plots. To do so, you • Select Exclude/Unexcludefrom the Rows menu or Cols menu • A circle with a strikethrough appears beside either the row number or the column name, indicating that the row or column is excluded and not analyzed • To un exclude rows or columns, you select Exclude/Unexcludeagain
Data Filter • The Data Filter gives you a variety of ways to identify subsets of data • Using Data Filter commands and options, you interactively select complex subsets of data, hide these subsets in plots, or exclude them from analyses • Select Rows > Data Filter
Data Filter • Exercise: Select data for Virginia • Open SAT data in JMP • Select Rows > Data Filter • Select State and click Add • Let’s check Select for Virginia • Can also check Show or Include • De-select? Click Clear • Choose another variable? Click Start Over
Data Filter • To select/show/include continuous variables such as time or weight, • Use sliders to control selection • Drag the end sliders to select the range you want • Need specific end points? Click on those values
Outline • Introduction • Getting Started • Managing Data • Visualizing Data • Creating Summary Statistics • Performing Basic Statistical Analysis • Saving and Exporting Results • Resources
Histograms • Histograms visually display the distribution of your data • For categorical (nominal or ordinal) variables, the histogram shows a bar for each level of the ordinal or nominal variable • For continuous variables, the histogram shows a bar for grouped values of the continuous variable • Select Analyze > Distribution
Histograms • Exercise: Create a histogram for SAT Math • Open SAT data in JMP • Select Analyze > Distribution • In the Select Columns box, select SAT Math > Y, Columns, then click on OK
Histograms • Interacting with the histogram • Change the orientation: • Click on the ▼ red triangle menu> Histogram Options > Vertical • Display the count of within each bar: • Click on the ▼ red triangle menu> Histogram Options > Show Counts • Rescaling the axis (continuous variables only): • Click and drag on an axis to rescale it • Hover over the axis until you see a hand, double-click on the axis and set the parameters in the XAxis Specification window • Resizing histogram bars (continuous variables only): • Click on the ▼ red triangle menu> Histogram Options > Set Bin Width • Hover over the axis until you see a hand, double-click on the axis and set the increment in the XAxis Specification window
Histograms • Interacting with the histogram • Clicking on a histogram bar highlights the bar and selects the corresponding rows in the data table • The appropriate portions of all other graphical displays also highlight the selection
Scatterplots • Select Analyze > Fit Y by X • Exercise: Plot SAT Verbal vs. SAT Math • Select Analyze >Fit Y by X • Click SAT Verbal in Select Columns box > Y, Response • Click SAT Math in Select Columns box > X, Factor button • Click OK
Scatterplots • Interacting with the scatterplots • Suppose we are interested in the points with both SAT Math and SAT Verbal greater than 600 • Point at this point and click on it • The point gets highlighted • The corresponding row (row 274) is also highlighted in the data table
Scatterplots • Interacting with the scatterplots • Suppose we are interested in all the points withboth SAT Math and SAT Math > 580 • Shift-click on all the points that satisfied this condition • Or, drag a box over all these points • To deselect, Ctrl-click
Scatterplots • Interacting with the scatterplots • Color the selected points red and change the symbol to an empty circle • Right click on the scatterplot • Row Colors • Row Markers • etc.
Scatterplots • Interacting with the scatterplots • Suppose those highlighted points are considered as ‘outliers’ and need to be removed from the plot (or the analysis) • Right click on the scatterplot • Row Hide • Row Exclude • ▼ Red triangle menu> Script > Redo Analysis to update the plot
Scatterplot Matrix • Using the Scatterplot Matrix platform, you can assess the relationships between multiple variables simultaneously • A scatterplot matrix is an ordered collection of bivariate graphs • Select Graph > Scatterplot Matrix • Select Analyze > Multivariate Methods > Multivariate (continuous data only) • Exercise: • Help > Sample data > Iris • Select Sepal length, Sepal width, Petal length, and Petal width and click Y, Columns • Select Speciesand click Group • Click OK
Scatterplot Matrix • To make the groupings stand out, you can: • From the ▼red triangle menu, select Density Ellipses • From the ▼ red triangle menu, select Shaded Ellipses
Scatterplot 3D • The Scatterplot 3D platform shows the values of numeric columns in the associated data table in a rotatable, 3D view • Select Graph > Scatterplot 3D • Exercise: • Help > Sample data > Iris • Select Graph > Scatterplot 3D • Select Sepal length, Sepal width, Petal length, and Petal width and click Y, Columns • Click OK
Scatterplot 3D • Information Displayed on the Scatterplot 3D Report
Scatterplot 3D • Normal Contour Ellipsoids • Exercise: Grouped normal contour ellipsoids • The ellipsoids cover 75% of the data points and are 50% transparent • The contours are color-coded based on species • Help > Sample data > Iris • Select Graph > Scatterplot 3D • Select Sepal length, Sepal width, Petal length, and Petal width and click Y, Columns • Click OK • ▼ Red triangle menu > Normal Contour Ellipsoids • Select Grouped by Column • Select Species • Type 0.75 next to Coverage • Type 0.5 next to Transparency • Click OK
Scatterplot 3D • Example of Grouped Normal Contour Ellipsoids
Scatterplot 3D • If we select Nonpar Density Contour instead of Normal Contour Ellipsoids, we can create nonparametric density contours
Variability Charts • The variability charts are used when we have multiple categorical x variables and one y variable • Select Analyze > Quality and Process > Variability/ Attribute Gauge Chart • Exercise: • Help > Sample data > Car Physical Data • Select Variability/ Attribute Gauge Chart • Select Weight as Y, Response,Country and Type as X, Grouping • Click OK
Variability Charts • From the ▼ red triangle menu, you can • Connect Cell Means (blue lines are added) • Uncheck Show Range Bars (easier to see points) • Show Group Means (purple lines are added)
Bubble Plots • A bubble plot is a scatter plot that represents its points as circles, or bubbles. You can use bubble plots to: • dynamically animate bubbles using a time variable, to see patterns and movement across time • use size and color to clearly distinguish between different variables • Bubble plots can produce dramatic visualizations and readily show patterns and trends • Select Graph > Bubble Plot