1 / 15

Combining the Power of R and Excel: RExcel A LISA Short Course February 2012

Combining the Power of R and Excel: RExcel A LISA Short Course February 2012. Matthew Lanham Ph.D. Student, Business Information Technology M.S. Student, Statistics. As you come in, please get materials here: https ://filebox.vt.edu/users/lanham/LISA/. Motivation for this course :.

fallon
Download Presentation

Combining the Power of R and Excel: RExcel A LISA Short Course February 2012

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Combining the Power of R and Excel: RExcelA LISA Short CourseFebruary 2012 Matthew Lanham Ph.D. Student, Business Information Technology M.S. Student, Statistics As you come in, please get materials here: https://filebox.vt.edu/users/lanham/LISA/

  2. Motivation for this course: Two Facts • Excel is the most prevalent software used for data storage and analysis. There are a lot of built in statistical functions in Excel along in addition to the “Analysis ToolPak.” • R is a free and open source program, and one of the most powerful and the fastest-growing statistics programs. Why not use them both together!! This is you with Excel This is you with Excel + R Outcome from this course: I hope to have provided you some examples that you might incorporate in your own work that might prove beneficial.

  3. Lets get started: 1) Double-click the RExcel2010 with Rcommander Icon This will open Excel and Rcommander. R commander is like using the standard R GUI, but looks a bit different. You will find R in the Excel Ribbon as well.

  4. Part 1 Transferring data between R and Excel • RExcel Drop-down • Close R – Will close the open instance of R and Rcommander as well • Run Code – Will run R code • Get R Value (Array or Dataframe) – Gets data • Put R Value (Array or Dataframe) – Defines a cell or range for R • Get R Output – Retrieves code output from R to Excel • Set R working dir – Define the folder location you want to work from on your PC. • Load R file – Used to load a data set or .R file • Copy code – copies code in Excel • Debug R – If checked, this will open a debugger if an error occurs • Error log – This will show you all the R errors • Options – Offers a few basic options • Set R sever – allows to select the server type, server name (for remote servers), and R process name (for servers from a serverpool). • RExcel Help – Takes you here: file:///C:/Program%20Files%20%28x86%29/RExcel/doc/RExcel.html • Rhelp – Takes you here: http://127.0.0.1:18357/doc/html/index.html • Rcommander – Opens Rcommander with menus in the Excel Ribbon or in Rcommander. • Demo worksheets – There are five demos for learning how to use the software • Mark calc cells – If activated, this will mark all cells containing calculated results with a special marker in the upper left corner • About RExcel • Data from Excel to R • Data from R to Excel

  5. Part 1 Functions, Arrays, and Dataframes • Advantages • Use Excel as a container for dependencies • Use R code functions without lengthy “IF” statements • Allows automatic recalculations via Excel’s computation engine(R will not do this by itself) See RExcelExamples workbook, Part1 tab

  6. Regression: Excel and R Part 2 Excel • Excel Functions TREND(Y-range, X-range, X-value for prediction) function LINEST(Y-range, X-range, Const, Stats) array function • Excel’s Analysis TookPak Data -> Data Analysis -> Regression -> Then fill in the dialog box (see example sheet) R • Use Rcommander "Statistics" -> "Fit models" -> "Linear regression.." • Use R code via RExcel • Benefits of each: Use what you like and is more advantageous to your problem • The Excel functions automatically update • Analysis TookPak outputs the statistics in a nice readable table • Rcommander has nice drop-down menus • R provides plots that are not easily available via Excel alone • R is more extensible and allows more advanced modeling

  7. Part 2 Regression: Assumption Review Gauss-Markov Theorem Tells us that our OLS estimators (our intercept and slope) are unbiased and have minimum variance among all linear unbiased estimators IF… Two assumptions: (1) Independence => = 0 (2) Equal variance (aka. Homoscedasticity, same finite variance) => To make tests inferences, do statistical tests, and create confidence intervals, we need to assume a third condition: (3) Error is normally distributed => )

  8. Part 3 Regression: Assumption investigation (a.k.a. Diagnostics) • Linear relationship among Sales and Advertising looks fine. • What do you think about our independence assumption? • What do you think about the constant finite variance assumption? • What about normality? • Anything else stand out?

  9. Part 3 Regression: Fit without influential points Here we see our new fitted line, in addition to how well our model performed at estimating sales.

  10. Part 3 Regression: More diagnostics • Linear relationship among Sales and Advertising is fine. • What do you think about our independence assumption? • What do you think about the constant finite variance assumption? • What about normality? • Anything else stand out?

  11. Part 3 Regression: Interpretation The F and Significant F tell us that our slope is statistically significant. Meaning, it is highly unlikely it is 0.

  12. Part 4 Using R commander • Obtain data sets from R libraries or load in your own • Nice drop-down for basic statistics and plots (code prints to R commander window) • Common distributions are available via drop-down See RExcelExamples workbook, Part4 tab

  13. Part 5 Using built-in R commander plug-ins These plots are useful, but somewhat dull. The code that generates these will show up in the R commander window (very useful for newbies). Like plotting in Excel, you can get what you need by default, but you’ll probably have to modify the graph a bit. Lets look at RmcdrPlugin.HH XY conditioning plot (HH) Side-by-side Boxplot

  14. Additional References • http://rcom.univie.ac.at/RExcelDemo/

More Related