1 / 84

Bare-Bones R

Bare-Bones R. A Brief Introductory Guide Thomas P. Hogan University of Scranton 2010 All Rights Reserved. Citation and Usage. This set of PowerPoint slides is keyed to Bare-Bones R: A Brief Introductory Guide, by Thomas P. Hogan, SAGE Publications, 2010.

Pat_Xavi
Download Presentation

Bare-Bones R

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bare-Bones R A Brief Introductory Guide Thomas P. Hogan University of Scranton 2010 All Rights Reserved

  2. Citation and Usage This set of PowerPoint slides is keyed to Bare-Bones R: A Brief Introductory Guide, by Thomas P. Hogan, SAGE Publications, 2010. All are welcome to use and/or adapt the slides without seeking further permission but with the usual professional acknowledgment of source.

  3. Part 1: Base R • 1-1 What is R • A computer language, with orientation toward statistical applications • Relatively new • Growing rapidly in use

  4. 1-2 R’s Ups and Downs • Plusses • Completely free, just download from Internet • Many add-on packages for specialized uses • Open source • Minuses • Obscure terms, intimidating manuals, odd symbols, inelegant output (except graphics)

  5. 1-3 Getting Started: Loading R • Have Internet connection • Go to http://cran.r-project/ • R for Windows screen, click “base” • Find, click on download R • Click Run, OK, or Next for all screens • End up with R icon on desktop

  6. At http://cran.r-project.org/

  7. Downloading Base R [Figs 1.1 – 1.4] • Click on Windows • Then in next screen, click on “base” • Then screens for Run, OK, or Next • And finally “Finish” • will put R icon on desktop

  8. What You Should Have when clicking on R icon:Rgui and R Consoleending with R prompt (>) [Fig 1.5]

  9. The R prompt (>) • > This is the “R prompt.” It says R is ready to take your command.

  10. 1-4 Using R as Calculator • Enter these after the prompt, observe output >2+3 >2^3+(5) >6/2+(8+5) >2 ^ 3 + (5)

  11. More as Calculator • You can copy and paste, but don’t include the > • Use # at end of command for notes, e.g. > (22+ 34+ 18+ 29+ 36)/5 #Calculating the average, aka mean • R as calculator: Not very useful

  12. 1-5 Creating a Data Set • > Scores = c(22, 34, 18, 29, 36) c means “concatenate” in R – in plain English “treat as data set” • Now do: >Scores R will print the data set

  13. Important Rules • We created a variable • Variable names are case sensitive • No blanks in name (can use _ or . to join words, but not -) • Start with a letter (cap or lc) • Can use <- instead of =

  14. Another variable • Create SCORES, using <- > SCORES<-c(122, 134, 118, 129, 124) • NB: SCORES different than Scores Check with >SCORES >Scores

  15. Non-numeric Data • Enclose in quotes, single or double • Separate entries with comma • Example: > names = c(“Mary”, “Tom”, “Ed”, “Dan”, “Meg”)

  16. Saving Stuff • To exit: either X or quit ( ) • Brings up this screen: • Do what you want: Yes or No • Do Yes, • then re-open R, get Scores & names

  17. Special Note on Saving • Previous slide assumes you control computer • If not, use File, Save Workspace, name file, click Save • Works much like saving a file in Microsoft • To retrieve, do File, Load Workspace, find file, click Open

  18. 1-6 Using R Functions: Simple Stuff • Commands for mean, sd, summary (NB: function names case sensitive) • mean(Scores) • sd(Scores) • summary(Scores) • Command for correlation • cor(Scores,SCORES)

  19. R functions • A zillion of ‘em • R’s big strength, most common use • For examples: • Help • R functions(text) • Enter name of a function (e.g., sd) • Yields lots (!) of information

  20. 1-7 Reading in Larger Data Sets • In Excel, enter (or download) the SATGPA20 file • Save as .xls • Then save as Text (tab delimited) file • Will have .txt extension

  21. … Larger Data SetsThe read.table command • Now read into R like this: >SATGPA20R=read.table("E:/R/SATGPA20.txt", header =T) • Need exact path, in quotes • header = T • T or TRUE, F or FALSE • Depends on opening line of file

  22. The file.choose ( ) command • At > enter file.choose ( ) • Accesses your system’s files, much like Open in Microsoft • Find the file, click on it • R prints the exact path in R Console • Can copy and paste into read.table

  23. Checking what you’ve got: • Enter >SATGPA20R • Then >mean (SATGPA20R) • Try >mean (GPA)

  24. The attach Command • To access individual variables, do this: >attach(SATGPA20R) • Now try: >mean(GPA)

  25. The data.frame Command • Let’s create these 3 variables with c > IQ = c(110, 95, 140, 89, 102) > CS = c(59, 40, 62, 40, 55) > WQ = c(2, 4, 5, 1, 3) • Then put them together with: >All_Data = data.frame(IQ, CS, WQ) • Check with: >mean(All_Data)

  26. 1-8 Getting Help • >help(sd) • >example(sd) • On R Console: Help R functions (text) Enter function name, click OK Reminder: function names case sensitive

  27. R’s “function” terms R language: function(arguments) Plain English: Do this (to this) or Do this (to this, with these conditions)

  28. 1-9 Dealing with Missing Data • NB: It’s a pain in R! • Key items • In data, enter NA for a missing value • In (most) commands, use na.rm=T

  29. Examples for missing data >Data=c(2,4,6,NA,10) >mean(Data, na.rm=T) • Add to the SATGPA20 file 21 1 NA NA NA 3.14 23 2 1 NA NA 2.86 Etc. and create new file SATGPA25R • Then >mean(SATGPA25R, na.rm=T) • Note exception for cor function (use=‘complete’)

  30. 1-10 Using R Functions: Hypothesis tests • Be sure you have an active data set (SATGPA25R), using attach if needed • Then, to test male vs. female on SATM: >t.test(SATM~SEX) # note tilde~ • Examples of changing defaults: >t.test(SATM~SEX, var.equal=TRUE, conf.level=0.99)

  31. Hypothesis tests: Chi-square • Using SEX and State variables in SATGPA25R • chisq.test (SEX, State)

  32. 1-11 R Functions for Commonly Used Statistics functioncalculates this mean ( ) mean median ( ) median mode ( ) mode sd ( ) standard deviation range ( ) range IQR ( ) interquartile range min ( ) minimum value max ( ) maximum value cor ( ) correlation quantile ( ) percentile t.test ( ) t-test chisq.test ( ) chi-sqaure NB1: See notes in text for details NB2: R contains many more functions

  33. 1-12 Two Commands for Managing Your Files > ls ( ) Will list your currently saved files > rm (file) Insert file name; this will remove the file NB: R has many such commands

  34. 1-13 R Graphics • R graphs: good, simple • Let’s start with hist and boxplot with the SATGPA25R file >hist(SATM) >boxplot(SATM) >boxplot(SATV, SATM) • R Graphics window opens, need to minimize to get R Console

  35. More Graphics: plot • Create these variables >RS=c(12,14,16,18,25) >MS=c(10,8,16,12,20) • Now do this: >plot(RS, MS)

  36. Line of Best Fit • Do these for the RS and MS variables: > lm(MS~RS) # lm means linear model > res=lm(MS~RS) # res means residuals > abline(res) # read as ‘a-b’ line

  37. Controlling Your Graphics: A Brief Look • R has many (often obscure) ways for controlling graphics; we’ll look at a few • Basically, we’ll change “defaults” Examples (try each one): • Limits (ranges) for X and Y axes >plot(RS, MS, xlim = c(5,25), ylim = c(5,25))

  38. Controlling Graphs: More Examples • Plot characters: >plot(RS, MS, pch=3) • Line widths >plot(RS, MS, pch=3, lwd=5) • Axis labels >plot(RS, MS, xlab = “Reading Score”, ylab = “Math Score”) • You can put them all together in one command

  39. Part 2: R Commander • 2-1 What is R Commander? • Point and click version of R • Uses (and prints) base R commands • Loading: Easy – it’s just a package • See next slide

  40. Loading Rcmdr • On R Gui, top menu bar click Packages, then Install package(s). Pick a CRAN mirror site (nearby), click OK. From the list of packages ,scroll to Rcmdr, highlight it, click OK • After it loads, do these: • Check with: >library ( ) • Activate with: >library (Rcmdr)

  41. Rcmdr’s extra packages • Scary message when first activating Rcmdr: • Just click Yes – and take a break

  42. The R Commander Window • You get, R Commander window with • Script window • Output window (incl Submit button) • Message window

  43. 2-2 R Commander Windows and Menus • File • Edit • Data ** • Statistics ** Most important for us • Graphs ** • Models • Distributions • Tools • Help

  44. Our Lesser Used Menus • File [Table 2.1] • Much like in Microsoft • Manage files • Edit [Table 2.2] • Much like in Microsoft • Can do with right click of mouse

  45. Our Lesser Used Menus (cont) • Models Mostly more advanced stats • Distributions • Tools • Load packages • Options – change output defaults • Help • Searchable index • R Commander manual

  46. 2-3 The Data Menu (very important)(Submenus for creating/getting data sets) • New data set – create new data set • Load data set – only for existing .rda data • Import data – import from various file types • Data in packages – not important for us

  47. Data Menu (cont.) (Submenus for managing data sets) • Active data set • Do stuff with current data set • Manage variables in active data set • Do stuff with variables in current data set

  48. New data set [Fig. 2.3] • Click on it, brings up spreadsheet • Name it SampleData

  49. New data set (cont) • Enter these data: var1 var2 var3 2 1 5 5 4 7 3 7 8 6 8 9 9 2 9 • Then kill window with X • Note: SampleData in Active Data Set

  50. Now Try These • View active data set • Edit active data set • In Script window, type* • mean(SampleData) • sd (SampleData) • mean(var1) [gives error message] • Attach(SampleData) • mean(var1) * When typing do not include >, do hit Submit

More Related