1 / 26

Basic Statistical Analysis of a Vector in R

Learn how to calculate statistics on a vector in R using high temperatures data for Philadelphia in August. Explore functions like mean, median, standard deviation, min, max, quartiles, summary, sort, length, and create visualizations like histograms and boxplots.

caraa
Download Presentation

Basic Statistical Analysis of a Vector in R

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Basic statistics on a vector in R

  2. Suppose we had a vector corresponding to high temperatures in Philadelphia in August • high_temps<-c(90, 87, 89, 90, 88, 86, 91, 89, 88, 85, 83, 88, 80, 83, 87, 89, 89, 91, 93, 92, 92, 92, 76, 79, 78, 75, 79, 85, 83, 88, 86) • We will consider another time how to get data from Excel to R or vice versa, but for now just copy the above code into R • For now we just want to focus on some of the simple statistic functions built into R that can act on a vector

  3. Open RStudio. Seems like there’s some old stuff in the Environment

  4. Click on the X on Script pane’s tab

  5. Click on the List icon (upper right) and switch to Grid. Grid will allow one to pick and choose what to clear out – though actually we will clear everything out this time

  6. Check the “objects” and use the broom icon to clear them out.

  7. Start a new script

  8. Save it

  9. Copy and paste the data into RStudio and run it • high_temps<-c(90, 87, 89, 90, 88, 86, 91, 89, 88, 85, 83, 88, 80, 83, 87, 89, 89, 91, 93, 92, 92, 92, 76, 79, 78, 75, 79, 85, 83, 88, 86) • The keyboard shortcut for copying is Ctrl-c (copies whatever is highlighted) • The keyboard shortcut for pasting is Ctrl-v • Then run: place the cursor on the line (or highlight it) and click on the Run icon (or type Ctrl-Enter)

  10. Result of paste and run Note that the tab turns red which means that the current version is different from the saved version. Click on the Save (floppy disk) icon if you want to save the data you just pasted in

  11. Use the mean function to calculate the average

  12. Use the median function

  13. Use the sd (standard deviation) function

  14. Use the min() function – minimum

  15. Use the max() function – maximum

  16. The quantile function (R-speak) can be used to determine the first quartile (Excel-speak)

  17. The quantile function can be used to determine the third quartile

  18. The summary function is a quick way to get a number of standard statistical measures on a set of data

  19. Use sort() to order the data

  20. Finding the second smallest

  21. Use length to determine the sample size

  22. Getting the second largest

  23. Making a histogram

  24. Use boxplot to obtain a “box and whiskers” display of the data

  25. Boxplot.stats() gives statistics used in a making a boxplot

  26. Here’s what happened when I changed the first temperature to 100 and re-ran all the code. Now you see outliers.

More Related