280 likes | 405 Views
http://www.r-project.org/. Brief Introduction to “R” 5 th International Seminar on Climate System and Climate Change 14-25 July 2008, Lanzhou University, Lanzhou, China Francis Zwiers, Climate Research Division, Environment Canada. http://www.r-project.org/misc/acpclust.R. Getting started.
E N D
http://www.r-project.org/ Brief Introduction to “R”5th International Seminar on Climate System and Climate Change14-25 July 2008, Lanzhou University, Lanzhou, ChinaFrancis Zwiers, Climate Research Division, Environment Canada http://www.r-project.org/misc/acpclust.R
ASTD/CRD Getting started • R Gui and R console open on your desktop when you start R • help.start() for help • q() to quit
ASTD/CRD >help.start() • html based • uses your browser
ASTD/CRD packages • much of what you will need is in the “stats” package • see also “base” and “graphics” packages
ASTD/CRD stats package • “lm” fits linear models • “linear” in statistics means a model like y=a+bx2+εthat is linear in its parameters (a and b)
ASTD/CRD >help(“lm”)
ASTD/CRD • Enter full path or use Browse button Set your working directory … • Click on File • Choose Change dir • Choose the “R-intro” directory
ASTD/CRD Doing simple things • usels()to find out what is in your workspace • Use load() to restore a previously used workspace • type a variable name to see its contents
ASTD/CRD more simple things … • list a variable • mean() • var() • summary() 50th percentile (or median) mean (or average) 75th percentile 25th percentile
ASTD/CRD summary(x) • 6 numbers that summarize the sample x1, …, xn • Minimum (smallest value) • 25th percentile • 25% of values ≤ to this number, 75% of values ≥ this number • Median (or 50th percentile) • 50% of values ≤ to this number, 50% of values ≥ this number • Mean (or average) • 75th percentile • 75% of values ≤ to this number, 25% of values ≥ this number • Maximum (largest value)
ASTD/CRD What is in “obs”? • Annual mean surface temperature anomalies in ALA+WNA combined • 1950-1999 • Expressed relative to 1961-1990 Detail from IPCC AR4 WG1, Fig 11.11
ASTD/CRD More simple things • hist() draws histograms • freq controls type (frequency or probability) • many parameters to control appearance • ylimcontrols scale on y-axis
ASTD/CRD A less simple thing … • add a plot of the normal (or Gaussian) distribution • use curve() anddnorm() • is this a good way to decide if the “obs” are Gaussian?
ASTD/CRD A better way to judge if observations come from a normal distribution … • draw a “quantile-quantile” plot comparing observations with the normal distribution • use qqnorm() • add qqline()
ASTD/CRD plot() • a simple line plot() • types can be • “p” (points), • “l” (lines), • “b” (both), • “h” (histogram), • “s” (stair steps), • etc.
ASTD/CRD Fit a trend line with lm() • fit linear model y=a+bx+ε withlm() • store the results in trend_fit • get a summary of results with summary.lm()
ASTD/CRD Add the fitted trend line to the plot • abline()adds lines to plots • Needs slope and intercept • found in variable trend_fit$coefficients • Estimated trend is ~0.21°C/10-yr
ASTD/CRD Trying out a quadratic trend • create year2 vector • fit linear model y=a+bt+ct2+ε • new summary
ASTD/CRD A bit of standard manipulation • Concatenate two vectors • Reorganize into a matrix (2 columns, filled in column order) • Plot original observations and fitted quadratic (as points and a curve) • Specify types • Specify plotting characters
ASTD/CRD c(), rep() • c() can be used to create a vector of consecutive values • c() can be used to concatenate vectors, etc. • rep()can be used to repeat values
ASTD/CRD Summary of some basic commands • q() to stop • help.start() to start html documentation interface • help(“xxxx”) to get help on function xxxx • File button to set working directory • ls() to list contents of work space • load(“xxxx”) to load workspace xxxx (or use File button) • scan() to read data from a file • x to list x • mean(x) to calculate mean of x • var(x) to calculate variance of x • summary(x) to produce 6 number summary of x • hist(x) to display a histogram of x • qqnorm(x) to display a normal quantile plot of x • qqplot() is used to display quantile plots for other distributions • plot() to produce a scatter plot or line plot • lm() to fit a linear model such as a linear trend • c() to concatenate numbers of vectors together • rep() to repeat a value or a vector a specified number of times • matrix() to convert a vector into a matrix • matplot() to plot multiple curves on the same plot • Need to organize data into a matrix first
ASTD/CRD Getting data into R – another example • Homogenized Canadian Station Data • http://www.cccma.bc.ec.gc.ca/hccd/index.shtml • Plot trend at in annual mean temperature at Thunderbay A
ASTD/CRD scan() • Skip first two lines • Use comma as separator • 18 items per year • Year • 12 monthly values • Annual mean • 4 seasonal values
ASTD/CRD Organize data for analysis • Form matrix • Year is in column 1 • Annual mean is in column 14 • -9999.9 indicates a missing value for 2004
ASTD/CRD Organize data for analysis … and plot • Select non-missing years • Plot with labels • Add trend line by calling lm() within the call to abline()
ASTD/CRD Summary • Powerful, with matrix-vector manipulation and display capabilities • Can easily define your own functions • Control structures, looping, etc • Free • Open source • Well documented • Has the latest stuff • statisticians are contributing actively • ETCCDI is disseminating it’s indices and homogenization software via R • http://cccma.seos.uvic.ca/ETCCDI/
ASTD/CRD Photo: F.Zwiers Thank you