2.05k likes | 2.34k Views
Oswald. for Windows Lancaster University 7-9 June 1998. David Smith S-PLUS Product Manager Phone: +44 (0)1276 452299 ext. 212 Fax: +44 (0)1276 451224 E-mail: dsmith@mathsoft.co.uk. MathSoft International Knightway House Park Street Bagshot, Surrey GU19 5AQ United Kingdom.
E N D
Oswald for Windows Lancaster University 7-9 June 1998 David Smith S-PLUS Product Manager Phone: +44 (0)1276 452299 ext. 212 Fax: +44 (0)1276 451224 E-mail: dsmith@mathsoft.co.uk MathSoft International Knightway House Park Street Bagshot, Surrey GU19 5AQ United Kingdom Oswald/Windows
Outline - Day 1 • 0915-1045 Intro to S-plus and Oswald • 1045-1115 coffee • 1115-1245 hands-on session • 1245-1400 lunch • 1400-1530 Longitudinal data • 1530-1600 afternoon tea • 1600-1730 hands-on session Oswald/Windows
Outline - Day 2 • 0915-1045 Exploring longitudinal data • 1045-1115 coffee • 1115-1245 hands-on session • 1245-1400 lunch • 1400-1530 Basic modelling • 1530-1600 afternoon tea • 1600-1730 hands-on session Oswald/Windows
Outline - Day 3 • 0915-1045 Advanced modelling • 1045-1115 coffee • 1115-1245 hands-on session • 1245-1400 lunch • 1400-1530 Other topics • 1530-1600 afternoon tea • 1600-1730 free laboratory session Oswald/Windows
Day 1 Introduction to S-plus and Oswald Oswald/Windows
Day 1 Session 1 • What is S-PLUS? • Oswald overview • Starting S-PLUS for Windows • The S command language • Fundamentals -- expressions and assignments • Vectors, matrices and lists • Calling functions Oswald/Windows
S-PLUS History • 1980’s S Language developed by AT&T Bell Labs • Added value S-plus distributed by StatSci Inc • 1992 Mathsoft acquires Statsci, Axum • 1997 S-plus 4.0 released • Axum Graphics, S Language, Office GUI • 1998 S-plus 4.5 (Windows) & S-plus 5.0 (Unix) released • Best-regarded statistical research tool in the industry • 50,000+ users world-wide Oswald/Windows
S-PLUS 4.5For exploratory data analysis and statistical modelling • From basic statistics to cutting edge statistics • Customisable and extensible GUI • For data analysts and researchers of all levels • Powerful, object-oriented S language • With thousands of built-in functions • Total control of data and graphics • Worldwide S-PLUS users group Oswald/Windows
Book Recommendations • Modern Applied Statistics with S-plus • W.N. Venables and B.D. Ripley • Springer 1997 (second edition) • The New S Language: A programming environment for data analysis and graphics • Becker, Chambers & Wilks • Wadsworth & Brooks/Cole 1988 • Analysis of Longitudinal Data • Diggle, Liang & Zeger • Oxford University Press 1994 Oswald/Windows
Oswald - Introduction • Object-oriented Software for the Analysis of Longitudinal Data Oswald/Windows
Longitudinal Data • A (largish) number of subjects • people, households, rats, cows, … • A measurement taken repeatedly on each subjects over a (smallish) number of times • Daily blood-pressure on 65 patients taken over 2 weeks. 30 patients on control drug, 35 on new anti-hypertensive drug. Oswald/Windows
New data type for longitudinal data Subsample selection Creating replicated covariates Parallel plots Shadow plots Available for Windows and Unix Analysis of longitudinal data Kernel smoothing Rice-Silverman C-V Variogram Mixed modelling Dropout modelling Comes with GEE and ALR Oswald features Oswald/Windows
Using Oswald • From the S-plus prompt • library(oswald) • The Oswald function and datasets are now available. Oswald/Windows
Starting S-PLUS Oswald/Windows
Object Browser • S-plus is object-oriented • data, functions, analysis results, plots, ... • persistence (press DEL to delete) • The Object Browser organises all the objects you create • Left pane -- objects by classification • Right pane -- list/components of objects • Completely customisable Oswald/Windows
Getting Data • Data … Select Data • Select existing data, or import data from a file • Dialog appears at startup (by default) • Import data: (later) • Existing data: enter the name of an S-plus object in the Name box and click OK. Oswald/Windows
Data Windows • Data appears in a data window • Like a spreadsheet, but column-oriented • Includes column names and row labels • Click on a column name to select a column • Shift-click to select a block of columns • Ctrl-click for non-adjacent columns • Order of selection is important! • Selected columns used for plots, analysis, ... Oswald/Windows
Simple graphs • Open the 2D or 3D plot palette • Select column(s) for plotting from data window • First ctrl-click -- x axis • Second ctrl-click -- y axis • (Third ctrl-click -- z axis) • Click on the palette button corresponding to the desired plot Oswald/Windows
Trellis Graphs • You can condition any plot by the value of any other variable in the data • Create a plot • Select the conditioning column in the data window, and drag to the title area • Be sure to start dragging from the data • Cursor changes to + when ready to drop • Plot is redrawn with panels for subranges of the conditioning variable Oswald/Windows
Example: Ozone data • Select Data • Histogram of Ozone • Scatterplot: Temp vs Ozone • With loess line • Trellis conditioning on Wind • 3-D plot: Temp vs Wind vs Ozone • Vary colour by Radiation • Export graphics Oswald/Windows
The Command Window • Behind all of the menus and toolbars lies the S command language • A complete programming language • expressions, data types, loops, functions, … • object oriented • We will mostly use Oswald by entering commands in the S language using the Command Window • Click on to open. Oswald/Windows
S Language: absolute basics • Enter an expression at the prompt > • S-plus prints out the result (usually) > 1+1 [1] 2 > pi [1] 3.141593 > sin((1+sqrt(5))/2*pi) [1] -0.9320324 Oswald/Windows
Secondary prompt • If you fail to complete the expression, continue at the `+’ prompt > exp(1/sqrt(2) - + 0.5) [1] 1.230114 • If you get stuck, enter lots of `)’s > log(1+sqrt(5 + sin(2) + ))))) Syntax error: No opening parenthesis, before ")" at this point: log(1+sqrt(5 + sin(2) Oswald/Windows
Vectors • Many functions return vectors: > rnorm(6) [1] -1.4217905 0.7963978 -0.2487190 -0.8279492 [5] 0.7495827 1.0976972 • The [n] on the left shows where the row starts. A single number is a length 1 vector: > mean(rpois(50, mean=5)) [1] 4.74 Oswald/Windows
Making vectors • Sequences a:b > 1:10 [1] 1 2 3 4 5 6 7 8 9 10 > 50:45 [1] 50 49 48 47 46 45 • Concatenation c(v1,v2,v3,…) > c(6,2,1) [1] 6 2 1 > c(1:5,5:1) [1] 1 2 3 4 5 5 4 3 2 1 Oswald/Windows
Assignment • Store results in objects with <- (“gets”) > a <- 1:10 • The result is not printed until you enter the object name by itself > a [1] 1 2 3 4 5 6 7 8 9 10 Oswald/Windows
Object Names • Valid object names may contain only • Letters: abcXYZ • Numbers: 0123456789 • Dot: . • Valid names: y Y weight x.var LD50 .my.dat sum • Invalid names: m-1 heart_rate 9th T F NA Oswald/Windows
Handling objects • List all your objects: > objects() [1] ".Last.value" ".Random.seed" [3] ".ldats.options" "a” • Objects are persistent and remain until removed (even if you quit S-plus) > remove("a") > a Error: Object "a" not found • Or, use the Object Browser Oswald/Windows
Objects as variables • Objects can be used in expressions > a <- 1:5 > sum(a) [1] 15 > b <- c(a, 10) > length(b) [1] 6 > 2*b [1] 2 4 6 8 10 20 Oswald/Windows
Vector arithmetic • Scalar functions work elementwise: > a <- 1:4 > sqrt(a) [1] 1.000000 1.414214 1.732051 2.000000 • Scalar and vector arithmetic > 2*a [1] 2 4 6 8 > 2*a + log(a) [1] 2.000000 4.693147 7.098612 9.386294 Oswald/Windows
Logical vectors • Expressions with relational operators return logical vectors a==b a>5 b<=0 a!=0 • T is True, F is False > a <- rnorm(5) > a [1] -0.1495632 0.1389647 2.3571278 2.1495335 [5] -1.8157597 > a < 0 [1] T F F F T Oswald/Windows
Missing values -- NA • The missing value is represented by NA > a <- c(1, NA, 2) • Most operations on NA return NA > a + 1 [1] 2 NA 3 > sum(a) [1] NA • is.na checks for missing values > is.na(a) [1] F T F Nota==NA Oswald/Windows
Vector indexing • Use [] to select elements of a vector > a <- c(2,3,5,7,11,13,17) > a[1] [1] 2 > a[3:5] [1] 5 7 11 > a[c(1,3,5)] [1] 2 5 11 • Negative indices remove elements > a[-(1:3)] [1] 7 11 1317 Oswald/Windows
Logical indices • A logical index selects elements • index as long as the indexed vector > b <- rnorm(8) > b[b<=0] [1] -0.04100 -0.86600 -0.07229 > log.b <- log(b) Warning messages: NAs generated in: log(x) > b[is.na(log.b)] [1] -0.04100 -0.86600 -0.07229 > log.b[!is.na(log.b)] [1] 0.7242 0.9154 0.6976 -0.1279 0.2408 Oswald/Windows
Replacement • You can also use [] on the left hand side of an assignment > a <- sample(1:10) > a [1] 2 1 10 9 7 8 3 6 4 5 > a[5] <- NA > a [1] 2 1 10 9 NA 8 3 6 4 5 > a[is.na(a)] <- 0 > a [1] 2 1 10 9 0 8 3 6 4 5 Oswald/Windows
Calling functions • Functions are called like this: function.name(argument, argument, ...) • Functions always return a value • NULL represents no value • Function arguments have a: • position (first, second, …) • name • default value (sometimes) Oswald/Windows
Example: rep rep(x, times = inferred, length.out = inferred) > rep(1:3,2) [1] 1 2 3 1 2 3 > rep(1:3,length=8) [1] 1 2 3 1 2 3 1 2 > rep(1:3,c(3,2,1)) [1] 1 1 1 2 2 3 > rep() Error in rep: Argument "x" is missing, with no default: rep() Dumped Oswald/Windows
Example: seq seq(from=1, to=end, by=1, length=inferred, along=NULL) > seq(1,5) [1] 1 2 3 4 5 > seq(10, 20, length=6) [1] 10 12 14 16 18 20 > seq(to=100, by=15, length=7) [1] 10 25 40 55 70 85 100 > seq(length=10) [1] 1 2 3 4 5 6 7 8 9 10 Oswald/Windows
Matrices matrix(data=NA, nrow=inferred, ncol=inferred, byrow=F, dimnames=NULL) > X <- matrix(1:10, nrow=2) > X [,1] [,2] [,3] [,4] [,5] [1,] 1 3 5 7 9 [2,] 2 4 6 8 10 > dim(X) [1] 2 5 Oswald/Windows
Matrix indices > X2 <- matrix(1:15, nrow=3, byrow=T) > X2[2,3] [1] 8 > X2[1:2,] [,1] [,2] [,3] [,4] [,5] [1,] 1 2 3 4 5 [2,] 6 7 8 9 10 > X2[2:3, 3:5] [,1] [,2] [,3] [1,] 8 9 10 [2,] 13 14 15 > X2[,1] [1] 1 6 11 > X2[,1,drop=F] [,1] [1,] 1 [2,] 6 [3,] 11 Oswald/Windows
Transpose: t(X) > X <- cbind(rep(1,5),1:5) > t(X) [,1] [,2] [,3] [,4] [,5] [1,] 1 1 1 1 1 [2,] 1 2 3 4 5 Matrix multiply: %*% > t(X) %*% X [,1] [,2] [1,] 5 15 [2,] 15 55 Transpose and multiply Oswald/Windows
Matrix algebra • Matrix inverse: solve > solve(t(X) %*% X) [,1] [,2] [1,] 1.1 -0.3 [2,] -0.3 0.1 • Decompositions • Eigenvector (eigen) • Singular value (svd) • Cholesky (chol) > eigen(t(X)%*%X) $values: [1] 1.1831 0.0169 $vectors: [,1] [,2] [1,] 0.9637 0.2669 [2,] -0.2669 0.9637 Oswald/Windows
Lists • An ordered collection of arbitrary objects > mylist <- list(dat=1:5, name="short", + loc=c(1.0,-2.5)) > mylist $dat: [1] 1 2 3 4 5 $name: [1] "short" $loc: [1] 1.0 -2.5 > mylist$name [1] "short" > mylist[[3]] [1] 1.0 -2.5 > names(mylist) [1] "dat" "name" "loc" Oswald/Windows
Getting help • Help on functions ?sample help(sample) • General help: Help menu • Function index, keyword search • On-line manuals (Users/Programmers/Statistics) • Visual demo Oswald/Windows
Stop! • To cancel the current command, press ESC • Cancel long outputs • Abort long computations • To quit from S-plus (command prompt): > q() • Or File … Exit from menu bar Oswald/Windows
Day 1 Session 2 • Practical • S-plus tutorial Oswald/Windows
Day 1 Session 3 • Reading data and data import • Data frames and factors • ldframe objects • adding columns • sub-selections • summary • ssv/tsv Oswald/Windows
Reading in data • File … Import Data … From File • Choose file and file type from dialog Oswald/Windows
Data import options • Select the Options tab for import options • By default, ASCII files are assumed to have • column labels on row 1 (blank rows ignored) • data from row 2 onwards • columns separated by whitespace or comma • Change defaults with import options • First data line, separators, column spec … • NB: .dat files assumed to be Gauss files! Oswald/Windows