1 / 19

Introduction to S-Plus

Introduction to S-Plus. by Francesco Ferretti Analysis of Biological Data Course Winter term 2007 Dalhousie University. Introduction. S-plus and R are statistical programs using the S language. Developed in the Bell Labs of AT&T in 1970s by Rick Becker, John Chambers and Allan Wilks

Download Presentation

Introduction to S-Plus

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to S-Plus by Francesco Ferretti Analysis of Biological Data Course Winter term 2007 Dalhousie University

  2. Introduction • S-plus and R are statistical programs using the S language. • Developed in the Bell Labs of AT&T in 1970s by Rick Becker, John Chambers and Allan Wilks • In 1987 Douglas Martin at the University of Washington created the present Insightful Corporation. He made S more popular, compatible with many hardware platforms, and provided with the necessary support for technical and statistical problems. S become S-plus • In 1997 the R project started. It was created by Ross Ihaka and Robert Gentelman at the university of Auckland, New Zealand. R is Similar to S-plus and freely available.

  3. S-Plus and R • Flexible and powerful statistical program • Particularly appealing for its graphical capabilities • Can be problematic with large amount of data SAS is more powerful in these cases

  4. GUI (Grafical User Interface) • Main toolbar and several windows • Object Explorer Overview of what is available on the system. • Computational Engine • data frames, list, matrices, vectors • Interface Objects • Search path, menu items, toolbars, dialogs • Documet objects – Outputs • Graph sheets, Scripts and Reports Object explorer visualize all the objects you have in your work directory

  5. GUI (Grafical User Interface) • Import data • File>Import Data>From file • Export data • File>Export Data>to file • chose among all the data frames present in your working directory, give location and extension • Creating graphs • Highlight a dataset in object explorer • Select variables (Ctrl-select) • Click on 2D plots • Chose the preferred graph type • Save graphs • Default *.sgr (s-plus graph sheet) • Eventually you can choose your preferred picture extension with File>Export Graph.. then specify location, name and extension then click OK

  6. GUI (Grafical User Interface) • Summary statistics • From object explorer select a data frame • On the main toolbar select Statistics>Summary Statistics • Select data, variables and statistics to be shown then click OK

  7. Full potential and flexibility of S-plus. Highly recommended! While GUI can perform much of the S-Plus commands and functions, programming mode allows you to resolve potentially all problems you will encounter in data manipulation, analysis and plotting. Command window Can be used step by step interactively Writing functions Using a text editor (notepad, emacs, editplus, etc.) or directly on the command line Programming mode

  8. Command line (the basic) • S-plus is case sensitive • # commenting sign • ? Call help • q() quit S-plus • <- assignment sign. This is to associate a value or a function to a variable name

  9. Use of S-Plus in programming mode • Calculator */+-, =, log, exp, sqrt, ^, sin, cos Follow the same arithmetic rules */ before +- and () before */ • Manipulate data • Fitting models to data • Plotting graphs

  10. Logical Values • Boolean Values: True, False • < (less than), >, <= (less than or equal to), >=, == (equal to), != (not equal to) • Conditional expressions and operators If, else, ifelse & (and) | (or)

  11. Brackets • () to enclose arguments of functions and perform arithmetic calculations • [] indexing objects • x<-c(1,5,7,8) then x[3] = 7 • {} to enclose groups of commands • Function bodies • If else statements • loops

  12. S-plus common objects • Vector • Ordered group of numbers or strings • X<-c(45,29,27) • z<-c(180,180,165) • y<-c(“Hall”,”Francesco”,”Sara”) • Matrix • “rectangular layout of cells each one containing a value” • AH<-matrix(c(45,29,27,180,180,165),nrow = 3) • AH<-matrix(c(x,z),nrow=3) • Array • Multidimentional matrix • Data frame • AHP<-data.frame(x,z,y) • AHP<-data.frame(x,z,y,) • List • group together data not having the same structure. Output or summary come out as list. You can access or use part of these output.

  13. Functions • Set of commands performed on specified variables • Y<-mean(x) …or..y<-(x1+x2+x3+x4)/4 ..or.. y<-sum(x)/4 ..or..y<-sum(x)/length(x) You can build your own functions • In command line SD<-function(x){sqrt(var(x))} function will be saved in your working directory…..SD(x)

  14. Function name arguments Body of the function, set of commands Functions • Creating a file with an s extension (file.s, sort of a library where you can store one ore more functions) • Open and editor • Write the function: # this function create the dataset “buddy” and # plot its variables one against the other buddy<-function(){ x<-c(2,3,5,6,8,10) y<-c(4,6,10,12,16,20) buddy<-data.frame(x,y) plot(buddy$x,buddy$y,xlab=“x”,ylab=“y”,type=“l”) print(buddy) } • Save the file as an s file: c:\buddy.s • Open the file with source(“c:\\buddy.s”) • Access the funtion calling it as buddy()

  15. Use of S-Plus in programming mode (Manipulation of data) • Dataset never ready for analyses • Importing datasets: read.table() • Subsetting object • Creating new variables • seq(), rep(), sort(), unique(), length() • Merging and binding datasets: • merge(), cbin(),rbin()

  16. Graphical analysis • Plotting to the active device: s-plus window or file pdf.graph(file=“”,horizontal=“”) postscript(file=“”,horizontal=“”) graphsheet(file=“”,format=“”) Important functions: par(), plot(), hist(), boxplot(), pairs()

  17. Fitting a model to data • Take SharkLife data • Summary of the data, summary() • EDA (Exploratory Data Analysis), pairs(), hist(), boxplot(), plot() • Fitting a linear regression model between Lmax and birth.size, model1<-lm() • Checking the model (using statistics and plots), summary(model), plot(model)

  18. Programming mode • Script window • Mode where you can write programs, run them and keep track of your operations for future work • File>New>Script File

  19. Useful Reference Books • The Basic of S-Plus by Krause A. and Olson M. • Statistical computing with S-Plus by Crawley M.J. • Modern Applied Statistics with S-plus by Venables W.N. and Ripley B.D • …much more in the internet

More Related