1 / 8

Introduction to R for Biological Computing

Introduction to R for Biological Computing. Jeff Krause (Shodor). What is R?. The R Project for Statistical Computing Free, high-level interpreted language for statistical computing and visualization Open-source version of S-plus

rasha
Download Presentation

Introduction to R for Biological Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to R for Biological Computing Jeff Krause (Shodor)

  2. What is R? • The R Project for Statistical Computing • Free, high-level interpreted language for statistical computing and visualization • Open-source version of S-plus • Robust & flexible language that facilitates rapid development of computational science tools • Extensible, with large dedicated user community of prominent researchers • >2000 user contributed packages containing functions and data for specific applications in statistics, data analysis, data mining, visualization & graphing, numerical simulation, optimization, sequence analysis, parallel computing, … • Command line or console interface for base installation • The R Commander: A Basic-Statistics GUI for R • CRAN Rcmdr package page • R-Forge Rcmdr package page

  3. Why R for biology? • It’s free! • Statistical data analysis • Clinical data • Statistical genetics • Bioinformatics • Easily extensible • Committed users have extended it’s capabilities

  4. Introductory R resources • Web sites • The R Project homepage • The R wiki – many great resources including the “Getting Started” site • Contributed documentation page (cran.r-project.org/other-docs.html) • An Introduction to R - • Statistics Using R with Biological Examples • Applied Statistics for Bioinformatics Using R – The authors stated goal is to bridge the gap between • Using R for scientific computing • Ecology and epidemiology in the R programming environment • R and Octave • Matlab/R Reference • Articles • Books • Courses

  5. First steps with R • Downloading and installing • Starting R • Getting help – “?foo”, help(foo) • Help menu – “html help” (“R help” on mac) • Rseek – Google powered search site for all things R • The basics • Assignment • Arithmetic • Vectors, random #’s, sort • time series as vector • Plotting • Matrices and arrays • Loops, scripts • Add-in package installation

  6. Topic specific resources • Introductory & Basic statistics • Books • Introductory Statistics with R - • R package ISwR contains data and functions from the book • Introduction to Probability with R – • A First Course in Statistical Programming with R – Goes through an introduction to the language, programmin and graphics, then works through MCMC simulation, computational linear algebra and numerical optimization • Solutions to selected exercises • Modern Applied Statistics with S - • R package MASS contains data and functions from the book • Bioinformatics & Genomics • Books • Applied Statistics for Bioinformatics Using R – Free 272 mini-text pdf • Bioconductor Case Studies • Bioinformatics and Computational Biology Solutions Using R and Bioconductor • Computational Genome Analysis - • Packages • Bioconductor (www.bioconductor.org) - set of packages for analysis of genomic data • Seqinr

  7. Numerical simulation • Books • An Introduction to Scientific Programming and Simulation, Using R • spuRs – R package containing functions and datasets from the book • Computer Simulation and Data Analysis in Molecular Biology and Biophysics • Describes the use of functions from a variety of R packages including: • Dynamics Models in Biology – Along with their supplemental Lab Manual for working in R • Their supplemental materials page includes resources for building simulations described in the text in both R and MATLAB • Epidemiology, Ecology & pop bio • Books • A Practical Guide to Ecological Modelling – Along with their supplement: “Using R for scientific computing” • Packages

  8. High-performance and parallel computing with R • CRAN task page • Rmpi can be used with the LAM/MPI, MPICH / MPICH2, Open MPI, and Deino MPI implementations • GridR package by Wegener et al. can be used in a grid computing environment via a web service, via ssh or via Condor or Globus • rsprng package by Li, random-number generator for parallel computing

More Related