1 / 29

Introduction to Systems Biology

Introduction to Systems Biology. Tom MacCarthy Math Tower 1-101 maccarth@ams.sunysb.edu Office hours Tue/Fri 10-12 Course website: www.ams.sunysb.edu/faculty/~maccarth. Systems Biology. Systems Biology implies holistic (whole system) view of biological systems

jena
Download Presentation

Introduction to Systems Biology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Systems Biology Tom MacCarthy Math Tower 1-101 maccarth@ams.sunysb.edu Office hours Tue/Fri 10-12 Course website: www.ams.sunysb.edu/faculty/~maccarth

  2. Systems Biology • Systems Biology implies holistic (whole system) view of biological systems • The study of the interactions between the components of biological systems, and how these interactions give rise to the function and behavior of that system • Antithesis of reductionist approach (study of components in isolation) • In practice, Systems Biology usually involves • Mathematical modeling • Generating (lots of) experimental data • Statistical data analysis • Here we will be dealing with computational aspects

  3. Growth of biological data • First draft human genome published June 26, 2000 • Cost: approximately $3 billion • Consists of ~3 billion nucleotides (C,G,A,T) • First phase of 1000 genomes project published Oct 28, 2010 • Cost: $30 - 50 million

  4. Growth of biological data The reduced cost of sequencing las led to many other projects such as: The Cancer Genome Atlas The 1000 Plant Genomes project Comparable increases in other types of data, for example gene expression data, now increasingly performed via sequencing technologies.

  5. Systems Biology • The availability of large and varied amounts of biological data has created a need for computational tools for manipulation and analysis. • Mathematical modeling can be used to generate or test novel hypotheses • Example: Transcription factor networks in blood cell differentiation *** This course is introductory and inter-disciplinary therefore my apologies to specialists ***

  6. DNA → RNA → Protein • Most cells contain DNA (deoxyribonucleic acid) • Genes are segments of DNA thatcontainthenecessary • informationformakingproteins. • Proteins are moleculeswithspecificcellularfunctions

  7. Gene regulation TF1 Gene expression TF2 TF3 TF4 • At any given moment genes may or may not be producing protein • Proteins called transcription factors (TFs) control the level of activation (or “expression”) of each gene. • Gene have regulatory regions which contain short DNA sequences (or “motifs”) that are recognized by the TFs. • → in this way TFs activate or repress gene expression

  8. TF Z Gene regulatory networks • Transcription factors themselves are proteins • They are activated/repressed by other TFs (or by themselves) • In this way they form gene regulatory networks activation TF binding site Gene X coding region X TF Y Transcription/ Translation Intermediate/s activation/ repression TF X Y TF binding site Gene Y coding region

  9. Blood cell differentiation During blood cell differentiation GATA-1 and PU.1 are transcription factors that control erythroid and myeloid development, respectively. The two proteins have been shown to function in an antagonistic fashion, with GATA-1 repressing PU.1 activity during erythropoiesis (red blood cells) and PU.1 repressing GATA-1 function during myelopoiesis(macrophages, etc.)

  10. Where are GATA-1 and Pu.1 binding? ChIP-Seq was used to detect where in the (mouse genome) GATA-1 and Pu.1 are binding

  11. Where are GATA-1 and Pu.1 binding? Find 151 myelo-lymphoid genes that are occupied by GATA-1 and PU.1 and that are positively regulated by PU.1 and repressed by GATA-1, for example:

  12. Mathematical modeling • Already known that GATA-1 and Pu.1 are mutually antagonistic. • Also known before that Pu.1 represses GATA-1 targets. • Last piece of puzzle: GATA-1 also represses Pu.1 targets • Question: What are the consequences of mutual repression of the targets on gene expression dynamics? • Can compare a mathematical model with and without the repression of the targets

  13. Mathematical modeling • A system of four coupled non-linear ordinary differential equations is used to model the GATA-1-PU.1 regulatory network GATA-1 Pu.1 GATA-1 target Pu.1 target We manipulated the rate constants to evaluate the different network architectures. For example, as we increase Kir→ ∞ then mutual antagonism (GATA-1↔Pu.1) disappears. Similarly, Kit modulates the cross-regulation of targets

  14. Mathematical modeling We used matlab to simulate the system

  15. Mathematical modeling

  16. Mathematical modeling Systematically modulated the mutual antagomism between GATA-1 and Pu.1 mutual antagonism between the targets For every point in the plane we evaluate the steady state ratio gT/pT The model behavior illustrates that mutual inhibition and repression of opposing downstream targets act synergistically to maximize the GT/PT ratio

  17. Systems Biology in practice These results suggest that the dual mechanism provides, in comparison to either cross-inhibition or target inhibition alone, more robust suppression of an alternative gene expression program during lineage-specification. The example illustrates the highly multi-disciplinary nature of much modern biological research, here combining: 1. High-throughput techniques (ChIP-Seq) 2. Data analysis 3. Mathematical modeling to test hypothesis Many times, the hypothesis might come first from the mathematical model

  18. Why Matlab and R? • Computational tools are indispensable for doing this kind of research • In many cases students are held back by lack of computational skills • Matlab and R are both interpreted languages, i.e. no compiler • This makes them slower than compiled languages • Both have an enormous number of extension packages • Octave is free Matlab “clone” and is available for Windows, Mac and Linux • Both languages can be used interactively, but it is more powerful to write programs.

  19. Matlab • Advantages • Matlab allows one to easily perform numerical calculations and visualize the results. • Many additional libraries for statistics, signal processing, image processing, etc. • Note Matlab has Symbolic Toolbox, Octave does not • Disadvantages • Slow, but can be improved via vectorization • Matlab not good for complex software projects (not OO)

  20. Octave download and libraries To download octave for your home PC or laptop, go to: http://octave.sourceforge.net/ To install a package, from within octave, run: pkg install package_file_name.tar.gz For list of packages choose “Packages” from top menu:

  21. Course outline • 1. Learning to program in Matlab/octave • 2. Applications in Mathematical Biology, including: • Elementary image processing • Linear regression • Markov processes and Fisher-Wright model • Difference equations • Ordinary differential equations • 3. R programming • 4. Statistics and Bioinformatics using R • Linear models • Statistical hypothesis testing and linear models • Expression data analysis • Analysis of high-throughput sequencing data

  22. Further reading • There isn’t yet a good Systems Biology textbook that I’m aware of. • I do not recommend this one →

  23. Further reading • “MATLAB Programming for Engineers” by Stephen J. Chapman (Brooks/Cole) • “Mathematical Models in Biology” by Elizabeth S. Allman and John A. Rhodes (Cambridge Univ Press) • “Introductory Statistics with R” by Peter Dalgaard (Springer) • “Bioinformatics and Functional Genomics” by Jonathan Pevsner (Wiley)

  24. Octave • To start octave, open a terminal window and enter the command “octave”

  25. Octave basics • Getting help • Within octave type • help <command>, e.g. “help sort” • User-friendly online help available at http://www.mathworks.com/help/techdoc/ • GNU Octave help: http://www.gnu.org/software/octave/doc/interpreter/

  26. Octave basics • Files and directories • A MATLAB script file (Called an M-file) is a text (plain ASCII) file that contains one or more MATLAB commands and, optionally, comments. • The file is saved with the extension ".m". • When the filename (without the extension) is issued as a command in MATLAB, the file is opened, read, and the commands are executed as if input from the keyboard. • Download the file calc_area.m from the course website • http://www.ams.stonybrook.edu/~maccarth/teaching.shtml • Place the file in subdirectory “work”

  27. MATLAB Script Files • The preceding file is executed by issuing a MATLAB command: >> calc_area • This single command causes MATLAB to look in the current directory, and if a file calc_area.m is found, open it and execute all of the commands. • If MATLAB cannot find the file in the current working directory, an error message will appear.

  28. MATLAB Script Files • When the file is not in the current working directory, a cd or chdir command may be issued to change the directory. >> cd ~/work >> calc_area

  29. Octave basics • The search path • Matlab/Octave also uses a search path to find M-files • The m-files are organized in directories which matlab searches • To add a directory to the search path: • addpath(‘<directory_name>’); e.g. addpath(‘~/work’) • savepath; • You should now be able to run calc_area.m even if it is not your current directory, simply type: • calc_area • Now open the file calc_area.m with ‘gedit’ • Applications – Accessories – Text Editor • Change the radius to 3 and re-run ‘calc_area’

More Related