1.08k likes | 1.88k Views
Introduction to Systems Biology. 國立台灣大學資訊工程系 博士後研究員 詹鎮熊. What is a system?. Features of a system. Components Interrelated components Boundary Purpose Environment Interfaces Input Output Constrain . Examples of Systems. Life ‘ s Complexity Pyramid. System. Functional modules.
E N D
Introduction to Systems Biology 國立台灣大學資訊工程系 博士後研究員 詹鎮熊
Features of a system • Components • Interrelated components • Boundary • Purpose • Environment • Interfaces • Input • Output • Constrain
Life‘s Complexity Pyramid System Functional modules Building blocks Components Z. N. Oltvai and A.-L. Barabási, Science 298, 763 (2002)
生物圈 個體 生態體系 器官系統 社區 組織 族群 細胞 個體 分子 原子
個體 – 細胞 – 胞器 – 分子Organism – Cell – Organelle – Molecules 人體由上兆個細胞組成 每個細胞具有: 46 條染色體 2 米長的DNA 30 億個鹼基 (A, T, G, C) 2~3萬個基因
Bottom-up • From genes to phenotypes • If the genome sequence can be fully sequenced, can we resolve all the secrets hidden in the DNA?
Genomics (Genome) • Human Genome Project • Other Genome Projects • Mouse • Fly • Dog • Worm • Bacteria • … • Most recently … Cat
Human genome project • Sequence the whole genome sequence of several individuals • Competition between Celera and NIH • Took over a decade • Draft in 2000, complete in 2003
The next stage: HapMap • HapMap is a catalog of common genetic variants that occur in human beings • It describes: • what these variants are • where they occur in our DNA • and how they are distributed among people within populations and among populations in different parts of the world
Personalized genome • James Watson (454 Life Science) • Craig Venter (Venter Institute) • 23andme (backed by Google, focus on social/family relationships) • Navigenics (focus on medical conditions) • Personal Genome Project (PGP, Harvard)
Proteomics (Proteome) • Categorize all proteins (and their relationships) in a temporal-spatial confined system • Identities of these proteins • Quantities • Variants of these proteins • Alternative splicing forms • Post-translational modifications (Phosphorylation, Methylation, Ubiqutination, …)
Co-localization (interaction) between protein-protein, protein-DNA pairs Fluorescence Resonance Energy Transfer (FRET)
Transcriptome • Identify all transcription factors (TF) functioning in a specific temporal-spatial confined system • Identify all genes regulated by specific TFs • ChIP-chip • TransFac database
a well-established procedure used to investigate interactions between DNA-binding proteins and DNA in vivo Chromatin Immuno-Precipitation (ChIP)
Interactome • Categorized all interactions (protein-protein or protein-DNA) within an organism • Yeast Two-Hybrid • Immuno-coprecipitation (co-IP) • Mass Spectrometry • FRET • …
Metabolomics (Metabolome) • “systematic study of the unique chemical fingerprints that specific cellular processes leave behind” • Collection of all metabolites in a biological organism
Analytical methods for metabolomics • Separation • Gas Chromatography (GC) • High performance liquid chromatography (HPLC) • Capillary electrophoresis (CE) • Detection • Mass Spectrometry • Nuclear magnetic resonance (NMR) spectroscopy
Glycomics • Oligosaccharide • Glycoprotein/Proteoglycan • Proteins attached to oligosaccharides • Important to cell recognition • Cancer targeting • Influenza
Model Organisms • Yeast (S. cerevisiae) • Worm (C. elegans) • Fruit Fly (D. melanogaster) • Mouse (M. musculus)
Monitoring the System • High throughput monitoring of gene expression • Microarray • Protein microarray • GC/HPLC/MASS/Tandem MASS • Phenotype/Disease
Phenotypes • Lethality • Synthetic lethal • Developmental • Morphological • Behavioral • Diseases
Genotypes and Phenotypes genotype + environment → phenotype genotype + environment + random-variation → phenotype
Importance of Computer Models • Interactions in cell are too complex to handle by pen-and-paper • With high-throughput tools, biology shifts from descriptive to predictive • Computers are required to store, processing, assemble, and model all high-throughput data into networks
Types of Computer Models • Chemical Kinetic Model • Defined by concentrations of different molecular species in the cell • Represented with a number of equations • Some processes may be stochastic • Simplified Discrete Circuit • Network with nodes and arrows • Nodes represent quantity or other attributes • Directed edges represent effect of nodes on other nodes
Different Mathematical Formulations • Differential Equations • Linear (ordinary) • Partial • Stochastic • S-Systems • Power-law formulation • Captures complicate dynamics • Parameter estimation is computation intensive
Model details • Selection of genes, gene products, and other molecules to be included • Cellular compartments: nucleus, golgi, or other organelles • Too much details may lead to more noises • Minimal model able to predict system properties (mRNA level, growth rate, etc) is sufficient
Construct Model from Global Patterns • Microarray gene expression patterns: Up-regulated/down-regulated • Gene expression profiles under different conditions: Tumor/normal, cell cycle, drug treatment, … • Methods: • Bayesian Inferences • Machine learning (clustering, classification) • …
Tools for Simulation • E-cell • Cell Illustrator • Virtual Cell • Standardizing efforts: • BioJake • SBML (systems biology markup language) • Facilitate the exchange of models
E-Cell System • A software to construct object models equivalent to a cell system or a part of the cell system • Employing Structured Variable-Process model (previously called the Substance-Reactor model, or SRM) • Objects: • Variables, Processes, Systems
Computational Databases • Protein-protein interaction • DIP, BIND, MIPS, MINT, IntAct, POINT, BioGRID • Protein-DNA interaction • TRANSFAC, SCPD • Metabolic pathways • KEGG, EcoCyc, WIT, Reactome • Gene Expression • GEO, ArrayExpress, GNF, NCI60, commercial • Gene Ontology
Network Biology • The entities within a system form intertwined complex networks • Genes • Proteins • Metabolites • External factors…