800 likes | 1.25k Views
BIOLOGICAL NETWORKS. Woochang Hwang. BIOLOGICAL NETWORKS. Introduction Biological Networks Protein-Protein Interaction Networks Signaling & Metabolic Pathway Networks Expression Networks Biological Networks ’ Properties Databases Discussion STM Clustering Model. Introd uction.
E N D
BIOLOGICAL NETWORKS Woochang Hwang
BIOLOGICAL NETWORKS • Introduction • Biological Networks • Protein-Protein Interaction Networks • Signaling & Metabolic Pathway Networks • Expression Networks • Biological Networks’ Properties • Databases • Discussion • STM Clustering Model
Informatics Its carrier is a set of digital codes and a language. In its manifestation in the space-time continuum, it has utility (e.g. to decrease entropy of an open system). Bioinformatics The essence of life is information (i.e. from digital code to emerging properties of biosystems.) Bioinformatics is the study of information content of life Bioinformatics
Genomics Proteomics Structural Proteomics Functional Proteomics Protein-Protein Interaction & Networking Structure Determination Database / Knowledge Source Protein Expression Post-tranlational Modification Homology Modeling Database / Knowledge Source Proteomics
From the particular to the universal A.-L- Barabasi & Z. Oltvai, Science, 2002
BIOLOGICAL NETWORK Networks are found in biological systems of varying scales: 1. Evolutionary tree of life 2. Ecological networks 3. Expression networks 4. Regulatory networks - genetic control networks of organisms 5. The protein interaction network in cells 6. The metabolic network in cells … more biological networks
Why Study Networks? • It is increasingly recognized that complex systems cannot be described in a reductionist view. • Understanding the behavior of such systems starts with understanding the topology of the corresponding network. • Topological information is fundamental in constructing realistic models for the function of the network.
Biological Network Model • Network • A linked list of interconnected nodes. • Node • Protein, peptide, or non-protein biomolecules. • Edges • Biological relationships, etc., interactions, regulations, reactions, transformations, activation, inhibitions.
Biological Network Model • It is usually represented by a 2-D diagram with characteristic symbols linking the protein and non-protein entities. • A circle indicates a protein or a non-protein biomolecule. • An symbol in between indicates the nature of molecule-molecule process (activation, inhibition, association, disassociation, etc.)
Proteins in a cell • There are thousands of different active proteins in a cell acting as: • enzymes, catalysors to chemical reactions of the metabolism • components of cellular machinery (e.g. ribosomes) • regulators of gene expression • Certain proteins play specific roles in special cellular compartments. • Others move from one compartment to another as “signals”.
Protein Interactions • Proteins perform a function as a complex rather as a single protein. • Knowing whether two proteins interact can help us discover unknown proteins’ functions: • If the function of one protein is known, the function of its binding partners are likely to be related- “guilt by association”. • Thus, having a good method for detecting interactions can allow us to use a small number of proteins with known function to characterize new proteins.
Protein Interactions P. Uetz, et al. Nature, 2000; Ito et al., PNAS, 2001; …
Yeast Protein Interaction Network Nodes: proteins Links: physical interactions (binding)
Signaling & Metabolic Pathway Network • A Pathway can be defined as a modular unit of interacting molecules to fulfill a cellular function. • Signaling Pathway Networks • In biology a signal or biopotential is an electric quantity (voltage or current or field strength), caused by chemical reactions of charged ions. • refer to any process by which a cell converts one kind of signal or stimulus into another. • Another use of the term lies in describing the transfer of information between and within cells, as in signal transduction. • Metabolic Pathway Networks • a series of chemical reactions occurring within a cell, catalyzed by enzymes, resulting in either the formation of a metabolic product to be used or stored by the cell, or the initiation of another metabolic pathway
Regulatory Network • a collection of DNA segments (genes) in a cellwhich interact with each other and with other substances in the cell, thereby governing the rates at which genes in the network are transcribed into mRNA.
Expression Network • A network representation of genomic data. • Inferred from genomic data, i.e. microarray.
BIOLOGICAL NETWORK PROPERTY • Interaction Network • Pathway Network • Regulatory Network • Expression Network
Biological Networks Properties • Power law degree distribution: Rich get richer • Small World: A small average path length • Mean shortest node-to-node path • Robustness: Resilient and have strong resistance to failure on random attacks and vulnerable to targeted attacks • Hierarchical Modularity: A large clustering coefficient • How many of a node’s neighbors are connected to each other
Power Law Network • PREFERENTIAL ATTACHMENT on Growth: the probability that a new vertex will be connected to vertex i depends on the connectivity of that vertex:
The Barabási-Albert [BA] model ER Model WS Model Actors Power Grid www (a) Random Networks (b) Power law Networks Power Law Network (Scale Free) • The probability of finding a highly connected node decreases exponentially with k:
Small World Property • A small average path length • Any node can be reached within a small number of edges, 4~5 hops.
Power Law Network • Power-law degree distribution & Small world phenomena also observed in: • communication networks • web graphs • research citation networks • social networks • Classical -Erdos-Renyi type random graphs do not exhibit these properties: • Links between pairs of fixed set of nodes picked uniformly: • Maximum degree logarithmic with network size • No hubs to make short connections between nodes
node failure Attack Tolerance • Complex systems maintain their basic functions even under errors and failures (cell mutations; Internet router breakdowns)
Attack Tolerance • Robust. For <3, removing nodes does not break network into islands. • Very resistant to random attacks, but attacks targeting key nodes are more dangerous. Max Cluster Size Path Length
Protein Interaction Network H. Jeong, S.P. Mason, A.-L. Barabasi & Z.N. Oltvai, Nature, 2001
Protein Interaction Network • The yeast protein interaction network seems to reveal some basic graph theoretic properties: • The frequency of proteins havinginteractions with exactly k other proteins follows a power law. • The network exhibits the small world phenomena: can reach any node within small number of hops, usually 4 or 5 hops • Robustness: Resilient and have strong resistance to failure on random attacks and vulnerable to targeted attacks.
Hierarchical Modularity E. Ravasz et al.,Science, 2002
Hierarchical Modularity Protein Networks Metabolic Networks E. Ravasz et al.,Science, 2002
Implications From Observations • Biological complexity: # states ~2# of genes. • Protein hubs critical for cells, 45% . • Infections will target highly connected nodes. • Cascading node failures could cause a critical problem. • Development of drug and treatment with novel strategies like targeting effective nodes is indispensable.
Protein Databases • Swiss-Prot (non-redundantdatabase): • Release 41.0, 11/4/2003: 124,464 entries. • Release 41.5, 23/4/2002: 125,236 entries. • TrEMBL (translations of EMBL nucleotide sequences not yet integrated into Swiss-Prot): • Release 23.7, 17/4/2003: 863,248 entries • This number keeps rapidly growing mainly due to large scale sequencing projects.
Protein Interaction Databases • Species-specific • FlyNets - Gene networks in the fruit fly • MIPS - Yeast Genome Database • RegulonDB - A DataBase On Transcriptional Regulation in E. Coli • SoyBase • PIMdb - Drosophila Protein Interaction Map database • Function-specific • Biocatalysis/Biodegradation Database • BRITE - Biomolecular Relations in Information Transmission and Expression • COPE - Cytokines Online Pathfinder Encyclopaedia • Dynamic Signaling Maps • EMP - The Enzymology Database • FIMM - A Database of Functional Molecular Immunology • CSNDB - Cell Signaling Networks Database
Protein Interaction Databases • Interaction type-specific • DIP - Database of Interacting Proteins • DPInteract - DNA-protein interactions • Inter-Chain Beta-Sheets (ICBS) - A database of protein-protein interactions mediated by interchain beta-sheet formation • Interact - A Protein-Protein Interaction database • GeneNet (Gene networks) • General • BIND - Biomolecular Interaction Network Database • BindingDB - The Binding Database • MINT - a database of Molecular INTeractions • PATIKA - Pathway Analysis Tool for Integration and Knowledge Acquisition • PFBP - Protein Function and Biochemical Pathways Project • PIM (Protein Interaction Map)
Pathway Databases • KEGG (Kyoto Encyclopedia of Genes and Genomes) • http://www.genome.ad.jp/kegg/ • Institute for Chemical Research, Kyoto University • PathDB • http://www.ncgr.org/pathdb/index.html • National Center for Genomic Resources • SPAD: Signaling PAthway Database • Graduate School of Genetic Resources Technology. Kyushu University. • Cytokine Signaling Pathway DB. • Dept. of Biochemistry. Kumamoto Univ. • EcoCyc and MetaCyc • Stanford Research Institute • BIND (Biomolecular Interaction Network Database) • UBC, Univ. of Toronto
KEGG • Pathway Database: Computerize current knowledge of molecular and cellular biology in terms of the pathway of interacting molecules or genes. • Genes Database: Maintain gene catalogs of all sequenced organisms and link each gene product to a pathway component • Ligand Database: Organize a database of all chemical compounds in living cells and link each compound to a pathway component • Pathway Tools: Develop new bioinformatics technologies for functional genomics, such as pathway comparison, pathway reconstruction, and pathway design
Discussion • Problems • Network Inference • Micro Array, Protein Chips, other high throughput assay methods • Function prediction • The function of 40-50% of the new proteins is unknown • Understanding biological function is important for: • Study of fundamental biological processes • Drug design • Genetic engineering • Functional module detection • Cluster analysis • Topological Analysis • Descriptive and Structural • Locality Analysis • Essential Component Analysis • Dynamics Analysis • Signal Flow Analysis • Metabolic Flux Analysis • Steady State, Response, Fluctuation Analysis • Evolution Analysis • Biological Networks are very rich networks with very limited, noisy, and incomplete information. • Discovering underlying principles is very challenging.
University at Buffalo The State University of New York Signal Transduction Model Based Functional Module Detection Algorithm for Protein-Protein Interaction Networks Woochang Hwang1 Young-Rae Cho1 Aidong Zhang1 Murali Ramanathan2 1Department of Computer Science and Engineering, State University of New York at Buffalo 2Department of Pharmaceutical Sciences, State University of New York at Buffalo