210 likes | 332 Views
Networks. A series of entities or NODES (genes, proteins, metabolites, individuals, ecosystems, etc, etc) and the interactions or EDGES between them. Directed graph (where connections have directionality, e.g. kinase – substrate connections). Undirected graph. Network Analysis.
E N D
Networks A series of entities or NODES (genes, proteins, metabolites, individuals, ecosystems, etc, etc) and the interactions or EDGES between them. Directed graph (where connections have directionality, e.g. kinase – substrate connections) Undirected graph
Network Analysis Goal: to turn a list of genes/proteins/metabolites into a network to capture insights about the biological system • Today: • 1. Types of high-throughput data amenable to network analysis • 2. Network theory and its relationship to biology
Physical Interactions: protein-protein interactions Data from: 1. Large-scale yeast-two hybrid assay: recovers binary (1:1) interactions Giorgini & Muchowski Gen. Biol. 2005 2. Protein immunoprecipitation & mass-spec identification: recovers complexes mass spectrometry to identify recovered proteins PEPTIDE TAG 3. Literature curation
Nature 2005 Y2H + literature curation
Protein Arrays Proteins or antibodies immobilized onto a solid surface Antibody arrays: for identification & quantification of fluorescently labeled proteins in complex mixtures … proteins bind to immobilized Ab. Functional arrays: for measuring protein function * ppi: detect binding of fluorescent protein to immobilized peptides/proteins * kinase targets: detect phosphorylation of immobilized peptides/proteins by query kinase * ligand binding: detect DNA/carbohydrate/small molecule bound to immobilized proteins Reverse-phase arrays (lysate arrays): cells lysed in situ and immobilized cell lysate is screened
Challenges: • Large-scale protein purification • Protein structure/stability requirements • vary widely (unlike DNA) • Conditions for protein function vary widely • Protein epitope/binding domain must be • displayed properly From Hall, Ptacek, & Snyder review 2007
High-throughput identification of gene/protein function: Functional Genomics • Gene knock-out libraries: library of single-gene deletions for every gene • done in yeast, E. coli, other fungi/bacteria S. cerevisiae libraries: heterozygous deletion (nonessential genes) OR homozygous deletion of all genes. Strains can be phenotyped individually (screening) OR Selected for particular phenotypes – Strains surviving the selection can be readout on DNA arrays designed against the barcode sequences * Each gene replaced with a short, unique ‘barcode’ sequence
Yeast deletion library used to: • Identify ‘essential’ yeast genes and genes required for normal growth • Genes required for survival of particular conditions/drugs • Features of functional genomics, gene networks, etc * Screened deletion libraries for >700 conditions * Found ‘phenotype’ for nearly all yeast genes * Characterized which genes could be functionally profiled by which assays (e.g. phenotype, gene expression, etc)
Challenges: • Difficult to probe ‘essential’ and slow-growing strains • Cells likely to pick up secondary mutations to complement missing gene • (chromosomal anueploidy in yeast)
Science 2010 Pairwise deletions to measure genetic interactions for 75% of yeast genes
High-throughput identification of gene/protein function: Functional Genomics RNAi knock-down libraries (C. elegans, flies, humans) Small double-stranded DNAs complementary to mRNA can be injected (or fed) … … these are targeted by the RNAi pathway to inhibit mRNA stability/translation of target gene … knocks down protein abundance/function • Challenges: • Doesn’t work for all genes/ds DNAs • Doesn’t work in all tissues • Delay in protein decrease, timing • different for different proteins Image from David Shapiro
Insights from whole-genome knockout / knockdown studies * Screens for genes important for specific phenotypes/processes * Identifying off-target drug effects * Clustering of genes based on common phenotypes from knockdowns * Clustering/analysis of phenotypes with similar underlying genetics/processes * Integrative analysis with genomic expression, etc * Network analysis
Network structures Random network: Each node has roughly equal number of connections k, distributed according to Poisson distribution Scale-free network: Some nodes with few connections, other nodes (‘hubs’) with many connections (distributed according to Power Law) Directed vs. Undirected Graphs
Network Terminology Connectivity (Degree) k: number of connections of a given node (average degree of all nodes <k>) Degree distribution: probability that a selected node has k connections Shortest path l: fewest number of links connecting two given nodes (average shortest path <l> between all node pairs) Clustering coefficient: # of links connecting the k neighbors of Node X together
Scale-free Networks Connectivity: most nodes have few connections but joined by ‘hub’ nodes with many connections ‘Small world’ effect: each node can be connected to any other node through relatively few connections ‘Disassortative’: hubs tend NOT to directly connect to one another ‘Robust’: network structure remains despite node removal (up to 80% removal!) ‘Hub vulnerability’: network structure is particularly reliant on few nodes (hubs)
Networks Challenges • Identifying relevant subnetworks • Integrating multiple data types (see #1 above) • Capturing temporary interactions and dynamic relationships • Using network structure/subnetworks to infer new insights about biology Networks Challenges • Infer hypothetical functions based on network connectivity • Reveal new connections between functional groups and complexes • Identify motifs and understand motif behaviors (more next time)
Inferred NaCl-activated Signaling Network 430 proteins 1199 edges starting network: 5,855 proteins 25,906 edges Kinase Transcription Factor Target Gene/Module Debbie Chasman & Mark Craven
Orthologs of human disease genes are enriched in the network 430 proteins 188 have one-to-one human orthologs 95% of ‘reviewed’ orthologs are disease associated Disease-associated ortholog Human ortholog not linked to disease