310 likes | 451 Views
How can you benefit from the Bioinformatics Resource?. Can (John) Bruce, Ph.D. Associate Director Bioinformatics Resource Keck Biotechnology Laboratory. The Bioinformatics Core. Created within Keck Lab upon request from Yale School of Medicine, July 2007.
E N D
How can you benefit from the Bioinformatics Resource? Can (John) Bruce, Ph.D. Associate Director Bioinformatics Resource Keck Biotechnology Laboratory
The Bioinformatics Core • Created within Keck Lab upon request from Yale School of Medicine, July 2007. • Director Hongyu Zhao Ph.D; Associate Directors Can Bruce, Ph.D. & Yong Kong , Ph.D. • The facility is located at Sterling Hall of Medicine. • Commercial software packages provided free by the Core are available to Yale researchers 24/7.
Services • Access to large number of widely used commercial and open source bioinformatics programs. • Fee-based consultation services for well defined bioinformatics analyses. • Collaborative projects requiring longer-term commitment of time and effort
Available programs • DNA/protein sequence analysis : Lasergene and Gene Construction Kit. • Pathway Analysis: Ingenuity Pathway Analysis and MetaCore. • Protein structuremodeling: Sybyl, a protein structuremodeling and visualization program. • Mass spectrometry data analysis: GPMAW. • Pipelining programs: Pipeline Pilot and VIBE
Examples of Current Collaborations • Pathway analysis on proteomics data (Yale/NIDA Proteomics Center Project and Yale/NHLBI Proteomics Center Project investigators) • Development of an algorithm for identification of phosphorylation sites from tandem spectrometry data (E. Gulcicek in Keck Proteomics ) • Molecular modeling of MAP Kinaseligand interactions (B. Turk in Pharmacology) • Sequence analysis for defining invention claim for Office of Collaborative Research
Microarray analysis software • GeneSpring GX, provides visualization and advanced statistical analysis for gene expression data. • Partek Genomics Suite, provides advanced statistics and interactive data visualization designed for gene expression analysis, exon expression analysis, promoter tiling array analysis, chromosomal copy number analysis, and SNP analysis.
Sequence Analysis Software • DNASTAR Lasergene, a comprehensive suite of programs for analysis of DNA/RNA/protein sequences including sequence editing, sequence assembly, sequence alignment, primer design, protein structure prediction, and gene detection and annotation. • Gene Construction Kit 2.5, a tool for designing, drawing, and annotating DNA sequences especially plasmid constructs.
PIPELINING PROGRAMS This pipeline from Pipeline Pilot takes a Swiss-Prot sequence, from a Web portal, then generates a results page with four tabs, giving summary data, sequence features map, chemical structures of substrates and blast results.
PATHWAY ANALYSIS • MetaCore (from GeneGo), • Ingenuity Pathways Analysis 3.1 (from Ingenuity Systems). • Both are integrated software suite for functional analysis. • Based on a proprietary manually curated database of human protein-protein, protein-DNA and protein compound interactions, metabolic and signaling pathways and the effects of bioactive molecules. • Metacore can be integrated with other software packages such as Genespring, Resolver, Expressionist etc. , Pipeline Pilot, EndNote, Cytoscape. • Ingenuity can be integrated with Genespring, Partek genomics, SAS-Jump Genomics, Spotfire.
Direct Interactions Algorithm Draws direct interactions between selected objects.No additional objects are added to the network
Self regulatory Networks Finds the shortest directed paths containing transcription factors between your genes in the gene list. (better used for small number of targets)
Expand by one(not suitable for large collections of targets)
Auto expand Draws sub-networks around the selected objects, stopping the expansion when the sub-networks intersect
Pathway Creation Algorithms in MetaCore (2) • Analyze Network: Creates a list of possible networks, ranked according to how many objects in the network correspond to the user's list of genes, how many nodes are in the network, how many nodes are in each smaller network. • Analyze Transcription Network similar to above, sub-networks created are centered on TFs. • Analyze Networks (Transcription Factors) focusses on presence of TFs at end notes. • Analyze Networks (Receptors) focusses on presence on Receptors at end point of a network.
Analyze Network Algorithm Generates sub-networks highly saturated with selected objects. Sub-networks are ranked by a P-value andG-Score and interpreted in terms of Gene Ontology A proteomics experiment. Effect of drug infusion on plasma proteins P<1e-18
Analyze Networks (Transcription Factors) Algorithm- an example - Favors netwok construction where the end-nodes of transcriptionally regulated pathways are present in the original gene list. Example from an mRNA expression analysis data set comparing healthy and lesion skin. P=7.2e-46
Analyze Network (Receptors) Algorithm- an example - Favors network construction where the end-point of a pathway leads to a receptor (through “receptor binding”) and the starting point of a pathway (a transcription factor, or ligands, etc…) is present in the original gene list, regardless of the presence of the end-point receptor in the list.
Transcription Regulation Algorithm Generates sub-networks centered on transcription factors. Sub-networks are ranked by a P-value and interpreted in terms of Gene Ontology 13 targets/14 nodes P=7.3e-31
Immune response: Histamine H1 receptor signaling in immune response (p=1e-4)
Network-disease associations 1) Carcinoma (72% coverage, p=3.3e-10) 2) Neoplasms, connective and soft tissue. (42% coverage, p=8e-10)
Use of Pathway Analysis in Candidate Gene Identification • FGF2, • WNT5A, Tenascin-C, EGF, • ILI1RN, • BDNF, • TGF-beta2, FGF2, • OSF-2, CSPG4(NG2), IL-8, • ENA-78, • GCP2, • SLIT2, • SLIT3, • Activin beta A, • Annexin I 1061 genes are located to mapped region for disease Other up- or down- regulated genes 360 genes up- or down- regulated by >2x 17 receptor ligand genes are important “input” nodes to pathways formed by genes with changed expression.
Pathway analysis narrows down number of candidate genes for disease • ErbB2 • PECAM1 • DDX5 • BCAS3 • microRNA1 • RARalpha • MUL • VHR • WIP • ErbB2 • NIK • Plakoglobin • HEXIM1 • Prohibitin • STAT5A • STAT3 • Clathrin • PSME3 • PSMC5 • ErbB2 • FGF2, • ILI1RN, • ErbB2 Other up- or down- regulated genes 360 genes up- or down- regulated by >2x These genes, from mapped region of interest, are able to form interaction pathways going through these receptor ligands identified by first analysis.
A caveat Not every gene belongs to a pathway in the database…
Why Pathway Analysis Software? • A learning tool • Study a group of gene products. • A data analysis tool. • Which pathways are particularly affected? • What disease has similar biomarkers? • A hypothesis generation tool • Can provide insight into mechanism of regulation of your genes. Which is the likely causative agent for the observed changes? What is likely to happen as a result of these changes? • Suggest effects of gene knock-in or knock-outs. • Suggest side-effects of drugs. • Can highlight new phenomena that needs further investigation. What does the program not explain?