270 likes | 417 Views
Advanced Bioinformatics Core (ABC) 進階生物資訊核心設施. Chang, Chuan-Hsiung ( 張傳雄 ) Chen, Chen-Hsin ( 陳珍信 ) Hsu, Chun-nan ( 許鈞南 ) Li, Kuo-Bin ( 李國彬 ) Yang, Ueng-Cheng ( 楊永正 ). Interdisciplinary collaboration. Vision-based R & D. Data collection. Analysis tools. 基因體 相關資訊. 資訊科學技術 Information
E N D
Advanced Bioinformatics Core (ABC)進階生物資訊核心設施 Chang, Chuan-Hsiung (張傳雄) Chen, Chen-Hsin (陳珍信) Hsu, Chun-nan (許鈞南) Li, Kuo-Bin (李國彬) Yang, Ueng-Cheng (楊永正)
Interdisciplinary collaboration Vision-based R & D Data collection Analysis tools 基因體相關資訊 資訊科學技術 Information Technology IT 比較生物資訊 Comparative Bioinformatics Databases Workflow CB FB 功能生物資訊 Functional Bioinformatics GS Interpretation 基因體研究統計 Genomic Statistics 預防性醫學 個人化醫學 R&D is the basis forserviceandcollaboration Our niches
使用者 任務編組小組會議 登錄 IT CB FB GS 單一窗口 分析結果上線 進度報告 登錄 使用者 完成 瞭解問題 no 後續諮詢 yes 初部規劃 綜合分析結果 品質管制 對使用者說明分析方法 fail pass 單一窗口處理服務申請 平台分析 客制化服務 合作性服務
Comparative bioinformatics Bacterial Genome Annotation System (bGAS) Genome Comparison Tools (includeCAGO, CAMP, CICP) Gene variation related A functional analysis and selection tool for SNP in large scale association study (FastSNP) Alternative splicing related Putative Alternative Splicing database (PALSdb) Integrated splicing variants database (ISVdb) Gene expression related Bacterial gene expression database (BGEdb) Microarray Annotation and Profile (MAP) Cross-Hybridization Analysis Network of Gene Expression (CHANGE) Pathway related Pathway Knowledge Management System (PKMS) Phenotype related Bacteria: Bacterial phenotype database (BPdb) Cellular level: Integrated RNAi database Organismal level: Genotype to Phenotype (G2P) Disease candidate gene databases Spinocerebellar ataxia candidate gene database (SCAdb) STR-related disease database (STRRDdb) Disease associated gene database (DAGdb) Encyclopedia of Hepatocellular Carcinoma genes Online (EHCO) Utilities Gene Name Service (GNS) Consultation service http://consult.binfo.org.tw/ Online servicehttp://abc.binfo.org.tw/
Liver cancer Lung cancers Breast cancer CancersInfectious diseaseHighly heritable disease The same strategy may be applied to all types of cancers
New method: top-down Gene variation Genome Risk factor Genotype Pathway analysis Literature mining Disease Value-added information and tools • Gene variation • Functional Analysis and Selection Tool for SNP (FastSNP) in large scale association study • Alternative splicing • Putative Alternative Splicing (PALS) db • Integrated splicing variant (ISV) db • Pathway analysis • Pathway knowledge Management System (PKMS) • Phenotypes • Disease Associated Gene (DAG) db • Gene to Phenotype (G2P) db • Integrated RNAi db
Gene Symbol Gene name service SNP rsID Candidate Gene Approach SNP Search Single SNP (batch) Text mining Novel SNP ESEfinder Agent Starter RESCUE-ESE Function Report TFSEARCH dbSNP PolyPhen Swiss-prot Prioritization Ensembl NCBI GenBank Two ways to collect information:web wrapper agent and text mining Chromosome FastSNP
World’s most accurate automatic gene name identification from biomedical literature BioCreAtIvE - Critical Assessment for Information Extraction in Biology http://biocreative.sourceforge.net/
Raw data differences Experiment Control Genotyping or Gene expression Distinguish cause & effect Patterns Look for major factor Mechanisms Form hypothesis Design therapeutic intervention Common strategy to discover the disease mechanism cancer
Design MIAME check list GESDAS (Gene Expression Study Design and Analysis Suite) Analysis SMD (Stanford Microarray Database) GESDAS MAP (Microarray Annotation and Profile) IPIR (integrated protein interaction resource) CHANGE (Cross Hybridization Analysis Network of Gene Expression) SpliceGear and ChangeGear Interpretation PKMS (Pathway Knowledge Management System) Integrated RNAi database DAG db (Disease Associated Gene database) G2P (Genotype to Phenotype) Six cancer-related publications in year 2006 More than 400 gene expression microarrays for cervical, lung, breast, etc. cancers were analyzed by ABC’s tools
Microarray study design http://gears.stat.sinica.edu.tw/MIAME/MIAME.php
cDNA Image plots Affymetrix MM larger than PM Genomic Statistics Unit for Complex Diseasesin theNRPGMAdvanced Bioinformatics Core Enhancingthe web platform: New New
Expanding GESDAS to a more comprehensive platform “Gene-Environment Analysis Refining System” (GEARS) for general biomarkers (not open yet)
Integrated Protein Interaction Resource (IPIR) => Microarray Annotation and Profile (MAP) => pathway knowledge management system (PKMS) No PPI expansion Red: ER+ Green: ER- Yellow: ER+ and ER- With PPI expansion
World’s Most Accurate Protein Subcellular Localization Image Classifier (July 2006 – Present) Previous best result: 83% Our preliminary result: 93% • Publications • Y.-S. Lin et al. Boosting Multi-Class Learning with Repeating Codes. In TAAI 2006 Conference on Artificial Intelligence and Applications. December, 2006. • C.-C. Lin et al. Boosting Multiclass Learning with Repeating Codes for Protein Subcellular Localization. Submitted, 2007
Infectious diseases CancersInfectious diseasesHighly heritable diseases Taiwan Pathogenic Microorganism Gene Database (TPMGD) for CDC, Taiwan
Integrate sequence with epidemiology information Dec. 2005 – Dec. 2006
同樣的系統,以 EpiNet 為名,對學術界開放 使用者身份切換成功 發佈最新消息與新聞管理 搜尋、顯示欄位的管理 資料管理者的權限 新增與管理資料庫內容 資料查詢及瀏覽 個人工作區操作及序列分析 Advanced Bioinformatics Core 18
The next generation bioinformatics tool for biomedical scientists: Web service & workflow tool
Comparative bioinformatics tools Vibrio vulnificus strain-specific plasmid genomes. bGAS(bacterial Genome Annotation System) Integrated Comparative Analysis Platform(iCAP) for Genomic Data
Research method Genome Disease gene Genes Candidate genes Candidateregion Chromosome Linkage analysis Genotyping Schizophrenia Disease CancersInfectious diseaseHighly heritable diseases
Example of providing integrated service: Searching for Disease-Associated Gene Variations Integrate information & Primer design(FB) Collect information (IT,FB) Sequencing Core Priority setting (FB) Integrate information perform quality control (FB) Look for gene variation (CB) Candidate gene variation & disease phenotype (GS)
Design FastSNP ISV db PALS db PipMaker pipeline Primer3 Analysis PolyPhred pipeline GAP (Generalized Associated Plots) analysis Gene variation detection and gene-gene interaction 60 primer pairs were designed 18,000 sequences were compared 103 Variation sites were found * 68 were not reported before * 20 variation sites may related to phenotype (need more samples)
Gene expression Sequencing Genotyping Gene related information Proteomics ABC PET gene probe Phenotype related information RNAi Mouse mutagenesis Synergy is emerging from collaboration • Help a single project to integrate different types of information • Make new observations by integrating data from different users