10 likes | 125 Views
Tripal: a Construction Toolkit for Online Genome Databases Stephen P. Ficklin 1 , Lacey-Anne Sanderson 2 , Chun-Huai Cheng 1 , Margaret Staton 3 , Sook Jung 1 , Taein Lee 1 , Il-Hyung Cho 4 , Kirstin E. Bett 2 , Dorrie Main 1 1 Washington State University, Pullman, WA, USA
E N D
Tripal: a Construction Toolkit for Online Genome Databases Stephen P. Ficklin1, Lacey-Anne Sanderson2, Chun-Huai Cheng1, Margaret Staton3, Sook Jung1, Taein Lee1, Il-Hyung Cho4, Kirstin E. Bett2, Dorrie Main1 1 Washington State University, Pullman, WA, USA 2 University of Saskatchewan, Saskatoon, SK, Canada 3 Clemson University Genomics Institute, SC, USA 4 Saginaw Valley State University, University Center, MI, USA contact: dorrie@wsu.edu Brief Overview Tripal is a platform that simplifies construction of online genomic databases. With the increase in data from new sequencing technologies and downstream data analysis the need for online visualization and data-mining is ever increasing. The need for skilled web developers and IT professionals coupled with the complexities of project and data management creates an obstacle for many research communities or individual labs. Additionally, online genomic databases are expected to provide access to the raw data, analysis results, data mining tools, cross references to larger or companion databases, include outreach content and perhaps social networking capabilities. Tripal is intended to reduce these complexities by coupling the strengths of GMOD Chado, a relational database schema for biological data, and Drupal, a popular Content Management System (CMS). Tripal provides a web interface that includes a Chado installer, data loaders for ontologies (controlled vocabularies), GFF files, and FASTA files. Web pages are automatically generated for organisms, genomic features, biological libraries, and stock collections. Web pages can be enriched with analysis results from BLAST, KAAS/KEGG, InterProScan, and Gene Ontology (GO). Tripal can be used “as is” but also allows for complete customization. PHP-based template files are provided for all data types to allow for precise customizations as required by the community. A well-developed Tripal API provides a uniform set of variables and functions for accessing any and all data within the Chado database. Currently, Tripal only supports visualization of a subset of the current Chado schema, but further development is underway. Meanwhile, others can use the Tripal API to develop their own extensions. Those extensions can in turn be made available for anyone to use. These custom extensions, the Tripal package, and support resources such as an active mailing list can be found on the Tripal website (http://tripal.sourceforge.net). Currently, Tripal is in use for several genome websites including the Citrus Genome Database, The Cacao Genome Database, Pulse Crops Genomics and Breeding, The Hardwood Genomics Project and more. Organism Pages. Site administrators can easily create pages with unique content for each organism. Sites Using Tripal Fagaceae Genomics Web Citrus Genome Database Pulse Crop Genomics & Breeding Marine Genomics Project Cacao Genome Database Cool Season Food Legume Genome Genome Database for Vaccinium Hardwood Genomics Project Sites Migrating to Tripal Genome Database for Rosaceae Cotton Genome Database Resources Mailing List Tutorials Demo Site Online Documentation Feature Pages. Genomic features from whole genome assemblies, unigene assemblies or other analyses can be added to the database using web-based loaders. Pages such as above can be made available for each feature. This example from the Citrus Genome Database is for an mRNA sequence. A structural view of the gene from GBrowse is shown and additional information about this gene is available through the right-hand resources sidebar. http://tripal.sourceforge.net/ Reference & Acknowledgements Stephen P. Ficklin, Lacey-Anne Sanderson, Chun-Huai Cheng, Margaret Staton, Taein Lee, Il-Hyung Cho, Sook Jung, Kirstin E Bett, Dorrie Main. Tripal: a construction Toolkit for Online Genome Databases. Database, Sept 2011. Vol 2011. Stock/Germplasm Pages. Stock and germplasm collections can be managed and viewed using Tripal. The screenshot above, taken from the KnowPulse website contains details about a specific stock, including properties, synonyms, genotypes and more. Tripal is funded indirectly through various agencies and groups. We are extremely thankful for this support. For a listing of these funding agencies please see the Tripal paper referenced above. The GMOD group provides logistical support. Functional Data. Reports for Gene Ontology (GO) annotations can be displayed. Users can select from all available analyses used to map GO terms. Analysis Pages. All data imported through Tripal has an “analysis” page providing a description for how the data was obtained. Functional Data. Tripal supports import and display of BLAST, InterProScan and KEGG results. All are loaded through common file formats such as XML Tripal easily supports loading of results generated through Blast2GO.