90 likes | 279 Views
Title: GeneWiz browser: An Interactive Tool for Visualizing Sequenced Chromosomes. By Peter F. Hallin , Hans- Henrik Stærfeldt , Eva Rotenberg, Tim T. Binnewies , Craig J. Benham , and David W. Ussery Published on Standards in Genomic Sciences (2009) 1: 204-215 Citation count: 35.
E N D
Title: GeneWiz browser: An Interactive Tool for Visualizing Sequenced Chromosomes By Peter F. Hallin, Hans-HenrikStærfeldt, Eva Rotenberg, Tim T. Binnewies, Craig J. Benham, and David W. Ussery Published on Standards in Genomic Sciences (2009) 1: 204-215 Citation count: 35
Background • Over 15 years of the genomic sequencing development, the public genome database has held more than a thousand sequenced genomes. • It is explicitly useful for biologists to analyzed multiple genomes cross different species for a broad range of interests, especially: • identify the phylogenetic relationship, genomic region causing the pathogencity to human and animals • new targeted genes worthy for industrial and economical use. • Such availability of the analytics tools is limited and often requires users with both analytical and programming knowledge, hence the analysis of multiple genomes is not always easy in a broad range of the biological research.
Function of the GeneWiz browser • GeneWiz browser for visualizing genomic data of prokaryotic chromosomes. • This tool provides several functions: • visualizing whole genome homology of genes and proteins within a reference strain compared to other strains or species • visualizing DNA physical properties such as curvature along the chromosome • identifying the repeat sequences along the chromosome • Additionally, custom numerical data such as gene expression and regulation data can also plotted • This web-interface service provides an interoperable method to carry out whole genome visualization
Implementation of GeneWiz browser • The method behind this visualization tool is to convert numerical information to color-encoded lanes in either using a linear scale with a fixed minimum and maximum range, or a dynamic scale of standard deviations. • DNA properties based on various developed methods to indicate particular regions posing biological functions • Mapping of homologous genes by BLAST (Basic Local Alignment Search Tool) • Mapping of short sequencing reads with the weighted coverage • Custom lanes with pre-processed data provided by users
Workflow of GeneWiz browser • This web interface includes two parts: • the client is written as a JavaApplet that obtains the data remotely from the server • the server is written in Perl/CGI, while a compiled C-program handles the access to the binary data files. • All input/output objects are defined in a separated XSD file (XML schema definition) within the WSDL file, • MySQL on the server provides the storage function for pre-binning of data for each zoom level
The maximum uniqueness quality is shown for the actual reads (green-to-blue lane) plotted along with reference genome. • This figure shows that a good correspondence between the in-silico and experimental reads suggests little bias towards certain chromosomal regions if read coverage is around 40 times.
BLAST comparison of 14 closely related bacteria chromosomes. • This figure clearly indicates that a strong preference of deletion on the pathogenic islands exist for a few of bacteria not causing infection to human.
A final example illustrates how the marks indicating the uniqueness of DNA physical properties can be used to integrate known regulatory elements and gene annotations to draw a more complete picture of a particular region for gene expressions.
Summary • Most biologists believe that a visualization of the multidimensional genomic information is necessary, but the use of an analytic tool is relatively difficult to them. • GeneWizbrowser is superior to numerous tools which are all the command-line programs generating publication quality static images and vector graphics for the genomic visualization. • easily navigate using mouse • zooming function to allow users to interpret the genomic information at varying scales • an automatic workflow that can be directly called from the users via the client part • This tool can be relevant in many pangenomic (cross-sequenced-species) as well as in metagenomic (cross-unsequenced-species) studies, by giving a quick overview of clusters of insertion sites, genomic islands and overall homology between a reference sequence and a data set.