200 likes | 350 Views
IGR-ANNOT: A Multiagent System for InterGenic Regions Annotation. Sandro Camargo, João Valiati, Luis Otávio Álvares, Paulo Engel, Sergio Ceroni. Introduction. The exponential growth of genomic data has led to an absolute requirement for computerized tools to analyze this data.
E N D
IGR-ANNOT: A Multiagent System forInterGenic Regions Annotation Sandro Camargo, João Valiati, Luis Otávio Álvares, Paulo Engel, Sergio Ceroni
Introduction • The exponential growth of genomic data has led to an absolute requirement for computerized tools to analyze this data. • A new genome sequencing does not answer all questions about the organism. Progress is more likely to come from comparing the genomes of different organisms.
Introduction • There are many tools and techniques to compare complete genomes and coding regions, but there is a lack for techniques for compare non-coding regions of DNA, which contains regulatory elements. • Many of the differences between species may be attributed to changes in the regulation of transcription and translation. • Transcription and translation are often regulated via elements that lie in intergenic regions.
InterGenic Regions • Intergenic regions are defined as the sequence between the translational stop of a gene and translational start of the next gene. • For obtaining intergenic regions of an organism are necessary: • the complete genome of this organism (the nucleotides sequence) • the information about coding regions (start and stop positions, orientation, and name).
InterGenic Regions • Our decision was to work with GenBank files because they contain all this necessary information for identifying coding regions, and this information will be used to infer the necessary information about intergenic regions.
InterGenic Regions • The format design is based on a tabular approach and consists of the following items: • Feature Key: a single word or abbreviation indicating functional group; • Location: instructions for finding a feature; • Qualifiers: auxiliary information about a feature.
InterGenic Regions Key Location/Qualifiers CDS 23..400 /product=“alcohol dehydrogenase” /gene="adhI" An example of a feature in the feature table.
InterGenic Regions • InterGenic Regions naming conventions: IGR-O-G1-G2 where O = {F|R|B|X} depending on the previous and next gene orientations, and G1 and G2 are the names coding regions which intergenic regions contains regulatory information.
InterGenic Regions • Intergenic regions will be written in the GenBank file format using the feature misc_feature. • According to the GenBank file format description, this feature key is used for annotate regions of biological interest which cannot be described by any other feature key.
IGR-ANNOT Engineering Process • The multiagent approach is particularly attractive to this problem because: • information content is heterogeneous. • information can be distributed. • much of the annotation work for each gene can be done by different laboratories using different methodologies for annotate information about genes. • We have used MASE and AgentTool to modelling the agent.
IGR-ANNOT Engineering Process • User Interface Agent (UIA) • File Reader Agents (FRA) • Gene Agents (GA) • InterGenic Regions Agents (IGRA) • File Writer Agents (FWA)
IGR-ANNOT Engineering Process • To implementing this architecture, we have used the Perl language, and it can be run on any suitable platform. • Perl have many features, like string manipulation facilities, that become it a very interesting language to working with DNA sequences, • besides there are complete packages to implementing multiagent systems.
Results Discussion • We have extensively used IGR-ANNOT to creating intergenic regions annotation in several genomes of Mycoplasmataceae family. • To getting a graphical view of annotation created by our tool we have used the Artemis tool. • The next figures are presenting the Mycoplasma Hyopneumoniae 232 genome.
Conclusions • This system is now successfully in use by biologists at the UFRGS. • The result of IGR-ANNOT application provides an easy way to comparing intergenic regions among different organisms. • Although the positive results achieved until now in genomes of Mycoplasmataceae family, further tests will be performed, mainly using most complexes genomes.
Future Works • Create an environment to InterGenic Regions comparison. • IGR-ANNOT will be available publicly to other biologists over the web at www.inf.ufrgs.br/~scamargo in software section.