1 / 1

ARPAnno: a dedicated web tool for Annotation of Actin Related Proteins

Actin sequence. Actin subdomain 1, 2, and 3, 4. Deletion. Insertion. Specific Insertion. Specific residue or motif. Hot spot of insertion/deletion. A. Actin. ARP1. ARP2. ARP3. ARP4. ARP5. ARP6. ARP7. ARP8. ARP9. ARP10. ARP11.

aquarius
Download Presentation

ARPAnno: a dedicated web tool for Annotation of Actin Related Proteins

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Actin sequence Actin subdomain 1, 2, and 3, 4 Deletion Insertion Specific Insertion Specific residue or motif Hot spot of insertion/deletion A Actin ARP1 ARP2 ARP3 ARP4 ARP5 ARP6 ARP7 ARP8 ARP9 ARP10 ARP11 ARPAnno: a dedicated web tool for Annotation of Actin Related Proteins Jean Muller1,3, Yukako Oma2, Laurent Vallar3, Evelyne Friederich3, Olivier Poch1 and Barbara Winsor2 1 Laboratoire de Biologie et Génomique Structurales, IGBMC, CNRS/INSERM/ULP, BP 163, 67404 Illkirch cedex, France. 2 Laboratoire Modèles Levure de Pathologies Humaines, FRE2375, IPCB, CNRS, 21 rue Descartes, 67084 Strasbourg, France. 3 Laboratoire de Biologie Moléculaire, d'Analyse Génique et de Modélisation, CRP-Santé, 42, rue du Laboratoire, L-1911, Luxembourg. jean.muller@igbmc.u-strasbg.fr Introduction Actin Related Proteins (ARPs) are key players in major biological processes important for cell life. In cytoskeleton activities, the ARP2/3 complex is essential for actin dynamics, ARP1 and ARP11 are involved in microtubule based vesicle trafficking, in nuclear functions (transcriptional activation, tumor suppression…), ARP4-ARP9 are components of many chromatin modulation complexes (SWI2/SNF2, SWR1, HAT). Conventional actins and ARPs together define a large family of homologous proteins, the actin superfamily, with a tertiary structure known as the “actin fold”. Since 1997 (Poch and Winsor), the unified classification of ARPs is composed of 11 families, based primarily on their decreasing relative sequence similarity to conventional actin sequences, where ARP1 is the most similar and ARP11 the least similar. Due to close sequence relationships between ARPs and actin sequences, it is frequently difficult to unambiguously annotate ARP sequences using classical database searches. It is then of high interest to develop discriminative tools to distinguish ARPs and actin, in order to understand the mechanisms in which they are involved. An initial dataset has been defined forming the basis of a multiple alignment of all ARP sequences. This set allows us to characterise each ARP family (sequence identity, specific residues and insertions, phylogenetic distribution) and to implement ARPAnno (http://bips.u-strasbg.fr/ARPAnno) a web server dedicated to ARP sequence annotation. Initial set ARP families characterisation In depth protein database (Uniprot) searches to retrieve the maximum number of different ARP sequences using for each family distinct queries from distantly related organisms (i.e H. sapiens, D. melanogaster and S. cerevisiae) and the PipeAlign program. Basic sequence analysis 1 Initial percent identity used to classify ARP families: % Identity to group of 29 actins Mean ARP family percent identity to reference actin: http://bips.u-strasbg.fr/PipeAlign (blastp, ballast, DbClustal, Rascal, DPC) 73340 proteins were detected, representing 4200 non redundant and “non fragment” sequences. Proteins with ≤ 15% amino acid identity or unrelated sequences, were not included in the final alignment. Mean percent identity inside a family: • High quality ARP Multiple Alignment of Complete Sequences (MACS) containing 692 sequences and 146 ARPs. • Decreasing percent identity to reference actin (RefID) for ARP1 to ARP11. • High family conservation (FamID) for ARP1-3, the main cytoplasmic ARPs in contrast to nuclear ARPs and the most divergent ARP10 and ARP11 families. • Increased number of ARP sequences in protein database (Uniprot) from 29 (1997) to 146 (July 2004). This can be divided in 3 groups of ARPs: >19 sequences for ARP1-4, >10 ARP5, ARP6 and ARP8 and ≤ 10 ARP1, ARP9, ARP10 and ARP11. • Assessment of 11 ARP family classifications. Definition of ARP family features Distribution of ARP families among eukaryotes 3 2 Eukaryotic presence and absence distribution is cross validated using proteome searches (blastp in Uniprot) and genome exploration (tblastn) from 19 different organisms ranging from T. pseudonana (algae) to H. sapiens (mammals). • Actin is present in all eukaryotic organisms explored. • Presence and absence patterns reveal pairs of ARPs (ARP2 with ARP3, ARP4 with ARP6, and ARP5 with ARP8). This strongly correlates with biological data available for ARP containing complexes. • Highlights specific features such as conserved residues or motifs and insertions for ARP1-9. No specific features have been defined for the divergent ARP10 and ARP11. • ARP4 and ARP6 are present in all organisms tested. Nuclear ARP is the minimum package for eukaryotic organisms. • 4 hot spots of insertions (A, B, C, D) can be seen in peripheral positions to core fold. • S. pombe and Y. lipolytica have no ARP7 but are the only yeast out of 31 to own a second ARP4 (ARP4*). • Creation of an ARP family Knowledge Filter which is a cornerstone for ARP annotation process. ARPAnno web server A multi-step process Validation >Q5ZM58_CHICK Hypothetical protein.MESYDVIANQPVVIDNGSGVIKAGFAGDQIPKYCFPNYVGRPKHVRVMAGALEGDIFIGPKAEEHRGLLSIRYPMEHGIVKDWNDMERIWQYVYSKDQLQTFSEEHPVLLTEAPLNPRKNRERAAEVFFETFNVPALFISMQAVLSLYATGRTTGVVLDSGDGVTHAVPIYEGFAMPHSMRIDIAGRDVSRFLRLYLRKEGYDFHTTSEFEIVKTIKERACYLSINPQKDETLETEKAQYYLPDGSTIEIGSARFRAPELLFRPDLIGEECEGLHEVLVFAIQKSDMDLRRTLFSNIVLSGGSTLFKGFGDRLLSEVKKLAPKDVKIRISAPQERLYSTWIGGSILASLDTFKKMWVSKKEYEEDGARAIHRKTF Unknown potential actin like protein All 146 sequences of available ARPs have been correctly annotated. 68 new sequences from recent version of Uniprot; 36 conventional actin, 3 Orphans, 6 ARP1, 7 ARP2, 6 ARP3, 8 ARP4, 1 ARP9 and 1 ARP10 from diverse organisms such as Y. lipolytica, D. hansenii, P. tetraurelia, X. tropicalis or G. gallus. GID Global percent identity blastp Local alignment with blastp and determination of eligible families for next step using GID and pCover. 1 pCover Percent sequence coverage Web interface http://bips.u-strasbg.fr/ARPAnno >Q5ZM58_CHICK Hypothetical protein. MESYDVIANQPVVIDNGSGVIKAGFAGDQIPKYCFPNYVGRPKHVRVMAGALEGDIFIGPKAEEHRGLLSIRYPMEHGIVKDWNDMERIWQYVYSKDQLQTFSEEHPVLLTEAPLNPRKNRERAAEVFFETFNVPALFISMQAVLSLYATGRT Fasta sequence clustalw Global alignment with reference alignment of eligible families using clustalw. 2 Coloured multiple alignment available pDR Percent of specific residues Filtering for specific residues, motifs (pDR) and insertions (pDI). Knowledge Filter 3 pDI Percent of specific insertions Calculation of one score for each eligible family and determination of most suitable ARP family. ScoreARP 4 Table results Conclusions and perspectives Poch, O., and Winsor, B. (1997). Who's who among the Saccharomyces cerevisiae actin-related proteins? A classification and nomenclature proposal for a large family. Yeast 13, 1053-1058. • The development of a high quality multiple alignment of ARP sequences permits the validation of the ARP classification and the definition of family features (residues and insertions). Plewniak, F., et al. (2003). PipeAlign: A new toolkit for protein family analysis. Nucleic Acids Res 31, 3829-3832. Altschul, S.F., et al. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25, 3389-3402. Thompson, J.D., et al. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22, 4673-4680. • The major ARP families are the nuclear ARP4 and ARP6. • Correlation of ARP organisms distribution with functional data is a benchmark case for phylogenetic profiling methods. • In future: Maintain ARP MACS up to date and add some structural features to ARPAnno. • Extend the genome exploration. • ARPAnno a new web server for the unambiguous identification of ARP sequences is available. Acknowledgments: Ministère de la Culture, de l’Enseignement Supérieur et de la Recherche du Luxembourg, Fonds National de la recherche du Luxembourg,CNRS, INSERM, France

More Related