140 likes | 207 Views
ORDered ALignment Information Explorer. Sequence Clustering. Alignment editor. “barcode” = schematic alignment. Conservation computtion. 3D viewer. Phylogenic tree. Features Editor. => sequence / structure / function / evolution cross-talks.
E N D
Sequence Clustering Alignment editor “barcode” = schematic alignment Conservation computtion 3D viewer Phylogenic tree Features Editor => sequence / structure / function / evolution cross-talks
Exploring Alignment Information up to the residue Level Taxa Single Taxa Level Clusterings level Global level Alignment Positions Full length 3D structure Domains conservation Motifs, secondary structures, ….. phylogeny Residues Contexts X x x
and Alignments : Reads ALN, MSF, TFA, RSF, Macsims/XML, ORD file formats What is an alignment ? - description of the alignment (NorMD score, date, etc …) - set of sequences generic information (length, EC, phylogeny, …) features (PFAM-A, PROSITE, BLOCK, etc …) - clustering = groups of sequences - conservation scores based on clustering
Sequence editing Clustering editing Current Alignment Overwrite current Create new MACSIM
Inside : Ordalie parameters (colors, fonts, thresholds, …) Description of the alignment (name, NorMD score, creation date, ...) Original Set of aligned sequences - general information (length, pI, mol. Weight, …) - features (Pfam domain, secondary structures, …) - AA sequence Coordinates of 3D structures corresponding to PDB entries Description of 3D objects (representation type, colors, etc …) M 5 – clust. + edit Clustering 2 Edit Sequences -> conservation M 4 – edit sequences Clustering 1 Edit Sequences -> conservation M 2 – macsims clustering Macsims Clustering Original Sequences set -> original conservation M 1 – original alignment Original Sequences set M 3 – new clustering Clustering 1 Sequences set 1 -> conservation
ORD : file format SQlite Database accessible through SQL statements ODBC compatible Platform independant Light weight Contains all Ordalie data preferences performances
Modes : • features • - search • - pairwise identity • - sequences editor • features editor • clustering • - trees • - conservation • - superposition
Clustering: • Zone selection : • Whole alignment • By Feature • User defined • Criterions : • % identity • pI • Length • Composition (aminoacid, physico-chemical groups) • Clustering Methods : • Manual clustering by inserting/removing separators • Hierarchical classification + Secator • Kmeans + DPC • Mixture model + AIC
Conservation Methods : Threshold Global Identity -> 100% Identity Global Conserved -> >80% identity. Group Identity -> 100 % identity in group Mean Distance as cf ClustalX Vector Norm based on a vectorial (polarity,volume) representation of amino acids Liu2 based on Blosum62 Entropy takes gaps and physico-chemical properties of AA into account Validity of score clustering ?
Key Usage Points : Always leave a mode before entering a new one Sequences selection : « à la Windows » - <Button-1> selects a sequence - <Control-Button-1> add current seq. to selection - <Shift-Button-1> Zone selection : - All (button) - selecting a feature <Control-Button-1> - manuaally : - <Button-1> for starting point - <Button-3> for ending point - <Shift-Button-3> to delete a selected zone
TODO List : Short term : - Bugs, if any …. ;-) - group naming - project handling - MacOS X version - documentation and tutorials - publication Long term : - Bugs, if any …. ;-) - on-line web services - on-line Macsims calculation - on-line sequence, information, feature updating - 3D surface mapping of features. - ….
Running Ordalie : On surf/lameX : - setordalie - ordalie <filename> - ordalie <filename> option value option value File formats: MSF, TFA, ALN, RSF, XML/Macsims and ORD Conversion : ordalie toto.msf –convert ALN - toto.aln
Enseignement Enseignement 1985 1985 1985 1985 1985 1985