190 likes | 367 Views
Multiple Sequence Alignments. Build a Sequence Alignment. Sequences are usually aligned automatically MUSCLE, PRANK, CLUSTAL etc. Also possible 'manually' using tools such as JalView Hopefully, these demonstrations will highlight that Alignment is "trivial" (at one level, at least)
E N D
Build a Sequence Alignment • Sequences are usually aligned automatically • MUSCLE, PRANK, CLUSTAL etc. • Also possible 'manually' using tools such as JalView • Hopefully, these demonstrations will highlight that • Alignment is "trivial" (at one level, at least) • only involves putting gap characters in the right places
Build an Automatic MSA Search Internet for "EBI Muscle" http://www.ebi.ac.uk/Tools/msa/muscle/
Build an Automatic MSA Copy and paste sequences in FASTA format http://www.embl.de/~seqanal/courses/commonCourseContent/sequences/verySimilarHemoglobins_unaligned.fasta Click "Submit"
Build an Automatic MSA Wait for result to be returned Click "Download Alignment File" to reach plain-text version of alignment
Build an Automatic MSA Download file or copy-paste text into text editor to store alignment on local computer View alignment in MSA viewer (e.g. JalView) etc.
Choosing an MSA tool Different tools designed for different tasks • CLUSTALX, MUSCLE, PROBCONS • divergent protein sequences • NAST • multiple alignment of 16S rRNA genes • PRANK • multiple alignment of relatively similar DNA sequences in an evolutionary context • EXPRESSO(3DCoffee) • multiple alignment of protein sequences, some of which have 3D structural information • MAUVE, Enredo • multiple alignment of genomes • and many others...
CLUSTALX Colouring Scheme • Only one of many possible colouring schemes • Good at highlighting variation in conservation between • Designed for red/green colour-blindness extract from an alignment of p53 proteins
e.g. basic residues arginine (R) and lysine (K) CLUSTALX Colouring Scheme Amino acids with similar properties drawn with the same colour extract from an alignment of p53 proteins
only "a few" other basic residues (uncoloured) • "many" other basic residues (coloured) CLUSTALX Colouring Scheme Residues only coloured... • ... if some proportion of residues in the column have the same property • e.g. lysine (K) in columns with: extract from an alignment of p53 proteins • EXCEPT for P and G, which are always coloured
CLUSTALX Colouring Scheme L V I M F W A C Hydrophobic: N T S Q Polar: D E Acidic: H Y Large Aromatic Polar: K R Basic: G P Secondary-structure breaking: (CLUSTALX help file fully describes the default colouring rules)
Common Patterns - Buried Beta-Strand Response regulator receiver domain http://tardis.nibio.go.jp/cgi-bin/homstrad/showpage.cgi?family=response_reg&disp=str
Common Patterns - Amphipathic Partially-Buried Alpha-Helices Response regulator receiver domain http://tardis.nibio.go.jp/cgi-bin/homstrad/showpage.cgi?family=response_reg&disp=str
Common Patterns - Amphipathic Beta Strands ubiquitin conjugating enzyme
Common Patterns - Non-Globular Sequence Different, more strongly biased (from equal representation of each of the 20 amino acids), sequence composition • Sometimes more variable sequence • more substitutions • more gaps • than globular/structured regions)
Identifying Mis-Aligned Regions K Identify a region of a sequence that you think is misaligned Look at patterns of conservation, and sequences which Decide how you would "fix" this misalignment
Unusual Sequences: Examples Short/fragmented sequences With CLUSTALX “”Quality”->”Show Low-Scorring Segments” switched on Unusual pattern of "conservation"
Using MSAs to Improve Prediction of Linear Motifs • Demonstration and Exercise