180 likes | 339 Views
Bioinformatics. – a definition ?. The design , construction and use of software tools to generate , store , annotate , access and analyse data and information relating to Molecular Biology. OR. Biologists doing “stuff” with computers?.
E N D
Bioinformatics – a definition ? The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology OR Biologists doing “stuff” with computers? Here we consider the use of Bioinformatics tools rather than their design and construction Here we consider the access and analysis of data and information items rather than their generation, storage or annotation
Software Tools for Sequence Analysis Packages that offer a comprehensive range of bioinformatics tools for sequence analysis. General Packages: Most researchers would expect to use such packages at some time. Packages that offer tools for a particular type of analysis. Specialised Packages Used intensely by researchers in the relevant area, not at all by everyone else. Tools whose nature inclines them to be primarily accessed over the network. WWW Resources These categorisations are very general Many specialist programs are incorporated into the general packages. Most things can be done at a web site somewhere.
Sequence Analysis – an Overview Nucleic Acid Sequences Protein Sequences Sequencing Project Management Database Retrieval Restriction Mapping Primer Design DNA/RNA Folding Nucleic Acid Sequence Analysis Database Retrieval Seeking Coding regions Database Similarity Searching Translation to amino acids Pairwise Sequence Comparison Multiple Sequence Alignment Protein Sequence analysis Prediction of Function Structure prediction Motifs and Patterns Phylogeny Structure analysis
Software Tools for Sequence Analysis General Packages: Open source UNIX only Several GUIs (java, WWW, X) Comprehensive Open source Windows, MacOS X, UNIX Reasonable GUI including interactive graphical output Notcomprehensive butallows access to EMBOSS
Software Tools for Sequence Analysis General Packages: Commercial Expensive Other options Windows PCs or Macintoshes Good GUIs Public Domain Windows, Macintosh, UNIX Modern intuitive GUI Access remote databases
Sequence Analysis – an Overview Nucleic Acid Sequences Protein Sequences Sequencing Project Management Database Retrieval Restriction Mapping Primer Design DNA/RNA Folding Nucleic Acid Sequence Analysis Database Retrieval Seeking Coding regions Database Similarity Searching Translation to amino acids Pairwise Sequence Comparison Multiple Sequence Alignment Protein Sequence analysis Prediction of Function Structure prediction Motifs and Patterns Phylogeny Structure analysis
Software Tools for Sequence Analysis Bioscience AG WWW Resources Database Retrieval Sequence Retrieval System Retrieves MUCH more than sequences Core elements free to academic sites Implemented in many places It is possible to integrate analysis tools Elements of SRS are incorporated into EMBOSS
Software Tools for Sequence Analysis WWW Resources Database Retrieval Retrieves MUCH more than sequences Access to NCBI databases only Entrez client software available by anonymous ftp Most general packages include tools to access local sequence databases EMBOSS programs can access sequences from remote SRS servers
Databases Database are available from WWW sites and highly interlinked Clinical and Mutation OMIM MGMD Bibliographic PubMed Raw Sequence As accessed for “sequence retrieval”
Databases Sequence Databases Contain both raw sequence data and annotation DNA Sequences (European Molecular Biology Laboratory) GenBank (NCBI) Refseq (NCBI) DNA Data Bank of Japan Protein Sequences Refseq (NCBI) PIR Trembl (GenPept)
Databases Database are available from WWW sites and highly interlinked Clinical and Mutation OMIM MGMD Bibliographic PubMed Raw Sequence As accessed for “sequence retrieval” Alignments and Patterns As generated by analysis software
Databases Alignments and Patterns Alignments Aligned protein families Comprised of a number of sections Aligned protein domains Automatically generated from protein sequence databases Conserved “blocks” of protein alignments Used to compute scoring schemes for protein comparisons
Databases Alignments and Patterns Patterns Patterns are largely derived from the conserved portions of aligned protein families Representations of single motifs Now comprised of both simple patterns and HMM profiles Representations of patterns of motifs (fingerPRINTS)
Databases Database are available from WWW sites and highly interlinked Clinical and Mutation OMIM MGMD Bibliographic PubMed Raw Sequence As accessed for “sequence retrieval” Alignments and Patterns As generated by analysis software Structural PDB Integrated Ensembl