270 likes | 393 Views
Exploring the tools of the trade. Tools and Datasets. Sequence Databases. Understanding EMBL Entries Understanding SWISS-PROT Entries. Understanding EMBL Entries. Understanding SWISS-PROT Entries. General Concepts and Methods. Predictions and Validation.
E N D
Exploring the tools of the trade Tools and Datasets
Sequence Databases • Understanding EMBL Entries • Understanding SWISS-PROT Entries
General Concepts and Methods • Predictions and Validation
Recognise the difference between the validation of a model and the testing of it for self-consistency Maxim 17.1
Generally, False Negative predictions are considered more acceptable than False Positives Maxim 17.2
figOUTCOME.eps Assessment/Validation Procedure and Possible Outcomes
With False Negatives we could come back next year and find the ones we missed, and these are preferred to False Positives, where we can waste time studying them this year, only to find out that the time was wasted. It all depends on the circumstances Maxim 17.3
Sometimes all those false positives are maybe, just maybe, trying to tell you something. So, if you aspire to a Nobel prize ... Maxim 17.4
Use a fast if inaccurate algorithm to protect your slow, accurate second-stage algorithm Maxim 17.5
figTRNA.eps An overview of tRNA: 2D, 3D and Gene Structure
Introducing Bioinformatics Tools http://www.ncbi.nlm.nih.gov/Education/
ClustalW http://www-igbmc.u-strasbg.fr/BioInfo/ ftp://ftp.ebi.ac.uk/pub/software
figCLUSTALX.eps ClustalX operating under Windows XP
Algorithms and Methods $ gzip -d clustalw1.83.UNIX.tar.gz $ tar -xvf clustalw1.83.UNIX.tar $ cd clustalw1.83 $ make $ ./clustalw $ ./clustalw -h $ ./clustalw -INFILE=../MerAHMAs_MerP.swp -OUTFILE=../Mer.aln
Exactly which BLAST is best depends on the circumstances Maxim 17.6
Installing NCBI-BLAST $ cd $ mkdir blast $ cp blast-2.2.6-ia32-linux.tar.gz blast $ cd blast $ gzip -d blast-2.2.6-ia32-linux.tar.gz $ tar -xvf blast-2.2.6-ia32-linux.tar [NCBI] Data="/home/michael/blast/data"
Preparation of database files for faster searching $ mkdir databases $ cd databases $ mv ../All_Mer_Proteins.fsa . $ ../formatdb -i All_Mer_Proteins.fsa -p T -o T -n Merproteins $ blastall -p blastp -d databases/Merproteins -i test_seq.fsa $ sed 's/sw|/sp|/' All_Mer_Proteins.fsa > Mer_db.prot $ ../formatdb -i Mer_db.prot -p T -o T -n Merproteins
The different types of BLAST search $ fastacmd -d databases/Merproteins -I $ fastacmd -d databases/Merproteins -s MERA_SHIFL $ blastclust -d databases/Merproteins | head