260 likes | 373 Views
LSM2104 Project Guidelines. 18 th February 2003. Outline. Setp 0: How to use MS-FrontPage to create web page(s) Go through Step 1 of the project Using known gene to BLAST against pseudomallei genome itself Report the location of CDS Go through Step 2 of the project
E N D
LSM2104 Project Guidelines 18th February 2003
Outline • Setp 0: How to use MS-FrontPage to create web page(s) • Go through Step 1 of the project • Using known gene to BLAST against pseudomallei genome itself • Report the location of CDS • Go through Step 2 of the project • Using unknown gene from related organisms to BLAST against pseudomallei genome • Find and report the location of CDS • Using CLUSTALW to do MSA
Step 0 : Using MS-Front Page • Inserting email address • Inserting images & background • Inserting hyperlinks • Bookmark(s) within the page • Link(s) to other web pages • NOTE : uploading to the web server will be taught in later session
Step 1 : Find and locate a known B.pseudomallei gene on the genome • Goto Genbank/Swiss-Prot: http://www3.ncbi.nlm.nih.gov/ • Type in keyword: pseudomallei flagellin complete cds • Look at the entry : AF030239 • Extract the protein sequence (in FASTA format) to do Blast
Blast Overview • Blastp: • An amino acid query sequence against protein sequence database • Blastn: • A nucleotide query sequence against nucleotide sequence database • Blastx: • A nucleotide query sequence translated in all reading frames against protein sequence database • Tblastn: • A protein query sequence against a nucleotide sequence database dynamically translated in all reading frames • Tblastx: • Compares the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database.
How to BLAST • Perform tBLASTn at http://sf01.bic.nus.edu.sg/blast/blast.html • Output explanation (http://www.ncbi.nlm.nih.gov/Education/BLASTinfo/glossary2.html) • Score • Expectation value • Identities • Positives • Frame • Report the location of CDS
Your Assigned Gene for Step 1 Must include at least this gene Q9RGS8 : Non-hemolytic phospholipase C, found from Swiss-Prot
Step 2: Find and locate unknown genes in Burkholderia pseudomallei • Minimum: must include at least the assigned 2 genes from Burkholderia genus (but unknown to B. pseudomallei) • Extract the protein sequences out to tBLASTn against B. pseudomallei genome. • Find and report the locations of the CDS • Use ClustalW to reconfirm your findings.
Reporting location of CDS BLAST results B.P genome Complete CDS Matched fragment Query sequence
Pick a known gene from related organism in same genus For example • N-acyl homoserine lactone synthase (AAK70351) • from Burkholderia stabilis
http://sf01.bic.nus.edu.sg/extract/ • Extension of ends • Direction
BpsChrom2 Burkholderia pseudomallei chromosome 2 (1177310-1178131 ) atcgcgccgcgcgcgcgaaacacgagcccctgtctgccgagccgcacgagcggcaggcgt tcggcgaacgacgggaacgcgacggcgatgcgggtttcgccggcatcgaacgtagggagc atcgcgcgaaataccgttgaatggtccacggtgtagaggtctccttgaatgacgaacggc gcggccccgaagcggggcgaccgggcgcgctcaggcggcttcggcgggcggcgcgcacag cagcgggtcgagatcgagcgcggcgagcgtttgcgcgtcgaggtcgatccagcacgcgac gaccatgcgcccgtcgatctgctgcgcgggccccgcccggtgcgcgtgcacgccgatccg gcggaacaggcgctccatgctcagaaacgtcacgccgatcagttgcttcgcgccaagccg cgcggcgcactcgacgacggcggcgagcatcggccgcaccgcccaggccgggttgccgcc cccggccggatcctcggcgttcgcggcgaagcgcgacaattcccagacggcggcggattg cggcaacggcatgtcttgcgcgaccagcgtcgggaacagttccttcagcagatacgggcg ggtcgtcggcagcagccgggcgcagccgcagatttccccgtcgtcgtcgcgggcgaacac atagacggtatcgtcgcgatcgtactgatcccgctcgaacccttcgcttgccgacggcag tttccagccgagctgctcgacgaaaactcggtgccgataaaggcccagatcagccgccaa gtcgctcggcaggcgcccgtcgccatgaacgaaagttcgcat
How to use ClustalW • Input sequences: • 4 Flagellin genes from different organisms, save them into one file in fasta format from GenBank: • AJ496283. Legionella longbe...[gi:22553065] • AJ496282. Fluoribacter boze...[gi:22553063] • AF030239. Burkholderia pseu...[gi:3337408] • AF307102. Borrelia parkeri ...[gi:11095317] • Copy and paste into the window
Symbol Found In the Results: '*' indicates positions which have a single, fully conserved residue • ':' indicates that one of the following 'strong' groups is fully conserved:-STA NEQK NHQK NDEQ QHRK MILV MILF HY FYW • '.' indicates that one of the following 'weaker' groups is fully conserved:-CSA ATV SAG STNK STPA SGND SNDEQK NDEQHK NEQHRK FVLIM HFY
Your minimum genes for step 2 Include at least these 2 genes from GenBank from Burkholderia genus: • AF525414 • AF333004
Step 1 (cont.) • Goto http://sf01.bic.nus.edu.sg/extract/e2.html • Type in the start & end base number (e.g. extend by 150 nucleotides) • Start position : (2902870 – 150) • End position : (2904966 + 150) • Explain why & when need to +/-
Step 1 • Extract out the nucleotide sequence from the output • Do a translation, paste in the sequence at http://sf01.bic.nus.edu.sg/translate/ • Extract the amino acid residues sequence • Do a BLAST again on the extracted sequence at http://sf01.bic.nus.edu.sg/blast/blast.html