1 / 5

Genetic Sequence Alignment Study

Explore gene sequences, compare alignments, and analyze conservation levels using BLAST and ClustalW. Dive into amino acid alignments on Pfam and practice java program modifications.

Download Presentation

Genetic Sequence Alignment Study

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS177 homework assigned March 2 this can be either a group or individual assignment, whichever is easier for you this will cover multiple alignment html and java this is a lot of stuff, so it is due in 2 weeks the balance should start shifting from assigned homework to doing your projects

  2. “vertical” multiple alignment • pick a gene, find the human mRNA (ie, NM_XXXX) RefSeq, and query NCBI nucleotides using BLAST • see if you get hits to 5 or 6 different species • if not, try another gene • pick out one good hit (ie, low p value and pretty long) for each species (including the original human RefSeq) • submit these sequences to clustalw • visually identify one column on the alignment that exemplifies highly conserved, one for moderately conserved, and one for poorly conserved

  3. “horizontal” multiple alignment • go to pfam site • http://www.sanger.ac.uk/Software/Pfam/ • look it over until you are completely confused • here is my example • run through it • then do your own example • enter “fibrin” in “keywords” box • click on “kringle” • click on “view species tree” • click in box next to homo sapiens and then click “view selected species alignment” • meditate on what you are seeing • amino acids, not nucleotides • uses the one letter symbol format for amino acids • can you make any observations about the multiple alignments? • try it in a second browser window for gorilla and compare with human • ditto for mouse

  4. java • take a look at the NCBI_STRUCTURES.java program • go to the web site from the last homework • http://java.sun.com/j2se/1.3/docs/api/index.html • see if you can find something in the web site that helps you make sense out of one or two things in the NCBI_STRUCTURES.java program • hint: Look at the URLConnection class • just spend 30 or 40 minutes on this. don’t get too frustrated now - you will have plenty of time for that once you get a real job • think of this as a growth experience that builds character

  5. java and html - we will do this in class next week, but you will have to do it on your own also • the NCBI_STRUCTURES.java program can be used as a prototype for this part • go to NCBI web site • view the html source code underlying the web page • locate the form action POST stuff • look at the stuff that happens between the <form and the </form tags • copy the html source into a file, and change POST to GET, save as NCBI.html • open another browser window, and read in NCBI.html using the File menu • perform a query type of your choice • this will not actually work since POST is expected, but notice the stuff in the URL window that is exposed by using GET • copy the html source into a file, and change the NCBI URL into the URL for my cgi program testloop.cgi, save as NCBIecho.html • repeat the last 3 steps, and see if the echo is the same as the GET • modify NCBI_STRUCTURES.java • make it put out what you need for your query • modify the part that does the parsing (ie, the line with <dd>) to make it relevant for parsing your output • hint: figure out the modification by looking at the real output html source from a real query at the real NCBI site • if you cannot figure out how to modify the parsing, then at least comment it out entirely or you will not see any output!! • remember that the java program is run as: • java NCBI_STRUCTURES inputfilename • inputfilename is the name of the input file that has 4or 5 gene names to test oout • remember that first you need to run javac NCBI_STRUCTURES.java to get NCBI_STRUCTURES.classs

More Related