1 / 27

Regulatory Genomics Lab

Regulatory Genomics Lab. Saurabh Sinha. Powerpoint by Casey Hanson. Exercise. In this exercise, we will do the following: . Use Galaxy to manipulate a ChIP track for BIN in D. Mel. Subject peak sets to MEME suite. Compare MEME motifs with Fly Factor Survey motifs for BIN.

lavada
Download Presentation

Regulatory Genomics Lab

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Regulatory Genomics Lab Saurabh Sinha Powerpointby Casey Hanson Regulatory Genomics Lab v1 | Saurabh Sinha

  2. Exercise In this exercise, we will do the following:. • Use Galaxy to manipulate a ChIP track for BIN in D. Mel. • Subject peak sets to MEME suite. • Compare MEME motifs with Fly Factor Survey motifs for BIN. • Subject peak set to a gene set enrichment test. Regulatory Genomics Lab v1 | Saurabh Sinha

  3. Step 0: Shared Desktop Directory For viewing and manipulating files on the classroom computers, we provide a shared directory in the following folderon the desktop: classes/mayo In today’s lab, we will be using the following folder in the shared directory: classes/mayo/sinha Regulatory Genomics Lab v1 | Saurabh Sinha

  4. Computational Prediction of Motifs In this exercise, we will upload a ChIP track of the transcription factor BIN in Drosophila Melanogaster to Galaxy. After performing various file manipulations, we will use the MEME suite to identify a motif from the top 100 ChIP regions. Subsequently, we will compare our predicted motif with the experimentally validated motif for BIN at Fly Factor Survey. Regulatory Genomics Lab v1 | Saurabh Sinha

  5. Step 1: Upload BIN ChIP Track to Galaxy Login to galaxy at main.g2.bx.psu.edu/ Click Get Data and then Upload File Upload our ChIP file: classes/mayo/sinha/BIN_Fchip_s11_1000.gff Set the File Format to gff. Set Genome to dm3. Click Execute Regulatory Genomics Lab v1 | Saurabh Sinha

  6. Step 2: Sort ChIP Track By Score Click on Filter and Sort and Sort. Under Sort Dataset, select our ChIP track. Under on column, select c6 (column 6). Under with flavor, select Numerical Sort. Under everything in, select Descending order. Click Execute. Regulatory Genomics Lab v1 | Saurabh Sinha

  7. Step 3: Obtain Top 100 ChIP Regions Click on Text Manipulation and Select First. Under Select first, enter 100 lines. Under from, select our sorted ChIP data. Click Execute. . Regulatory Genomics Lab v1 | Saurabh Sinha

  8. Step 4: Extract DNA of Top 100 ChIP Regions Click on Fetch Sequences. Click on Extract Genomic DNA. Under Fetch sequences for intervals in select our top 100 ChIP regions. Set Interpret features when possible to No. Set Source for Genomic Data to Locally cached. Set Output data type to FASTA. Click Execute. Regulatory Genomics Lab v1 | Saurabh Sinha

  9. Step 5: Download The Data When finished, click on to download the file to our desktop. This has already been done for you. The resulting sequence is in the following file: classes/mayo/sinha/BIN_top_100.fasta Regulatory Genomics Lab v1 | Saurabh Sinha

  10. Step 6: Submit to MEME DO NOT RUN THIS NOW. MEME TAKES A VERY LONG TIME. In this step, we will submit the sequences to MEME Go to the following address: http://meme.nbcr.net/meme/cgi-bin/meme.cgi Enter your email address here. Upload your sequences file here Leave other parameters as default. Click Start Search. Regulatory Genomics Lab v1 | Saurabh Sinha

  11. Step 7A: Analyzing MEME Results Go to the following web address: http://nbcr-222.ucsd.edu/opal-jobs/appMEME_4.9.01371501018575720728765/meme.html The webpage contains a summary of MEME’s findings. It is also available on the shared directory: classes/mayo/sinha/MEME.htm Let’s investigate the top hit. Regulatory Genomics Lab v1 | Saurabh Sinha

  12. Step 7B: Analyzing MEME Results To the right is a LOGO of our predicted motif, showing the per position relative abundance of each nucleotide At the bottom are the aligned regions in each of our sequences that helped produce this motif. As the p-value increases (becomes less significant) matches show greater divergence from our LOGO. Regulatory Genomics Lab v1 | Saurabh Sinha

  13. Step 7C: Analyzing MEME Results Other predicted motifs do not seem as plausible. Regulatory Genomics Lab v1 | Saurabh Sinha

  14. Step 8A: Comparison with Experimentally Validated Motif for BIN FlyFactorSurvey is a database of TF motifs in Drosophila Melanogaster. Go to the following link to view the motif for BIN: http://pgfe.umassmed.edu/ffs/TFdetails.php?FlybaseID=FBgn0045759 Regulatory Genomics Lab v1 | Saurabh Sinha

  15. Step 8B: Comparison with Experimentally Validated Motif for BIN Best MEME Motif Reverse Complemented Best MEME Motif Actual BIN Motif There is strong agreement between the actual motif and the reverse complement of MEME’s best motif. This indicates MEME was actually able to find the motif from the top 100 ChIP regions for this TF. Regulatory Genomics Lab v1 | Saurabh Sinha

  16. Gene Set Enrichment Analysis In this exercise, we will extract the nearby genes for each one of the ChIP peaks for BIN. We will then subject the nearby genes to enrichment analysis tests on various Gene Ontology gene sets utilizing DAVID. Regulatory Genomics Lab v1 | Saurabh Sinha

  17. Step 9A: Acquire Nearby Genes In this step, we will acquire all genes in Drosophila Melanogaster. Select Get Data and UCSC Main. Regulatory Genomics Lab v1 | Saurabh Sinha

  18. Step 9B: Acquire Nearby Genes Ensure the following settings are configured. Click get output and then send query to Galaxy. Regulatory Genomics Lab v1 | Saurabh Sinha

  19. Step 9C: Acquire Nearby Genes Go back to Galaxy. Select Operate on Genomic Intervals Then Select Fetch Closest non-overlapping interval feature. Regulatory Genomics Lab v1 | Saurabh Sinha

  20. Step 9D: Acquire Nearby Genes For For every interval feature in select our original ChIP track. For Fetch closest features from select the UCSC genes track we just downloaded. Click Execute Regulatory Genomics Lab v1 | Saurabh Sinha

  21. Step 10A: Cut Out Genes The resulting file has the list of nearby genes in CG format in the 12th column. We are only interested in the genes, so we need to cut them out using the CUT tool. Under Text Manipulation click Cut Regulatory Genomics Lab v1 | Saurabh Sinha

  22. Step 10B: Cut Out Genes For Cut Columns type c12 to denote column 12. Under Delimited Byselect Tab Under From select the track we just generated: the intersection of the ChIP-peaks and Fly Base genes. Click Execute. Regulatory Genomics Lab v1 | Saurabh Sinha

  23. Step 11A: Convert IDs The resulting file from the previous analysis is located in: classes/mayo/sinha/cg_transcripts.txt The enrichment tool we will use doesn’t accept genes in this format. We will use the FlyBase ID converter to convert these transcript ids into FlyBase transcript ids. Regulatory Genomics Lab v1 | Saurabh Sinha

  24. Step 11B: Convert IDs Go to http://flybase.org/static_pages/downloads/IDConv.html Upload our cg_transcript.txt file and hit Go. On the next page, click FlyBase Hit List and choose where to save the file. Regulatory Genomics Lab v1 | Saurabh Sinha

  25. Step 12A: Gene Set Enrichment - DAVID The file from the previous analysis is available here: classes/mayo/sinha/fb_transcripts.txt With our correct ids of transcripts of genes near ChiP peaks, we now wish to perform a gene set enrichment analysis on various gene sets. A tool that allows us to do this from a web interface is DAVID located at the following address: http://david.abcc.ncifcrf.gov/tools.jsp Regulatory Genomics Lab v1 | Saurabh Sinha

  26. Step 12A: Gene Set Enrichment - DAVID We will perform a Gene Set Enrichment Analysis on our transcript list (gene list) and see what GO categories we are significantly enriched in. Click Choose File on select our fb_transcripts.txt file. Under Select Identifier select FLYBASE_TRANSCRIPT_ID. Under Step 3: List Type check Gene List. Click Submit List. Regulatory Genomics Lab v1 | Saurabh Sinha

  27. Step 12B: Gene Set Enrichment - DAVID On the next page, select Functional Annotation Chart. Our gene set seems to be enriched in the BP_FATGO category! This is consistent with the activity of the BIN transcription factor in the literature. Regulatory Genomics Lab v1 | Saurabh Sinha

More Related