140 likes | 249 Views
Data Analysis Project. Advanced Bioinformatics BIF-30806 2013. Set Up. Basic and Advanced Project Available data sets Deliverables Literature Groups Schedule week 3 & 4. Purpose. Build software pipeline to perform a transcriptome analysis
E N D
Data Analysis Project Advanced Bioinformatics BIF-30806 2013
Set Up • Basic and Advanced Project • Available data sets • Deliverables • Literature • Groups • Schedule week 3 & 4
Purpose • Build software pipeline to perform a transcriptome analysis • Code to connect tools and do input/output conversions • Code developed on certain data set, but should be able to run on different input (e.g. different species)
Basic Project • Which are the most highly expressed genes (top 100) in your species of interest under a single condition (or in a single tissue)? • Can you find a correlation between gene expression and transcript properties, such as GC content, transcript length, intron length, codon usage, or others? • [Optional] Can you visualize the highly expressed genes in an interaction network? TOOLS: Tophat, cufflinks, perl scripts, and possibly others.
Advanced Project • Which transcripts/genes show differential expression under both conditions? • Can you find out what the functions of these genes are? • Can you give a biological explanation of why these genes are differentially expressed under the conditions in your experiment? • [Optional] In your data set, can you find modules of co-expressed genes? Try to use the WGCNA package. • [Optional] Can you find a functional description and explanation for the identified modules? • [Optional] To what extent are the modules conserved in a closely related species? TOOLS: Tophat, cufflinks, cuffdiff, WGCNA, perl scripts, and possibly others
You have a choice • Start on basic or advanced project • Of cour se the basic project can be extended with elements of the advanced project • Group members should talk to each other and discuss their choice with Harm/Sandra.
Deliverables per group • Pipeline code, all input/output has to be stored in the “group directory” at the server • Final presentation (20 minutes) • Each group member must prepare and presents some slides (5 min per person)
Deliverables per person • Project report • All the work done in the project (intro, M&M, results, discussion/conclusion) • Appendix A: your contribution to the group effort • Appendix B: personal reflection on the project • Contribution to group presentation • Prepare and present some slides (5 min per person) • The code that you have written
Data • On server: /course/project/ • Arabidopsis • Yeast • Other data/species of your choice • Use for example NCBI Short Read Archive (SRA)
Literature • See course website
Groups • See course website
Schedule week 3 & 4 • Presentations • Tue (26-2) afternoon: presenting project plan • Fri (1-3) afternoon: presenting progress • Fri (8-3) all day: final presentation • Deadline report & code • Sunday March 10, 23:59 • So, your report has to be in before Monday! • Email your report to “project@bioinformatics.nl”