70 likes | 210 Views
BIF-30806 Group Project. Group ( A) rabidopsis : David Nieuwenhuijse Matthew Price Qianqian Zhang Thijs Slijkhuis Species: C. Elegans Project: Advanced (+Basic). Progress Report. Project Overview. Results so far. David Nieuwenhuijse GeneID and GO term extraction tool
E N D
BIF-30806 Group Project Group (A)rabidopsis: David Nieuwenhuijse Matthew Price Qianqian Zhang Thijs Slijkhuis Species: C. Elegans Project: Advanced (+Basic)
Results so far • David Nieuwenhuijse • GeneID and GO term extraction tool • Cytoscape GO enrichment analysis • Finding automatic GO enrichment tool for pipeline • Qianqian Zhang • Create shell script for running Cuffdiff, Gffread and Samtools program • Get the gene lists of most differentially expressed genes and highest expressed genes • Visualization of differentially expressed genes by cummeRbundpackage: Density plot, Scatter plot, Volcano plot, P value distribution plot, MA plot etc. • Basic statistics of differentially expressed genes
Results so far • Matthew Price • Script for listing the top 100 expressed genes • Script for determining GC-content, transcript & intron length • Script for getting correlation between each transcript property and the expression level • Thijs Slijkhuis • Created a shell script that: • Downloads the source files • Converts SRA into FASTQ files • Performs bowtie2-build • Performs tophat • Performs cufflinks • Programmed a script that sorts cuffdiff output on p-value (significance in differential expression), extracts gene names from it
Issues/Challenges • Co-expressed Genes Modules • WGCNA package not usable in our case • Use cummeRbundpackage to get Heatmaps • GO enrichment analysis • Not many genes are annotated in the GO database. • Gene id of the differentially expressed genes are not compatible with the NCBI database. • Transcript sequences • Not all expressed transcripts in the .gtf file can be matched to their corresponding sequence in the fasta file.