1 / 2

Quality filtering using ConvertGerald Convert reads to fasta format

Quality filtering using ConvertGerald Convert reads to fasta format. Extract coordinates for exon, transcript, gene, non-coding regions. Align reads to the genome (with Eland software ) allowing up to10 perfect alignments up to 5 alignments with 1-mismatch

kennan
Download Presentation

Quality filtering using ConvertGerald Convert reads to fasta format

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Quality filtering using ConvertGerald Convert reads to fasta format Extract coordinates for exon, transcript, gene, non-coding regions • Align reads to the genome (with Eland software ) allowing • up to10 perfect alignments • up to 5 alignments with 1-mismatch • up to 2 alignments with 2-mismatches Raw reads from illumina sequencer Genome sequence and annotationprovided by maize sequencing project Assess background level as sum of alignments to non-coding regions Weight the alignments of each read by a fraction 1/n where n = number of best alignments for the read Add the alignments to each exon Add up exon coverage for each transcript Statistical analysis (to identify significantly expressed genes) Expression level of gene assigned to be that of its most highly expressed transcript isoform

  2. Run GERALD pipeline to convert bustard quality scores and sequences to a single fastq format • Align reads to the genome (with Tophat software which uses the Bowtie mapper) • http://tophat.cbcb.umd.edu • Pre-built indexes for the A.thaliana genome (splice junctions), but can build for any reference genome • Uses GFF files provided by TAIR Raw reads from illumina sequencer Output = .SAM file and RPKM Statistical analysis (to identify significantly expressed genes)

More Related