130 likes | 303 Views
Genome Analyzer Software. Jordan Stockton. Genome Analyzer Computer Hardware. Customer Selected Server. Automated Data Transfer Real Time Data QC. 8 Kernel 32 GB RAM 9 TB RAID Storage ~$50K. Illumina Preconfigured Server. Can store raw data for 4-5 paired-end reads.
E N D
Genome Analyzer Software Jordan Stockton
Genome Analyzer Computer Hardware Customer Selected Server • Automated Data Transfer • Real Time Data QC • 8 Kernel • 32 GB RAM • 9 TB RAID Storage • ~$50K Illumina Preconfigured Server Can store raw data for 4-5 paired-end reads
Firecrest – Image Analysis Tiff Image Files Intensity Files
Maximum Threshold Cluster FirecrestIdentify cluster and extract intensities • Band-pass filter • Sharpens and enhances clusters • Threshold • Cuts out background noise • Maxima detection • Finds clusters
Firecrest OffsetsAdjusts scale and registration of image X # Offsets/offsets.txt 0.00 0.00 0.00000 0.00000 0.32 1.41 0.00069 0.00068 -0.01 1.82 -0.00123 -0.00125 0.14 1.59 -0.00097 -0.00092 dY Y dX
Bustard – Base Calling Intensity Files Sequence Files
Bustard Base with highest corrected intensity is called C A C G T
GEneration of Recursive Analyses Linked by Dependency Gerald - Alignment Sequence Files Output Files
GEneration of Recursive Analyses Linked by Dependency GeraldFiltering removes low quality base calls Chastity: Default value > 0.6 IA • Other Filters: • Purity • Similarity • Neighbor • Neighborhood IB
VALID 160 bp insert 1220 bp insert Features for Paired End Runs In ELAND (alignment software) • Calculate median insert size • Align both reads and record multiple hits for each • Discard alignments not within three standard deviations of median insert length VALID 220 bp insert REJECTED
Supported Platforms • Linux – And only Linux! • Will more prominently state in manuals & data sheets. Other Common Platforms • Considered “non-supported configurations.” • Cygwin • Will work with Pipeline already installed on Cygwin-Windows systems • New users should be discouraged • Solaris (and other Unix) Platforms • Common at genome centers, pharma, possibly unavoidable. • Need to communicate unsupported platform status.
Publicly Available Analysis Tools Group: Rene Warren et al; British Columbia • Software: SSAKE – Assembly of short reads • Downloadable: Yes, free • Reference: http://bioinformatics.oxfordjournals.org/cgi/content/full/23/4/500 • Contact: rwarren@bcgsc.ca Group: Pavel Pevzner, Mark Chaisson; UCSD San Diego • Software: Euler – Genomic Assembly • Downloadable: Free for academic groups • http://nbcr.sdsc.edu/euler/ Group: Jonathan Butler, David Jaffe et al; Broad Institute • Software: ALLPATHS – Assembly optimized for highly polymorphic regions • Downloadable: Yes, free
Publicly Available Analysis Tools Group: Barbara Wold, Rick Meyers • Software: ChIP-Seq Peak Finder • Downloadable: Yes, Free • http://woldlab.caltech.edu/html/software/ Group: Cancer Genome Anatomy Project • Software: SAGE DGED Tool • Available Online • http://cgap.nci.nih.gov/SAGE/SDGED_Wizard?METHOD=SS10,LS10&ORG=Hs