10 likes | 191 Views
dbEST 56,645 reads. Sanger 141,097 reads. Roche 454 1,333,444 reads. AB1 file?. NO. YES. 129,195 reads. 11,902 reads. Base call (phred). CjCon1 and coding information † (prot4EST). Vector/adaptor/chloroplast detection and trimming (SeqClean).
E N D
dbEST56,645 reads Sanger141,097 reads Roche 4541,333,444 reads AB1 file? NO YES 129,195 reads 11,902 reads Base call (phred) CjCon1 and coding information † (prot4EST) Vector/adaptor/chloroplast detection and trimming (SeqClean) Vector/adaptor/E. coli /chloroplast masking and trimming (cross_match and SeqClean) Vector/adaptor/E. coli masking and trimming (cross_match and SeqClean) Adaptor/chloroplast masking and trimming (cross_match and SeqClean) Classifying reads according to properties (origin of cDNA and sequencing direction) Arabidopsis (AGI)Sunflower (HAGI)Tabaco (NTGI) Oak (OGI) Rice (OSGI) Pine (PGI)Spruce (SGI) Gene index 118,319 reads from AB1 files 1,201,150 reads Assemby within each group (MIRA) Analysis of SSR frequency (misa.pl) and GC content Contigs Hybrid assemblly (MIRA) CjCon1 SSR detection (misa.pl) 92,541 debris 81,284 contigs 4,059 SSRs in3,694 contigs Selection for full length cDNA candidates Quality trimming GroupA56,273 reads GroupB11,831 reads GroupC173,816 sequences Full length peptide prediction (FrameDP) SSR detection (misa.pl) 8,166 SSR-containing sequences Gene ontology annotation (Blast2GO) Homology search (BlastX) Clustering (BlastCLUST) Clustering(CD-HIT-EST) read2marker Uniprot_sprot 3,644 full length peptide for matrix construction 4, 067 unique sequences Gene set enrichment analysis (FatiGO) Chloroplast protein sequences Peptide prediction (prot4EST) simTrans.pl costructSMAT.pl p4e.pl 111 primer pairs MISAmisa.plp3_in.plprimer3p3_out.pl 81,284 predicted peptide sequences 2,889 SSR primer pairs in 2,772 sequences Identifying Coding regions (Fasty35) CjCon1 In silico PCR (ipcress) Designed primer coordinates Prediction of primer location 3,220 total PCR products Prediction of SSR location† 266 primer pairs discarded 2,623 unique PCR amplification primer pairs selected Clustering iPCRess pruducts (BlastCLUST) 2,412 unique sequences and primer pairs EST-SSR marker sequences (Moriguchi et al. 2003 & 2009) BlastN 2,371 unique loci 96 primer pairs selected Duplication check (Blast) 96 primer pairs selected