220 likes | 328 Views
Genboree Microbiome Workbench 16S Workshop Part I. March 11 th , 2014 Julia Cope Emily Hollister Kevin Riehle. Genboree 16S Workshop. Learning Objectives Students should be able to take .sff files and user supplied information and produce: Metadata File PCoA Classification Distribution
E N D
Genboree Microbiome Workbench 16S Workshop Part I March 11th, 2014 Julia Cope Emily Hollister Kevin Riehle
Genboree 16S Workshop • Learning Objectives • Students should be able to take .sff files and user supplied information and produce: • Metadata File • PCoA • Classification Distribution • Expectations • Apply topics learned today before next meeting • Be able to discuss where issues arise • Be able to move knowledgeably through the whole Genboree Workflow
Genboree 16S Workshop Part II • Learning Outcomes • Newer database version of RDP – How to take advantage? • Students should take user .sff files and user created metadata file and produce: (I can provide files if needed.) • PCoA (QIIME) • Classification Distribution (RDP) • Expectations • Apply topics learned in tutorial • Be able to discuss where in the process issues arose • Have a hypothesis about your data issues if they happen
Workshop Outline • 16S • Metadata File • Genboree Workbench Workflow • Account • Group • Database • Project • Loading your files/samples/sequences (and linking) • QIIME • RDP • How to get help • Wrap Up and Preparation for 2nd Installment
Resources • Genboree Home Screen • http://genboree.org • Tutorials are located in the Genboree Commons • You must be signed in to open the following link • http://genboree.org/theCommons/projects/mw-march-2014 • Tutorial 1 Data Set: • http://www.genboree.org/microbiome/include/data/tutorial_sequence_file.sff.gz • Tutorial 2 Data Set: • http://genboree.org/theCommons/attachments/3545/Tutorial_2.zip • Projects are accessed through the Genboree Workbench
16S • What is it? • What part is being sequenced? • Here? • Elsewhere? • How is this accomplished? • DNA to bead to light • Intro. to flow data and .sff file content • OUTPUT is an .sff file • Aside on zipping methods and large file transfers
16S • What is it? 16Svedberg (small sub-unit of the ribosome) • What part is being sequenced? Here? - TCMC sequences the V5-V3 by454 Elsewhere? - V3-V5, V1-V3, V9, V7-V9…many more. Know your variable regions Allmetrics.net Sales Material Tortoli E Clin. Microbiol. Rev. 2003;16:319-354
16S • How is this accomplished? • DNA to bead to light http://cage.unl.edu/equipmentsoftware.shtml 454 Life Sciences Sales Materials
16S • How is this accomplished? • DNA to bead to light http://cage.unl.edu/equipmentsoftware.shtml 454 Life Sciences Sales Materials
16S • How is this accomplished? • DNA to bead to light • Intro to flow data and sff file content • OUTPUT is an .sff file • Standard FlowgramFormat • All reads are structured as linker-tag-primer • Provides both identity and quality information http://cage.unl.edu/equipmentsoftware.shtml Allmetrics.net Sales Material
Genboree Workflow Meta-data • Take one step back from the Genboree Workflow and talk about input files. • What do you do with your files? .sff From: Genboree.org help files
Genboree Workflow Meta-data Meta data files are very small and do not need compression. Meta-data • What do you do with many files? • Genboree takes .zip, .gzip, .txt, and .sff files • Compressed files are easier and faster to move • Multiple files are easier to move when compressed together in an archive .sff .sff .sff .sff .sff(s) should be archived and compressed. .sff .sff
Metadata Files • What data must you have? • How should it be formatted for Genboree? • What can you include? • How to make it tab-delimited • Include variable region or primer? • Directional awareness on primers
Metadata Files • What data must you have? • name • barcode • region or proximal & distal • First column must begin with # • #No_spaces_are_allowed_in_column_names_0123456789 • How should it be formatted for Genboree? • Tab delimited • What can you include? • How to make it tab-delimited? • Include variable region or primer? • Directional awareness on primers
Metadata Files • How to determine which to include - variable region or primers • Directional awareness on primers • Demo of making and saving as tab delimited or
Metadata Files - Demo • Select the data above and Copy. • Paste into Excel or an open source spreadsheet program. Be sure all entries are free of spaces and special characters and that all samples have the same number of columns. Avoid the column titles "state" and "type". • Save As and select tab-delimited. • Name your file in a clear and consistent manner. or
Metadata Files • How to determine variable region vs. primer inclusion • Directional awareness of primers • If you aren’t sure, ask! • What are these files often called: mapping, metadata, oligos, or linker-primer file. (Many others possible.) Allmetrics.net Sales Material
Metadata Files • Another example: Tutorial Set 2 Metadata • What possible issues may arise with this metadata file?
Metadata Files • Another example • What possible issues may arise with this metadata file? • Change name => #name (or any #1st entry) • Change tag => barcode • Change type => sample_type (do not name columns ‘type’ or ‘state’) • Demo. making and saving as tab-delimited
7zip • Zipping methods and large file transfers • Compression and archiving of files • Uncompressing in an easy to use format for PCs • Demo compressing • .sff (s) • http://www.7-zip.org/ From: 7-zip.org
Genboree Workflow • Create Group • Create Database • Create Project • Upload Files • Create Samples (Sample Import using metadata file) • Link Samples to Sequence Files (Sample File Linker) • QC and Attach Sequences (Sequence Import) • QIIME • RDP