200 likes | 354 Views
flow. charts. make. science. less. hard. Importing honey bee data into lightning 3. Click here to get started. Once the information reloads based on your settings, click on your organism. - Kingdom - Group - Subgroup. species selection. species selection. assembly. species
E N D
flow charts make science less hard
Importing honey bee data into lightning 3 Click here to get started
Once the information reloads based on your settings, click on your organism - Kingdom - Group - Subgroup
species selection
species selection assembly
species selection assembly
Checkpoint: data retrieved • Now it’s time for you to decide what you want to do with your data set. • You may want to • Alter its format (unzip, %GC, etc.) • Send it to a supercomputer
Compressed data Uncompressed Lightning 3
From here the file is recursively sent to the directory with the -r command: scp-r ~/Desktop/Primary_Assemblytut_user2@lightning3.its.iastate.edu:/data003/GIFTEACH/BCB660/foldername This command should be thought of as three commands into one: • scp –r ~/Desktop/Primary_Assembly • what you want to move • tut_user2@lightning3.its.iastate.edu • where you want to move it • :/data003/GIFTEACH/BCB660/foldername • where to go once it gets there You’re now ready to log into lightning3
lightning3 address password change directory to folder name containing Primary assembly change directory to Primary_Assembly/ change directory to placed_scaffolds Type this long command This last command takes everything with gz in the name and decompresses each file
Unzipping file at lightning3 • Permission may be denied, if so enter : • This should grant permission to each file chmod –R 777 Primary_Assembly/ re-enter long command
We want to takeallthe individual scaffold files and put them into 1 file.Run the GC program on 1 file instead of 16 files • cat*.fa*>ApisMellifera_4.5.fasta • Here's the breakdown of this command • cat- concatenate, so take all these folders • * - wild card, around key letters • .fa– key letters >- sends command to a file • ApisMellifera_4.5.fasta - this is the name of our file
to convert this to GC content • To convert this to GC content • ./percentGCApisMellifera_4.5fasta • the './percentGC' is a program turning ApisMellifera_4.5fasta into a table format.
Open this file into R • > honeyBeeGC← read.table(“ApisMellifera_4.5gc”) • > ls(honeyBeeGC) • This should read • [1] “V1” “V2” “V3” • If you want a histogram • > hist(honeyBeeGC$V2), breaks=seq(0,100, len=1000))
use join if have a common field • use cat to glue fast afiles together in order cat file1 file2 file3 > redirectedOutput transliterating httyp://stackoverflow.com/questions
list of species • unzipping gzips • current process to generate our .gc format file • percent GC Abdf.fasta >gctemp • seqlen.awkAbr.fasta > sitemp_tabs
cut out first two fields of gctemp outputs to temp file • cut – f1-2 gctemp_tabs