830 likes | 1.05k Views
Progress report. 9.17 PJ Lab. Cross-project. TCGC/ICGC database. "CMDI-UK" "LIHM-FR" "LUSC-CN" "RECA-CN". "PAAD-US". Clinical plot. gridExtra::arrangeGrob(emptyPlot,emptyPlot,plotDD,plotD , ncol= 3 , widths=c(.20,.0 5 ,.80)).
E N D
Progress report 9.17 PJ Lab
TCGC/ICGC database "CMDI-UK" "LIHM-FR" "LUSC-CN" "RECA-CN" "PAAD-US"
gridExtra::arrangeGrob(emptyPlot,emptyPlot,plotDD,plotD , ncol=3, widths=c(.20,.05,.80)) gridExtra::arrangeGrob(emptyPlot,emptyPlot,plotDD,plotD , ncol=3, widths=c(.05,.20,.80))
barplot(ego2, showCategory= 6, split='ONTOLOGY', font.size = 8,horiz = FALSE) + facet_grid(.~ONTOLOGY, scale='free') +scale_colour_gradient2(low = "red", mid = "white", high = "blue", midpoint = 7.562602e-05)
To do list • TCGA/ICGC project plot • Enrichplot
Progress report 10.01 PJ Lab
Question Oncotator or VEP pVac-seq Change the annovar Refseq Use netMHC anlysis
Future work • E.L.K • pVac-seq • Gromacs
Progress report 11.26 PJ Lab
test file output all_epitopes.tsv Example file output all_epitopes.tsv
pVac-seq output pvacseq run --iedb-install-directory /home/jacky831006/ -e 11 ~/vep_data/VCF/test2.vcf test HLA-A*01:01,HLA-A*01:02 NetMHC ~/pvac_test/test3
test.all_epitopes.tsv Mutation Position MT Epitope Seq WT Epitope Seq Best MT Score Method HLA-A*01:01 11 3 9 KSLPGGLDTVV KSLPGGLDAVV NetMHC 31067.37 31004.92 0.998 Best MT Score Corresponding WT Score Corresponding Fold Change
pVac-seq • Binding filter • Coverage filter • Transcript support level filter • Top score filter
Transcript support level filter The GENCODE TSL provides a consistent method of evaluating the level of support that a GENCODE transcript annotation is actually expressed in humans.
Question Expression data Coverage data
Creating a phased VCF of proximal variants By default, pVACseq will evaluate all somatic variants in the input VCF in isolation. As a result, if a somatic variant of interest has other somatic or germline variants in proximity, the calculated wildtype and mutant protein sequencesmight be incorrect because the amino acid changes of those proximal variants were not taken into account.
Future work • pVac-seq • hail
Progress report 2018.12.10 PJ Lab
pVac-seq(1.0.7) test.combined.parsed.tsv Mutation Position MT Epitope Seq WT Epitope Seq Best MT Score Method HLA-A*01:01 11 3 9 KSLPGGLDTVV KSLPGGLDAVV NetMHC 31067.37 31004.92 0.998 Best MT Score Corresponding WT Score Corresponding Fold Change
NetMHC 78 HLA allele sequences
The tables below show the allele-specific thresholds for the 38 most common HLA-A and HLA-B alleles, representative of the nine major supertypes. The tables can also be downloaded as an RTF file (see attached file)
Different threshold pvacseq 1.1.5
pvacseq run --iedb-install-directory /home/jacky831006/ -e 11 ~/vep_data/VCF/ACC-US.vcf test HLA-A*01:01,HLA-A*02:01,HLA-A*02:02,HLA-A*02:03,HLA-A*02:06 NetMHC ~/pvac_test/ACC-US 5 alleles 1000 mutations 21mins ACC 19000 -> 37000 mutations *10 * 73 Total 37*21*10*73*78/5 mins=147474.6 hrs = 6144.775 days
Other Algorithm pvacseq 1.1.5
NetMHC • Total 37*21*10*73*78/5 mins = 147474.6 hrs = 6144.775 days MHCflurry • Total 1.5hrs *10*73*80/5 mins = 730 days
Question nohup pvacseq run --iedb-install-directory /home/jacky831006/ -e 11 ~/vep_data/VCF/ACC-US.vcf test HLA-A*01:01,HLA-A*02:01,HLA-A*02:02,HLA-A*02:03,HLA-A*02:06 NetMHC ~/pvac_test/ACC-US &
Progress report 2018.12.24 PJ Lab
Test tensorflow Tensorflow -> Tensorflow(GPU)
Question Time 1.5hr2hr up
Allele WT binding affinity MT binding affinity Fold Change Protein structure
Future work • Run MHCflurry by gpu • Remove the duplicate mutation in all projects
Progress report 2019.1.07 PJ Lab
Speedup suggestion • Number of variants in your VCF Split the VCF into smaller subsets and process each one individually, in parallel. • Number of transcripts for each variant Use the --pickoption when running VEP to annotate each variant with the top transcript only. • The --fasta-sizeparameter value When using a local IEDB install, increase the size of this parameter. • Number of prediction algorithms, epitope lengths, and HLA-alleles • --downstream-sequence-lengthparameter value Reduce the value of this parameter.
VCF processing • bcftools • vcftools