310 likes | 428 Views
Development of Viral Epitope Repertoires. Adi Yablonka Jennifer Benichou Project Advisor: Prof. Yoram Louzoun Computational Biology Track 2010 Final Project. Motivation and Goals. HIV: Many failed attempts have been made to develop vaccines against HIV.
E N D
Development of Viral Epitope Repertoires Adi Yablonka Jennifer Benichou Project Advisor: Prof. Yoram Louzoun Computational Biology Track 2010 Final Project
Motivation and Goals • HIV: • Many failed attempts have been made to develop vaccines against HIV. • This is due to the rapid mutation rate which enables the virus to evade immune recognition. • HBV: • The HBV genome has overlapping CDS’s. • Analyzing the implications of mutations affecting overlapping genes (with regard to HBV’s evolution and its interaction with the immune system) may also help us learn about similar viruses with overlapping genes (e.g. HPV). • Our goal is to characterize changes and viral trade-off preferences in the epitope distribution along the CDS of different proteins.
Background – Antigen Presentation • An epitope is the part of the antigen that is recognized by the immune system. • Proteasome – degrades proteins within the cell into peptides about 9 AA long. • TAP – delivers cytosolic peptides into the ER, where they bind to MHC class I molecules. • MHC-I – found in every nucleated cell, function being to display fragments of proteins from within the cell to T-cells.
Background - HIV • HIV attacks CD4+ cells. The spikes on the surface of the virus particle stick to the CD4 and allow the viral envelope to fuse with the cell membrane. • Leaving the envelope behind after fusion, the viralRT converts its RNA genome into DNA. It is then transported to the cell nucleus, and is spliced and integrated into the human genome by the viral integrase. • HIV provirus may lie dormant within a cell for a long time. When the cell becomes activated, human as well as HIV genes are transcribed using human enzymes. • Then the messenger RNA is transported outside the nucleus, and is used as a blueprint for translation and replication.
Background – HIV Genes • Tat – controls transactivation of all HIV proteins. • Rev – The differential regulator of expression of virus protein genes. • Nef – negative regulator factor, retards HIV replication. • Vif – infectivity factor gene. • Vpr – undetermined function. • Vpu – required for efficient viral replication and release. • GAG – codes for various proteins necessary to protect the virus. Has 3 parts: MA (matrix), CA (capsid), and NC (nucleocapsid). • POL – codes for the enzyme necessary for virus replication. Has 3 parts: PR (protease), IN (endonuclease), and RT (reverse transcriptase). • ENV – the envelope of the virus. Has two parts: SU (surface envelope, gp120) and TM (transmembrane envelope, gp41). Regulatory Proteins Accessory Proteins Structural Proteins
Background - HBV • HBV is a small enveloped virus with partially double-stranded circular DNA genome. It is the only member of the hepadnaviridae family that infects human. • The HBV genome contains 4 main genes: • Core – encodes for the capsid protein. • Pol – encodes for a polymerase, with reverse transcriptase activity. • Surface – encodes for small, medium and large ER intermembrane proteins. • X – thought to have transcription regulation activity. • The HBV genome has 4 ORF’s – the entire Surface protein, the C-terminus of Core and the N-terminus of X overlap with Polymerase.
Background – Viral Epitopes • Previous works have shown that HIV tends to decrease the number of epitopes in regulatory proteins, which predominate in the initial stages of replication. • On the other hand, in HBV, the protein copy number more than the expression time seems to affect the epitope density.
HLA Polymorphism • The advantage in a mutation that removes an epitope is usually lost when the virus transfers to a new host with different HLA alleles. • Therefore, we expect a high turnover of mutations in potential epitopes in the new host during the transfer. • Mutations affecting the cleavage sites (flanking regions) are not dependent on the HLA allele and will therefore provide the virus with this advantage, also in the new host.
Algorithm Multiple Sequence Alignment Phylogenetic Tree PreProcessing of Sequences DNA-based Mutation Positioning Within the AA Sequences Translation of DNA Sequences Peptibase Mutation Characterization for all Alleles
MSA • Phylogenetic Tree • PreProcessing of Sequences • Mutation Positioning • Translation • Peptibase • Mutation Characterization MSA and Phylogenetic Tree • The input DNA sequences are aligned using MUSCLE 3.6. The sequences were retrieved from the LANL HIV Database. • A genetically distant ‘Outgroup’ sequence is added to properly position the root of the tree and reconstruct the ancestral sequences. • The ‘Outgroup’ sequence for the HIV dataset was selected from SIV.
MSA • Phylogenetic Tree • PreProcessing of Sequences • Mutation Positioning • Translation • Peptibase • Mutation Characterization MSA and Phylogenetic Tree • The alignment is used to build a phylogenetic tree using the Maximum Parsimony method (Phylip 3.69). • The intermediate sequences built by the program reflect the changes that occurred within the coding sequence of the viral protein. • The phylogenetic tree shows the epitope development of the virus.
MSA • Phylogenetic Tree • PreProcessing of Sequences • Mutation Positioning • Translation • Peptibase • Mutation Characterization PreProcessing of Sequences • The sequences reconstructed by the Phylip program may contain ambiguous nucleotides. • These nucleotides are fixed from the bottom of the tree upwards, in order to rely on the original input sequences. • Reconstructed sequences containing an early stop-codon remained in the tree, but were not taken into account in the analysis.
MSA • Phylogenetic Tree • PreProcessing of Sequences • Mutation Positioning • Translation • Peptibase • Mutation Characterization DNA-based Mutation PositioningWithin the AA Sequences • Mutations of each sequence with its direct descendant were noted in the DNA level. • Each such mutation was then associated with the matching amino acids in the translated sequences. Mutation: C A Between: AA1 in father AA1 in son Mutation: G - Between: AA2 in father AA1 in son
MSA • Phylogenetic Tree • PreProcessing of Sequences • Mutation Positioning • Translation • Peptibase • Mutation Characterization Translation of DNA Sequencesand Upload to Peptibase • All DNA sequences (input and intermediate) were: • translated to AA’s. • uploaded to the Peptibase server. • The Peptibase server was developed by our lab and is used to predict epitopes within AA sequences. • The analysis performed in Peptibase is conducted on the 31 most frequent HLA alleles, taking into account the allele frequency in the human population.
MSA • Phylogenetic Tree • PreProcessing of Sequences • Mutation Positioning • Translation • Peptibase • Mutation Characterization Peptibase • Given an AA sequence, Peptibase uses 3 cut-offs on a 9-mer AA sliding window to predict its epitopes: • Cleavage by the Proteasome • Binding to TAP • Binding to MHC-I • For each 9-mer, cleavage, TAP and MHC-I binding scores are computed. • 9-mers passing all three stages are defined as epitopes.
MSA • Phylogenetic Tree • PreProcessing of Sequences • Mutation Positioning • Translation • Peptibase • Mutation Characterization Mutation Characterization • Some mutations in the nucleotide level may either affect the resulting amino acid (replacement) or not (silent). • We defined 9 types of replacement mutations: • E2N • F2N • N2N • E2F • F2F • N2F • E2E • F2E • N2E Epitope PGRAFYATGEITGDIR N F F N E
MSA • Phylogenetic Tree • PreProcessing of Sequences • Mutation Positioning • Translation • Peptibase • Mutation Characterization Mutation Characterization • The mutation type is based on the original affiliation of the amino acid in the father sequence, and the new affiliation within the son sequence (whether it belonged to an epitope/flanking region or a non-epitope region). • For example, an E2N mutation occurred in a nucleotide which belonged to an epitope in the father sequence, and resulted in the loss of this epitope in the sons sequence. • E2N • F2N • N2N • E2F • F2F • N2F • E2E • F2E • N2E
Results – HIV (Full Balance) Full Balance Calculation Epitope: E2N + E2F – N2E – F2E Flanking: F2N + F2E – N2F – E2F Non-Epitope: N2E + N2F – E2N – F2N • The results were normalized by the average length of the proteins.
Results – HIV (Full Balance) • In compliance with HLA polymorphism, all HIV proteins clearly tend to eliminate flanking regions. • For most proteins, the non-epitope balance is approximately 0, except for Nef and Vpu which accumulate epitopes more than others, and Rev and Vpr which remove epitopes. • In the epitope balance, most proteins (again, except for Rev and Vpr) create new epitopes instead of removing them. • An interesting point to notice is the total balance within epitope and flanking regions, where there is a tendency to remove cleavage sites by adding epitopes.
Results – HIV (Transition Balance) • The results were normalized by the average length of the proteins.
Results – HIV (Full Balance) • All HIV proteins tend to remove flanking regions, either completely or by creating a new epitope. • Rev and Vpr prefer to eliminate existing epitopes without creating new epitopes.
Results – HBV (R/S Ratio) • HBV proteins with multiple copies undergo selection against epitope presentation. • Pol is expressed in low levels and does not go through the same selection. • Epitope-reducing mutations in other proteins are at the expense of causing replacement mutations in the overlapping regions of Pol.
Results – HBV (R/S Ratio) • R/S is the ratio between the number of replacement and silent mutations. • The R/S ratio is significantly higher in regions with two reading frames, since there are few mutations that are simultaneously silent in the two reading frames.
Results – HBV (Epitope turnover) Epitope Turnover Overlapping (2 RF’s) Non-overlapping (1 RF) No. of mutations affecting epitopes per 1000 bp in each father-son pair Turnover Calculation: E2N+N2E+N2F+F2N
Results – HBV (Epitope turnover) • The epitope turnover is the number of mutations per 1,000 nucleotides either adding or removing an epitope between a father sequence and its son in the phylogenic tree. • In the non-overlapping regions of proteins C and X (one reading frame), there is a higher turnover than in overlapping regions. • In their overlapping regions (two reading frames), most mutations are not allowed due to functional constraints. • Pol, which is expressed in low levels and does not tend to remove epitopes, has a lower turnover in its non-overlapping region. The higher turnover is seen in its overlapping region, due to mutations meant to affect the other genes.
Results – HBV • The number of mutations affecting the cleavage sites was observed (epitope removing mutations per 1000 nucleotides in father-son pair in the phylogenetic tree). • The difference is significantly positive in practically all regions. Net Decrease in the Number of Cleavage Sites F2N–N2F
Conclusions • In order for a virus to survive in the presence of a CTL immune response, it must minimize the total number of exposed epitopes. • In HIV and HBV, there is a clear tendency to remove epitopes by eliminating cleavage sites. This may be the viral solution against the HLA polymorphism. • In HBV, there is a strong selection on Core, Surface and X to remove epitopes. Core and X have an easier time mutating their non-overlapping regions, since in the overlapping regions Pol is also affected. Pol, having a low copy number, doesn’t try to remove epitopes and is therefore mainly affected in overlapping regions.
Conclusions • HIVremoves cleavage sites by creating new epitopes. A possible explanation: The selection occurs only on the patient’s HLA alleles. The other alleles not present in the host do not go through the same selection. A mutation eliminating a cleavage site to avoid epitope presentation in the specific HLA allele, may create a new epitope in a different allele. • In HIV, Rev and Vpr remove epitopes while other proteins actually accumulate them.
Open Questions & Future Goals • Research further the phenomenon of cleavage site destruction producing new epitopes rather than non-epitope nucleotides. • Characterize the changes in the epitope density of a single HIV patient with known HLA serotyping. • …
Acknowledgements • Thank you to: • Prof. Yoram Louzoun, for the dedicated guidance… • Kobi Maman and the whole lab, for all the help… • Prof. Ron Unger • Dr. Rachel Levy Drummer • Ariel Azia Amitai
Bibliography • Jonathan W. Yewdell, Eric Reits & Jacques Neefjes. 2003. Making sense of mass destruction: quantitating MHC class I antigen presentation. Nature Reviews Immunology 3, 952-961. • Vider-Shalit, T., M. Almani, R. Sarid, and Y. Louzoun. 2009. The HIV hide and seek game: an immunogenomic analysis of the HIV epitope repertoire. AIDS 23:1311-8. • http://www.righto.com/theories/hiv_genes.html • http://www.avert.org/hiv-virus.htm • http://peptibase.cs.biu.ac.il/peptibase/ • http://www.hiv.lanl.gov/content/index