470 likes | 692 Views
Influenza. David L. Suarez Southeast Poultry Research Laboratory Agricultural Research Service U.S. Department of Agriculture Athens, Georgia. Influenza. Orthomyxovirus Segmented genome Pleomorphic RNA viruses - single stranded Three antigenic types: A, B, C Type A:
E N D
Influenza David L. Suarez Southeast Poultry Research Laboratory Agricultural Research Service U.S. Department of Agriculture Athens, Georgia
Influenza • Orthomyxovirus • Segmented genome • Pleomorphic RNA viruses - single stranded • Three antigenic types: A, B, C • Type A: • Human influenza H1N1, H3N2, pandemic H1N1 • Swine Influenza H1N1, H3N2 • Equine Influenza H3N8, H7N7 • Canine Influenza H3N8 • Avian Influenza(many bird species) H1-H16, N1-N9 • Vary in pathogenicity
Influenza A Virus 10 or 11 influenza proteins 9 proteins packaged in virion HA, NA, M2-surface proteins NP, PA, PB1, PB2, M1 and NS2 internal proteins 16 HA subtypes 9 NA subtypes NS1 not packaged in virion
Influenza: Infection and Disease • Infection may cause a wide range of clinical signs from no disease (asymptomatic), respiratory disease, to severe disease with high mortality • Localized Infection-mild to moderate disease • Intestinal-wild ducks and shorebirds, poultry • Respiratory-humans, swine, horses, poultry, domestic ducks, seal, mink • Systemic Infection-high mortality • chickens, turkeys, other gallinaceous birds
Swayne, D.E. Epidemiology of Avian Influenza in Agricultural and Other Man-Made Systems. In: Avian Influenza. Wylie-Blackwell (www.blackwellpublishing.com), March, 2008.
Main Existing Influenza Lineages Human influenza H3N2, H1N1 Avian Influenza Equine/Canine Influenza H3N8 Swine influenza H1N1, H3N2
H3N8 H2N2 H2N2 H1N1 PandemicH1N1 H3N2 2015 2010 1915 1925 1955 1965 1975 1985 1995 2005 1895 1905 H1N1 H9* Recorded new avian influenzas 1999 H5 1997 2003 H7 1980 1996 2002 1955 1965 1975 1985 1995 2005 Pandemics of influenza Recorded human pandemic influenza(early sub-types inferred) 2009 Pandemic influenza H1N1 1889 Russian influenza H2N2 1968 Hong Kong influenza H3N2 1918 Spanish influenza H1N1 1900 Old Hong Kong influenza H3N8 1957 Asian influenza H2N2 Animated slide: Press space bar Reproduced and adapted (2009) with permission of Dr Masato Tashiro, Director, Center for Influenza Virus Research, National Institute of Infectious Diseases (NIID), Japan.
PB2 PB1 PB2 PA PB1 HA PA NP HA NA NP MP NA NS MP NS PB2 PB1 PA HA NP NA MP NS Genetic origins of the pandemic (H1N1) 2009 virus: viral reassortment N. American H1N1 (swine/avian/human) Unknown lineage H1N1 Pandemic (H1N1) 2009, combining swine, avian and human viral components Classical swine, N. American lineage Avian, N. American lineage Human seasonal H3N2 Unknown lineage (closest Eurasian swine)
Virus as a Parasite • Viruses are very small and encode for relatively few viral genes • Require host genes to make viral RNA or DNA and to package the virus • RNA viruses are generally smaller than DNA viruses (3000-30,000 bp) • Most viruses infect a cell, cause the host cell to make huge numbers of virus RNA, and results in death of host cell
Viral Genes and Host Genes • Host proteins are needed to make viral proteins from viral mRNA • Host proteins help to assemble the virus • Viral genes usually make the viral RNA in the polymerase complex- 4 flu proteins, NP, PA, PB2, and PB1 are used to perform this function • Viral proteins are used to attach to the host cells (hemagglutinin protein) and exit host cells (neuraminidase protein) • Viral proteins are used to evade host immune response (non-structural proteins)
General flu facts • Influenza makes viral mRNA that is translated into protein by the host cell • Proteins start from the first ATG (methionine) • Proteins end with any of the 3 stop codons • The matrix and non-structural genes are spliced into 2 proteins (M1, M2 and NS1, NS2) • Host machinery processes proteins including removing leader sequences and glycosylation
Influenza Virus Production • Influenza has 8 gene segments • Each segment must be packaged into virus to be infectious • How do you get all gene segments into virus? • Each gene segment has conserved sequence on 5’ and 3’ ends of segment • 5’ end is 12 bp AGCAAAAGCAGG • 3’ end is 13 bp CCTTGTTTCTACT
Flu facts • Six of eight gene segments are strict on lengths of gene segments • PB2 2341bp • PB1 2341bp • PA 2233bp • NP 1565bp • MA 1027bp • NS 890 bp • No larger gene segment have ever been reported for these genes (rare cases smaller) • The hemagglutinin and neuraminidase genes are exceptions with a lot of size variation
Influenza Genes PB2 2341 bp AGCGAAAGCAGG TCAAATAT ATTCAATATG TAGTGTC GAATTGTTTA AAAACGA CCTTGTTTCTACT PB1 2341 bp AGCGAAAGCAGG CAAACCAT TTGAATG TGAAAAAATG CCTTGTTTCTACT PA 2233 bp AGCGAAAGCAGG TACTGATT CAAAATG TAGTTGTGGCAATGCTACTATTTGCTATCCATACTGTCCAAAAAAGTA CCTTGTTTCTACT HA 1779 bp AGCAAAAGCAGG GGTTCAAT CTGTCAAAATG TAGTTAAAAACAC CCTTGTTTCTACT NP 1565 bp TAAAGAAAAATAC CCTTGTTTCTACT AGCAAAAGCAGG GTAGATAA TCACTCACCGAGTGACATCC ACATCATG NA 1450 bp AGCAAAAGCAGG AGTTCAAA ATG TAGAAAAAAANT CCTTGTTTCTACT MA 1027 bp AGCAAAAGCAGG TAGATATT GAAAGATG TAGAGCTGGAGTAAAAAACTA CCTTGTTTCTACT NS 1565 bp AGCAAAAGCAGG GTGACAAA AACATAATG TGATAAAAAACAC CCTTGTTTCTACT
Nucleoprotein Coding Sequence AGCAAAAGCAGG GTAGATAA TCACTCACCGAGTGACATCC ACATCATG TAAAGAAAAATAC CCTTGTTTCTACT • Nucleoprotein • 1565 base pairs in length • Encodes a single protein of 498 amino acids • Non-coding sequence is present before and after the coding sequence • Non-coding sequence acts as promoter and thought to be important for virus assembly
Sequencing of Influenza Viruses • Over 180,000 influenza gene sequences have been deposited in GenBank representing over 50,000 isolates • Many of these sequences are only partial gene sequences that don’t include the non-coding sequence • Understanding non-coding sequences contribution to pathogenesis of flu is important • A rough estimation of 3% of flu sequences in GenBank have serious errors
Errors in Flu sequence • Gene segments are longer than they should be and is likely the result of • Primer sequence was included as part of submission • For cloned genes, plasmid sequence was included • Taq polymerase induced errors • Sequence was poorly aligned and includes extra sequence • Sequence includes bad sequence that results in insertions or deletions that result in premature stop codons
GenBank Data Mining • Using Influenza Research Database searched for NP gene segments >1565 bp • 266 isolates were greater than 1565 bp which should be the maximum size • Most if not all these sequences have errors that is apparent on a multiple sequence alignment
Bioinformatics Class Assignment • Identify obvious mistakes in influenza sequences • Initially identify sequences with non-influenza sequence on the 5’ or 3’ end of the gene segments • Characterize the types of errors that are present and correlate that with the laboratories that produce the sequence
Results • Analyze the data from all eight gene segments and publish the results in a peer reviewed journal • Contact the laboratories that have mistakes and give them an opportunity to correct the errors • GenBank provides a relatively simple process to correct sequence data • Track which labs correct the data
Errors not so obvious • RT-PCR amplification and sequencing of the PCR product is commonly used Primer DNA Viral RNA converted to ss DNA by reverse transcriptase enzyme Viral RNA SS DNA transcribed to DS DNA Primer Primer PCR used to amplify DS DNA that can then be sequenced Primer
PCR basics Primer Primer Denature DS DNA to SS DNA at 94C AGCGCTAGCTAGCTAGCGGCTAGCGTATCGAGCGTAGCGTAG TCGCGATCGATCGATCGCCGATCGCATAGCTCGCATCGCATC Anneal Primer to SS DNA 54C AGCGCTAGCTAGCTAGCGGCTAGCGTATCGAGCGTAGCGTAG AGCTCGCATCGCATC AGCGCTAGCTAGCTA TCGCGATCGATCGATCGCCGATCGCATAGCTCGCATCGCATC Repeat the Denaturation, Annealing, and Extension for 30-40 cycles
Mismatches in Primer to Template Can Still Result in PCR Amplification AGCGCTAGCTAGCTAGCGGCTAGCGTATCGAGCGTAGCGTAG AGCTGGCATCGCATC AGCGCGAGCTAGCTA TCGCGATCGATCGATCGCCGATCGCATAGCTGGCATCGCATC Mismatches become incorporated in PCR product AGCGCGAGCTAGCTAGCGGCTAGCGTATCGACCGTAGCGTAG TCGCGATCGATCGATCGCCGATCGCATAGCTGGCATCGCATC Sequenced PCR Product will include these errors
Conclusions • Primers must be close but are not always identical to template • Primers may introduce errors into PCR product that will show up in sequence • Primer sequence should be removed when data is submitted to GenBank • Often it isn’t, and errors in sequence may be introduced in GenBank database • Errors in sequence makes it harder to understand what sequence changes are important for viral infections • GIGO-garbage in, garbage out
Influenza Sequencing • Procedures are available to PCR amplify the complete gene segment for eight genes • Primers include conserved areas in the non-coding region including the 12 and 13 bp found in all eight gene segments • In addition to flu sequence, primers also contain 5’ extensions to improve PCR efficiency because the sequences are so short • These primer sequences are commonly not removed before submission to GenBank
Error from Commonly Used Procedure ACGTCGATCGCTTTCGTCC AGCGAAAGCAGGTACTGATTCAAAATGCCGATCGCT Primer sequence with 5’ extension ACGTCGATCGCTTTCGTCCATGACTAAGTTTTACGGCTAGCGA TGCAGCTAGCGAAAGCAGGTACTGATTCAAAATGCCGATCGCT Primer extension incorporated in PCR product Sequence includes “extra” DNA that if not edited can get submitted to GenBank
How to identify primer induced errors • May not be possible by looking at sequence directly • Read the manuscript and look at experimental detail (if they don’t have procedure specifically sequencing ends, probably means they have primer data in their sequence) • Generate own non-coding sequence data and compare that with GenBank sequence
Lethality and Molecular Characterization of an HPAI H5N1 Virus Isolated from Eagles Smuggled from Thailand into Europe M. Steensels, S. Van Borm, M. Boschmans, and T. van den Berg Reverse transcription (RT) was performed using an RT primer specific to a universal noncoding sequence present in all influenza segment RNAs (Table 1; Unit 12) and AMV reverse transcriptase (Roche), according to the manufacturer’s instructions, using 4 ll of purified RNA in a 20-ll reaction volume. Overlapping gene fragments were polymerase chain reaction (PCR)–amplified using Taq DNA polymerase (Roche) and a 2- lM final concentration of gene-specific primers (Table 1) and 1 ll of cDNA in 50-ll reactions. PCR was performed using the following temperature profile: 4 min at 94 C, followed by 45 times the cycle (1 min at 94 C, then 1 min at 55 C, and 1 min at 72 C). At the end, a final elongation step of 10 min at 72 C was used. The size of the amplicons was verified by agarose gel electrophoresis. Subsequently, amplicons of the correct size were cloned into a pCR2.1-TOPO vector (TOPO TA cloning kit; Invitrogen, Carlsbad, CA), according to manufacturer’s instructions. The plasmid DNA from positive colonies was further purified (Qiaprepminiprep kit; Qiagen, Valencia, CA), according to the manufacturer’s procedures, and was verified by EcoRI (Roche, according to manufacturer’s instructions) digestion and agarose gel electrophoresis. Finally, sequencing reactions were performed using the M13F and M13R primers (provided with the cloning kit) (BigDyeTerminator, version 3.1,
Conclusions • Test sequence had “extra” sequence on 5’ and 3’ end • 5’ sequence is non-flu sequence added to primer to improve PCR efficiency • Review of published paper confirms data • Original paper shows they cloned sequence before sequencing • Part of 3’ sequence appears to be plasmid sequence • Origin of remainder of 3’ sequence is unclear
How can you sequence the ends? Convert SS linear RNA to circular SS RNA T4 RNA ligase will connect RNA ends together Do RT-PCR using primers that cover the non-coding sequence Purify and sequence PCR as normal