590 likes | 773 Views
Sequence Feature Variant Type (SFVT) Method: HLA Associations with Systemic Sclerosis Genetic Determinants of Influenza Virus Host Range Restriction. Y. Megan Kong, Nishanth Marthandan, Paula Guidry, Jyothi Noronha, R. Burke Squires, Elizabeth McClellan, Mengya Liu, Yu Qian,
E N D
Sequence Feature Variant Type (SFVT) Method: HLA Associations with Systemic Sclerosis Genetic Determinants of Influenza Virus Host Range Restriction Y. Megan Kong, Nishanth Marthandan, Paula Guidry, Jyothi Noronha, R. Burke Squires, Elizabeth McClellan, Mengya Liu, Yu Qian, David Dougall, Jie Huang, Diane Xiang, Brett Pickett, Victoria Hunt, Young Kim, Jeff Wiser, Thomas Smith, Jonathan Dietrich, Edward Klem, Lindsay Cowell, Nancy Monson, David Karp, Richard H. Scheuermann Laboratory of Molecular Pathology Retreat - 10 MAR 2011
Abstracts & Posters – Immunology • HLA Research Data, Reference Data, Visualization Tools and Analysis Tools in ImmPort • Paula A. Guidry, Nishanth Marthandan, Thomas Smith, Patrick Dunn, Steven J. Mack, Glenys Thomson, Jeffrey Wiser, David R. Karp, Richard H. Scheuermann • Creating a Cell Detail Page for Hematopoietic Cells in ImmPort • David S. Dougall, Shai Shen-Orr, John Campbell, Yue Liu, Patrick Dunn, Y. Megan Kong, Mark M. Davis, Richard H. Scheuermann • Minimum Information about a Genotyping Experiment • Jie Huang, Nishanth Marthandan, Alexander Pertsemlidis, LiangHao Ding, Julia Kozlitina, Joseph Maher, Nancy Olsen, Jonathan Rios, Michael Story, Chao Xing, Richard H. Scheuermann • Translational Research in ImmPort • Y. Megan Kong, Carl Dalke, Diane Xiang, Max Y. Qian, David Dougall, David Karp, Richard H. Scheuermann • Potential of a Unique Antibody Gene Signature to Predict Conversion to Clinically Definite Multiple Sclerosis • A.J. Ligocki, L. Lovato, D. Xiang, P. Guidry, R.H. Scheuermann, S.N. Willis, S. Almendinger, M.K. Racke, E.M. Frohman, D.A. Hafler, K.C. O'Connor, N.L. Monson • Analysis of DRB1 Sequence Feature Variant Type Associations with Systemic Sclerosis Autoantibodies Types and Racial Groups • Nishanth Marthandan, Paula Guidry, Glenys Thomson, Frank Arnett, David R. Karp, Richard H. Scheuermann • An automated analysis and visualization pipeline for identification and comparison of cell populations in high-dimensional flow cytometry data • Yu Qian, David Dougall, Megan Kong, Paula Guidry, and Richard H. Scheuermann
Abstracts & Posters – Infectious Diseases • Influenza Research Database (IRD): A Web-based Resource for Influenza Virus Data & Analysis • Victoria Hunt, R. Burke Squires, Jyothi Noronha, Ed Klem, Jon Dietrich, Chris Larsen, Richard H. Scheuermann • Tool for Identifying Sequence Variations that Correlate with Virus Phenotypic Characteristics • Brett Pickett, Prabakaran Ponraj, Victoria Hunt, Mengya Liu, Liwei Zhou, Sanjeev Kumar, Jonathan Dietrich, Sam Zaremba, Chris Larson, Edward B. Klem, Richard H. Scheuermann • Conserved Epitope Regions (CER): Elucidation of Evolutionarily Stable, Immunologically Reactive Regions of Human H1N1 Influenza Viruses • R. Burke Squires, Brett Pickett, Jyothi Noronha, Victoria Hunt, Richard H. Scheuermann • Influenza NS1-dependent Host Range Restriction Demonstrated By Sequence Feature Variant Type Analysis • Jyothi M. Noronha, R. Burke Squires, Mengya Liu, Victoria Hunt, Brett Pickett and Richard H. Scheuermann
HLA allele counts IMGT HLA – March 2011 HLA-A HLA-B HLA-C 1519 (1119) 2069 (1601) 1016 (750) HLA-DRB HLA-DQA1 HLA-DQB1 HLA-DPA1 HLA-DPB1 966 (738) 35 (26) 144 (103) 28 (16) 145 (127) MICA MICB TAP 73 (60) 31 (20) 11 (9) Figures in parenthesis indicate the number of unique proteins encoded by the various alleles at each locus. 1634 new alleles were described in 2010 alone.
HLA and autoimmune disease Robbins Pathologic Basis of Disease 6th Edition (1999)
HLA and infectious disease • Correlation between HLA genotype and HIV viral burden and progression to AIDS • M Dean, M Carrington and SJ O'Brien Annual Review of Genomics and Human Genetics Vol. 3: 263-292 (2002)
HLA and adverse drug reaction P. Parham
Locus Asterisk Allele family (serological where possible) Amino acid difference Non-coding (silent) polymorphism Intron, 3’ or 5’ polymorphism N = null L = low S = Sec. A = Abr. Q = Quest. HLA Allele Nomenclature HLA - A * 24 02 01 01 HLA - A * 24 02 01 02 L
DRB1 phylogeny DRB1*15 DRB1*16 DRB1*04 DRB1*10 DRB1*09 DRB1*07
DRB1 phylogeny DRB1*13 DRB1*13 DRB1*13 DRB1*13 DRB1*13 DRB1*13
DRB1 phylogeny DRB1*15 DRB1*16 DRB1*04 DRB1*10 DRB1*09 DRB1*07
DRB1 alignment 07/15 07/09 09/15
HLA–mediated disease predisposition • Hypothesis: • While the allelic/haplotypic structures reflect evolutionary history of the locus, it is the focused regions in the HLA genes/proteins that effect gene expression, protein structure and/or protein function that are responsible for enhanced disease risk
Summary of SFVT approach • Define individual sequence features (SF) in HLA proteins (genes) • Determine the extent of polymorphism for each sequence feature by defining the observed variant types (VT) • Re-annotate HLA typing information with complete list of VT for each SF • Examine the association between every sequence feature variant type and disease or other phenotype
TCR Binding CD8 Binding A*0201 - ‘CD8 binding’ & ‘TCR binding’ SF
Summary of SFs defined 1775 total
Variant Types for Hsa_HLA-DRB1_beta-strand 2_peptide antigen binding
HLA SFVT Association with Systemic Sclerosis • Summary of data set • Systemic sclerosis (SSc, scleroderma) is a chronic condition characterized by altered immune reactivity, thickened skin, endothelial dysfunction, interstitial fibrosis, gangrene, pulmonary hypertension, gastrointestinal tract dysmotility, and renal arteriolar dysfunction. • A large cohort of ~1300 SSc patients and ~1000 healthy controls has been assembled by Drs. Frank C. Arnett, John Reveille and colleagues at the University of Texas Health Science Center at Houston. • Information on autoantibody reactivity for over 15 nuclear antigens is available. • 4-digit typing has been done for DRB1, DQA1, and DQB1 in all individuals. • Initial re-annotation of 4 digit DRB1 typing data • DRB1*1104 => SF1_VT43; SF2_VT4; SF3_VT12 ……… • Statistical analysis • Split data set into two - pseudo-replicates • 2 x n contingency table for every SF (286), where n = number of VT • Chi-squared or Fisher’s Exact Test analysis • Select SF with adjusted p-value <0.01 (83/286) • 2 x 2 contingency table (type vs non-type) for every VT (418 total) • Merge results of pseudo-replicates
DRB1*0101 Visualization protective risk 67F 67I 70D 70D 71R 71R 28D 28E 26F 26F 30Y 30L 37Y 37F 86V 86G
Limitations to initial study • Did not take into account difference in allele frequency distributions in different racial populations • Treated SSc as a single disease • limited cutaneous involvement associated with pulmonary hypertension; 60-70% are anti-centromere positive • diffuse cutaneous involvement associated with more interstitial lung disease and kidney involvement; 30% are anti-topo positive • the two antibodies tend to be mutually exclusive
Auto-antibody SFVT associations • Separated SSc participants based on presence of anti-topoisomerase or anti-centromere auto-antibody (cases only) • 231 anti-topoisomerase • 318 anti-centromere • 3 both • 752 neither • SSc with anti-topo vs SSc without anti-topo • SSc with anti-cent vs SSc without anti-cent
Overlap of top 100 SFVTs 75 28 72 Anti-centromere SFVTs Anti-topoisomerase SFVTs
28 common SFVTs Anti-topoisomerase SFVTs Anti-centromere SFVTs Protective Risky 10 18 0 18 0 10 Anti-centromere Anti-topoisomerase Anti-centromere Anti-topoisomerase
Risky vs Risky Risky vs Protective 39 0 40 21 18 22 Anti-centromere risky SFVTs Anti-topoisomerase risky SFVTs Anti-centromere risky SFVTs Anti-topoisomerase risky SFVTs Protective vs Risky Protective vs Protective 2 10 30 12 0 40 Anti-centromere protective SFVTs Anti-topoisomerase risky SFVTs Anti-centromere protective SFVTs Anti-topoisomerase protective SFVTs
Table 7. Some of the SFVTs significantly associated with the presence of anti-centromere autoantibody Anti-centr 9W_28E_30C_47Y_67L Anti-topo 9E_28D_30Y_47F_67F Table 8. Some of the SFVTs significantly associated with presence of anti-topoisomerase autoantibody Hsa_HLA-DRB1_SF137_VT25 (all SSc) 1.85 1.38 e-07
Table of subject vs. HLA 4-digit typing data Table of subject vs. SFVT feature vector Table of p-values, adj. p-values, odds ratio, confidence intervals TCR Binding CD8 Binding ImmPort HLA SFVT Workflow
Summary • SFVT Approach • Proposed a novel approach for HLA disease associations based on sequence feature variant type analysis (SFVT) • Defined structural and functional protein sequence features (SF) for all classical human MHC class I and II proteins • Determined variant types (VT) for all SF in known alleles • Available in ImmPort www.immport.org, IMGT-HLA and dbMHC • Systemic Sclerosis Analysis • Based on the SFVT approach, identified a region of the HLA-DRB1 protein centered around peptide-binding pocket 7 that appears to be associated with disease risk • Sequences found in HLA-DRB1*1104 at positions 28, 30, 37, 67 and 86, especially with aromatic amino acids, were associated with increase disease risk • Sequences found in this region of HLA-DRB1*0302 appear to be protective • Different alleles are associated with altered risk in different racial/ethnic populations, but they share common SFVTs • SFVTs associated with risk of developing SSc are different in patients with anti-topo versus anti-cent antibodies, supporting the idea that these are distinct disease • However, the risk-associated SFVTs are from the same SFs suggesting a common mechanism of disease pathogenesis
Public Health Impact of Influenza • Seasonal flu epidemics occur yearly during the fall/ winter months and result in 3-5 million cases of severe illness worldwide. • More than 200,000 people are hospitalized each year with seasonal flu-related complications in the U.S. • Approximately 36,000 deaths occur due to seasonal flu each year in the U.S. • Populations at highest risk are children under age 2, adults age 65 and older, and groups with other comorbidities. Source: World Health Organization - http://www.who.int/mediacentre/factsheets/fs211/en/index.html
Flu pandemics of the 20th and 21st centuries • 1918 flu pandemic (Spanish flu) • H1N1 subtype • The most severe pandemic • Estimated to claim 2.5% - 5% of world’s population (20 – 100 million deaths) • Asian flu (1957 – 1958) • H2N2 subtype • 1 – 1.5 million deaths • Hong Kong flu (1968 – 1969) • H3N2 subtype • 750,000 - 1 million deaths • 2009 pandemic • H1N1 • >16,000 deaths as of March 2010
Influenza Virus Orthomyxoviridae family Negative-strand RNA Segmented Enveloped 8 RNA segments encode 11 proteins Classified based on serology of HA and NA
SFVT approach Influenza A_NS1_nuclear-export-signal_137(10) Influenza A_NS1_alpha-helix_171(17) VT-1 I F D R L E T L I L VT-2 I F N R L E T L I L VT-3 I F D R L E T IV L VT-4 L F D Q L E T L VS VT-5 I F D R L E N L T L VT-6 I F N R L E A L I L VT-7 I Y D R L E T L I L VT-8 I F D R L E T L V L VT-9 I F D R L E NIVL VT-10 I F E R L E T L I L VT-11 L F D QM E T L VS • Identify regions of protein/gene with known structural or functional properties – Sequence Features (SF) • an alpha-helical region, the binding site for another protein, an enzyme active site, an immune epitope • Determine the extent of sequence variation for each SF by defining each unique sequence as a Variant Type (VT) • High-level, comprehensive grouping of all virus strains by VT membership for each SF independently • Genotype-phenotype association statistical analysis (virulence, pathogenesis, host range, immune evasion, drug resistance)
DO VARIATIONS IN NS1 SEQUENCE FEATURES INFLUENCE INFLUENZA VIRUS HOST RANGE?
Causes of apparent NS1 VT-associated host range restriction • Virus spread = capability + opportunity • Phenotypic property of the virus – limited capacity • Restricted founder effect – limited opportunity • Restricted spatial-temporal distribution • Sampling bias – assumption of random sampling • Oversampling – avian H5N1 in Asia; 2009 H1N1 • Undersampling – large and domestic cats • Linkage to causative variant