1 / 79

High throughput urine biomarker discovery and integrative analysis for translational medicine

High throughput urine biomarker discovery and integrative analysis for translational medicine. Bruce Ling, Ph.D. Biomarker .

davin
Download Presentation

High throughput urine biomarker discovery and integrative analysis for translational medicine

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. High throughput urine biomarker discovery and integrative analysis for translational medicine Bruce Ling, Ph.D.

  2. Biomarker A molecular indicator of a specific biological property; a biochemical feature or facet that can be used to measure the progress of disease or the effects of treatment (NIH, 2002)

  3. Biomarker examples • Small molecules • Glucose (diabetes) • Serum cholesterol (cardiovascular disease) • Proteins • PSA (prostate cancer) • HER2 (IHC) (breast cancer Herceptin Therapy) • hCG (pregnancy test) • RNA/DNA • HER2 (FISH) (breast cancer) • OncoDX (Genomic Health, breast cancer)

  4. Pediatric Diseases • Kidney transplant Acute Rejection • Kawasaki Disease • Systemic Juvenile Idiopathic Arthritis • Necrotizing Enterocolitis • Inflammatory Bowel Disease • Glioblastomamultiforme • Preterm Labor

  5. Where to look for biomarkers • Disease tissue • Proximal/distal fluids • Plasma/serum, urine, amniotic, synovial fluid, CSF, saliva, tears, etc.

  6. Why Urine? • Patient consenting • Non-invasive • Easy to collect for time course analysis • Abundant and stable

  7. Urine is a rich resource for biomarker discovery • Filtration of plasma • 900 liters daily • Urine proteome • > 1500 proteins, ~30 mg/day • 30% from circulation • 70% from urogenital tract • Urine peptidome • > 100, 000 naturally occurring peptide, ~20 mg/day

  8. Urine Peptidome: a fertile ground for biomarker discovery • Equal mass of protein and peptide in urine translates into at least a ten-fold greater molar abundance of peptides than proteins • Urine peptide analysis is not hampered by highly abundant protein issues • One hour one dimensional HPLC separation is sufficient for the analysis of greater than 100,000 urine peptides, allowing a high throughput biomarker discovery

  9. Challenges of Urine Analysis • Dilution factor causing concentration variations • Solution: content normalization • Creatinine; house keeping urine abundant peptides; equal peptide mass • Peptide content can be complicated by • Diet, exercise, circadian rhythm, circulatory levels of hormones • Solution: careful experimental design to avoid these confounding issues, e.g., • Cohorts of patients of similar demographics • Multi-center sample collection and validation

  10. Urine Peptidome Profiling by Mass Spectrometry

  11. Biomarker HTS Flows Sample peptides: -Class 1:1,2,3… -Class 2:1,2,3… -Class 3:1,2,3… MASS-Conductor ® Machine learning feature discovery and classification RP-HPLC Collect 120 fractions on MALDI plates MALDI-TOF MS on each fraction Candidate Biomarkers 987.62 1027.51 1098.55 etc.

  12. Biomarker Confirmation/Validation Protein ID MS/MS Testing Validation New Center sample sets New sample Sets Identify Differentiating Markers Quantitative MS Higher throughput Quantitative methods Exploration Immunoassay New Longitudinal sample sets

  13. Data Challenges in Urine Peptide Biomarker Discovery • Data tracking and storage • Patient demographics • Peptide profiles in various fractions/samples • Dimension reduction and data reduction • Multi-dimensional data sets • Huge data sets and lots of noise A project of 40 samples produced 241.5 GB raw data in MYSQL database Peptide mass Patient ID HPLC fraction Patient demographics Peptide signal

  14. Decode the Urine Peptidome ???

  15. Decode the Urine Peptidome • Peak finding in each fraction for each sample • Align the peaks across the samples • Create common peak index

  16. Data mining issues in Biomarker Discovery • Peak number >> sample number • False discovery in multiple hypothesis testing • Multi-class classification and validation • Discovery of biomarker signature

  17. MASS-Conductor® Platform Support Urine Peptide Biomarker Discovery • Robustly loading and tracking of high volume proteomic data • Robust reduction of raw data sets and enabling of efficient and accurate peak finding, alignment and indexing • Robust and automatic high throughput computing for expensive algorithms • Integration of FDR analysis and multi-class classification algorithms to obtain statistically differentiating feature panels • Automatic generation of data reports with graphics

  18. MASS-Conductor® Platform High Throughput Computing

  19. Urine Biomarker Discovery: Case Study

  20. Kidney Transplant Rejection • Most effective treatment for end stage renal disease • 16,000 per year in US • Grafts monitored by biopsy • Unmet needs: • Less invasive and more frequent monitoring • Acute rejection vs. stable graft • Acute rejection vs. BK virus

  21. Allograft Acute Rejection Urine Biomarker Discovery 1 2 3 4 LCMS Data reduction Supervised Data mining Unsupervised Data mining Quantitative LCMS Feature selection Training Testing 2D - Clustering Validation Peak finding Peak alignment Peak indexing

  22. Biomarker Panel: Supervised Analysis

  23. Biomarker Panel: Unsupervised Analysis

  24. 108 EGF-like Domain III 107 334 149 ZP-domain EGF-like Domain II COOH 585 EGF-like Domain I 28 65 64 NH2 Urine THP Peptide Biomarkers Fall into a Tight Cluster in C-Terminus 1. R.VLNLGPITR.K 2. G.SVIDQSRVLNLGPI.T 3. I.DQSRVLNLGPITR.K 4. R.SGSVIDQSRVLNLGPI.T 5. S.VIDQSRVLNLGPITR.K 6.R.SGSVIDQSRVLNLGPIT.R 7. G.SVIDQSRVLNLGPITR.K 8.R.SGSVIDQSRVLNLGPITR.K

  25. MRM: Multiplexed Quantitative Biomarker Validation

  26. 1.0 0.8 0.6 0.4 0.2 0.0 0.2 0.6 1.0 0.0 0.4 0.8 ROC Analysis of THP Peptide Biomarkers Quantified by MRM 1.0 AR versus BK AR versus STA 0.8 AUC: 0.83 AUC: 0.92 0.6 Sensitivity AUC: 0.74 AUC: 0.83 0.4 SAMPLE: URINE PEPTIDES SAMPLE: URINE PEPTIDES 0.2 THP 1680.98 VIDQSRVLNLGPITR THP 1680.98 VIDQSRVLNLGPITR THP 1912.07 SGSVIDQSRVLNLGPITR THP 1912.07 SGSVIDQSRVLNLGPITR 0.0 0.2 0.6 1.0 0.0 0.4 0.8 1- Specificity 1- Specificity

  27. AR Urine Biomarkers are Collagen and THP Peptides A B Collagen peptide biomarkers THP peptide biomarkers • THP 982.59 VLNLGPITR • THP 1047.48 SGSVIDQSRV • THP 1211.66 DQSRVLNLGPI • THP 1225.69 SRVLNLGPITR • THP 1324.76 IDQSRVLNLGPI • THP 1423.83 VIDQSRVLNLGPI • THP 1468.82 DQSRVLNLGPITR • THP 1510.87 SVIDQSRVLNLGPI • THP 1567.91 GSVIDQSRVLNLGPI • THP 1581.91 IDQSRVLNLGPITR • THP 1654.91 SGSVIDQSRVLNLGPI • THP 1680.98 VIDQSRVLNLGPITR • THP 1755.96 SGSVIDQSRVLNLGPIT • THP 1768.01 SVIDQSRVLNLGPITR • THP 1912.07 SGSVIDQSRVLNLGPITR • THP 2040.16 SGSVIDQSRVLNLGPITRK • COL1A1 1235.56 APGDRGEPGPPGP • COL1A1 1251.55 APGDRGEPGPPGP • COL1A1 1322.57 APGDRGEPGPPGPA • COL1A1 1316.59 DAGPVGPPGPPGPPG • COL1A1 1409.66 GPPGPPGPPGPPGPPS • COL1A1 2048.92 NGDDGEAGKPGRPGERGPPGP • COL1A1 2064.91 NGDDGEAGKPGRPGERGPPGP • COL1A1 2192.97 NGDDGEAGKPGRPGERGPPGPQ • COL1A1 2362.12 GKNGDDGEAGKPGRPGERGPPGPQ • COL1A1 2378.10 GKNGDDGEAGKPGRPGERGPPGPQ • COL1A1 2645.24 GPPGKNGDDGEAGKPGRPGERGPPGPQ • COL1A1 1709.79 PPGEAGKPGEQGVPGDLG • COL1A1 2031.95 PPGEAGKPGEQGVPGDLGAPGP • COL1A1 2221.97 ADGQPGAKGEPGDAGAKGDAGPPGP • COL1A1 2205.99 ADGQPGAKGEPGDAGAKGDAGPPGP • COL1A1 2277.01 ADGQPGAKGEPGDAGAKGDAGPPGPA • COL1A1 2293.01 ADGQPGAKGEPGDAGAKGDAGPPGPA • COL1A1 2617.15 GPPGADGQPGAKGEPGDAGAKGDAGPPGPA • COL1A1 2086.93 EGSPGRDGSPGAKGDRGETGPA • COL1A1 2157.96 AEGSPGRDGSPGAKGDRGETGPA • COL1A1 3014.41 ESGREGAPGAEGSPGRDGSPGAKGDRGETGPA • COL1A1 1266.58 SPGPDGKTGPPGPA • COL1A1 2129.99 DGKTGPPGPAGQDGRPGPPGPPG • COL1A1 2017.93 GRPGEVGPPGPPGPAGEKGSPG • COL1A2 2081.94 DGPPGRDGQPGHKGERGYPG • COL1A2 2195.99 NDGPPGRDGQPGHKGERGYPG • COL2A1 1861.85 SNGNPGPPGPPGPSGKDGPK • COL3A1 1738.76 NDGAPGKNGERGGPGGPGP • COL3A1 2008.93 DGESGRPGRPGERGLPGPPG • COL3A1 2079.92 DAGAPGAPGGKGDAGAPGERGPPG • COL3A1 2565.18 GAPGQNGEPGGKGERGAPGEKGEGGPPG • COL3A1 2743.24 KNGETGPQGPPGPTGPGGDKGDTGPPGPQG • COL4A1 1424.66 PGQQGNPGAQGLPGP • COL4A2 1126.51 GLPGLPGPKGFA • COL4A3 1161.52 GEPGPPGPPGNLG • COL4A4 1218.55 GLPGPPGPKGPRG  • COL4A5 1144.52 GPPGPPGPLGPLG • COL4A5 1269.53 PGLDGMKGDPGLP • COL4A5 1733.76 GIKGEKGNPGQPGLPGLP • COL4A6 1158.52 GLPGPPGPPGPPS • COL5A1 1748.82 KGPQGKPGLAGMPGANGPP • COL7A1 1690.80 PGLPGQVGETGKPGAPGR • COL9A1 1732.84 KRPDSGATGLPGRPGPPG • COL11A1 1441.64 GPPGPPGLPGPQGPKG • COL11A1 1828.84 DGPPGPPGERGPQGPQGPV • COL17A1 1368.62 LPGPPGPPGSFLSN • COL18A1 1142.51 GPPGPPGPPGPPS

  28. Hypothesis of Molecular Mechanisms for AR Urine Biomarkers Hypothesis 1 Gene expression alteration in AR Hypothesis 2 Protease expression alteration in AR Hypothesis 3 Protease inhibitor expression alteration in AR

  29. Transcriptome Analysis of Allograft Biopsies Validation Analysis Confirmation Analysis Exploration Analysis Exploration data set6 (TGCG) Confirmation data set (Stanford ) Validation data set (Stanford ) 3 2 1 Quantitative RT-PCR (AR: BX, n=14) (STA: BX, n=10) (HC: BX, n=10) Affymetirics HU-133 (AR: BX, n=37) (HC: BX, n=23) Affymetirics HG-U95Av2 (AR: PBL, n=6; BX, n=7) (STA: PBL, n=9; BX, n=10) (NR: PBL, n=8; BX, n=5) (HC: PBL, n=8; BX, n=9) Confirmation Validation Expression analysis of peptide biomarkers’ corresponding precursor genes Expression analysis of metzincin superfamily genes Discovery  mechanism biomarkers Expression analysis of protease inhibitor genes

  30. Parental Protein Expression Analysis of Allograft Biopsies Contrasting Urine Peptide Biomarker Changes

  31. Genome-wide Protease and Protease Inhibitor Expression Analysis of Allograft Biopsies Revealed Up Regulation of MMP7, SERPING1, TIMP1

  32. Allograft Biopsies Expression Biomarkers Effectively Classified AR 50 1.0 AR HC STA 0.8 40 30 0.6 Sensitivity Signal Intensity 20 0.4 Mean ( AUC): 0.98 10 0.2 0 0.0 0.2 0.6 1.0 COL1A2 COL3A1 MMP7 SERPING1 TIMP1 UMOD 0.0 0.4 0.8 1- Specificity

  33. Proposed Underlying Mechanisms for the AR Urine Peptide Biomarkers

  34. Hypothesis: Collagen Breakdown and Deposition in AR Integrated Analysis Increased TIMP1 (Collagenase Inhibitor) in AR Decreased Collagenase Activity In AR tissue Urine Peptidomics Biopsy Gene Expression GSE 14328 Increased MMP7 in AR Urine Decreased Collagen Peptides In AR Decreased Collagen Breakdown in AR Renal Biopsy Increased Collagen Deposition in AR Increased Collagen Expression in AR Urine Peptide Analysis by MS More Graft Fibrosis After an AR episode?

  35. Urine Biomarker Discovery: Case Study

  36. Unmet Medical Needs in Necrotizing Entrocolitis Necrotizing enterocolitis (NEC) is a medical condition primarily seen in premature infants, where portions of the bowel undergo necrosis (tissue death). Despite decades of research the pathogenesis of NEC remains obscure, the diagnostic parameters unclear, and both treatment and prevention strategies remain inadequate and dated. There is the real need for better molecular identification of NEC in order to assist in altering its onset and progression.

  37. Clinical parameters do not adequately predict outcome in Necrotizing Enterocolitis

  38. Clinical Parameters Based Model stratifies Necrotizing Enterocolitis Patients NEC M S 30 Low Risk Group Intermediate Risk Group Rate of NEC-S occurrence (% patients) 20 M: n = 16 S: n = 10 M: n = 26 S: n = 0 High Risk Group 10 M: n = 2 S: n = 15 0 -10 0 10 20 30 40 NEC score

  39. NEC Urine Naturally Occurring Peptide Biomarker Discovery 1 2 3 LCMS Data reduction Supervised Data mining Unsupervised Data mining Feature selection Training Testing 2D - Clustering Peak finding Peak alignment Peak indexing

  40. Biomarker Panel: Supervised Analysis (Training and Testing)

  41. Biomarker Panel: Unsupervised Analysis

  42. Biomarker Panel: Combined data set and ROC analysis

  43. Permutation based FDR analysis of the biomarker signature

  44. Proposed Ensemble Approach to Diagnose Necrotizing Enterocolitis Patients Discovery set n = 34 NEC Patients NEC M S Clinical Diagnosis 17 17 Medical NEC Scoring Clinical Model NEC Risk Groups Low n=7 Intermediate n=15 High n=9 N/A n=3 Clinical Diagnosis M S M S M S 7 0 9 6 0 9 NEC Risk Diagnosed as M 7 0 4 3 0 1 Diagnosed as S 0 0 5 3 0 8 Urine peptide based Classification Urine Biomarkers Classified as M 7 0 8 1 0 0 Classified as S 0 0 1 5 0 9 Percent Agreement with clinical diagnosis 100 % 100 % 88.9 % 83.3 % 100 % 100 % + - + - + - NEC Diagnosis 100 % 86.1 % 100 % P = 0.01

  45. Overlapping Urine Peptide Biomarkers for NEC TABLE 2

  46. Proposed Underlying Mechanisms of Urine Naturally Occurring Peptide Biomarkers

  47. Prediction of drug response in SJIA Enbrel Anakinra CR PR CR A Enbrel Anakinra PR CR CR B C

  48. Urine peptide biomarkers: the discovery process Biomarker panels Sample peptides: -Class 1:1,2,3… -Class 2:1,2,3… -Class 3:1,2,3… MASS-Conductor ® Machine learning feature discovery and classification SCX/RP-HPLC Collect 100 fractions on MALDI plates MALDI-TOF MS for each sample LC fraction -- m/.z --abundance Prospective validation with quantitative mass spec (MRM) MSMS protein ID

More Related