1 / 5

High dimensional genomic data, identifiability , and query-response

High dimensional genomic data, identifiability , and query-response. Haixu Tang School of Informatics and Computing Indiana University, Bloomington. “Big Data” in Personal G enomics. Genomics is a key component of personalized medicine Massive

eden
Download Presentation

High dimensional genomic data, identifiability , and query-response

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. High dimensional genomic data, identifiability, and query-response Haixu Tang School of Informatics and Computing Indiana University, Bloomington

  2. “Big Data” in Personal Genomics • Genomics is a key component of personalized medicine • Massive • Large research-oriented projects: 1000 genomes to 106 • Genome sequencing for all new-borns? • Open data project, e.g., the Personal Genomics Project (PGP) • Heterogeneous • Genomic sequence (variations) • Constant, dynamic monitoring • Transcritpomics, proteomics, metabolomics, microbial communities, etc. (as demonstrated by iPOP)

  3. Challenges in Personal Genomics Challenges: Speed, Storage, Scalability, Security Solution: cloud, hybrid cloud, bring computing to the data!

  4. Privacy Enhancing Technologies Database security approaches: access control, query auditing, differential privacy Cryptographic protocols: SMC, homomorphic computation, functional encryption Ethic studies, informed consent, policy

  5. What is specific for genomic data? • Challenges • Genome technologies evolve very fast! • Genomic data are extremely high dimensional • Millions of SNPs, easily identifiable • Balance between data security and utility • Not only the data, but also analysis results need to be protected • Allele frequencies or test statistics (e.g., Homer’s attack) • Special properties • Different dimensions are NOT independent • Genetic structures (e.g., linkage disequilibrium) • Specific genomic research focuses on a small number of dimensions (e.g., disease-associated SNPs)

More Related