190 likes | 204 Views
Determining the Number of Non-Spurious Arcs in a Learned DAG Model: Investigation of a Bayesian and a Frequentist Approach. Listgarten & Heckerman. Purpose. Design a vaccine for HIV
E N D
Determining the Number of Non-Spurious Arcs in a Learned DAG Model: Investigation of a Bayesian and a Frequentist Approach Listgarten & Heckerman
Purpose • Design a vaccine for HIV • By considering many patients and observing which HLA molekyles causes the T-killer cells of the imune system to react
Definitions • HLA = Human leukocyte antigen • Each person usally has [3;6] • Epitopes = bits of protein • Results of T-cell attacking HIV-peptide • Peptide = “small digestible” • Link between amino acids
How? • Find out which HIV peptides interact with which HLA molekyles by using a graphical model.
Solution • A directed acyclic graph representing HLA and peptides HLA h1 HLA h2 HLA h3 HLA h4 HLA hN ... peptide y1 peptide y2 peptide y3 peptide yM ... Model for one patient. A design of a vaccine is to identify a set of peptide-HLA-pairs, which are epitopes for a large number of the population
Properties • Bi-partite model(2 levels) • HLA can have zero or several outgoing archs • Peptide can have zero or several ingoing archs • Each patient will have [3;6] HLA nodes that are “on” • Answers: which HLA molekyle(s) are(is) responsible for a given immune system reaction
Two approaches • Bayesian • Frequentist
Bayesian Approach cont. 2(2) • Exponentional complexity…! • Can be improved by limiting |Parent set| • Limit=5, gives identical results
Frequentist Approach • FDR = False Discovery Rate • Given a set of hypotheses • Hypothesis i has a test score • s: assumed to be independent in a given hypotheses
FDR cont. 2(4) Rewrite Where is a structure search algorithm
FDR cont. 3(4) – multiple data sets Q - – number of archs found by applying to real data, D
FDR cont. 4(4) • Standard FDR: • The average over multiple datasets • +1 – smooths the estimate
Results • PPV – positive predictive value • Frequentist method: • Bayesian method:
Results on real HIV data • 8 results…. all matches