1 / 1

HMM structure:

0 2 1 2 00 2 10 0 1 1 0 00 1 10 0 0 1 1 00 0 10. genotype. two haplotypes per individual. … ataggtcc C tatttcgcgc C gtatacacggg A ctata … … ataggtcc G tatttcgcgc C gtatacacggg T ctata … … ataggtcc C tatttcgcgc C gtatacacggg T ctata …. HMM structure:

jerome-paul
Download Presentation

HMM structure:

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 021200210 011000110 001100010 genotype two haplotypes per individual … ataggtccCtatttcgcgcCgtatacacgggActata … … ataggtccGtatttcgcgcCgtatacacgggTctata … … ataggtccCtatttcgcgcCgtatacacgggTctata … • HMM structure: • Left-to-right HMM similar to models proposed by [Schwartz 04, Rastas et al. 05, Kimmel&Shamir 05] • Determined by number n of SNP loci and user specified number K of “founder” states at each SNP (set to 7 in our experiments) • Each state allowed to emit both alleles but training usually introduces strong bias towards one of them • Paths with high transition probability correspond to “founder” haplotypes; transition probabilities capture observed (founder-specific) recombination rates • Efficient Likelihood Computations: • A trained HMM M emits haplotypes along left-to-right paths  • P(H|M) = sum over all possible HMM paths  of joint probability that M follows and emits H; efficiently computed in O(nK) time using forward algorithm • P(G|M) = probability with which M emits any two haplotypes that explain G along any pair of paths;efficiently computed in O(nK3) time by a 2-path extension of the forward algorithm combined with speed-up idea of [Rastas07] • Similar speed-up can be used for computing in O(nK5) the likelihood of genotype trios

More Related