Comp. Genomics

Comp. Genomics Recitation 6 14/11/06 ML and EM

Outline • Maximum likelihood estimation • HMM Example • EM • Baum-Welch algorithm

Maximum likelihood • One of the methods for parameter estimation • Likelihood: L=P(Data|Parameters) • Simple example: • Simple coin with P(head)=p • 10 coin tosses • 6 heads, 4 tails • L=P(Data|Params)=(106)p6 (1-p)4

Maximum likelihood • We want to find p that maximizes L=(106)p6 (1-p)4 • Infi 1, Remember? • Log is a monotonically increasing function, we can optimize logL=log[(106)p6 (1-p)4]= log(106)+6logp+4log(1-p)] • Deriving by p we get: 6/p-4/(1-p)=0 • Estimate for p:0.6 (Makes sense?)

ML in Profile HMMs • Emission probabilities • Mi a • Ii a • Transition Probabilities • Mi  Mi+1 • Mi  Di+1 • Mi  Ii • Ii  Mi+1 • Ii  Ii • Di  Di+1 • Di  Mi+1 • Di  Ii http://www.cs.huji.ac.il/~cbio/handouts/Class6.ppt

Parameter Estimation for HMMs Input: X1,…,Xn independent training sequences Goal: estimation of  = (A,E) (model parameters) Note: P(X1,…,Xn | ) = i=1…nP(Xi | )(indep.) l(x1,…,xn | )= log P(X1,…,Xn | ) = i=1…nlog P(Xi | ) Case 1 - Estimation When State Sequence is Known: Akl = #(occurred kl transitions) Ek(b) = #(emissions of symbol b that occurred in state k) Max. Likelihood Estimators: • akl = Akl / l’Akl’ • ek(b) = Ek(b)/ b’Ek(b’) small sample, or prior knowledge correction: A’kl = Akl + rkl E’k(b) = Ek(b) + rk(b)

Example • Suppose we are given the aligned sequences **---* AG---C A-AT-C AG-AA- --AAACAG---C • Suppose also that the “match” positions are marked... http://www.cs.huji.ac.il/~cbio/handouts/Class6.ppt

**---* AG---C A-AT-C AG-AA- --AAACAG---C Calculating A, E count transitions and emissions: transitions emissions http://www.cs.huji.ac.il/~cbio/handouts/Class6.ppt

Estimating Maximum Likelihood probabilities using Fractions emissions http://www.cs.huji.ac.il/~cbio/handouts/Class6.ppt

Estimating ML probabilities (contd) transitions http://www.cs.huji.ac.il/~cbio/handouts/Class6.ppt

EM - Mixture example • Assume we are given heights of 100 individuals (men/women): y1…y100 • We know that: • The men’s heights are normally distributed with (μm,σm) • The women’s heights are normally distributed with (μw,σw) • If we knew the genders – estimation is “easy” (How?) • What we don’t know the genders in our data! • X1…,X100are unknown • P(w),P(m) are unknown

Mixture example • Our goal: estimate the parameters (μm,σm), (μn,σn), p(m) • A classic “estimation with missing data” • (In an HMM: we know the emmissions, but not the states!) • Expectation-Maximization (EM): • Compute the “expected” gender for every sample height • Estimate the parameters using ML • Iterate

EM • Widely used in machine learning • Using ML for parameter estimation at every iteration promises that the likelihood will consistently improve • Eventually we’ll reach a local minima • A good starting point is important

Mixture example • If we have a mixture of M gaussians, each with a probability αi and density θi=(μm,σm) • Likelihood the observations (X): • The “incomplete-data” log-likelihood of the sample x1,…,xN: • Difficult to estimate directly…

Mixture example • Now we introduce y1,…,y100: hidden variables telling us what Gaussian every sample came from • If we knew y, the likelihood would be: • Of course, we do not know the ys… • We’ll do EM, starting from θg=(α1g ,..,αMg, μ1g,..,μMg,σ1g,.., σMg)

Estimation • Given θg, we can estimate the ys! • We want to find: • The expectation is over the states of y • Bayes rule: P(X|Y)=P(Y|X)P(X)/P(Y):

Estimation • We write down the Q: • Daunting?

Estimation • Simplifying: • Now the Q becomes:

Maximization • Now we want to find parameter estimates, such that: • Infi 2, remember? • To impose the constraint Sum{αi}=1, we introduce Lagrange multiplier λ: • After summing both sides over l:

Maximization • Estimating μig+1,σig+1 is more difficult  • Out of scope here • What turns out is actually quite straightforward:

What you need to know about EM: • When: If we want to estimate model parameters, and some of the data is “missing” • Why: Maximizing likelihood directly is very difficult • How: • Initial guess of the parameters • Finding a proper term for Q(θg, θg+1) • Deriving and finding ML estimators

EM estimation in HMMs Input: X1,…,Xn independent training sequences Baum-Welch alg. (1972): • Expectation: • compute expected # of kl state transitions: P(i=k, i+1=l | X, ) = [1/P(x)]·fk(i)·akl·el(xi+1)·bl(i+1) Akl= j[1/P(Xj)] · i fkj(i) · akl ·el(xji+1) · blj(i+1) • compute expected # of symbol b appearances in state k Ek(b) = j[1/P(Xj)] · {i|xji=b} fkj(i) · bkj(i) (ex.) • Maximization: • re-compute new parameters from A, E using max. likelihood. repeat (1)+(2) until improvement  

Comp. Genomics

Comp. Genomics

Presentation Transcript

DNA Chips and Their Analysis Comp. Genomics: Lecture 13

Genomics

Genomics

Computational Genomics Fall 2004/5 www.cs.tau.ac.il/~bchor/CG05/comp-genom.html

Comp. Genomics

Comp. Genomics

Genomics

Comp. Genomics

Intro to Comp Genomics

Comp. Genomics

Comp. Genomics

Comp. Genomics

Comp. Genomics

Genomics

Comp. Genomics

Genomics

Comp. Genomics

Computational Genomics Spring 2009 cs.tau.ac.il/~bchor/CG09/comp-genom.html

Genomics

Comp. Genomics

Genomics

Comp. Genomics