220 likes | 300 Views
Similar Techniques For Molecular Sequencing and Network Security Doug Madory 27 APR 05. Big Picture Protein Structure Sequencing using Profile HMM. Big Picture. PQS for Network Security (Us) Design HMM for network event Find event within linear stream of observed network events.
E N D
Similar Techniques For Molecular Sequencing and Network SecurityDoug Madory27 APR 05 • Big Picture • Protein Structure • Sequencing using Profile HMM
Big Picture • PQS for Network Security (Us) • Design HMM for network event • Find event within linear stream of observed network events Q: Did an event happen? • Sequencing using Profile HMM (Bioinformatics) • Train HMM using known information about subsequence • Find subsequence within linear protein / genome sequence Q: If it exists, where is sequence?
Profile HMM - Simple Case • Train HMM • Viterbi Scoring • Backtrace Viterbi • Query: A-, AA, TA • DB: ATA
D1 D2 I0 I1 I2 End Begin A C G T A C G T M1 M2 HMM Training Build HMM with 2 M states because there are 2 columns in query
D1 D2 I0 I1 I2 End Begin A 1 C 1 G 1 T 1 A 1 C 1 G 1 T 1 M1 M2 HMM Training 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Step 1 – add pseudocount to each transition and emission
D1 D2 I0 I1 I2 End Begin A 2 C 1 G 1 T 1 A 1 C 1 G 1 T 1 M1 M2 HMM Training 1 1 1 1 1 1 2 1 1 1 1 1 2 1 1 1 1 2 1 1 Step 2 – train with A-
D1 D2 I0 I1 I2 End Begin A 3 C 1 G 1 T 1 A 2 C 1 G 1 T 1 M1 M2 HMM Training 1 1 1 1 1 1 2 1 1 1 1 1 2 1 1 1 1 3 2 2 Step 3 – train with AA
D1 D2 I0 I1 I2 End Begin A 3 C 1 G 1 T 2 A 3 C 1 G 1 T 1 M1 M2 HMM Training 1 1 1 1 1 1 2 1 1 1 1 1 2 1 1 1 1 4 3 3 Step 4 – train with TA
HMM Training D1 D2 1 1 1 1 1 1 2 1 1 1 I0 I1 I2 1 1 2 1 1 1 1 End Begin A 3 C 1 G 1 T 2 A 3 C 1 G 1 T 1 4 3 3 Fully trained HMM M1 M2
Moves Insert Viterbi Scoring Match Delete Illegal Moves Observations VI0(1) = log aB-I0 VM1(0) = 0 VI1(0) = 0 VD1(0) = log aB-D1 States
Moves Insert Viterbi Scoring Match Delete Observations VI0(2) = VI0(1)+log aI0-I0 VI0(3) = VI0(2)+log aI0-I0 States
Moves Insert Viterbi Scoring Match Delete Observations VM1(1) = log e(A)/q + VB + log aB-M1 VM1(1) = log (3/7)/(1/4) + 0 - 0.17 VM1(1) = 0.23 – 0.17 = 0.06 States
Moves Insert Viterbi Scoring Match Delete Observations States VM1(0) + log aM1I1 VI1(1) = 0 + max { VI1(0) + log aI1I1 } VD1(0) + log aD1I1 VI1(1) = 0 + max {-0.78+-0.47} VI1(1) = -0.47 VD1(1) = VI1(0) + log aI0D1 VD1(1) = -0.78 – 0.47 = -1.25
Moves Insert Viterbi Scoring Match Delete Observations States
Moves Insert Viterbi Scoring Match Delete Observations States
Profile HMM - Simple Case • Demo in Python
Big Picture Revisited • PQS for Network Security (Us) • Design HMM for network event • Find event within linear stream of observed network events Q: Did an event happen? • Sequencing using Profile HMM (Bioinformatics) • Train HMM using known information about subsequence • Find subsequence within linear protein / genome sequence Q: If it exists, where is sequence?