610 likes | 843 Views
HANU IT R&D Center Seminar Series. On Optimal Training Time Algorithms for Positive and Negative Selection on strings with r-chunk matching rule. Presenter: Nguyen Van Truong Advisor : Assoc. Prof. Nguyen Xuan Hoai HANU IT R&D Center. Contents. Overview of Artificial Immune Systems
E N D
HANU IT R&D Center Seminar Series On Optimal Training Time Algorithms for Positive and Negative Selection on strings with r-chunk matching rule Presenter: Nguyen Van Truong Advisor : Assoc. Prof. Nguyen Xuan Hoai HANU IT R&D Center
Contents • Overview of Artificial Immune Systems • Positive/Negative Selection Algorithm • The best NSA with r-chunk matching rule • Proposed PSA and NSA with r-chunk matching rule
Contents: • Overview of Artificial Immune Systems • Positive/Negative Selection Algorithm • The best NSA with r-chunk matching rule • Proposed PSA and NSA with r-chunk matching rule
Role of the Immune System (IS) • Protect our bodies from infection • Primary immune response • Launch a response to invading pathogens • Secondary immune response • Remember past encounters • Faster response the second time around
Self/Non-Self Recognition • Immune system needs to be able to differentiate between self and non-self cells • Antigenic encounters may result in cell death, therefore • Some kind of positive selection • Some element of negative selection
Immune System Network Environment Static Data Self Uncorrupted data Non-self Any change to self Active Processes on Single Host Cell Active process in a computer Multicellular organism Computer running multiple processes Population of organisms Set of networked computers Skin and innate immunity Security mechanisms, like passwords, groups, etc. Adaptive immunity Lymphocyte process able to query other processes to seek for abnormal behaviors Autoimmune response False alarm Self Normal behavior Non-self Abnormal behavior Network of Mutually Trusting Computers Organ in an animal Each computer in a network environment IS to Security Systems
Immune System Computational System Pathogens (antigens) Computer viruses B-, T-cells and antibodies Detectors Proteins Strings Antibody/antigen binding Pattern matching Virus Detection
Contents: • Overview of Artificial Immune Systems • Positive/Negative Selection Algorithm • The best NSA with r-chunk matching rule • Proposed PSA and NSA with r-chunk matching rule
Positive/Negative Selection Algorithm • Positive Selection Algorithm • Negative Selection Algorithm
PSA: Stage 1- Detector generation No Yes No Yes
Positive/Negative Selection Algorithm • Positive Selection Algorithm • Negative Selection Algorithm
NSA: Stage 1- Detector generation Yes No No Yes
00000110 00000111 Contiguous five positions NSA example: Detector generation • Negative Selection Algorithm, • r-contiguous matching position, • Binary string representation method. Self set 00000110, 00000101, 00000100,…… (First five bits are 0) 00000111, 00000010, 11011001,…… 11011001,…… T-celldetectors produced randomly Matured T-cell detectors No Matching (r = 5) Yes 00000111, 00000010,…… Abandoned
Two kinds of detector: r-chunk and r-contiguous • : an arbitrary alphabets • s ℓ: a binary string, ℓ = |s| • s[i,…,j]: a substring of s starts at position i.
Two kinds of detector: r-chunk and r-contiguous Definition 1. An r-chunk detector (d, i) is a tuple of a string d r and an integer i {1,…, ℓ - r + 1}. It matches another string s ℓ if s[i,…, i + r - 1] = d. Example: 3-chunk detector (000, 2) matches string s = 100010 {0,1}6 (s[2,…,4] = 000)
Two kinds of detector: r-chunk and r-contiguous Definition 2. An r-contiguous detector is a string d ℓ. It matches another string s ℓ if there is an i {1,…, ℓ - r + 1} with d[i,…, i + r - 1] = s[i,…, i + r - 1]. Example: 3-contiguous detector d = 111010 matches string s = 100010 {0,1}6 (d[4,…,6] = s[4,…,6])
More NSA example: Self-set S {a,b}5 with all 3-chunk detector All strings in {a,b}5 are classified as nonself (gray background) and self (white background: bolds string are member of S and the others are called holes)
More NSA example: Self-set S {a,b}5 with all 3-contiguous detector All strings in {a,b}5 are classified as nonself (gray background) and self (white background: bolds string are member of S and the others are called holes)
Important Property The r-contiguous detectors are originally researched by many authors, and r-chunk detectors were later introduced to achieve better results (reduce number of holes) [3].
Comparison r-chunk detector-based algorithms on binary strings [6] M. Elberfeld, J. Textor, Negative selection algorithms on strings with training and linear-time classification, Theoretical Computer Science, 412, 534-542, 2011.
Contents: • Overview of Artificial Immune Systems • Positive/Negative Selection Algorithm • The best NSA with r-chunk matching rule • Proposed PSA and NSA with r-chunk matching rule
The best NSA with r-chunk matching rule [6] M. Elberfeld, J. Textor, Negative selection algorithms on strings with training and linear-time classification, Theoretical Computer Science, 412, 534-542, 2011. • Have 3 steps: • Create tree Ti from S[i,..,i+r-1] • Construct prefix tree Ti’ by adjusting Ti • Connect Ti’ with Ti+1’:
Step 1. Create tree Ti from S[i,..,i+r-1] Insert every s[1,..3] in to T1 Labels of edges are omitted Implicitly understand: left edge labeled with a and right edge with b. Given S ∑5 = {a, b}5 S = { sb = abbbb; s2 = aabbb; s3 = baaaa; s4 = baaab; s5 = baaba; s6 = babba; s7 = bbbbb } ℓ = 5, r =3
Step 1. Create tree Ti from S[i,..,i+r-1] Insert every s[1,..3] in to T1 (aab,1) (abb,1) (baa,1) (bab,1) (bbb,1) Given S ∑5 = {a, b}5 S = { sb = abbbb; s2 = aabbb; s3 = baaaa; s4 = baaab; s5 = baaba; s6 = babba; s7 = bbbbb } ℓ = 5, r =3
Step 1. Create tree Ti from S[i,..,i+r-1] Insert every s[2,..4] in to T2 Insert every s[3,..5] in to T3 Insert every s[1,..3] in to T1 (aab,1) (abb,1) (baa,1) (bab,1) (bbb,1) (aaa,2) (aab,2) (abb,2) (bbb,2) (aaa,3) (aab,3) (aba,3) (bba,3) (bbb,3) Given S ∑5 = {a, b}5 S = { sb = abbbb; s2 = aabbb; s3 = baaaa; s4 = baaab; s5 = baaba; s6 = babba; s7 = bbbbb } ℓ = 5, r =3
Create new leaf from non-leaf node • Delete node from which none of the newly created leaves is reachable Step 2. Construct prefix tree Ti’ by adjusting Ti Create new leaf Delete node Prefix tree T1’ Tree T1
Step 2. Construct prefix tree Ti’ for i = 1, 2, 3 Tree T1 Prefix tree T1’ Prefix tree T2’ Tree T2 Prefix tree T3’ Tree T3
Step 2. Construct prefix tree Ti’ for i = 1, 2, 3 Tree T1 Prefix tree T1’ Prefix tree T2’ Tree T2 Prefix tree T3’ Tree T3
Step 3. Connect Ti’ with Ti+1’: insert failure links connect Ti’ with Ti+1’: Prefix tree T1’ Path aab A failure link Prefix tree T2’ Path aba
Step 3. Connect Ti’ with Ti+1’ Prefix tree T1’ Prefix tree T2’ Connection of prefix tree T1’ and prefix tree T2’
Connection of three prefix trees T1’, T2’ and T3’ • Turn the graph into a finite automaton • Time complexity: O(|S|ℓr)
Contents: • Overview of Artificial Immune Systems • Positive/Negative Selection Algorithm • The best NSA • Proposed PSA and NSA with r-chunk matching rule
Proposed PSA with r-chunk matching rule • S ℓ, n = |S|, si S, (i = 1,2,…,n), m = ||, m ≥2 • Array Q: Q[s][c] is a pointer used for creating new node in the tree, s r-1, c • Array P: P[i] is a struct with two fields, a pointer P[i].end and a string P[i].str r-1, i = 1, 2,…,n.
Create tree T1 P[1].end bb P[1].str P[1] Given S ∑5 = {a, b}5 S = { sb = abbbb; s2 = aabbb; s3 = baaaa; s4 = baaab; s5 = baaba; s6 = babba; s7 = bbbbb } ℓ = 5, r =3
Create tree T1 ab bb aa aa aa ab bb P[2] P[1] P[3] P[4] P[5] P[6] P[7] Given S ∑5 = {a, b}5 S = { sb = abbbb; s2 = aabbb; s3 = baaaa; s4 = baaab; s5 = baaba; s6 = babba; s7 = bbbbb } ℓ = 5, r =3
Create tree T1 aa Q[aa][a] Q[aa][b] Given S ∑5 = {a, b}5 S = { sb = abbbb; s2 = aabbb; s3 = baaaa; s4 = baaab; s5 = baaba; s6 = babba; s7 = bbbbb } ℓ = 5, r =3
Create tree T1 Q[s][c] aa ab ba bb Given S ∑5 = {a, b}5 S = { sb = abbbb; s2 = aabbb; s3 = baaaa; s4 = baaab; s5 = baaba; s6 = babba; s7 = bbbbb } ℓ = 5, r =3
Create tree T1 ab bb aa aa aa ab bb P[2] P[1] P[3] P[4] P[5] P[6] P[7] aa ab ba bb S = { sb = abbbb; s2 = aabbb; s3 = baaaa; s4 = baaab; s5 = baaba; s6 = babba; s7 = bbbbb }
Combination of T1 and T2 ab bb aa aa aa ab bb P[2] P[1] P[3] P[4] P[5] P[6] P[7] aa ab ba bb
Combination of T1 and T2 ab bb ab bb P[2] P[1] P[6] P[7] aa ab ba bb
Combination of T1 and T2 ab aa bb