240 likes | 400 Views
Version Space using DNA Computing. 2001.10.26 임희웅. Version Space(1). Version Space? Concept Learning Classifying given instance x Maintain a set of hypothesis that is consistent with the training examples Instance X described by the tuple of attributes Attributes Dept, {ee, cs}
E N D
Version Space using DNA Computing 2001.10.26 임희웅
Version Space(1) • Version Space? • Concept Learning • Classifying given instance x • Maintain a set of hypothesis that is consistent with the training examples • Instance X • described by the tuple of attributes • Attributes • Dept, {ee, cs} • Status,{faculty, staff} • Floor,{four, five}
Version Space(2) • Hypotheses H • Each hypothesis is described by a conjunction of constraints on the attributes • Ex) <cs, faculty> or <cs> • Target concept • X {0, 1} • Training example D • <cs, faculty, four> + • <cs, faculty, five> + • <ee, faculty, four> - • <cs, staff, five> -
Version Space(3) • Hypothesis that is consistent with training example all combination of the attributes of the training example i.e. power set • Training example : <cs, faculty, four> • Consistent Hypothesis<cs, faculty, four>,<cs, faculty>, <cs, four>, <faculty, four><cs>, <faculty>, <four>
∧ cs ee faculty staff four five cs ∧ faculty cs ∧ staff ee ∧ faculty ee ∧ staff faculty ∧ four faculty ∧ five cs ∧ faculty ∧ four cs ∧ faculty ∧ five ee ∧ faculty ∧ four cs ∧ staff ∧ five Version Space(4)
∧ cs ee faculty staff four five cs ∧ faculty cs ∧ staff ee ∧ faculty ee ∧ staff faculty ∧ four faculty ∧ five cs ∧ faculty ∧ four cs ∧ faculty ∧ five ee ∧ faculty ∧ four cs ∧ staff ∧ five + <cs, faculty, four>,<cs, faculty>, <cs, four>, <faculty, four><cs>, <faculty>, <four><>
∧ cs ee faculty staff four five cs ∧ faculty cs ∧ staff ee ∧ faculty ee ∧ staff faculty ∧ four faculty ∧ five cs ∧ faculty ∧ four cs ∧ faculty ∧ five ee ∧ faculty ∧ four cs ∧ staff ∧ five + – <cs, faculty, four>,<cs, faculty>, <cs, four>, <faculty, four><faculty>, <four>
∧ cs ee faculty staff four five cs ∧ faculty cs ∧ staff ee ∧ faculty ee ∧ staff faculty ∧ four faculty ∧ five cs ∧ faculty ∧ four cs ∧ faculty ∧ five ee ∧ faculty ∧ four cs ∧ staff ∧ five + – + <cs, faculty><faculty>
∧ cs ee faculty staff four five cs ∧ faculty cs ∧ staff ee ∧ faculty ee ∧ staff faculty ∧ four faculty ∧ five cs ∧ faculty ∧ four cs ∧ faculty ∧ five ee ∧ faculty ∧ four cs ∧ staff ∧ five + + – – <cs, faculty>
Version Space using DNA Computing • Problem Definition • Attributes • Dept,{ee, cs} • Status,{faculty, staff} • Floor,{four, five} • Training example D • <cs, faculty, four> + • <cs, faculty, five> + • <ee, faculty, four> - • <cs, staff, five> -
Encoding.1 • Attribute사이에 순서를 고려할 경우 각각의 attribute의 값들을 하나의 기본 DNA sequence로 표현하고 이러한 기본 DNA sequence들을 서로 다른 attribute에 속하는 sequence들 끼리 ligation 될 수 있도록 sticky end조건을 준다. • 이 경우 <cs, faculty>나 <faculty, four>와 같은 것은 생성되지만 <cs, four>는 생성되지 않는다.
Encoding.2(1) • Attribute들 간의 순서를 고려하지 않을 경우 • Adleman 실험의 encoding을 이용 • Attribute value : vertex • Ligation of Attribute value : edge Complete graph, Overhead
cs ee four faculty staff five Encoding.2(2) Edge • Graph for All Hypothesis Vertex + Dummy strand for blunt end?
cs four faculty Encoding.2(3) • Graph for Hypothesis that is consistent with <cs, faculty, four> Vertex Edge + Dummy strand for blunt end?
Encoding.2(4) • 앞의 과정을 통해서 생성된 double strand를 그대로 사용하지 않고 vertex 부분이 encoding된 single strand만을 추출하여 사용한다. • Single strand를 추출하는 과정은 각각의 attribute value에 해당하는 strand에 complementary 한 strand를 사용해서(bead?) 이루어 질 수 있다. • Why only single strand? • hybridization을 이용한 교집합 연산을 위해 • Hypothesis strand에 존재하는 attribute에 대해서 순서를 주었음
Dept Bead Status Floor + 각각의 attribute에 해당하는 dummy sequence Encoding.3(1) • Bead의 이용 • 앞의 Adleman의 encoding방법을 사용하는 것보다 훨씬 적은 수의 sequence가 필요함 • 또한 가능한 모든 hypothesis를 한꺼번에 생성할 수도 있고 특정한 example에 대해서 consistent한 모든 hypothesis를 모두 생성할 수도 있음
Encoding.3(2) • Problem!! • Hypothesis들의 증폭은 어떻게 할 것인가?
Detection(1) – Encoding.2 • 1. Tube1(0) • first traning example(positive라고 가정)에 대해서 그와 consistent한 모든 hypothesis • 2. 다음 example들에 대해 각각 다음의 작업을 수행한다. • Tube2 • example과 consistent한 모든 hypothesis • Positive일 경우 • Tube1(n+1) = Tube1(n) ∩ Tube2 • Negative일 경우 • Tube1(n+1) = Tube1(n) - Tube2
Detection(2) – Encoding.2 • Implementation of ‘∩’operation • 최초의 Tube1(0)는 앞서 encoding.2 에서 제안한대로 생성한다.(single strand) • 이후의 example에 대해 생성되는 모든 Tube2는 tube1(0)과 encoding을 complementary하게한다.(single strand) • Tube2를 Tube1에 넣어 hybridization시켜 서로 완전히 결합한 double strand만을 뽑아낸다. • 앞의 결과물을 denature시켜 최초의 Tube1(0)과 같은 방향으로 encoding된 한쪽 single strand만을 다시 추출해낸다.
Detection(3) – Encoding.2 • Implementation of ‘-’operation • How? =_=; • Complement Set과의 ‘∩’ operation? • Another encoding for negative example?
Detection(4) – Encoding.2 • Problem • Strand들의 증폭이 필요하다면?
Detection – Encoding.3 • Bead를 이용한 encoding의 경우 • How? =_=; • ‘∩’, ‘-’ • 증폭
Application • 실제 input이 들어왔을 때 Classification을 어떻게 할 것인가? • Voting?
Reference on Version Space • Machine Learning, T.M. Mitchell, McGraw Hill • Artificial Intelligence-Theory and Practice, Dean, Addison-Wesley