230 likes | 345 Views
DNA computing-based Implementation of Decision tree. Advanced AI 컴퓨터공학부 임 예니 인지과학 협동 과정 이 은석 생물정보학 협동 과정 조 성범. 유전자 1. class. 유전자 2. class. 0. 환자 1. 0. 1. 0. 환자 2. 0. 1. 0. 0. 0. 환자 3. 1. 0. 0. 1. 1. 1. 0. 환자 4. Decision Tree using DNA computing.
E N D
DNA computing-based Implementation of Decision tree Advanced AI 컴퓨터공학부 임 예니 인지과학 협동 과정 이 은석 생물정보학 협동 과정 조 성범
유전자 1 class 유전자2 class 0 환자 1 0 1 0 환자 2 0 1 0 0 0 환자 3 1 0 0 1 1 1 0 환자 4 Decision Tree using DNA computing • Input strand organization At each attribute, instance value and class label was coupled After hybridization, length of strand means number of instances
5’ Sticky end {(00),(01),(10),(11)} Sticky end 3’ Cy5
(0,0) (0,1) (1,0) (1,1)
Calculation of Information Gain • Information Gain(S,A) ≡ Entropy(S) - ∑(|Sv|/|S|)*Entropy(Sv) = (|S0|/|S|)*Entropy(S0) +(|S1|/|S|)*Entropy(S1) In gene expression data, all attribute values are encoded in binary mode.
(|S0|/|S|)*Entropy(S0) ≈ (|S0|/|S|)*(n1/|S0|) ≈ n1/|S| (|S1|/|S|)*Entropy(S1) ≈ (|S1|/|S|)*(n2/|S1|) ≈ n2/|S|
∑(|Sv|/|S|)*Entropy(Sv) = (|S0|/|S|)*Entropy(S0)+(|S1|/|S|)*Entropy(S1) ≈ (|S0|/|S|)*(n1/|S0|) + (|S1|/|S|)*(n2/|S1|) ≈ n1/|S|+ n2/|S| ≈ n1+n2
DNA computing Vs Digital computing • Rules from DNA computing 36822=0 -1894=0 -1915=0:0 -1915=1:1 -1894=1:1 Identical to conventional decision tree algorithm
Input Sequence <00>/<01>/<10>/<11> 5’ Sticky end Sticky end 3’ <00> GCATAG GAAATGAGTT CTTTACTCAA CGTATC <01> ATAGGC TGATGCTACA ACTACGATGT TATCCG <10> AGGCAT GGTTGTGGCG CCAACACCGC TCCGTA <11> ATAGGA CAGTTATTTC GTCAATAAAG TATCCT
Implementation steps 1. Rule representing sequence 2. Hybridization 3. Construction random paths 4. Florescence detection: Check if a specific rule appeared sequentially 5. Repeating step 3-5
Simulation Results • 1st: each rule sequences: 1000,900,800,700 hybridization #: 1000
Simulation Results • 2nd:
Simulation Results • 3rd:
Simulation Results • 4th:
Summary of Simulation Results (0,0) ; 0.85 (0,1) ; 0.91 (1,0) ; 0.62 (1,1) ; 0.87
Calculation of Root Node with Simulation Results
Validation of decision tree resulting from DNA computing and digital computing
Discussion • Due to unspecific hybridization, simulation results were different from that of calculation • Lack of Pruning process • Cost • More specifichybridization process