200 likes | 222 Views
Enhancing gene expression classification using Zinc-Finger proteins and magnetic bead fluorescence attributes. Machine learning with DNA to classify genes based on specific zinc-finger protein values.
E N D
Gene Expression Classification by Kernel-based PLM 응용화학부 2004-31012 서 주 현 전기전자공학부 2003-21710 조 율 원 컴퓨터공학과 2004-21440 강 성 구
Strategy in This Study - Making molecular kernel-based PLM with high confidence • Tandem selection • - programmable, no need of index • 2. Enhancing the specificity and confidence using “zinc-finger protein”
Zinc-Finger Protein • DNA binging protein • ~30 amino acid • used transcriotional regulator domain in cell • Codon specific (5’-NNN-3’) • Able to expand to recognize 6 or 9 base pair if connected tandemly. • - number of attribute increases in 64n
Magnetic Bead 형광 Attribute Biotin 형광 T*6 Attribute classification Library Data and Attribute Data DNA Design Library DNA learning data DNA DNA library with various DNA length
Attribute 1의 값에 특이적인 zinc-finger 단백질 Attribute 2의 값에 특이적인 zinc-finger 단백질 Magnetic Bead Magnetic Bead 형광 형광 Attribute 2 Attribute 1 자석을 이용해 Attribute 1 DNA 회수 자석을 이용해 Attribute 2 DNA 회수 .... Machine Learning with DNA (1)
Attribute n의 값에 특이적인 zinc-finger 단백질 Class 의 값에 특이적인 zinc-finger 단백질 Magnetic Bead Magnetic Bead 형광 형광 Attribute n Class 자석을 이용해 Attribute n DNA 회수 자석을 이용해 Class DNA 회수 Machine Learning with DNA (2)
T*6 classification Biotin 형광 Attribute Extension Class codon Extension TTTTTT Data Amplification by PCR
Attribute 1의 값에 특이적인 zinc-finger 단백질 Attribute 2의 값에 특이적인 zinc-finger 단백질 Magnetic Bead Magnetic Bead 형광 형광 Attribute 2 Attribute 1 자석을 이용해 Attribute 1 DNA 회수 자석을 이용해 Attribute 2 DNA 회수 .... Classification Prediction by Kernel-Based PLM library streptavidin으로 library DNA 회수
Attribute n의 값에 특이적인 zinc-finger 단백질 Class 의 값에 특이적인 zinc-finger 단백질 Magnetic Bead Magnetic Bead 형광 형광 Attribute n Class 자석을 이용해 Attribute n DNA 회수 형광 Classification Prediction by Kernel-Based PLM library streptavidin으로 library DNA 회수
Library Design attribute1 attribute2 attribute3 … class value Positive AAA AAG ACA … TTA Negative AAC AAT ACC … TTC (a) encoding for zinc-finger Protein Positive Positive Negative AAA AAA AAT AAC ACT ACA AAA TTA AAA AAT TTA AAC ACT TTA AAA TTC AAA AAT TTC AAC ACT TTC (c) New Library Design (b) Previous Library Design
Learning Algorithm new example e e is positive ? yes no Why Separation ? [Tradeoff Negative Pruning] Why 2 attribute ? Find SuperSet that differ in 2 attributes Find SuperSet that differ in 2 attributes [noise of example] Positive Negative (a) Learning Algorithm
Classification of New Data new data Positive Negative ratio = size of positive Library / size of negative Library a = # of positive data b = # of negative data no a > b * ratio yes positive value negative value (a) Classification Algorithm
Experimental Result (a) Variation of Library size
Experimental Result 1 2 3 4 Avg Corrent(120) 112 112 112 112 112 (a) Correctness of 120 example data 1 2 3 4 Avg Corrent(60) 59 59 59 59 59 (b) Correctness of 60 example data 1 2 3 4 Avg Corrent(120) 118 118 118 118 118 (a) Correctness of 60 example data
Conclusion • Zinc-finger Protein • No indexing • Reasonable Classification • 2 Sub Library