400 likes | 568 Views
Physical Mapping Problem. D7526010@csie.ntu.edu.tw. Problem Definition. Physical mapping 的定義. H J. DNA. A. B. Q. C. P. D. O. N. E. M. F. G. L. H. K. J. Fragment of DNA. Why We Need Physical Mapping. 可以利用這個地圖將 DNA 做完全排序 可以知道基因到底如何對人類產生作用 利用人造蛋白質 ... 等等來改進遺傳體質.
E N D
Physical Mapping Problem D7526010@csie.ntu.edu.tw
Problem Definition • Physical mapping的定義
H J DNA A B Q C P D O N E M F G L H K J Fragment of DNA
Why We Need Physical Mapping • 可以利用這個地圖將DNA做完全排序 • 可以知道基因到底如何對人類產生作用 • 利用人造蛋白質...等等來改進遺傳體質
人類染色體(約 bp) Physical map (約 bp) AGACTAGTCGTAACGATCGCTAATTTAAGGCTACT..... DNA Sequencing (約 bp)
Why We Need Physical Mapping • 可以利用這個地圖將DNA做完全排序 • 可以知道基因到底如何對人類產生作用 • 利用人造蛋白質...等等來改進遺傳體質 • 可以得知基因(或標記)的大約位置 • 對於一些遺傳疾病可以得到較多的資訊 • 可以幫助偵測是否具有遺傳疾病
DNA A B Q C P D O N E M F G L H K J Fragment of DNA H J α
target DNA 加入酵素
Partial Digest Problem • by single enzyme A • restriction sites: a1< a2< a3<.....< ap • multiset of fragment lengths {aj-ai,i<j}
Double Digest Problem (DDP) • Clones first completely digested by enzyme A,then by B, finally A and B together • restriction sites: • by A: a1< a2< a3<.....< ap • by B: b1< b2< b3 <.....< bq • by A+B : c1< c2< c3 <.....< cp+q • Reconstruct the restriction sites from these multisets
Example : DDP Enzyme A+B 1 2 3 3 5 6 7 Enzyme A 3 6 8 10 Enzyme B 4 5 7 11
target DNA ........
target DNA ................. ........ ATGCGCTAACTGGACTTCAAGCCTAAACTGCATCAGACTT TACGCGATTGACCTGAAGT Complementary probe The Spirit of Hybridization
target DNA A B 1 2 3 4 5 C D E F G H I J
J D F I E G A C H B 1 2 3 4 5
1 A、B、C C、D、E 2 3 E、F F、G 4 G、H、I 5 6 I、J、K False Negative 1 A、C C、D、E 2 3 E、F A、F、G 4 G、H、I 5 6 E、F、I、J、K
1 A、B、C C、D、E 2 3 E、F F、G 4 G、H、I 5 6 I、J、K False Positive 1 A、C C、D、E 2 3 E、F A、F、G 4 G、H、I 5 6 E、F、I、J、K
Chimeric Clones 1 A、C 1 A、B、C C、D、E 2 C、D、E 2 3 E、F 3 E、F A、F、G 4 F、G 4 G、H、I 5 G、H、I 5 6 E、F、I、J、K 6 I、J、K
5 A、B、C C、D、E 6 3 E、F F、G 4 I、J、K 1 2 G、H、I Clones 1 2 3 4 5 6 A B Probes C D E F G H I J K
Clones 1 2 3 4 5 6 A 5 A、B、C B Probes C、D、E 6 C 3 E、F、K D I、J、K、F、G E 4 F I、J、K 1 G H 2 G、H、I I J K
How To Use Traveling Salesman Problem to Solve Physical Mapping Problem
How to Convert to TSP? • Hamming distance
How to Convert to TSP? • Hamming distance • Cycle weight = number of gaps transitions +2n
How to Convert to TSP? • Hamming distance • Cycle weight = number of gaps transitions +2n • So, minimize the cycle weight is to the gap number
Our approach • We also convert it to optimization problem • Using more complicated model • Using Genetic Algorithm to solve it. F(A) = X*C(A)+Y*P(A)+Z*N(A)+T*M(A)+ P*L(A).
The results of our approach tested on simulated data. • (a) (b) • The false negative rate is set as 0.1. The false positive rate is 0.05. • The false negative rate is set as 0.1. The false positive rate is 0.01.
Experimental Results of our GA tested on Real data from chromosome 1 (a) It shows the results of our GA run with the data which is a contig with about 95 clones and about 120 probes (b) It shows the results of our GA run with the data which is a contig with about 172 clones and about 136 probes