340 likes | 430 Views
Data Processing Technologies for DNA Micr oarray. Nini Rao School of Life Science And Technology UESTC 14/11/2004. Introduction The Applications of SVD Technology The Applications of NMF Technology Summarization. Introduction. 1. Gene and Genomes
E N D
Data Processing Technologies for DNA Microarray Nini Rao School of Life Science And Technology UESTC 14/11/2004
Introduction • The Applications of SVD Technology • The Applications of NMF Technology • Summarization
Introduction • 1. Gene and Genomes Gene ----The basic unit of genetic function Gene Expression ----The process by which genetic information at the DNA level is converted into functional proteins.
Introduction Genome Structure ---- each organism contains a unique genomic sequence with a unique structure.
Genome Data with unknown biological meanings exponentially increase. There are needs for mining these data.
Analysis of these new data requires mathematical tools that are adaptable to the large quantities of data, while reducing the complexity of the data to make them comprehensible.
2. A Microarray A small analytical device. That allows genomic exploration with speed and precision unprecedented in the history of biology. This technology was presented in 1990s.
3. Microarray Analysis The process of using microarrays for scientific exploration. Massive Technologies for microarray analysis have been adopted since the early 1990s.
5. The Roles of Microarray To monitor gene expression levels on a genomic scale To enhance fundamental understanding of life on the molecular level regulation of gene expression gene function cellular mechanisms medical diagnosis, treatment, drug design
Applications of SVD Mathematical definition of the SVD U is an mxn matrix S is an nxn diagonal matrix VT is also an nxn matrix
X(l) is the closest rank-l matrix to X. • The term “closest” means that X(l) minimizes the sum of the squares of the difference of the elements of X and X(l) ∑ij|xij – x(l)ij|2=min
The result analysis for Pattern Inference • (a) Raster display of v’ , the expression of 14 eigengenes in 14 arrays. • (b) Bar chart of the fractions of eigenexpression • (c) Line-joined graphs of the expression levels of r1 (red) and r2 (blue) in the 14 arrays fit dashed graphs of normalized sine(red) and osine(blue) of period T =390 min and phase = 2*3.14/13, respectively.
The results analysis for data sorting Fig.3.Genes sorted by relative correlation with r1 and r2 of normalized elutriation. • Normalized elutriation expression of the sorted 5,981 genes in the 14 arrays, showing traveling wave of expression. • Eigenarrays expression; the expression of a1 and a2, the eigenarrays corresponding to r1 and r2, displays the sorting. • Expression levels of a1(red) and a2(green) fit normalized sine and cosine functions of period Z=N-1= 5,980 and phase Q=2*3.14/13 (blue), respectively.
Other Applications for SVD • Missing data • Comparison between two genomic sequences
The Applications of NMF Mathematical definition of the NMF V (nm) = W (nr) . H (rm) In general, (n+m)r < nm. It can be used to extract the features that are hidden in dataset.
Summarization 1. SVD:Normalization 。 no data limitation NMF:No Normalization Positive data 2. SVD: Missing data, Cluster, Pattern inference, weak pattern extraction, Comparison NMF: Pattern inference, Cluster, Finding similarity 3. ICA is used to mining DNA microarray data.