400 likes | 553 Views
生物醫學影像技術介紹. Overview. Image Processing Medical Imaging Bioinformatics Microarray Data Analysis and Mining Validation. transverse plane. coronal plane. sagittal plane. Image Processing (I). Fundamental units 2D – pixel 3D – voxel
E N D
Overview • Image Processing • Medical Imaging • Bioinformatics • Microarray • Data Analysis and Mining • Validation
transverse plane coronal plane sagittal plane Image Processing (I) • Fundamental units • 2D – pixel • 3D – voxel • Orthogonal views: transverse (or axial), coronal, sagittal • Image processing: preprocessing, segmentation, object representation and recognition, quantitative analysis
Y Image Processing (II) – Slices vs. Projections Maximum Intensity Projection (MIP) Z X Weighted-Sum Projection
Visualization Tools Image Processing Procedure CVGIP2001, Aug. 19-21, 2001
Image Processing (III) • Preprocessing – enhancement of image, removal of noise, transformation • Segmentation – partition an image into disjoint regions: threshold-based, region-based, contour-based, hybrid • Object representation & recognition • Quantitative analysis
Medical Imaging • Objective: assist the physician in study/identification of anomaly in the organism • Medicine domain knowledge + image processing + visualization • Topological & geometrical analysis • Validation issues – how to obtain ground truth? • Need of human interaction • Scanning techniques – CT, MR, PET, Ultrasound • Resolution issue: delta z usually significantly larger than delta x and delta y
3D CT Images and Analysis Human liver Rat heart Rat liver CVGIP2002, Aug. 25-27, 2002
Results: 3D CT Image Analysis CVGIP2002, Aug. 25-27, 2002
Bioinformatics • Use information techniques to solve biological problem • Reasons to exist • A great deal amount of biological data • Identification of known information • Discovery of hidden pattern • Quantification • DNA -> mRNA -> amino acid -> protein • A, G, C, T basics – base pair, pairing rule • Exon, intron, gene • PCR (Polymerase Chain Reaction)
Microarray Arraying Process • Ordered sets (in 2D array) of DNA molecules (usually oligonucleotide or cDNA) attached to solid surface (glass, silicon, or nylon) • Matrix is coated materials to be reactive; known DNA sequence segments (of genes) spotted on surface and hybridized with specimen’s RNA that are labeled (usually with fluorescent nucleotide) • Spot intensities (or amount of fluorescence) correspond to transcript levels of particular gene 35 to hundreds of bp
Microarray Data • Organized as an m x n matrix M • m: no. of genes; n: no. of samples • Mij: expression level of gene i in sample j • Row ei: expression pattern of gene i • Column sj: expression pattern of sample j • Ratioing: eij = log (Mij/gi) • gi: expression level in a control • The logarithm (log) is useful for normalization of expression profile
Discovering Microarray Data (I) • Genes w/ similar expression pattern over all samples • Compare eiand ej • Cluster analysis to group similar expression patterns • Define novel genes that are similar to known genes • Characterize cluster functions • Find genes which fit a prototypical pattern over samples
Discovering Microarray Data (II) • Genes w/ unusual expression levels in a sample • Examine the change of expression levels in certain si vector • Determine outlier genes of interest • Genes whose expression levels vary across samples • monitor pre- and post-state of genes • Samples w/ similar expression patterns • Determine possible clusters of samples • Discover genetic clues that may lead to find subtype of a cancer • Tissues that might be cancerous • Classify or diagnose cancers by relating them to information from expression patterns of known cancerous and non-cancerous tissues
Ideal Microarray Images • Sub-grids are of the same size • Sub-grids are equally spaced • Locations of spots are circular and aligned • Grid positions are fixed across images • No dust or contamination is present • Minimal undesirable noises in the image • Background intensity is uniform in image
Microarray Image Processing • Preprocessing – noise removal • Spot localization & image segmentation • Feature extraction • Data quantification • Quality measurements • Data analysis & Visualization
Statistics Concerns (I) • Experiment design • Sufficient experiments (e.g., minimally 5-10 samples of cancerous tissue and 5-10 samples of non-cancerous tissue for identification of cancerous tissue sample) • Samples must be independent – giving sufficient dimensionality for discriminating genes • Replicates of each sample help improve robustness
Statistics Concerns (II) • Preprocessing of the data • Cy5/Cy3 values of Spotted array data are usually log-transformed to normalize the expression profile (why?) • Chip-to-chip scaling to account for variation in overall chip intensity is needed • Subtracting the mean from each component of a gene expression pattern • Further, dividing the above result by its variance • Dimensionality reduction, e.g., principal components analysis (PCA) or Multidimensional Scaling (MDS)
Statistics Concerns (III) • Statistical tools • Cluster analysis • Probability theory • Statistical inference • Classification
More Issues on Microarray Data (I) • Each dot on the microarray chip corresponds to a southern blot experiment • Dyes: dye-3 & dye-5; dye swapping • Color scale used in gene expression profile • Observations from the gene expression profile • What if the microarray image is mistreated?
Organizing Microarray Data – Hierarchical Clustering More Issues on Microarray Data (II) Time points Different genes
Identify Spots (I) • Projection
Data Analysis & Mining • Analysis and reasoning tools • Clustering: k-means, SVM, SOM, etc. • Bayes’ rule • Decision tree vs. hierarchical clustering • Sequence alignment – BLAST • Online tools – NCBI-bound • Association
Decision Trees • Choose between options by projecting likely outcomes • Draw a decision tree in terms alternate decisions or all possible outcomes • Evaluate the decision tree
The second player to pick the stick can always win!! Decision Tree Example: Stick-Picking Game • Two players • Each player takes turn to take off either 1 or 2 sticks at a time • The play taking the last stick(s) loses the game 4 1 2 2 3 2 1 1 2 2 1 1 0 1 2 1 1 1 0 0 0 1 0
K-Means Clustering Method • A k-means algorithm is implemented in 4 steps: • Partition objects into k nonempty subsets • Compute seed points as the centroids of the clusters of the current partition. The centroid is the center (mean point) of the cluster. • Assign each object to the cluster with the nearest seed point. • Go back to Step 2, stop when no more new assignment.
Model Learning – Training vs. Testing • Training – model development • Testing – performance evaluation • Concerns • Testing data be distinct from training data • Make efficient use of limited data
Validation – Sensitivity vs. Specificity • Sensitivity and specificity • True positive fraction (TPF) = sensitivity • False positive fraction (FPF) = 1- specificity, normal case is reported as abnormal falsely
Validation – ROC • Receiver operating characteristic (ROC) curve • Observers are asked to rate images using multi-point scale of certainty (e.g. definitely normal to definitely abnormal)