生物醫學影像技術介紹

生物醫學影像技術介紹

Overview • Image Processing • Medical Imaging • Bioinformatics • Microarray • Data Analysis and Mining • Validation

transverse plane coronal plane sagittal plane Image Processing (I) • Fundamental units • 2D – pixel • 3D – voxel • Orthogonal views: transverse (or axial), coronal, sagittal • Image processing: preprocessing, segmentation, object representation and recognition, quantitative analysis

Y Image Processing (II) – Slices vs. Projections Maximum Intensity Projection (MIP) Z X Weighted-Sum Projection

Visualization Tools Image Processing Procedure CVGIP2001, Aug. 19-21, 2001

Image Processing (III) • Preprocessing – enhancement of image, removal of noise, transformation • Segmentation – partition an image into disjoint regions: threshold-based, region-based, contour-based, hybrid • Object representation & recognition • Quantitative analysis

Medical Imaging • Objective: assist the physician in study/identification of anomaly in the organism • Medicine domain knowledge + image processing + visualization • Topological & geometrical analysis • Validation issues – how to obtain ground truth? • Need of human interaction • Scanning techniques – CT, MR, PET, Ultrasound • Resolution issue: delta z usually significantly larger than delta x and delta y

3D CT Images and Analysis Human liver Rat heart Rat liver CVGIP2002, Aug. 25-27, 2002

Results: 3D CT Image Analysis CVGIP2002, Aug. 25-27, 2002

Bioinformatics • Use information techniques to solve biological problem • Reasons to exist • A great deal amount of biological data • Identification of known information • Discovery of hidden pattern • Quantification • DNA -> mRNA -> amino acid -> protein • A, G, C, T basics – base pair, pairing rule • Exon, intron, gene • PCR (Polymerase Chain Reaction)

Microarray Arraying Process • Ordered sets (in 2D array) of DNA molecules (usually oligonucleotide or cDNA) attached to solid surface (glass, silicon, or nylon) • Matrix is coated materials to be reactive; known DNA sequence segments (of genes) spotted on surface and hybridized with specimen’s RNA that are labeled (usually with fluorescent nucleotide) • Spot intensities (or amount of fluorescence) correspond to transcript levels of particular gene 35 to hundreds of bp

Microarray Data • Organized as an m x n matrix M • m: no. of genes; n: no. of samples • Mij: expression level of gene i in sample j • Row ei: expression pattern of gene i • Column sj: expression pattern of sample j • Ratioing: eij = log (Mij/gi) • gi: expression level in a control • The logarithm (log) is useful for normalization of expression profile

Discovering Microarray Data (I) • Genes w/ similar expression pattern over all samples • Compare eiand ej • Cluster analysis to group similar expression patterns • Define novel genes that are similar to known genes • Characterize cluster functions • Find genes which fit a prototypical pattern over samples

Discovering Microarray Data (II) • Genes w/ unusual expression levels in a sample • Examine the change of expression levels in certain si vector • Determine outlier genes of interest • Genes whose expression levels vary across samples • monitor pre- and post-state of genes • Samples w/ similar expression patterns • Determine possible clusters of samples • Discover genetic clues that may lead to find subtype of a cancer • Tissues that might be cancerous • Classify or diagnose cancers by relating them to information from expression patterns of known cancerous and non-cancerous tissues

Ideal Microarray Images • Sub-grids are of the same size • Sub-grids are equally spaced • Locations of spots are circular and aligned • Grid positions are fixed across images • No dust or contamination is present • Minimal undesirable noises in the image • Background intensity is uniform in image

Microarray Image Processing • Preprocessing – noise removal • Spot localization & image segmentation • Feature extraction • Data quantification • Quality measurements • Data analysis & Visualization

Statistics Concerns (I) • Experiment design • Sufficient experiments (e.g., minimally 5-10 samples of cancerous tissue and 5-10 samples of non-cancerous tissue for identification of cancerous tissue sample) • Samples must be independent – giving sufficient dimensionality for discriminating genes • Replicates of each sample help improve robustness

Statistics Concerns (II) • Preprocessing of the data • Cy5/Cy3 values of Spotted array data are usually log-transformed to normalize the expression profile (why?) • Chip-to-chip scaling to account for variation in overall chip intensity is needed • Subtracting the mean from each component of a gene expression pattern • Further, dividing the above result by its variance • Dimensionality reduction, e.g., principal components analysis (PCA) or Multidimensional Scaling (MDS)

Statistics Concerns (III) • Statistical tools • Cluster analysis • Probability theory • Statistical inference • Classification

More Issues on Microarray Data (I) • Each dot on the microarray chip corresponds to a southern blot experiment • Dyes: dye-3 & dye-5; dye swapping • Color scale used in gene expression profile • Observations from the gene expression profile • What if the microarray image is mistreated?

Organizing Microarray Data – Hierarchical Clustering More Issues on Microarray Data (II) Time points Different genes

Background and Significance

Pre-processing – Spot Smoothing

Identify Spots (I) • Projection

Identify Spots (II)

Identify Spots (III)

Data Analysis & Mining • Analysis and reasoning tools • Clustering: k-means, SVM, SOM, etc. • Bayes’ rule • Decision tree vs. hierarchical clustering • Sequence alignment – BLAST • Online tools – NCBI-bound • Association

Bayes’ Rule

Bayes Rule (con’t)

Decision Trees • Choose between options by projecting likely outcomes • Draw a decision tree in terms alternate decisions or all possible outcomes • Evaluate the decision tree

The second player to pick the stick can always win!! Decision Tree Example: Stick-Picking Game • Two players • Each player takes turn to take off either 1 or 2 sticks at a time • The play taking the last stick(s) loses the game 4 1 2 2 3 2 1 1 2 2 1 1 0 1 2 1 1 1 0 0 0 1 0

K-Means Clustering Method • A k-means algorithm is implemented in 4 steps: • Partition objects into k nonempty subsets • Compute seed points as the centroids of the clusters of the current partition. The centroid is the center (mean point) of the cluster. • Assign each object to the cluster with the nearest seed point. • Go back to Step 2, stop when no more new assignment.

Example: K-Means Clustering

Model Learning – Training vs. Testing • Training – model development • Testing – performance evaluation • Concerns • Testing data be distinct from training data • Make efficient use of limited data

Approaches

Validation – True/False Positive and Negative

Validation – Sensitivity vs. Specificity • Sensitivity and specificity • True positive fraction (TPF) = sensitivity • False positive fraction (FPF) = 1- specificity, normal case is reported as abnormal falsely

Validation – ROC • Receiver operating characteristic (ROC) curve • Observers are asked to rate images using multi-point scale of certainty (e.g. definitely normal to definitely abnormal)

生物醫學影像技術介紹

生物醫學影像技術介紹

Presentation Transcript