220 likes | 237 Views
imArray - An Automated High-Performance Microarray Scanner Software for Microarray Image Analysis, Data Management and Knowledge Mining. Wei-Bang Chen and Chengcui Zhang Department of Computer and Information Sciences University of Alabama at Birmingham. Microarray slide.
E N D
imArray - An Automated High-Performance Microarray Scanner Software for Microarray Image Analysis, Data Management and Knowledge Mining Wei-Bang Chen and Chengcui Zhang Department of Computer and Information Sciences University of Alabama at Birmingham
Microarray slide Microarray Introduction Microarray allows biologists to monitor gene expression level in parallel. Tumor tissue Normal tissue Mix & Pour onto slide Labeled with different fluorescent dye (Cy3 / Cy5) Hybridization Samples compete the gene on the slide If a gene in the sample is complementary to a gene on the slide, they will bind together Wash
Microarray Scanner 532 nm / 635 nm Microarray Slide Images 532 nm / 635 nm Microarray Slide Microarray slide image
This is a block 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 Microarray slide layout 30 × 30 spots
Gene expression level “Red / Green Intensity Ratio” represents the “Gene expression level”
The challenge of microarray image analysis • Spot addressing problems • Tilted slide • Block detection • Gridline detection • Segmentation problems • Uneven background • Inner holes (a donut, comet, or overlap) • Scratch • Noises • Data management problems • Abundant information from unstructured documents
Solutions • imArray - Microarray Image Analysis system • Fully automatic image analysis • Orientation • Gridding1 • Segmentation1 • Robust and efficient data management • Unstructured Information Management Architecture (UIMA) 1.W.-B. Chen, C. Zhang, and W.-L. Liu, “An Automated Gridding and Segmentation Method for cDNA Microarray Image Analysis,” in Proc. of the 19th IEEE International Symposium on Computer-Based Medical Systems, pp. 893-898, 2006.
Unstructured documents Structured information UIMA Introduction • Developed at IBM • Component-based framework • Analysis Engine (AE) • Primitive AE & Aggregate AE • Annotator • Component Descriptor • Common Analysis Structure (CAS)
System overview • Slide Information Module • Slide Blocking Module • Slide Gridding Module • Slide Segmentation Module
Slide Information Module • Goal Retrieve information, such as probe set specification, in documents • Implementation • Primitive Analysis Engine • Analyze, parse, and retrieve information in XML documents • Collaborate with agent-based automatic information retrieval module for updating retrieved contents from online databases
Vertical block boundaries Vertical gridlines 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Horizontal block boundaries Horizontal gridlines 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 Fully automatic spot addressing
Slide Blocking Module • Signal/Noise detector Distinguish signal (foreground pixels) from noise (background pixels) by adopting a global-local thresholding technique1 • Tilt detector Identify and correct a tiled slide by first determining the tilted angle, and then rotate the entire slide with affine transformation • Block boundary detector Discovers the repeated block patterns in a slide by detecting gaps between blocks and generates the horizontal and vertical block boundaries1 1.W.-B. Chen, C. Zhang, and W.-L. Liu, “An Automated Gridding and Segmentation Method for cDNA Microarray Image Analysis,” in Proc. of the 19th IEEE International Symposium on Computer-Based Medical Systems, pp. 893-898, 2006.
Tilt detector • Goal Identify and correct a tilted slide • Implementation • Primitive analysis engine • Detect tilted angle by Principal Component Analysis (PCA) • Correct tilted slide with affine transformation
Slide Gridding Module • Goal Generates a grid within each block for separating spots, i.e. a cell in the grid contains only one spot1 • Implementation • Aggregate Analysis Engine including two primitive analysis engines: • Bounding box generation • Gridline detection 1.W.-B. Chen, C. Zhang, and W.-L. Liu, “An Automated Gridding and Segmentation Method for cDNA Microarray Image Analysis,” in Proc. of the 19th IEEE International Symposium on Computer-Based Medical Systems, pp. 893-898, 2006.
Spot addressing results • Detecting blocks1 • Recall value: 100% • Precision value: 100% • Gridding1 • Recall value: 99.97% • Precision value: 100% 1.W.-B. Chen, C. Zhang, and W.-L. Liu, “An Automated Gridding and Segmentation Method for cDNA Microarray Image Analysis,” in Proc. of the 19th IEEE International Symposium on Computer-Based Medical Systems, pp. 893-898, 2006.
Slide Segmentation Module • Goal Refine the class label within the grid region • Implementation • Primitive analysis engine • Determine local threshold – Otsu’s2 • Minimize • Intra-class variance • Between-class variance 2. V. R. Iyer, et al. "The transcriptional program in the response of human fibroblasts to serum,“ Science, v283, pp. 83-7, 1999.
Spot segmentation results Segmentation results for some sample spots1 • In each row, • from left to right: • Original spot • Pre-labeled spot with the segment boundary • Spot segmentation results • GenePix
Spot segmentation results Segmentation results1 1.W.-B. Chen, C. Zhang, and W.-L. Liu, “An Automated Gridding and Segmentation Method for cDNA Microarray Image Analysis,” in Proc. of the 19th IEEE International Symposium on Computer-Based Medical Systems, pp. 893-898, 2006.
Conclusions • Our proposed imArray system is fully automatic • Handle uneven background and severe noise • Detect tilted slide and correct its orientation • Detect block boundaries and generate grids • Spot segmentation method is simple and effective • Highly parallelizable method • Update annotation automatically