360 likes | 487 Views
The following slides have been adapted from http://www.tm4.org/ to be presented at the Follow-up course on Microarray Data Analysis (Nov 20-24 2006, PICB Shanghai) by Peter Serocka. TIGR. THE INSTITUTE FOR GENOMIC RESEARCH. TIGR Spotfinder: a tool for microarray image processing.
E N D
The following slides have been adapted from http://www.tm4.org/ to be presented at the Follow-up course on Microarray Data Analysis (Nov 20-24 2006, PICB Shanghai) by Peter Serocka
TIGR THE INSTITUTE FOR GENOMIC RESEARCH TIGR Spotfinder:a tool for microarray image processing Developer: Vasily Sharov The Institute for GenomicResearch
Printer Scanner Image File Image Analysis Microarray Data Flow Raw Gene Expression Data Gene Annotation Normalization / Filtering Normalized Data with Gene Annotation Expression Analysis Interpretation of Analysis Results
Printer Scanner Image File Image Analysis Microarray Data Flow .tif Raw Gene Expression Data .mev (.gpr) Gene Annotation Normalization / Filtering .ann (.gal) Normalized Data with Gene Annotation .mev (.gpr, .txt) Expression Analysis Interpretation of Analysis Results
Cy3 Cy3-cDNA Cy5 Cy5-cDNA Process Overview Sample1 mRNA Cy3 intensity RT RT cDNA array Cy5 intensity Sample2 mRNA
Basic Steps from Image to File 1.) Image File Loading 2.) Construct or Apply an Overlay Grid 3.) Computations • Find Spot Boundary and Area • Intensity Calculation • Background Calculation and Correction 4.) Quality Control 5.) Text File Output
Basic DemonstrationExploring the Interface(Using An Existing Grid File)
Microarray Image Parameters MA Scanner generates two 16 bit gray scale TIFF images: one image for each labeling probe (Cy3 and Cy5) 16 bit schema provides signal dynamic range from 0 to 216=65536 Each image size varies from 20 to 30 MB for scanning resolution 10 mm/pixel
Typical layout of microarray image Image size 22 MB Image size 28 MB (images scanned at 10mm/pix resolution)
Apply the Grid Determine Spot Boundary Calculate Spot Intensity Determine Background and Correct Intensity Processing Overview
Applying an Overlay Grid • What does it accomplish? • The grid cells set a boundary for the spot finding algorithms. • The grid cells also define an area for background correction.
Gridding Dimension Parameters pin X pin X pin Y pin Y
Spot Spacing Parameter spotspacing
Spot Finding Spot finding requires an estimated spot size. The spot can be drawn as an irregular contour, as an ellipse, or as unconnected pixels. Area inside contour is used for spot intensity calculation Area outside contour is used for local background calculation
Processing Overview Apply the Grid Determine Spot Boundary Calculate Spot Intensity Determine Background and Correct Intensity
Background Calculation Background intensity is calculated as the median pixel intensity from the area within the square and outside the spot. A separate local background is calculated for each spot using the non-spot pixels from it’s square. local background area
Spot Definition and Calculations Spot Area, A = number of pixels within the defined spot boundary BKG = median pixel value within the cell (excluding the spot pixels) Integral = Sum of all spot pixels excluding saturated pixels Reported “Intensity”=Integral-BKG*A
Quality Control Issues • Two measures of spot quality are reported by SpotFinder: • Saturation Factor • QC Score: reports shape and signal to noise ratio
Saturation Examples Partially saturated spots can look like this: saturated area non-saturated area Completely saturated spots can look like this: fully saturated spot
216=65536 Saturation, Pixel Value Limit Output: pixel value Input: fluorescence dye light signal
(# good pixels in spot) Saturation Factor = (total number of spot pixels) Saturation Factor -Partially saturated spots can be handled in SpotFinder by excluding the saturated pixels from spot area and intensity calculations. -Fully saturated spots can not be recovered in SpotFinder. In this case rescanning with lower excitation power or PMT gain could be considered. *Faint spots may possibly be lost.
Saturation, RI Plot RI plot: log(IB/IA) vs 1/2log(IA*IB) clearly displays the saturation limits
shape signal/noise shape signal/noise Quality Control, QC Score A QC Score is generated for each spot and is based on the spot shape and a measure of signal to noise ratio. QC Score QCA QCB
Spot Shape Parameter Shape Factor = (Spot Area/Perimeter) Spots with large perimeters relative to spot area will have a low shape factor.
Signal to Noise Ratio 216 S/N factor = fraction of spot pixels exceeding: *med(BKG) + * SD(BKG) Pixel Values med(BKG) SD(BKG) 0
Quality Control Calculation QC Score = (QCA+QCB)/2 QCA=sqrt(QC shape*QC S/N) for channel A QCB=sqrt(QC shape*QC S/N) for channel B
Quality Control, RI Plot RI plot: log(IB/IA) vs1/2log(IA*IB) plotted for means shows clearly low intensity distortion due to background overestimation. Data from earlier slide processed without QC filter
Quality Control (data provided by E. Snesrud)
Quality Control (data provided by E. Snesrud)
SpotFinder Flag Descriptions A - Spot area is larger than 50 pixels B - Spot area is between 30 and pixels C - spot area is smaller than 30 pixels X - Spot rejected by QC based on spot shape and spot intensity relative to surrounding background U - Spot rejected (“flagged”) by user Y - Bad spot, background is higher than spot intensity Z - Spot was not detected by the program S - Warning: some spot pixels are saturated
Output data (.mev) per spot: UID Unique identifier for this spot IA Intensity value in channel A IB Intensity value in channel B R Row (slide row) C Column (slide column) MR Meta-row (block row) MC Meta-column (block column) SR Sub-row SC Sub-column
Output data (.mev) per spot: FlagA TIGR Spotfinder flag value in channel A FlagB TIGR Spotfinder flag value in channel B SA Actual spot area (in pixels) SF Saturation factor QC Cumulative quality control score QCA Quality control score in channel A QCB Quality control score in channel B
Output data (.mev) per spot: BkgA Background value in channel A BkgB Background value in channel B SDA Standard deviation for spot pixels in channel A SDB Standard deviation for spot pixels in channel B SDBkgA Standard deviation of the background in channel A SDBkgB Standard deviation of the background in channel B
Output data (.mev) per spot: MedA Median intensity value in channel A MedB Median intensity value in channel B MNA Mean intensity value in channel A MNB Mean intensity value in channel B X/Y X resp. Y coordinates of the spot cell PValueA P-value in channel A PValueB P-value in channel B DBID Data Base ID (if UID is substituted)