190 likes | 317 Views
How to Work With Affymetrix .Cel Files in geWorkbench. Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute of MIT and Harvard. Introduction.
E N D
How to Work With Affymetrix .Cel Files in geWorkbench Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute of MIT and Harvard
Introduction • The Affymetrix .CEL file contains information about the expression levels of the individual probes on the pixel values of a DAT file. • The format of the CEL file is a binary file, where values are stored in little-endian format. • This tutorial is to provide a guidance on how to use a graphical user interface: affyimGUI, to prepare Affymetrix .CEL files that can be read by geWorkbench.
About affylmGUI Package • affylmGUI is a Graphical User Interface designed for analysis of Affymetrix microarray data. • affylmGUI is a R package. It requires the installation of R 1.9.0 or later. • R program can be downloaded at http://www.r-project.org/ • affylmGUI can be downloaded at: http://bioinf.wehi.edu.au/affylmGUI/#download • Installation steps can be found at: http://bioinf.wehi.edu.au/affylmGUI/#windows • More detailed information on affyImGui can be found at: http://bioinf.wehi.edu.au/affylmGUI/
Pre-requisite of AffyImGui • Followings are the default packages needed for running AffImGui: • Default packages from Bioconductor installed using: • tkrplot R package (on CRAN) • Additional packages might be required for using probe-level linear models or exporting HTML reports • Please follow the example to meet the requirement for running affyImGui: http://bioinf.wehi.edu.au/affylmGUI/R/library/affylmGUI/doc/estrogen/estrogen.html
Starting AffyImGui AffyImGui can be started by opening RGui and typing: • > library(affylmGUI) • Click "Yes" in the message box to begin affylmGUI
Start New Analysis in affyImGui • From the File menu, click on "New" to begin a new analysis.
Choose Working Directory • The first thing is to specify a working directory for the analysis. This directory should contain: • Targets file • Affymetrix .CEL files • No array design files (CDF files) are needed. They will be downloaded from the Bioconductor site automatically.
What is the “RNA Targets" file? The file at the right is known as “RNA Targets” file in affylmGUI. It describes the experimental conditions for each of the 12 arrays. The file should be: • In tab-delimited text format. • Having 3 columns in the file The column headings must appear exactly as shown: • Name: the unique name for each chip • Filename: Affymetrix .CEL file name for each chip • Target: Used by affylmGUI to group the arrays into different classes (for downstream differential expression analysis). Per formatting requirements of the file, this field must be populated. However, it is not used in the workflow described in the subsequent slides (you can enter arbitrary labels).
Select Target File • Click on the “Select Targets file" button to choose the targets file. • Select the tab-delimited target file created previously for the carcinoma cell data.
Set Dataset Name • After the targets file is loaded, a window is prompted to enter a name for the data set. • The name will become the default filename when you save your analysis or export an HTML report. • The name will show up on the top of the left status panel in the main affylmGUI window.
View RNA Targets file • The RNA Targets can be viewed after having been loaded into affylmGUI using the "RNA Targets" menu
Visualize .CEL File Quality • User may use the Plot option in AffyImGui to visualize the .CEL data for data quality control. • Select “Plot” from the menu to see the options.
Image Array Plot Image Plot is useful to identify spatial artifactssuch as scratches or smudges or boundary effects. • From the Plot menu, select "Image Array Plot". • Select the first array. • If desired, you may customize the plot title and axis labels at the prompt for the plot
Data Normalization There are three normalization methods available: Robust Multiarray Averaging (RMA), GCRMA Background Adjustment and Probe-Level Linear Models (PLM). • Select "Normalize” from the Normalization menu • Select the normalization method • R prompt will show the status of the normalization
Export the Normalized Expression The normalized expression estimates can be exported to a tab-delimited text file : • Select “Export Normalized Expression Values” From Normalization Menu • Save the tab-delimited text file in a convenient location.
Visualize the Normalized Expression • The table of normalized expression values should then be imported into Excel. • IMPORTANT: The file needs to be further edited to be recognized by geWorkbench by labeling the first data field as ID (see arrow). • Finally, the file must can be saved as a tab-delimited .txt file. It is now ready to be loaded in geWorkbench.
Open Normalized File with geWorkbench • Follow the normal step in geWorkbench to open the data file • Choose Tab-Delimited File Type • Select File & Open
View Data in geWorkbench • geWorkbench offers many options to visualize and analyze the genomic data • For more information on how to use geWorkbench, please visit geWorkbench knowledgebase center: https://cabig-kc.nci.nih.gov/Molecular/KC/index.php/GeWorkbench#geWorkbench_Knowledge_Base
Need More Information? NCI is developing an extensive knowledge base to support various NCI molecular analysis tools. Visit us atNCI’s Molecular Analysis Tool Knowledge centerat:https://cabig-kc.nci.nih.gov/MediaWiki/index.php/Main_Page. • For more information on how to use caArray, please visit NCI Knowledge Center, caArray section at : https://cabig-kc.nci.nih.gov/Molecular/KC/index.php/ccaArray. • Have a caArray related question? Find the answers in caArray FAQ section at: https://cabig-kc.nci.nih.gov/Molecular/KC/index.php/CaArray#caArray_Knowledge_Base • New more helps? Post it in caArray Forum at : https://cabig-kc.nci.nih.gov/Molecular/forums/viewforum.php?f=6.