120 likes | 235 Views
Tex Thompson Spring 2005. A Short Overview of Microarrays. Raw Data. Microarray data at its most raw consists of a spotted image, and information on what each spot represents (spot intensities and metadata). Genes may be spotted in replicate
E N D
Tex Thompson Spring 2005 A Short Overview of Microarrays
Raw Data • Microarray data at its most raw consists of a spotted image, and information on what each spot represents (spot intensities and metadata). • Genes may be spotted in replicate • Affymetrix chips use a match/mismatch technology to guard against non-specific hybridization.
Normalizing Data • Normalization of microarray data is the process of removing array-specific bias in order to make results between arrays comparable. • Intensity data relevant to a single gene needs to be combined and normalized in order to define “expression levels” for each gene. • The basic idea is that the expression level is proportional to the number of mRNA transcripts of that gene within the tissue of interest.
RMA Normalization • Each array is assumed to have a common amount of “background noise.” • Normalization is performed by quantile normalization, such that the intensities across each chip are adjusted to produce identical distributions. • A statistician (or Google) could tell you much more about this.
Diagram of Microarray Analysis mRNA ?????? Normalized Data Raw Data
What Sorts of Questions Can We Ask? • What are the most highly/lowly expressed genes in a sample of interest? • What are the differentially expressed genes across two (or more) samples of interest? • What sets of genes are always upregulated or downregulated as a set? • What do you think?
Clustering • Clustering is the process of assembling N objects into K “clusters” based on a set of measured characteristics. • For example, a common clustering application is clustering individual samples into clusters based on their gene expression. • Alternatively, clustering can be used to group together individual genes who similar expression patterns.
Prediction • Prediction is the process of creating an algorithm for taking an unknown sample and putting it in a known classification scheme. • For example, a predictor might measure the gene expression levels of an unknown tissue sample and match it to the most probable classification. • This protocol is very common in studies of different types of cancer.
Algorithms Of Interest • Principal Component Analysis (PCA) • Self-Organizing Maps (SOM) • Support Vector Machines (SVM) • Linear Discriminant Analysis (LDA) • K-Means Clustering • KNN Classifiers • Differential Expression Statistics • Assumptions of RMA Normalization
Looking At The Data • Each array falls into one of four types: • Young • Middle-aged • Old, Mild Presbycusis • Old, Severe Presbycusis
Looking At The Data X13_Frisina_S2_M430A.CEL X1_b_Frisina_S2_M430A.CEL 1415670_at 10.0073897626035 10.4616952671666 1415671_at 12.1960225217605 13.1951229785856 1415672_at 13.9737085433580 13.7746451795089 1415673_at 9.62027371983307 10.9092694066664
Go To Work! I'll be available for questions via until 9:30am and via e-mail (tex@bioinformatics.rit.edu). These slides will be made available on the course website.