A Short Overview of Microarrays

Tex Thompson Spring 2005 A Short Overview of Microarrays

Raw Data • Microarray data at its most raw consists of a spotted image, and information on what each spot represents (spot intensities and metadata). • Genes may be spotted in replicate • Affymetrix chips use a match/mismatch technology to guard against non-specific hybridization.

Normalizing Data • Normalization of microarray data is the process of removing array-specific bias in order to make results between arrays comparable. • Intensity data relevant to a single gene needs to be combined and normalized in order to define “expression levels” for each gene. • The basic idea is that the expression level is proportional to the number of mRNA transcripts of that gene within the tissue of interest.

RMA Normalization • Each array is assumed to have a common amount of “background noise.” • Normalization is performed by quantile normalization, such that the intensities across each chip are adjusted to produce identical distributions. • A statistician (or Google) could tell you much more about this.

Diagram of Microarray Analysis mRNA ?????? Normalized Data Raw Data

What Sorts of Questions Can We Ask? • What are the most highly/lowly expressed genes in a sample of interest? • What are the differentially expressed genes across two (or more) samples of interest? • What sets of genes are always upregulated or downregulated as a set? • What do you think?

Clustering • Clustering is the process of assembling N objects into K “clusters” based on a set of measured characteristics. • For example, a common clustering application is clustering individual samples into clusters based on their gene expression. • Alternatively, clustering can be used to group together individual genes who similar expression patterns.

Prediction • Prediction is the process of creating an algorithm for taking an unknown sample and putting it in a known classification scheme. • For example, a predictor might measure the gene expression levels of an unknown tissue sample and match it to the most probable classification. • This protocol is very common in studies of different types of cancer.

Algorithms Of Interest • Principal Component Analysis (PCA) • Self-Organizing Maps (SOM) • Support Vector Machines (SVM) • Linear Discriminant Analysis (LDA) • K-Means Clustering • KNN Classifiers • Differential Expression Statistics • Assumptions of RMA Normalization

Looking At The Data • Each array falls into one of four types: • Young • Middle-aged • Old, Mild Presbycusis • Old, Severe Presbycusis

Looking At The Data X13_Frisina_S2_M430A.CEL X1_b_Frisina_S2_M430A.CEL 1415670_at 10.0073897626035 10.4616952671666 1415671_at 12.1960225217605 13.1951229785856 1415672_at 13.9737085433580 13.7746451795089 1415673_at 9.62027371983307 10.9092694066664

Go To Work! I'll be available for questions via until 9:30am and via e-mail (tex@bioinformatics.rit.edu). These slides will be made available on the course website.

A Short Overview of Microarrays

A Short Overview of Microarrays

Presentation Transcript

Microarrays

Microarrays

A short overview of the Bioinformatics Core

Microarrays

Short overview of Weka

Research4Life – a short overview

Microarrays

SHORT OVERVIEW OF CURRENT STATUS

MICROARRAYS

Microarrays

Tissue Microarrays: Overview

Applications of microarrays

Plasma Physics: A short overview

Short Overview of Videoconferencing @ CERN

Intelligent Buildings - a short overview

A SHORT OVERVIEW ON ISMB

Seismology (a very short overview)

A SHORT OVERVIEW ON ISMB

A Short Overview of Dental Teeth Cleaning

Short Overview Of #ForkliftCamera

Microarrays

Home Staging - a Short Overview