1 / 30

Introduction to Gene Chips and Microarray Expression Data

Introduction to Gene Chips and Microarray Expression Data. Dr. Travis Doom, Assistant Professor BIRG Lab Department of Computer Science and Engineering Wright State University. Outline. DNA Microarrays Fabrication Application Microarray Data Analysis Techniques New Technology &

edric
Download Presentation

Introduction to Gene Chips and Microarray Expression Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Gene Chips andMicroarray Expression Data Dr. Travis Doom, Assistant Professor BIRG Lab Department of Computer Science and Engineering Wright State University

  2. Outline • DNA Microarrays • Fabrication • Application • Microarray Data • Analysis Techniques • New Technology & Open Commentary

  3. Fabrication • Fabrication via Printing • DNA sequence stuck to glass substrate • DNA solution pre-synthesized in the lab • Fabrication In Situ • Sequence “built” • Photolithographic techniques use light to release capping chemicals • 365 nm light allows 20-m resolution

  4. GeneChip DNA Microarrays • Each probe consists of thousands of strands of identical oglionucleotides • The DNA sequences at each probe represent important genes (or parts of genes) • Printing Systems • Ex: HP, Corning Inc. • Printing systems can build lengths of DNA up to 60 nucleotides long • 1.28 x 1.28+ cm glass wafer • Each “print head” has a ~100 m diameter and are separated by ~100 m. ( 5,000 – 20,000 probes) • Photolithographic Chips • Ex: Affymetix • 1.28 x 1.28 cm glass/silicon wafer • 24 x 24 m probe site ( 500,000 probes) • Lengths of DNA up to 25 nucleotides long • Requires a new set of masks for each new array type

  5. Practical Application of DNA Microarrays • DNA Microarrays are used to study gene activity (expression) • What proteins are being actively produced by a group of cells? • “Which genes are being expressed?” • How? • When a cell is making a protein, it translates the genes (made of DNA) which code for the protein into RNA used in its production • The RNA present in a cell can be extracted • If a gene has been expressed in a cell • RNA will bind to “a copy of itself” on the array • RNA with no complementary site will wash off the array • The RNA can be “tagged” with a fluorescent dye to determine its presence • DNA microarrays provide a high throughput technique for quantifying the presence of specific RNA sequences

  6. Poly-A RNA 10% Biotin-labeled Uracil Antisense cRNA cDNA IVT AAAA Cells L L L L L L Fragment (heat, Mg2+) Labeled fragments Hybridize Wash/stain Scan The Process (In-vitro Transcription)

  7. L L L L L L L L L L L L Hybridization and Staining Biotin Labeled cRNA GeneChip Hybridized Array + + SAPE Streptavidin- phycoerythrin

  8. The Result • A light source scans the array, causing the dyes to fluoresce • The glow is picked up by a sensor and is used to determine the relative abundance of the RNA • This information must be processed to determine the level of activity for each expressed gene

  9. The Goals • Basic Understanding • Arrays can take a snap shot of which subset of genes in a cell is actively making proteins • Heat shock experiments • Medical diagnosis • Microarrays can indicate where mutations lie that might be linked to a disease. Still others are used to determine if a person’s genetic profile would make him or her more or less susceptible to drug side effects • 1999 – A genechip containing 6800 human genes was used distinguish between myeloid leukemia and lympholastic leukemia using a set of 50 genes that have different activity levels • Drug design • Pharmaceutical firms are in a rush to translate the human genome results into new products • Potential profits are huge • First, though, they must figure out what the genes do, how they interact, and how they relate to diseases. • Evaluation, Specificity, Response

  10. The Gains • A decade of rapid advances in biology has swept an avalanche of genetic information into scientist’s laps. • Mass analysis of the vast set of biologic data is impractical without high-throughput techniques • DNA microarrays (aka Gene chips, biochips) allow researchers to look for the presence, productivity, or sequence of thousand of genes simultaneously • Advantages: • Speed • Feasibility • Sensitivity • Reproducibility

  11. Outline • DNA Microarrays • Fabrication • Application • Microarray Data • Analysis Techniques • New Technology & Open Commentary

  12. Microarray Data • First, the Problems: • The fabrication process is not error free • Probes have a maximum length 25-60 nucleotides • Biologic processes such as hybridization are stochastic • Background light may skew the fluorescence • How do we decide if/how strongly a particular gene is being expressed? • Solutions to these problems are still in their infancy

  13. Features • Problem #1: The fabrication process is not error free • Solution: Each probe does not represent a unique DNA sequence. • Probe set: A set of probes each containing the same DNA sequence (the Feature) • Remove outermost rows and columns to avoid fabrication-based error

  14. Feature Value Remove outermost rows and columns Find 75th percentile of remaining values This value is taken as representative of this feature

  15. 5’ Gene Sequence 3’ How Features Are Chosen • Problem #2: Probes have a maximum length 25-60 nucleotides: • Solution: Use multiple features per gene • Affymetrix claims that this redundancy actually improves detection and quantification of the target gene Multiple oligo probes 25-mers Features

  16. 5’ Gene Sequence 3’ Feature Mismatches • Problem #3: Biologic processes such as hybridization are stochastic • Solution: Include a “control” for each probe – a DNA sequence which differs only slightly from the feature • In a 25-mer, the mismatch sequence differs in the 13th position (A-T or G-C) Multiple oligo probes 25-mers Perfect Match Mismatch

  17. Background Noise Removal • Problem #4: Background light may skew the fluorescence • “Measure of non-specific fluorescence attributed to hybridization conditions and sample” = Noise • Solution: Estimate background noise and subtract intensity • The array is divided into equal sectors (16 is standard) • For each sector • Find the lowest feature intensities (2%) • Average these • Subtract this average from the intensity value of all features in the sector

  18. Average Difference Intensity • Problem #5: How do we decide if / how strongly a particular gene is being expressed? • For a given gene • For each feature match/mismatch pair for the given gene • Calculate the difference PM-MM • Calculate ,  for this set • Remove outliers from set • Ex: abs( (PM – MM) - )  3 • The average (PM – MM) difference over the set (minus outliers) is the average difference intensity • This value can be used to compare expression levels for the gene which the features represent

  19. Positive & Negative Probe Pairs • Problem #5: How do we decide if / how strongly a particular gene is being expressed? • For each perfect match/mismatch probe pair in the feature, perform a standard difference and ratio test • Example SRT and SDT thresholds: • SRT  1.5 • SDT  a multiple of intensity  or  PM/MM  SRT PM-MM  SDT MM/PM  SRT MM-PM  SDT If both true, mark probe pair as positive evidence If both true, mark as probe pair as negative evidence Otherwise, mark probe pair as inconclusive

  20. Voting Methods for Absolute Call • Problem #5: How do we decide if / how strongly a particular gene is being expressed? • Solution: Use decision matrix to make absolute call • Positive/negative ratio PNR = # pos. calls / # neg. calls • Positive fraction PF = # pos. calls / # probe pairs • Log average ratio LA = 10 x avg. ( log (PM/MM) ) VOTE!

  21. Average Difference and Absolute Call • Problem #5: How do we decide if / how strongly a particular gene is being expressed? • Which of these do you base a decision on, for whether a gene is being expressed? • Use the absolute call for decision ifa particular gene is being expressed • Use average difference to compare how strongly agene which is present is expressed

  22. Comparison Analysis • Compare probe sets between two gene chips to determine whether gene expression increased, did not change or decreased • Comparison analysis has its own set of problems: • The signals must be adjusted (if necessary) to normalize average signal levels • For each perfect match/mismatch probe pair in the feature, perform a difference and ratio test • If both true, mark probe pair as evidence of increase from base • PM/MMexperiment – PM/MMbase  Change Threshold • (PM-MM)experiment /(PM-MM)base  Percentage Change Threshold • If both true, mark probe pair as evidence of decrease from base • PM/MMbase - PM/MMexperiment  Change Threshold • (PM-MM)base / (PM-MM)experiment  Percentage Change Threshold • Otherwise mark probe pair as unchanged

  23. Voting Methods for Comparison Call • Increase fraction IR = # increase calls / # PP used • Increase ratio DR = # increase calls / # decrease calls • Log average ratio change LAC = LAexp – Labase • If a change is called, use the average difference to measure percent change • Are there better ways to extract patterns from multivariate gene expression profiles?

  24. Outline • DNA Microarrays • Fabrication • Application • Microarray Data • Analysis Techniques • New Technology & Open Commentary

  25. Does Moore’s Law apply to Gene Chips? • Ideally, we would like to fit all of an organism’s genes on one chip • Current estimates for Humans are between 30,000 – 40,000 genes

  26. Field-Programmable Microarrays? • Nanogen has produced a silicon chip embedded with 100 “programmable” probe pads • 80m platinum pads (each spaced about 200um apart) • Each pad can have apply a voltage (-1.3 to 2.0 V) • Since DNA carries a negative charge, applying a positive charge on a pad “corrals” DNA onto that spot • This is used to build custom arrays by washing the chip in a single stranded DNA solution, biasing the desired spot on the chip, and then chemically fixing the DNA to that spot • The electric charge is also useful during the hybridization reaction • Pooling the DNA onto the charged pads increases the reaction by a factor of 1000 • Reversing the charge “shakes loose” imperfectly matched DNA leading to more accurate results

  27. From the Rumor-Mill • Xeotron Corp: Maskless lithography • An array of micro mirrors are used to direct/block light during fabrication • Motorola: 3D microarrays • Arrays with a coating of acrylimide gel to allow “certain enzymatic reactions” to occur that might be important to lab-on-a-chip applications • Motorola: Electrical intensity measures • Arrays contain embedded circuitry to detect hybridization through a change in conductance rather than fluorescence • Ciphergen Biosystems Inc. & Packard Instrument Co.: Protein chips • Creates microarrays of antibodies (rather than DNA) to bind and identify proteins

  28. Acknowledgements • David Paoletti, Ph.D. Student, BIRG Lab, Wright State University. • Berberich, S, and McGorry, M; GeneChip protocols, Wright State University. • Moore, S K; Making chips to probe genes, IEEE Spectrum, March 2001, 54-60. • GeneChip Gene Expression Algorithm Training, Affymetrix.

  29. Questions ? • DNA Microarrays • Fabrication • Application • Microarray Data • Analysis Techniques • New Technology & Open Commentary

  30. The End • DNA Microarrays • Fabrication • Application • Microarray Data • Analysis Techniques • New Technology & Open Commentary

More Related