1 / 31

Probe design for microarrays using OligoWiz

Probe design for microarrays using OligoWiz. Question Experimental Design. The DNA Array Analysis Pipeline. Array design Probe design. Sample Preparation Hybridization. Buy Chip/Array. Image analysis. Normalization. Expression Index Calculation. Comparable Gene Expression Data.

diantha
Download Presentation

Probe design for microarrays using OligoWiz

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Probe designfor microarraysusingOligoWiz

  2. Question Experimental Design The DNA Array Analysis Pipeline Array design Probe design Sample Preparation Hybridization Buy Chip/Array Image analysis Normalization Expression Index Calculation Comparable Gene Expression Data Statistical Analysis Fit to Model (time series) Advanced Data Analysis Clustering PCA Classification Promoter Analysis Meta analysis Survival analysis Regulatory Network

  3. Probe design for microarrays • What is a Probe • Different Probe Types • OligoWiz • Probe Design • Cross Hybridization and Complexity • Affinity • Position

  4. An Ideal Probe must • - Discriminate well between its intended target and all other targets in the target pool • - Detect concentration differences under the hybridization conditions

  5. Probe Type comparisons

  6. Custom Microarrays When on virgin ground • Some technologies available for custom arrays • Spotted arrays • in situ synthesized • NimbleExpressェ Array Program

  7. OligoWiz a Tool for flexible probe design

  8. How does it work? Probe selection Optimal melting temperature (Tm) for the DNA:DNA or RNA:DNA hybridization for probes of the given length is determined. Optimal probe length are determined for all possible probes along the input sequence Five scores are calculated for each of these probes Best probes are selected based on a weighted sum of these scores

  9. The five scores In order of importance Cross-hybridization ∆Tm - (deviation from optimal Tm) Folding - (probe self annealing) Position - (3’ preference) Low-complexity All scores are normalize to a value between 0.0 (bad) and 1.0 (best).

  10. How to Avoid cross-hybridization From Kane et al. (2000) we learn that a 50’mer probe can detect significant false signal from a target that has >75-80% homology to a 50’mer oligo or a continuous stretch of >15 complementary bases If we have substantial sequence information on the given organism, we can try to avoid this by choosing oligos that are not similar to any other expressed sequences.

  11. Probe Specificity Hughes et al. 2001

  12. Mapping Regions 50 bp without similarity to other transcripts The Sequence we want to design a probe for 5’ 3’ BLAST hits >75% & longer than 15bp Regions suitable for probes

  13. Filtering Self Detecting 50 bp BLAST hits out The Sequence we want to design a oligo for 3’ 5’ BLAST hits >75% & longer than 15bp Sequence identical or very similar to the query sequence Therefore no BLAST hits with homology > 97% and with a ‘hit length vs. query length’ ratio > 0.8, are considered.

  14. Cross-hybridization Oligo BLAST hits { 100% 0 Max hit in pos. i expressed as a score Only BLAST hits that passed filtering are considered If m is the number of BLAST hits considered in position i. Let h=(h1i,...,hmi) be the BLAST hits in position i in the oligo Where n is the length of the oligo

  15. Similar Affinity for all oligos Another way of ensuring a optimal discrimination between target and non-target under hybridization is to design all the oligos on an array with similar affinity for their targets. This will allow the experimentalist to optimize the hybridization conditions for all oligos by choosing the right hybridization temperature and salt concentration. Commonly Melting Temperature (Tm) is used as a measure for DNA:DNA or RNA:DNA hybrid affinity.

  16. Melting Temperature difference Where DH (Kcal/mol) is the sum of the nearest neighbor enthalpy, A is a constant for helix initiation corrections, DS is the sum of the nearest neighbor entropy changes, R is the Gas Constant (1.987 cal deg-1 mol-1) and Ct is the total molar concentration of strands. Where N is all oligos in all sequences.

  17. Tm distributions for 30’mers and 50’mers

  18. DTm Distribution for probe length intervals

  19. Avoid self annealing oligos Sensitivity may be influenced Probes that form strong hybrids with it self i.e. probes that fold should be avoided. But, accurate folding algorithms like the one employed by mFOLD or RNAfold, is too time consuming, for large scale folding of oligos. Time consumption: mFOLD ~2 sec / 30’mer Pr. gene (500bp) ~16 min.

  20. . . . . . . . . . . . . . . . . . . . . . . . . Minimal loop size border . . . . Substitution matrix is based on binding energies Folding an oligonucleotide an approximation { { { AT TG CT ........................................................................................CG GT TT Dynamic programming: alignment to inverted self The alignment is based on dinucleotides AT TG CT .........................................................................................CG GT TT { { { .

  21. AT TG CT ........................................................................................CG GT TT Full dynamic programming calculation for first probe . . Dynamic programming calculation for second etc. probe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . AT TG CT .........................................................................................CG GT TT Minimal loop size border . . . . . . . . . . . . . . . . Last probe . . Super-alignment matrix Folding a lot of oligos a fast heuristic implementation

  22. Reasonably folding prediction compared to mFOLD

  23. Probes With Very Common sub sequences may result in unspecific signal If the sub-fractions of an oligo are very common we define it as ‘low-complex’ Oligo with low-complexity: AAAAAAAGGAGTTTTTTTTCAAAAAACTTTTTAAAAAAGCTTTAGGTTTTTA (Human) Oligo without low-complexity: CGTGACTGACAGCTGACTGCTAGCCATGCAACGTCATAGTACGATGACT (Human)

  24. Low-complexity expressed as a score For a given transcriptome a list of information content from all ‘words’ with length wl(8bp) is calculated: Where f(w) is the number of occurrences of a pattern and tf(w) is the total number of patterns of length wl. A low-complexity score for a given oligo is defined as: Low-complexity = 1-norm Where norm is a function that normalizes to between 1 and 0, L is the length of the oligo and Wi is the pattern in position i.

  25. Location of Oligo within transcript • Labeling include reverse transcription of the mRNA • and is sensitive to: • - RNA degradation • Premature termination of cDNA synthesis • - Premature termination of cRNA transcription (IVT) • Eukaryote Position Score: • 3’ preference • Prokaryote Position score • Preference toward 3’, but avoid ~50 most 3’ bases • Typically eukaryote sample labeling is done by poly-T • and Bacterial samples by random labeling

  26. Species databases For 398 species are currently available • The species databases are built from complete genomic sequences or UniGene collections in the case of Vertebrates. • The databases are used for: • Cross hybridization • Low-complexity

  27. Sequence Features Intron/Exon structure, UTR regions etc. • Special purpose arrays • Example: Detecting Differential splicing Exon Intron Exon Exon Exon

  28. Annotation String - single letter code Single letter code. Sequence: ATGTCTACATATGAAGGTATGTAA Annotation: (EEEEEEEEEEEEEE)DIIIIIII E: Exon I: Intron (: Start of exon ): End of exon D: Donor site A: Accepter site

  29. Probe placement using Regular Expressions search in annotation

  30. Extracting annotation from GenBank files • FeatureExtract server • www.cbs.dtu.dk/services/FeatureExtract

  31. Exercise • Running OligoWiz 2.0 • Java 1.4.1 or better is required • Input data • Sequence only (FASTA) • Sequence and annotation • Rule-based placement of multiple probes • Distance criteria • Annotation criteria • Please go to the exercise web-page linked from the course program

More Related