Structural properties, scales and profiles

DNA structural properties in functional genomicsPieter Meysman, Kathleen Marchal and KristofEngelen

Structural properties, scales and profiles The structural properties of the DNA molecule can be roughly divided into two categories: Conformational: details of the static DNA structure. Physicochemical: dynamic potential of the DNA structure or the free energy.

DNA-binding proteins Nucleosomes Recent models estimate the dinucleotide deformation energy which are better at predicting nucleosome positioning. -- rigidity

DNA-binding proteins Transcription factors binding sites Structural preferences can be used to predict new binding sites.

Promoters Eukaryotes In general, promoters are more rigid than the remainder of the genome. Which is important for excluding nucleosomes from the promoter region. The proximal promoter (where most TFs binding sites are), have a decrease in rigidity (to allow binding of TFs). Extreme rigidity values embedded in elements such as TATA-box (rigid regions are found even when the TATA is not). Prediction of promoters. -- stability -- rigidity

Promoters Prokaryotes Promoters are less stable, more rigid and have more extreme curvature. -- stability -- rigidity -- curvature

Transposons Insertion of transposons preferred into sites with a consensus sequence, a typical deformability, and a high bendability.

Use of structural DNA properties for the prediction of transcription-factor binding sites in Escherichia coli. Meysman P, Dang TH, Laukens K, De Smet R, Wu Y, Marchal K, Engelen K. Nucleic Acids Res. 2011 Jan;39(2):e6. Epub 2010 Nov 4.

Structural scales

CRoSSeD methodology CRoSSeD uses structural properties to model and predict novel binding sites. Green: binding sites from RegulonDB

Method evaluation Creating a positive synthetic dataset

Method evaluation 40 positive 1000 negative

Method evaluation Training set 36 positive 104 unknown 900 negative

Method evaluation

Method evaluation The synthetic dataset was used to compare the predictive power of CRoSSeD to: Position-weight matrix (PWM). CRFseq: di and tri nucleotides relationships. BioBayesNet: bayesian networks structure-based methodology.

Method evaluation

Method evaluation on a real dataset Real datasets were derived from experimentally confirmed binding sites of E.coli (obtained from RegulonDB) Positive: all known binding sites Negative: 1000 random For 17 out of 27 TF CRoSSeD model outperformed the other 3

Screening for novel binding sites To evaluate those novel targets they used gene expression data and extensive literature.

Screening for novel binding sites 14 out of 23 gene sets were enriched with high-scoring predicting binding sites obtained from structural model

CRP Binds as a dimer Introduces two kinks Values in this flexibility is derived from DNase I

PurR Induce a single kink by intercalating a pair of lecine into the minor groove Highest weight was assigned to the stability, the disruption energy.

Using sequence-specific chemical and structural properties of DNA to predict transcription factor binding sites Bauer AL, Hlavacek WS, Unkefer PJ, Mu F.PLoSComput Biol. 2010 Nov 18;6(11):e1001007

SiteSleuth Combine DNA structural prediction (MD), computational chemistry and machine learning to identify and predict new TFs binding sites.

Sequence specific DNA structure (A) same base, different shape GCTGGGC (left) is twisted −4.3 degrees GCAGAGC (right) is twisted −20.4 degrees. (B) different bases, similar shape GCCAGGC (left) is twisted −9.5 GCCGGGC (right) is twisted −9.5 degrees. (MD simulations)

Mapping of DNA MD simulations 6 shear, buckle, stretch, propeller, stagger and opening. 8 shift, tilt, slide, roll, rise, and twist

Chemical features Interaction energy between the DNA and 31 probes

Mapping of DNA MD simulations 6 shear, buckle, stretch, propeller, stagger and opening. 8 shift, tilt, slide, roll, rise, and twist

Support Vector Machine (SVM) For each of the 54 TFs: Positive: Binding sites from RegulonDB Negative: 10,000 randomly selected non-coding sequences Using SVM

Support Vector Machine (SVM) ?

Comparison methods BvH MATRIX SEARCH Match QPMEME

Cross-validation SiteSleuth outperforms all other in 28 cases (out of 54)

SithSleuth VS BvH BvH predicts more estimated false positives than SiteSleuth

Validation against ChIP-chip data *Sitesleuth produced the fewer false positive * SiteSleuth outperformed the other methods with 41% correct predictions

Conclusions * Adding shape information can help in predicting new binding site. * Although SiteSleuth produces the highest fraction of correct predictions, the fraction correct prediction is still small (40%).

Structural properties, scales and profiles

Structural properties, scales and profiles

Presentation Transcript

Structural and Vibrational Properties of Small Vanadium Clusters

Scales and Weighing

Structural scales and types of analysis in composite materials

Lipid Structural Properties

Introduction to Structural Member Properties

Indices and Scales

Objective and Scales

Structural and mechanical properties of the dispersed systems

Index and Scales

Remote Sensing of Soil Properties @ National and Regional Scales

Lab 2 Soil Properties and Profiles

African language families and their structural properties

Utility and Scales

Physical and Structural Properties of Bronze Powder

scales and slicers

Imidazole Derivatives | Physical & Structural Properties

Stainless Steel Powder | Physical & Structural Properties

Cyclohexanol – Its structural and physical properties

African language families and their structural properties

Lab 2 Soil Properties and Profiles

Optical and Structural Properties of Gold/DNA Nanocomposites