90 likes | 197 Views
Modeling Sequence Specificity of Transcription Factors with DNA structural features. Tianyin Zhou 11/06/2013. Data set. Data: 3 Human bHLH TFs: Mad, Max and Myc 62 TFs from Dream5 Composition of shape features Linear terms: MGW i , Roll i , ProT i , HelT i ,
E N D
Modeling Sequence Specificity of Transcription Factors with DNA structural features Tianyin Zhou 11/06/2013
Data set • Data: • 3 Human bHLH TFs: Mad, Max and Myc • 62 TFs from Dream5 • Composition of shape features • Linear terms: MGWi, Rolli, ProTi, HelTi, • Bilinear terms: MGWi*MGWi+1, Rolli*Rolli+1, ProTi*ProTi+1, HelTi *HelTi+1 • Feature combinations: • PWM (1mer) • 1mer + 2mer • 1mer + 2mer + 3mer • 1mer + shape features • shape features • Algorithms: • Support vector regression
1mer+shape outperforms other feature combinations. • Shape alone performs as well as 1mer+2mer and 1mer+2mer+3mer. • Performance of 1mer+shape is less sensitive to the sample size compared to 1mer+2mer and 1mer+2mer+3mer.
Dream5 data 1mer+shape outperforms 1mer.
Results 1mer+4shape has comparable performance to 1mer +2mer.