90 likes | 202 Views
Making the Most of Small Sample High Dimensional Micro-Array Data. Allan Tucker, Veronica Vinciotti, Xiaohui Liu; Brunel University Paul Kellam; Windeyer Institute. MicroArray Data. High dimensional Small number of samples Need to identify predictive genes E.g. classification
E N D
Making the Most of Small Sample High Dimensional Micro-Array Data Allan Tucker, Veronica Vinciotti, Xiaohui Liu; Brunel University Paul Kellam; Windeyer Institute
MicroArray Data • High dimensional • Small number of samples • Need to identify predictive genes • E.g. classification • Rate confidence on genes based upon predictive ability / classification
Identifying Predictive Genes • We use Naïve Bayes Classifier • Well established • Minimises parameters • Feature selection using SA • Repeated 10 times • Apply cross validation
Identifying Predictive Genes • Identify genes robustly • Data perturbed during CV • Repeats of stochastic SA search • Assign confidence based upon the frequencies of genes being selected • Limit maximum number of links
Classification Accuracy • Generally RSN performs best • SA global search better than local • Anomaly with B-Cell? • Synthetic data supports global over local
Confidence Scores • Relatively small number of genes • Identified with high confidence • Consistency between runs
Conclusions • When micro-array data only has small samples: • Simple models with small parameters best • Global search for parameters better • Proposed RSN successfully identifes genes of interest paving way for further biological analysis • Need to explore different parameters