230 likes | 242 Views
Can human experts predict solubility better than computers? This study compares the accuracy of human predictions and machine learning algorithms in determining the solubility of drug-like molecules, highlighting the value of computational models in pharmaceutical research.
E N D
Humanity v The Machines An AI Challenge Dr John Mitchell, St Andrews Sam Boobier, Anne Osborn & John Mitchell, Can human experts predict solubility better than computers? J Cheminformatics, 9:63 (2017) Image: scmp.com
Which would you Prefer ... or ? Solubility in water (and other biological fluids) is highly desirable for pharmaceuticals!
Solubility is an important issue in drug discovery and a major cause of failure of drug development projects Expensive for the pharma industry Patients suffer lack of available treatments A good computational model for predicting the solubility of druglike molecules would be very valuable.
Solubility You might think that “How much solid compound dissolves in 1 litre of water” is a simple question to answer. However, experiments are prone to large errors. Solution takes time to reach equilibrium, and results depend on pH, temperature, ionic strength, solid form, impurities etc.
Humankind vs The Machines Challenge is to predict solubilities of 25 molecules given 75 as training data.
Wisdom of Crowds Francis Galton (1907) described a competition at a country fair, where participants were asked to estimate the mass of a cow. Individual entries were not particularly reliable, but Galton realised that by combining these guesses a much more reliable estimate could be obtained.
Wisdom of Crowds Guess the mass of the cow: Median of individual guesses is a good estimator: Francis Galton, Voxpopuli, Nature, 75, 450-451 (1907).
Wisdom of Crowds This is an ensemble predictor which works by aggregating individual independent estimates, and generates a result that is more reliable than the individual guesses and more accurate than the large majority of them.
Humankind vs The Machines Sent 229 emailed invitations to subject experts and students. Obtained 22 anonymous responses, of those 17 made full sets of predictions.
Humankind vs The Machines 10 machine learning algorithms were given the same training & test sets as the human panel.
ML Methods for Solubility • Expt data: errors unknown (0.5-0.7 logS0 units?) but limit possible accuracy of models; • Differences in dataset size and composition often hinder comparisons of methods; • ML numerically better than first principles (but FP not widely validated), at the cost of less insight.
0.99 0.94 Difference not significant Sam Boobier, Anne Osborn & John Mitchell, Can human experts predict solubility better than computers? J Cheminformatics, 9:63 (2017)
Another Layer of Wisdom of Crowds • We don’t know in advance which predictors will be good and which will be poor. • However, we can make an algorithm that will allow us to generate a good (consensus) prediction without prior knowledge of results.
Wisdom of Crowds: Human Consensus Predictor Guess for the solubility of the molecule: Median of all (between 17 & 21) individual human guesses of logS0 for a given compound.
Wisdom of Crowds: Machine Consensus Predictor Guess for the solubility of the molecule: Median of all 10 individual machine guesses of logS for a given compound.
1.09 Sam Boobier, Anne Osborn & John Mitchell, Can human experts predict solubility better than computers? J Cheminformatics, 9:63 (2017)
1.14 1.09 Difference not significant Sam Boobier, Anne Osborn & John Mitchell, Can human experts predict solubility better than computers? J Cheminformatics, 9:63 (2017)
Conclusions: Humans v ML • Best humans and best algorithms perform almost equally; • Consensus of humans and consensus of algorithms perform almost equally; • Less effective individual human predictors are notably weaker. • Both humans and ML numerically clearly better than a physics-based first principles theory approach.* * On a similar but non-identical dataset; David Palmer, James McDonagh, John Mitchell, Tanja van Mourik & Maxim Fedorov, First-Principles Calculation of the Intrinsic Aqueous Solubility of Crystalline Druglike Molecules, J Chem Theory Comput, 8, 3322-3337 (2012)
Thanks • Tanja van Mourik (St Andrews), Neetika Nath, James McDonagh (now IBM), Rachael Skyner (now Diamond, Oxford), Sam Boobier (now Leeds), Will Kew (now Edinburgh) • Maxim Fedorov, Dave Palmer (Strathclyde) • Laura Hughes (now Stanford), Toni Llinas (AZ), Anne Osbourn (JIC, Norwich)