Humanity v The Machines

Humanity v The Machines An AI Challenge Dr John Mitchell, St Andrews Sam Boobier, Anne Osborn & John Mitchell, Can human experts predict solubility better than computers? J Cheminformatics, 9:63 (2017) Image: scmp.com

The Challenge: Computing Aqueous Solubility

Which would you Prefer ... or ? Solubility in water (and other biological fluids) is highly desirable for pharmaceuticals!

Solubility is an important issue in drug discovery and a major cause of failure of drug development projects Expensive for the pharma industry Patients suffer lack of available treatments A good computational model for predicting the solubility of druglike molecules would be very valuable.

Solubility You might think that “How much solid compound dissolves in 1 litre of water” is a simple question to answer. However, experiments are prone to large errors. Solution takes time to reach equilibrium, and results depend on pH, temperature, ionic strength, solid form, impurities etc.

Humankind vs The Machines Challenge is to predict solubilities of 25 molecules given 75 as training data.

Video explanation of wisdom of crowds

Wisdom of Crowds Francis Galton (1907) described a competition at a country fair, where participants were asked to estimate the mass of a cow. Individual entries were not particularly reliable, but Galton realised that by combining these guesses a much more reliable estimate could be obtained.

Wisdom of Crowds Guess the mass of the cow: Median of individual guesses is a good estimator: Francis Galton, Voxpopuli, Nature, 75, 450-451 (1907).

Wisdom of Crowds This is an ensemble predictor which works by aggregating individual independent estimates, and generates a result that is more reliable than the individual guesses and more accurate than the large majority of them.

Humankind vs The Machines Sent 229 emailed invitations to subject experts and students. Obtained 22 anonymous responses, of those 17 made full sets of predictions.

Humankind vs The Machines 10 machine learning algorithms were given the same training & test sets as the human panel.

ML Methods for Solubility • Expt data: errors unknown (0.5-0.7 logS0 units?) but limit possible accuracy of models; • Differences in dataset size and composition often hinder comparisons of methods; • ML numerically better than first principles (but FP not widely validated), at the cost of less insight.

0.99 0.94 Difference not significant Sam Boobier, Anne Osborn & John Mitchell, Can human experts predict solubility better than computers? J Cheminformatics, 9:63 (2017)

Machine Learning Algorithms Ranked 2nd 1st

Another Layer of Wisdom of Crowds • We don’t know in advance which predictors will be good and which will be poor. • However, we can make an algorithm that will allow us to generate a good (consensus) prediction without prior knowledge of results.

Wisdom of Crowds: Human Consensus Predictor Guess for the solubility of the molecule: Median of all (between 17 & 21) individual human guesses of logS0 for a given compound.

Wisdom of Crowds: Machine Consensus Predictor Guess for the solubility of the molecule: Median of all 10 individual machine guesses of logS for a given compound.

1.09 Sam Boobier, Anne Osborn & John Mitchell, Can human experts predict solubility better than computers? J Cheminformatics, 9:63 (2017)

1.14 1.09 Difference not significant Sam Boobier, Anne Osborn & John Mitchell, Can human experts predict solubility better than computers? J Cheminformatics, 9:63 (2017)

Conclusions: Humans v ML • Best humans and best algorithms perform almost equally; • Consensus of humans and consensus of algorithms perform almost equally; • Less effective individual human predictors are notably weaker. • Both humans and ML numerically clearly better than a physics-based first principles theory approach.* * On a similar but non-identical dataset; David Palmer, James McDonagh, John Mitchell, Tanja van Mourik & Maxim Fedorov, First-Principles Calculation of the Intrinsic Aqueous Solubility of Crystalline Druglike Molecules, J Chem Theory Comput, 8, 3322-3337 (2012)

Thanks • Tanja van Mourik (St Andrews), Neetika Nath, James McDonagh (now IBM), Rachael Skyner (now Diamond, Oxford), Sam Boobier (now Leeds), Will Kew (now Edinburgh) • Maxim Fedorov, Dave Palmer (Strathclyde) • Laura Hughes (now Stanford), Toni Llinas (AZ), Anne Osbourn (JIC, Norwich)

Humanity v The Machines

Humanity v The Machines

Presentation Transcript

Habitat for Humanity

Jesus’ Humanity

Celebrating Humanity

The Creation of Humanity

The English Renaissance: Celebrating Humanity

Fragile Humanity

The F uture of Humanity

Humanity and Survival

The Books that defined humanity

Humanity doesn ’ t believe God (v. 1-8)

The future of humanity

Celebrating Humanity

Humanity

Antiquity of Humanity

Humanity

The Humanity Conference

The Darkside of Humanity

The Beginnings of Humanity

HUMANITY

The Humanity of Jesus

Classification Part V: Support Vector Machines

The Fate of Humanity…