Modeling Progress in Artificial Intelligence: Challenges, Frameworks, and Future Trends

Modeling Progress in AI Miles Brundage Consortium for Science, Policy, and Outcomes Arizona State University

Overview • Motivations • Conceptual challenges/approaches • Algorithmic vs. hardware/data/etc. progress • Different cognitive domains and/vs. general AI • Applications • Go • Atari • Future of work • Future directions Slide 2 of 25

Motivations • The future of AI matters… • Why not model it rigorously? As done with… Slide 3 of 25

Motivations (cont’d) • Concepts and existence proofs from other domains • Methodological clues • Model-based (Armstrong et al., 2014), • Quantitative (Mullins, 2012), • Short-term (Ibid) • Simple extrapolations often beat expert intuitions (Roper et al., 2011) Slide 4 of 25

Relevant Literatures • AI evaluation • Natural intelligence evaluation • Technology forecasting • Technology roadmapping Eryilmaz et al. 2015 Slide 5 of 25

Limitations • Not integrated • Focused on one agent (vs. overall AI community) • Static (vs. dynamic) Slide 6 of 25

Challenges • “Progress in AI is linear/exponential” • Y-axis? • Unit/level(s) of analysis? • Metric(s)? • Confounding variables Clark, 2015 Bowling et al., 2015 Ginesh, 2011 Slide 7 of 25

Proposed Framework • AI performance benefits from algorithmic progress, greater compute power, more/better data, and human input (including research, programming, feedback, etc.) • Vs… • Algorithmic AI progress is an increase in the ability to efficiently convert computing power to performance in a given cognitive domain or in a range of domains, controlling for human input and data Slide 8 of 25

Proposed Framework (cont’d) • Rigorous definitions of intelligence, tasks, definitions, abilities, difficulty, etc. needed • Many foundations developed in Legg and Hutter, 2007; Goertzel, 2009; Hernandez-Orallo, 2016 • Need empirical study of different dimensions of AI performance increase Slide 9 of 25

Role of Hardware • Speed and/vs. level (Carroll, 1993) • Ctotal = Cquality + Cspeedup • Algorithmic progress and hardware progress enable Pareto optimal improvements • Multiple reference levels He et al., 2015 with annotations Slide 10 of 25

Role of Hardware (cont’d) • Research pace speedup • ~1/2 of historical performance improvement in six domains (Grace 2013) • Multiple reference levels (say, 2) which increase over time Slide 11 of 25

Abilities • Internally correlated, externally distinct task classes • General intelligence is superset • Within-ability proficiency: • “The ability of an individual subject to perform a specified kind of task is the difficulty E at which the probability is 1/2 that he will do that task” • (Thurstone, 1937) Slide 12 of 25

Preliminary Synthesis • Perception • Manipulation • Language processing • Learning • Planning • Social cognition • Inference • Steps toward rigor • task difference measure (Hernandez-Orallo, 2016) • Empirical analysis Slide 13 of 25

(Deep) Learning Hegemony? • How to model interdependencies and synergies? • Perception • Manipulation • Language processing • Learning • Planning • Inference • Social cognition Slide 14 of 25

Go • “Large, sudden jump in strength” • Jon Diamond, President of British Go Association, quoted by Jack Clark of Bloomberg • It was an improvement, but a smaller one than it was made out to be by some, after controlling for: • Hardware • Effort • Data Silver et al., 2016 Silver et al., 2016 Slide 15 of 25

Go (cont’d) • Expert judgment vs. simple extrapolations • Forecast by Yamashita in 2011: cross-over in 4 years • Compare to Facebook’s darkfmcts3 and Zen19X Silver et al., 2016 data, extrapolated to 5 mins/turn or more hardware Slide 16 of 25

Atari • Less human input for same performance or increased performance for same input = progress • DQN uses raw pixels • No hyperparameter adjustment across games • Exponentially improving performance, faster training Slide 17 of 25

Atari (cont’d) • Human-scaled performance in 6 DeepMind papers, 2015-2016 Mnih et al., 2016 data • 2015 = approximate crossover point for human level as well as classical planning performance Slide 18 of 25

Atari (cont’d) Steady trend for >5 years • Some games still subhuman, but decreasing in # Mnih et al., 2015, with annotations Data from several publications, details available upon request Slide 19 of 25

AI Progress and Jobs:Current Theories • Murnane/Levy 2004: • Routine vs. non-routine • Autor2013: • Novelty of tasks • Brynjolfsson/McAfee 2014: • Creativity • Rus 2015: • Perception/manipulation (of some types) • Abstraction, creativity • Frey and Osborne 2013: • Social intelligence • Creative intelligence • Perception and manipulation Slide 20 of 25

Issues with Current Theories • Performance and algorithmic progress not disentangled • Insufficient attention to: • robustness considerations • Interrelationships between abilities • generality Slide 21 of 25

Example: Frey and Osborne, 2013 • Reasons for skepticism re: resilience of perception/manipulation, social intelligence, and creative intelligence in face of: • New datasets • Disproportionate effort • Hardware • Deep RL progress calls into question independence of abilities

Future Directions • Principled demarcations of abilities and possible interrelationships • More epirical analysis of progress rates • Monte Carlo simulations of future ability levels Slide 23 of 25

Acknowledgments • Thanks to David Guston, Joanna Bryson, Erik Fisher, Mark Gubrud, Daniel Dewey, Katja Grace, Kaj Sotala, Brad Knox, Vincent Mueller, Beau Cronin, Adi Wiezel, David Dalrymple, Adam Elkus, Jose Hernandez-Orallo, an anonymous reviewer, and others for helpful comments on this and related work. Slide 24 of 25

Thanks! • www.milesbrundage.com • miles.brundage@asu.edu Slide 25 of 25

Modeling Progress in Artificial Intelligence: Challenges, Frameworks, and Future Trends