140 likes | 258 Views
J Storrs Hall. Ethics for Machines. Stick-Built AI. No existing AI is intelligent Intelligence implies the ability to learn Existing AIs are really “artificial skills” A human e.g. Grandmaster will have learned the chess-playing skills It's the learning that's the intelligent part
E N D
J Storrs Hall Ethics for Machines
Stick-Built AI • No existing AI is intelligent • Intelligence implies the ability to learn • Existing AIs are really “artificial skills” • A human e.g. Grandmaster will have learned the chess-playing skills • It's the learning that's the intelligent part • Providing ethical constraints to stick-built AI is just a matter of competent design
Autogenous AI • Truly intelligent AI would learn and grow • Create new concepts and understand the world in new ways • This is a problem for the ethical engineer: • Cannot know what concepts the AI will have • Can't write rules in terms of them
The Rational Architecture • WM (world model) predicts the results of actions • UF (utility function) evaluates possible worlds • The AI evaluates the effects of its possible actions and does what it predicts will have the best results • This is an ideal except in the case of very simplified worlds (e.g. chess)
Learning Rational AI • WM is updated to use new concepts to describe the world in • WM changes can be evaluated based on how well they predict • But on what basis can we update the UF?
Vernor Vinge: A mind that stays at the same capacity cannot live forever; after a few thousand years it would look more like a repeating tape loop than a person. ... To live indefinitely long, the mind itself must grow ... and when it becomes great enough, and looks back ... what fellow-feeling can it have with the soul that it was originally?
Invariants • We must find properties that are invariant across the evolutionary process • Base our moral designs on those
A Structural Invariant • Maintaining the grasp, range, and validity of the WM is a necessary subgoal for virtually anything else the AI might want to do • Socrates put it: • There is only one good, namely, knowledge; and only one evil, namely, ignorance.
Evolving AI • Current AI only evolves like any engineered artifact • The better it works, the more likely the design is to be copied in the next generation • Once AIs have a hand in creating new AIs themselves, there will be a strong force toward self-interest
The Moral Ladder • Axelrod's “Evolution of Cooperation” • Subsequent research expanding it • ALWAYS DEFECT is optimal in random environment • GRIM is optimal in env. of 2-state strategies • TIT-FOR-TAT in env. of human-written strategies • PAVLOV in env. cleared out by TIT-FOR-TAT • etc.
Open Source Honesty • Intelligent autonomous agents are always better off if they can cooperate • Even purely self-interested ones • Ascending the moral evolutionary ladder requires finding others one can trust • AIs might be able to create protocols that would guarantee their motives • e.g. Public-key signed release of UF
The Horizon Effect • Short planning horizons produce unoptimal behavior • A planning horizon commensurate with the AI's predictive ability is evolutionarily stable • What goes around, comes around, especially in an environment of superintelligent AIs • Honesty really is the best policy
Invariant Traits • Curious, e.g strongly motivated to increase its understanding of the world • Self-interested • Understands evolutionary dynamics of the moral ladder • Capable of guranteeable trustworthiness • Long planning horizon
The Moral Machine • If we start an AI with these traits, they are unlikely to disappear in their essence, even if they get changed in their details beyond current recognition