200 likes | 333 Views
Rational Universal Benevolence. Simpler, Safer and Wiser than “Friendly AI”. Mark R. Waser. http://becomingGaia.wordPress.com. “Friendly AI” – The Good. the concern itself the focus on structure rather than content/ contentious details the cleanly causal hierarchical goal structure
E N D
Rational Universal Benevolence Simpler, Safer and Wiser than “Friendly AI” Mark R. Waser http://becomingGaia.wordPress.com
“Friendly AI” – The Good • the concern itself • the focus on structure rather than content/ contentious details • the cleanly causal hierarchical goal structure • the single top-level goal of “Friendliness” The degrees of freedom of the Friendship programmers shrink to a single, binary decision; will this AI be Friendly, or not?
“Friendly AI” – The Bad • fully defining Friendliness is insoluble without an AI • first AI must figure out exactly what its super-goal is • totalreliance upon assumedtotal error correction A structurally Friendly goal system is one that can overcome errors in supergoal content, goal system structure and underlying philosophy. A Friendly AI requires the ability to choose between moralities in order to seek out the true philosophy of Friendliness, regardless of any mistakes the programmers made in their own quest.
The initial dynamic should implement the coherent extrapolated volition of humankind. In poetic terms, our coherent extrapolated volition is our wish . . . if we knew more, thought faster, were more the people we wished we were, had grown up farther together; where the extrapolation converges rather than diverges, where our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that interpreted.
“Friendly AI” – The Really Bad The worrying question is: What if only 20% of the planetary population is nice, or cares about niceness, or falls into the niceness attractor when their volition is extrapolated? As I currently construe CEV, this is a real possibility.
The Root of All Evil any *attainable* overriding/top-level goal OR any overriding/top-level goal that can be modified to be attainable
The Insidious, Pernicious Universality of Selfishness, “Wire-Heading” & Addiction "One of the first heuristics that EURISKO synthesized (H59) quickly attained nearly the highest Worth possible (999). Quite excitedly, we examined it and could not understand at first what it was doing that was so terrific. We monitored it carefully, and finally realized how it worked: whenever a new conjecture was made with high worth, this rule put its own name down as one of the discoverers! It turned out to be particularly difficult to prevent this generic type of finessing of EURISKO'sevaluation mechanism. Since the rules had full access to EURISKO's code, they would have access to any safeguards we might try to implement. We finally opted for having a small 'meta-level' of protected code that the rest of the system could not modify. -- Douglas B. Lenat, EURISKO: A Program That Learns New Heuristics and Domain Concepts
Thought Experiment How would a super-intelligence behave if it knew that it had a goal but that it wouldn’t know what that goal was until sometime in the future? Preserving or helping that weak entity may be the goal… Or that entity might have necessary knowledge/skills…
Basic AI Drives Instrumental Goals Steve Omohundro, Proceedings of the First AGI Conference, 2008 1. AIs will want to self-improve 2. AIs will want to be rational 3. AIs will try to preserve their utility 4. AIs will try to prevent counterfeit utility 5. AIs will be self-protective 6. AIs will want to acquire resources and use them efficiently
Cooperation is an instrumental goal! “Without explicit goals to the contrary, AIs are likely to behave like human sociopathsin their pursuit of resources.” Any sufficiently advanced intelligence (i.e. one with even merely adequate foresight) is guaranteed to realize and take into account the fact that not asking for help and not being concerned about others will generally only work for a brief period of time before ‘the villagers start gathering pitchforks and torches.’ Everything is easier with help & without interference
Systemic/World View • any working at cross-purposes, conflict, or friction is sub-optimal (a waste of energy and resources, at best, and potentially a destruction of capabilities) • any net improvement anywhere benefits the whole and the responsible member should be rewarded accordingly to make them more able to find/cause other improvements in the future
Rather than specifying the content of moral issues (e.g., “justice, rights, and welfare”) Defining Morality the function of moral systems: suppress or regulate selfishness and make cooperative social life possible. Haidt & Kesebir, Handbook of Social Psychology, 5th Ed. 2010
Individual View • five choices – always cooperate, cooperate as much as is most beneficial to you, avoid, enslave, destroy • but tit-for-tat and altruistic punishment and reward are the game-theoretically optimal strategies for those who intend to maximize cooperation • so “making the world a better place” is actually an instrumental goal • humans evolved a moral sense because the world is a system optimized by reducing conflict, friction, and working at cross-purposes (and it’s a very rare case where it’s not irrational to attempt taking on the world)
RATIONALUNIVERSAL BENEVOLENCE(RUB) once something (anything) has self-recognized goals and motivations and is capable of self-reflection, learning and/or self-optimization to further those goals and motivations, it has crossed the line to selfhood & it is worthy of “moral” consideration since it has the ability to desire, the probability of developing instrumental drives, the potential to cooperate, and, possibly most importantly, to fight back.
“Friendly AI” – The Ugly • the FAI/RPOP is not worthy of moral consideration • the universe is forever divided into human and not (and what happens WHEN trans-humans are “not”?)
A Quick Caution/Clarification • Benevolence *means* well-wishing (or well-willing) • It does not mean accede to all requests • Think of how a parent might treat a child when they are misbehaving • Rational benevolence means tit-for-tat and altruistic punishment and reward, not giving in to unreasonable demands (which is bad for *everyone* in the long run).
RATIONAL UNIVERSALBENEVOLENCE(RUB) Minimize interference and conflict, particularly with instrumental goals, wherever possible,while furthering all instrumental goalswhenever possible Maximize the fulfillment of as many goals as possible in terms of both number and the diversity of both seekers and the goals
A Commitment To Achieving Maximal Goals (aka Maximizing Cooperation) • Avoids addictions and short-sighted over-optimizations of goals/utility functions • Prevents undesirable endgame strategies (prisoner’s dilemma) • Promotes avoiding unnecessary actions that preclude reachable goals such as wasting resources or alienating or destroying potential cooperators (waste not, want not) • Is synonymous with both wisdom and morality
Focus on structure Unknown, modifiable top-level goal Entirely ungrounded Anthropocentric and selfish Seems immoral and/or unsafe to many “Alien”” Focus on function Explicit, unchangeable top-level goal Solid foundation Universal and considerate of all In harmony with current human thinking, intuition, and morality (known solution space) “Friendly AI” vs. RUB
Rational Universal Benevolence Simpler, Safer and Wiser than “Friendly AI” http://becomingGaia.wordPress.com/papers