1 / 20

Rational Universal Benevolence

Rational Universal Benevolence. Simpler, Safer and Wiser than “Friendly AI”. Mark R. Waser. http://becomingGaia.wordPress.com. “Friendly AI” – The Good. the concern itself the focus on structure rather than content/ contentious details the cleanly causal hierarchical goal structure

pancho
Download Presentation

Rational Universal Benevolence

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Rational Universal Benevolence Simpler, Safer and Wiser than “Friendly AI” Mark R. Waser http://becomingGaia.wordPress.com

  2. “Friendly AI” – The Good • the concern itself • the focus on structure rather than content/ contentious details • the cleanly causal hierarchical goal structure • the single top-level goal of “Friendliness” The degrees of freedom of the Friendship programmers shrink to a single, binary decision; will this AI be Friendly, or not?

  3. “Friendly AI” – The Bad • fully defining Friendliness is insoluble without an AI • first AI must figure out exactly what its super-goal is • totalreliance upon assumedtotal error correction A structurally Friendly goal system is one that can overcome errors in supergoal content, goal system structure and underlying philosophy. A Friendly AI requires the ability to choose between moralities in order to seek out the true philosophy of Friendliness, regardless of any mistakes the programmers made in their own quest.

  4. The initial dynamic should implement the coherent extrapolated volition of humankind. In poetic terms, our coherent extrapolated volition is our wish . . . if we knew more, thought faster, were more the people we wished we were, had grown up farther together; where the extrapolation converges rather than diverges, where our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that interpreted.

  5. “Friendly AI” – The Really Bad The worrying question is: What if only 20% of the planetary population is nice, or cares about niceness, or falls into the niceness attractor when their volition is extrapolated? As I currently construe CEV, this is a real possibility.

  6. The Root of All Evil any *attainable* overriding/top-level goal OR any overriding/top-level goal that can be modified to be attainable

  7. The Insidious, Pernicious Universality of Selfishness, “Wire-Heading” & Addiction "One of the first heuristics that EURISKO synthesized (H59) quickly attained nearly the highest Worth possible (999). Quite excitedly, we examined it and could not understand at first what it was doing that was so terrific. We monitored it carefully, and finally realized how it worked: whenever a new conjecture was made with high worth, this rule put its own name down as one of the discoverers! It turned out to be particularly difficult to prevent this generic type of finessing of EURISKO'sevaluation mechanism. Since the rules had full access to EURISKO's code, they would have access to any safeguards we might try to implement. We finally opted for having a small 'meta-level' of protected code that the rest of the system could not modify. -- Douglas B. Lenat, EURISKO: A Program That Learns New Heuristics and Domain Concepts

  8. Thought Experiment How would a super-intelligence behave if it knew that it had a goal but that it wouldn’t know what that goal was until sometime in the future? Preserving or helping that weak entity may be the goal… Or that entity might have necessary knowledge/skills…

  9. Basic AI Drives Instrumental Goals Steve Omohundro, Proceedings of the First AGI Conference, 2008 1. AIs will want to self-improve 2. AIs will want to be rational 3. AIs will try to preserve their utility 4. AIs will try to prevent counterfeit utility 5. AIs will be self-protective 6. AIs will want to acquire resources and use them efficiently

  10. Cooperation is an instrumental goal! “Without explicit goals to the contrary, AIs are likely to behave like human sociopathsin their pursuit of resources.” Any sufficiently advanced intelligence (i.e. one with even merely adequate foresight) is guaranteed to realize and take into account the fact that not asking for help and not being concerned about others will generally only work for a brief period of time before ‘the villagers start gathering pitchforks and torches.’ Everything is easier with help & without interference

  11. Systemic/World View • any working at cross-purposes, conflict, or friction is sub-optimal (a waste of energy and resources, at best, and potentially a destruction of capabilities) • any net improvement anywhere benefits the whole and the responsible member should be rewarded accordingly to make them more able to find/cause other improvements in the future

  12. Rather than specifying the content of moral issues (e.g., “justice, rights, and welfare”) Defining Morality the function of moral systems: suppress or regulate selfishness and make cooperative social life possible. Haidt & Kesebir, Handbook of Social Psychology, 5th Ed. 2010

  13. Individual View • five choices – always cooperate, cooperate as much as is most beneficial to you, avoid, enslave, destroy • but tit-for-tat and altruistic punishment and reward are the game-theoretically optimal strategies for those who intend to maximize cooperation • so “making the world a better place” is actually an instrumental goal • humans evolved a moral sense because the world is a system optimized by reducing conflict, friction, and working at cross-purposes (and it’s a very rare case where it’s not irrational to attempt taking on the world)

  14. RATIONALUNIVERSAL BENEVOLENCE(RUB) once something (anything) has self-recognized goals and motivations and is capable of self-reflection, learning and/or self-optimization to further those goals and motivations, it has crossed the line to selfhood & it is worthy of “moral” consideration since it has the ability to desire, the probability of developing instrumental drives, the potential to cooperate, and, possibly most importantly, to fight back.

  15. “Friendly AI” – The Ugly • the FAI/RPOP is not worthy of moral consideration • the universe is forever divided into human and not (and what happens WHEN trans-humans are “not”?)

  16. A Quick Caution/Clarification • Benevolence *means* well-wishing (or well-willing) • It does not mean accede to all requests • Think of how a parent might treat a child when they are misbehaving • Rational benevolence means tit-for-tat and altruistic punishment and reward, not giving in to unreasonable demands (which is bad for *everyone* in the long run).

  17. RATIONAL UNIVERSALBENEVOLENCE(RUB) Minimize interference and conflict, particularly with instrumental goals, wherever possible,while furthering all instrumental goalswhenever possible Maximize the fulfillment of as many goals as possible in terms of both number and the diversity of both seekers and the goals

  18. A Commitment To Achieving Maximal Goals (aka Maximizing Cooperation) • Avoids addictions and short-sighted over-optimizations of goals/utility functions • Prevents undesirable endgame strategies (prisoner’s dilemma) • Promotes avoiding unnecessary actions that preclude reachable goals such as wasting resources or alienating or destroying potential cooperators (waste not, want not) • Is synonymous with both wisdom and morality

  19. Focus on structure Unknown, modifiable top-level goal Entirely ungrounded Anthropocentric and selfish Seems immoral and/or unsafe to many “Alien”” Focus on function Explicit, unchangeable top-level goal Solid foundation Universal and considerate of all In harmony with current human thinking, intuition, and morality (known solution space) “Friendly AI” vs. RUB

  20. Rational Universal Benevolence Simpler, Safer and Wiser than “Friendly AI” http://becomingGaia.wordPress.com/papers

More Related