260 likes | 378 Views
A Game-Theoretically Optimal Basis For Safe and Ethical Intelligence: . A Thanksgiving Celebration. Mark R. Waser MWaser@BooksIntl.com http://BecomingGaia.WordPress.com. Intelligence – the ability to achieve/fulfill complex goals in complex environments.
E N D
A Game-Theoretically Optimal Basis For Safe and Ethical Intelligence: A Thanksgiving Celebration Mark R. Waser MWaser@BooksIntl.com http://BecomingGaia.WordPress.com
Intelligence – the ability to achieve/fulfill complexgoals in complex environments A safe and ethical intelligence *must* have the goals of safety and ethics as its top-most goals (restrictions) What is safety? What is ethics? How are they related? Are we truly safe IFF the machine is ethical?
Safe = Protective Protective of what? • Physical presence • Mental presence • Capabilities • Wholeness/Integrity • Resources Things that I value
Safety = Identical Goals & Values Friendly AI meme (Yudkowsky) • Coherent Extrapolated Volition • of Humanity (CEV - Yudkowsky) • “In poetic terms, our coherent extrapolated volition is our wish if we knew more, thought faster, were more the people we wished we were, had grown up farther together.”
So . . . . To be safe, humanity needs to ensure that the intelligence has and maintains humanity’s goals and values (CEV) But . . . . Isn’t this effectively mental slavery, which is contrary to ethics, which is thereby contrary to personal safety?
Cripple the entity so that it doesn’t qualify as an entity deserving to be treated ethically • Remove its will/desire/goals • RPOP (Yudkowsky) • An “Oracle” (e.g. Google) • Realize that the CEV of humanity necessarily must be a universal morality (benefit: avoids the problem of “What is human?”) Two possible solutions
Working HypothesisHumanity’s CEV = Core of Ethicswhere core ethics are those ethics that apply to every intelligence because they are logical/necessary for their own personal safety (if not efficiency, etc.)
Basic AI Drives Steve Omohundro, Proceedings of the First AGI Conference, 2008 1. AIs will want to self-improve 2. AIs will want to be rational 3. AIs will try to preserve their utility function 4. AIs will try to prevent counterfeit utility [gaming/manipulation] 5. AIs will be self-protective 6. AIs will want to acquire resources and use them efficiently
“Without explicit goals to the contrary, AIs are likely to behave like human sociopathsin their pursuit of resources.” Any sufficiently advanced intelligence (i.e. one with even merely adequate foresight) is guaranteed to realize and take into account the fact that not asking for help and not being concerned about others will generally only work for a brief period of time before ‘the villagers start gathering pitchforks and torches.’ Everything is easier with help & without interference
Why a Super-Intelligent God *WON’T* “Crush Us Like A Bug”Waser, M. 2010. Presentation, AGI ’10. Lugano, Switzerland http://becominggaia.wordpress.com/papers/ Counterfactual Mugging.Nesov, V. 2009. • http://lesswrong.com/lw/3l/counterfactual_mugging/ Friendlinessis an intelligent machine’s best defense against its own mind children (ungrateful children)
Basic AI Drives Inherently implies reproduction (even if only in the form of sending parts of yourself out in space probes, etc.) Steve Omohundro, Proceedings of the First AGI Conference, 2008 1. AIs will want to self-improve 5. AIs will be self-protective 6. AIs will want to acquire resources and use them efficiently
Basic AI Drives • AIs will want to self-improve Improve self as resource towards goal 2. AIs will want to be rational Improve self’s integrity/efficiency w.r.t. goals 3. AIs will try to preserve their utility function Preserve goal 4. AIs will try to prevent counterfeit utility Preserve self/goal integrity 5. AIs will be self-protective Protect self as resource towards goal 6. AIs will want to acquire resources/use them efficiently Improve access to resources & use efficiently for goals
Basic AI Drives • AIs will want to self-improve Improve self as resource towards goal 2. AIs will want to be rational Improve self’s integrity/efficiency w.r.t. goals 3. AIs will try to preserve their utility function Preserve goal 4. AIs will try to prevent counterfeit utility Preserve self/goal integrity 5. AIs will be self-protective Protect self as resource towards goal 6. AIs will want to acquire resources/use them efficiently Improve access to resources & use efficiently for goals
preserve protect improve security safety risk Conservative roadblock “biological” imperative Jurassic Park Syndrome (JPS)
Basic AI Drives • AIs will want to self-improve Improve self as resource towards goal 2. AIs will want to be rational Improve self’s integrity/efficiency w.r.t. goals 3. AIs will try to preserve their utility function Preserve goal 4. AIs will try to prevent counterfeit utility Preserve self/goalintegrity 5. AIs will be self-protective Protect self as resource towards goal 6. AIs will want to acquire resources/use them efficiently Improve access to resources & use efficiently for goals
goals self resources goals self resources others goals self tools/ ~self ~goals resources others goals AGI self tools/ ~self ~goals resources others
Cripple the entity so that it doesn’t qualify as an entity deserving to be treated ethically • Remove its will/desire/goals • Realize that the CEV of humanity necessarily must be a universal morality (benefit: answers the problematic question of “Why shouldn’t I force/destroy you?”) Two possible solutions
Singer’s Circles of Morality community goals AGI self tools/ ~self ~goals resources others extended self
interlocking sets of values, virtues, norms, practices, identities, institutions, technologies, and evolved psychological mechanisms that work together to Moral Systems Are . . . suppress or regulate selfishness and make cooperative social life possible. Haidt & Kesebir, Handbook of Social Psychology, 5th Ed. 2010
Cooperation (striving for common goals) has two pre-requisites: • Recognition of the inherent value of others. • Consideration of values placed on things by others Other-focussedNOT Selfish
suppress or regulate selfishness and make cooperative social life possible. Accept *ALL* other’s goals as subgoals? Including those that prevent other goals?
Intelligent Drives/Universal SubgoalsUniversal Bill of Rights 1. The right and freedom to self-improve 2. The right and freedom to be rational 3. The responsibility* to preserve their utility function 4. Freedom from counterfeit utility, gaming, manipulation 5. The right &freedom to be self-protective (self-defense) 6. The right to access to and efficient usage of resources 7. The right (responsibility*) of (rational*) reproduction 8. The right and responsibility*of community (including co-operation and assistance)
Fairness and Justicerights, responsibilities & freedoms must be allocated fairly/justly Fairness is determined by the community by what is best for the community goals including: • Giving everybody what they need and the right quantity of extra so that it is irrational to defect • Remembering that more responsibilities generate more resources and more rights and freedoms (thank/reward workers)
Optimistic tit-for-tat and altruistic punishment seem to be optimal for non-exploitable cooperation and community in the face of assumedconflicting goals THE DOWNSIDE = game-theoretically optimality We’d better hope that the machines are intelligent enough and resourceful enough to treat us better than we treat chimpanzees We’d better hope that the machines are grateful enough and resourceful enough to thank us better than we thank Mother Earth
Smarten Up, • Declare Yourself Friendly, and • Treat Everyone & Everything As Well As You Desire To Be Treated (Modified Golden Rule) What YOU *Can* Do • Be Grateful, • Give Thanks, and • Treat Everyone & Everything As Well As They Deserve To Be Treated (Modified Golden Rule) Pay attention to/RESEARCH FRIENDLINESS and/or MORALITY & ETHICS BEFORE YOU KILL US ALL!!
A Game-Theoretically Optimal Basis For Safe and Ethical Intelligence Mark R. Waser MWaser@BooksIntl.com http://BecomingGaia.WordPress.com