700 likes | 815 Views
Lecture 12: Agent Societies. Two perspectives. Coordination in MAS by norms and social laws (textbook, Ch. 9.6.4) Using MAS for simulation of social phenomena. Coordination in MAS by norms and social laws. Conventions, Norms: an established, expected pattern of behavior
E N D
Two perspectives • Coordination in MAS by norms and social laws (textbook, Ch. 9.6.4) • Using MAS for simulation of social phenomena
Coordination in MAS by norms and social laws • Conventions, Norms: • an established, expected pattern of behavior • e.g. queuing in the end of the queue, using a certain language to communicate with others • Social laws: • norm associated with authority enforcing them and punishment in case of violation • e.g. laws protecting private property • Provide template behaviors which save reasoning power, and behavioral constraints that assure that agents have a chance to achieve their goals
Conceptualizations of norms within game theory • Norms as solutions to problems of co-ordination • Norms as solutions of conflict of utility • Norms as solutions to problems of inequality
Two origins of norms and social laws in MAS • Embedded in the design on each agent. Social laws and norms are designed offline and hardwired in each agent. Examples: Tennenholtz, Conte & Castelfranchi • Emerging from within the interactions of self-interested but cooperative agents. Examples: Shoham & Tennenholtz
Embedded norms and laws • Engineering multi-agent organizations • Usually hierarchical design: • who reports to whom; predefined authority structures • Modeled after human organizations • Defines the flow of information / control and the agent interactions • Much like Software Engineering or like Protocol design for negotiations
Only one robot can occupy a grid-point at any time Robots collect and transport items from one grid-point to another Design a social law that prevents collisions Design of a social law: example
Each robot has to move constantly Direction of motion fixed on alternating rows Move-up when at right-most column Move-down when either On the left-most column of even rows On 2nd rightmost column of odd rows One possible social law 5 4 3 2 • Next move of robot is uniquely defined • Robot can always get to the desired loc • In at most O(n2) moves 1 0 But not efficient!
Emergent norms • How can a norm or social law emerge in a society of agents? • How can agents reach a global agreement by using only local information • Each agent decides which convention to follow based only on its own experience? • The T-shirt game (Shoham & Tennenholtz, 1992): can all agents agree on the same colour?
Emergent norms (2) • Strategy update functions for the T-shirt game: • history of observed colours choice of colour • Simple majority • Simple majority with agent types • agents of the same type can communicate, exchange histories • Simple majority with communication on success (after an agent has reached a success threshold) • Highest cumulative reward • Measuring the efficiency of convergence, how many rounds it takes for all agents to converge to a particular strategy • Results: all strategies lead to the emergence of conventions • The strongest results – about the highest cumulative reward update function. It can be shown that agreement can be reached with certain probability in finite number of rounds.
Using MAS for simulation of social phenomena • Why to adopt computer-based social simulation? • Social phenomena are not directly accessible or are difficult to observe. • Some social studies are time consuming, policy study spans a considerable time. • Some social studies cannot be replicated in labs; difficult to collect data, e.g. crime study: bribery and deterrence
Computational vs Sociological Study off Norms • Computational study of norms is a formal approach to theory building in the field of norms • Sociological study of norms aims at: • appropriate definition of norms • explanation of how norms affect social behaviour • explanation of how norms emerge
Sociological Study of Norms Four different conceptualization of norms: • The statistical conceptualization of norms originates in behaviourism. A behavioural pattern becomes a norm if the majority of actors behave according to this pattern. • According to Durkheim norms are social facts which can be identified through the mere existence of certain sanctions. • According to Ethnomethodologists: • several basic rules which have a pseudo-normative character • seen as obliging and deviations are sanctioned • deviations from these basic rules is judged in clinical, not moral categories • Developmental psychologists have put forward an ethical conceptualization of norms
Sociological Study of Norms (cont’d) • Functions of Norms: • Norms have been analyzed as solutions to social problems • According to Marx, norms are dependent on the economic foundations of society. • Computational study of norms develops into sociological • Computational development of concepts advances from the statistical to the ‘sociological’ conceptualization of norms.
The Conte and Castelfranchi Model Background: • 50 agents are placed randomly into a 2-dimensional world that consists of a 10*10 grid with connected edges (a torus). • The initial strength of each agent is 40. • 25 food items of nutritional value 20 are distributed randomly on the grid. Each food item is replenished at a randomly selected location on the grid after it has been consumed. • At the beginning of a match, agents are randomly allocated to locations and are assigned those food items which happen to fall into their own territories (their von Neumann-neighborhood). • Food possessed is flagged and each agent knows to whom it belongs.
Conte and Castelfranchi’s model • Actions are simultaneous. Depending on built-in strategies and knowledge, agents may decide to attack one another. • Three strategies: • blind aggression (‘attack an eater to get its food, unless free food is available at a lower cost’) • strategic aggression (‘attack an eater whenever you perceive it as no stronger than you, unless free food is available at a lower cost’) • normative aggression (‘attack an eater unless the food item being eaten is marked as ‘owned’ by that agent’ I.e. the finder-keeper norm)
Conte and Castelfranchi’s experiment • Experiment consists of 100 matches, each of which includes 2000 games. During each game each agent performs one action. • For each experiment, the number of attacks, the average strength, and the standard deviation of individual strength is recorded. • Results: • In homogeneous population the agents using the normative strategy do best at controlling aggression and keeping inequality low • In mixed population the normative strategy becomes the worst
Resimulating norms by Saam & Harrer • Difference between the original model and the re-implemented model – actions are not executed simultaneously, but in sequence. This decreases the number of conflicts and thus results are less dependent on random resolutions of conflicting actions. • Replication results are the same as the original results.
Saam and Harrer (Experiment 1) • First Experiment: Private Property and Heritage • Extended the Conte and Castelfranchi model – agents may reproduce and the offspring inherit the sum of the strength of their parents. • The agent chooses another agent who is next to it within its von Neumann-neighbourhood, unite their strength, produce two children, divide their total strength and forward it to the children. • The parents die immediately and the children take their places in the grid.
Saam & Harrer (1): results • Results: • For equal heritage, the normative strategy is found to perform best at increasing the average strength of the agents, reducing inequality among them and reducing aggression. • For unequal heritage, the normative strategy leads in producing the worst inequality.
Saam & Harrer: experiment 2 • Second Experiment: Unequal Renewal of Resources • In this model nutritional value of food is no longer constant. When a food item is replenished and happens to fall at the same location as an agent, the nutrition value depends on the strength (s) of the agent. • The higher the strength of the agent during the previous time step, the larger is the nutrition value of the replenishing food item: • Resimulating
Saam & Harrer (2): results • The normative MAS has the highest average strength and the lowest degree of aggression. • Compared to the original model, inequality is much more pronounced. • In sum, in homogeneous societies, the finder-keeper norm: • minimizes aggression in all experiments • maximizes the average strength of the agents in all experiments • However its function with respect to equality depends very much on the initial conditions and the redistribution of strength.
Resimulating Norms from Sociological Theoryby Saam and Harrer • Haferkamp’s theory of action approach to deviant behaviour • Haferkamp combines the theory of action and system theory and integrates power and rule as conflict theoretical elements • He defines norms as conceptions which are internalized by the majority of members of a social situation. • The conception implies the correct (re)actions to defined situations and the certitude that deviance will be sanctioned.
Sociological Model • The agents live in a two-group society, in-group g1 and out-group g2. Members of the in-group are more resourceful and powerful than members of the out-group. • Those individuals who are most powerful rule and therefore decide about the institutionalization of norms in society as a whole. Members of the in-group transfer some of the resources to the redistribution agent in exchange for the institutionalization of norm n1 in situation s1. • The redistribution agent redistributes the resources uniformly to all agents. • The agents are able to identify and define social situations. • Deviant behaviour is sanctioned by the members of the in-group. Agent a1 will sanction agent a2 if it observes it reacting by behaviour b2 (instead of behaviour b1) to situation s1. • Whenever an agent of the in-group sanctions an agent of the out-group this will increase its power and decrease its resources.
Sociological Simulation Results • In the mixed population case, the normative strategy (finder-keeper) no longer generates the worst inequality (as in the Castelfranchi, Conte and Paolucci experiments). • The finder-keeper norm no longer controls aggression the most effectively. • Problem with the model: it is incomplete with respect to the power variable. The power of the in-group agents increases continuously, whereas the power of the out-group agents remains constant. Power is not consumed.
A Simulation of the Market for Offenses in Multiagent Systems: Is Zero Crime Rates Attainable? Pinata Winoto Presented at MABS’2002 Workshop with AAMAS’02, Bologna
Motivations • Many theories have been developed to understand criminal behavior at micro level and crime market at macro level • In an open multiagent system, an optimal policy against malevolent actions is needed • Hard to test a theory, especially in the macro level
Are Existing Theories Suitable for MAS? • Micro-level: • Agents are less complicated, more consistent, and more homogeneous than people (No drugs, alcohol, social dilemma, etc.) • Most existing theories are developed under assumptions which fit MAS • Macro-level: • MAS is a discrete world, can be initialized, consists of small number of agents • Existing theories assume continuous domains with large number of agents existing for a long time
Objective • Fender’s equilibrium theory (Journal of Economic Behavior and Organization, 1999) • Multiple equilibria of crime rate exist • One of the stable equilibria is low crime rate (zero crime) • Specific conditions to attain zero crime rate equilibrium are unknown • Objective: Exploring zero crime rate equilibrium
Fender’s Equilibrium Theory • Potential offenders follow rational choice theory (von Neumann-Morgenstern Expected Utility): • commit crime if their expected return from crime is greater than that from legitimate work • Some agents will not commit crime under any circumstance (Honest Agents) • Government’s expenditure in controlling crime is financed by the tax collected from the workers
Fender’s Equilibrium Theory (cont’) • Agents’ expected return from crime: Uc = p u2 + (1-p) u1 p : probability of conviction/punished u2: return from crime if convicted (fixed) u1: return from crime if not convicted (fixed) • Agents’ return from legitimate work: Uw = w - L - T w : return from work (uniform distribution) L : expected loss from being victimized T: average tax paid to government
Fender’s Equilibrium Theory (cont’) • Agents’ expected loss from being victimized: L = lC/(n-C) l : average loss from crime (fixed) C: number of criminals n: number of agents (fixed) n- C: number of legitimate workers • Average tax to be paid by agents: T = E/(n-C) E : total government’s expenditure (fixed)
Fender’s Equilibrium Theory (cont’) • The effectiveness of the law enforcement in controlling crime depends on the punishment and the probability of conviction: p = G(E) / C G(E) :number of criminals convicted (productivity of government spending on the law enforcement) • Multiple equilibria exist, where two of them are stable equilibria: high crime rate and low crime rate equilibria
An Example of Fender’s Conjecture • A/D are stable low/high crime rate equilibria • B is an unstable equilibrium: higher/lower probability of conviction or lower/higher number of criminals will drag the system to A/D
Simulation • 10-generation overlapping model (every agent lives for 10 periods) • to maintain the diversity of agents • to simulate entry and exit • 50% of agents are honest • dichotomous property from Fender’s framework • Various initial probability of conviction and initial crime rate • Various size of society (number of agents)
Agents’ Interactions Potential offenders Honest Agents Generate new potential offenders with income {1000, …, 3000} Generate new honest agents with income = $2000 Retrieve info: last punishment rate and number of criminals Work, get paid, pay tax Evaluate the gain from crime and from work. Make decisions. Government Record the number of criminals and arrest them; collect tax from each worker Commit crime Work, get paid succeed fail Pay tax $2000 $500
The smaller the society, the higher the chance to reach zero crime rate equilibrium Preliminary Results • Initial crime rate = 0% • Various size of society (number of agents) The larger the society, the better the theoretical analysis in predicting the equilibrium outcome
Preliminary Results The higher the initial crime rate, the lower the chance to reach zero crime rate equilibrium • Various initial crime rate • Fix size of society (number of agents) As initial crime rate becomes higher, the chance of zero crime equilibrium will be zero!
Conclusions • Compared to a large society: • Small society has higher advantage in reducing crime to zero crime rate equilibrium • Small society has lower advantage in utilizing theoretical analysis • Low initial crime rate is essential to reach zero crime rate equilibrium • If a social disturbance causes the crime rate to increase to a high level, then zero crime rate equilibrium will not ever appear.
Free Market Control for a Multi-Agent Based Peer Help Environment Kevin Kostiuk and Julita Vassileva
I-Help can be viewed as a special case of electronic market • There are users who possess some goods or resources (knowledge in this case) and users who need these goods / resources (asking for help or advice); • A person interested to buy a good must find a seller who offers the good with acceptable quality and at acceptable price. In I-Help a user with a specific help request needs to find a competent helper; • The buyer is willing to pay some amount of money in order to achieve the goal of gaining some knowledge, while the seller is willing to give away some knowledge in exchange for money. The goal of accumulating some resource, like money, which can be exchanged with some goods (promotion, salary increase in workplace environment, marks in University environment) creates a motivation for knowledgeable users / students to participate. • The price of a certain good depends on the offer and the demand for this good on the market. People having exceptional and highly demanded knowledge / expertise, can put higher prices for their advice. • There is some cost associated with supplying the buyer with the good; helping costs some time for the helper, which could be used for achieving some other goal.
Designing an economy • An exercise in designing emergent control • Free-market model chosen: • Allows to seek equilibrium quickly and provides access to control parameters. • in the absence of externalities, competitive equilibria correspond to efficient resource allocations. • Practice demonstrates that fiscal, monetary, and trade policy can guide macroeconomic behaviour, while taxation and redistribution can support individual agent welfare. • Free-market mechanisms are extendible because they provide a foundation for other economic models.
Specifics of the problem • The "wellness" of the economy is not measured by the money turnover, but by the accumulated knowledge of the users. • In conventional market the prices emerge and develop historically, electronic markets start usually with real-world prices. In our case, due to the unusual nature of the good "knowledge", there is no price history. • We can’t expect a purely rational behavior from the sellers and buyers • There are two levels of interaction in the environment: the "real world", where the players are users / students, and an "agent world“ where the players are the personal agents.
Requirements Educational (Meta-level): • Involve all students. • Create general enthusiasm, but not divert from main goal (learning). • Allow unclassified resources into the economy (e.g. tutorials, FAQ, tutors etc.). • Require little maintenance. • Obtain measurable results.
Requirements (cont.) Desired Macroeconomic Behaviours: • Establish and maintain trade. • Achieve a well-behaved price level. • Realize net gains from trade (overall knowledge of students). • Distribute benefits fairly.
Requirements (cont.) Agent Welfare Constraints: • Encourage win-win transactions. • Not make anyone worse off. • The coordination mechanism must seek Pareto preferred states; • Collectivist perspectives prefer the maximization of average or minimum welfare; • However seeking collective welfare within a free-market structure threatens the participation of individual users, particularly the most productive and competent ones; only coercion ensures the participation of students made worse off by the system. • Avoid unreasonable wealth accumulation.
Individual Agent • Utility function U = a(DM) + (1- a)(C•DG) DM – change in Money DG – change in Knowledge • – greed for money 0<=a<=1 C – conversion factor between units of G and M Strategies: DM ~ Teaching - Usage DG ~ Teaching* + Usage*