360 likes | 518 Views
Seminar on Coordinated Systems. SYS 793 Fall 2004. Department of Systems and Information Engineering University of Virginia. Motivation. We are increasingly reliant on large-scale, distributed engineering systems: Internet, national power grid, etc.
E N D
Seminar onCoordinated Systems SYS 793Fall 2004 Department of Systems and Information Engineering University of Virginia
Motivation • We are increasingly reliant on large-scale, distributed engineering systems: Internet, national power grid, etc. • coordination is often achieved in engineering systems through the specification of ad hoc protocols for relatively well defined (constrained) interactions between distributed systems. • existing protocols are not adequately tuned for new applications and/or unexpected situations • there appears to be little in the way of underlying guiding principles and theory for designing and operating such systems • Notions of game theory and decentralized control go only part way toward revealing the basic problems associated with distributed engineering systems
SYS 793 • In this seminar: • we will explore the recent literature on the intersections between game theory, distributed planning in robotics and artificial intelligence, and distributed control. • Faculty and students are expected to present a paper (to be chosen from the list given below). • After each presentation we shall have a discussion session where the main contributions of each paper will be critically assessed.
Parameters • Time: • Wednesdays, 5-6:15 • Place: • Olsson 005 • Credit: • One hour, pass/fail • Registration: • Schedule number 95083 (SYS 793, Section 2) • Webpage: https://toolkit.itc.virginia.edu/cgi-local/tk/UVa_SEAS_2004_Fall_SYS793-2
Student Responsibilities • To earn a passing grade, each student will: • prepare a presentation on at least one approved paper • lead subsequent discussion on salient features of the paper • hand over slides for publication on the course website
Partial List of Approved Papers R. Aumann. Subjectivity and correlation in randomized strategies. Journal of Mathematical Economics, 1:67–96, 1974. H. Bui, S. Venkatesh, and D. Kieronska. A framework for coordination and learning among teams of agents. Lecture Notes in Computer Science, 1441:164–178, 1998. J. Doran, S. Franklin, N. Jennings, and T. Norman. On cooperation in multi-agent systems. The Knowledge Engineering Review, 12(3):309–314, 1997. G. Ellison. Learning, local interaction and coordination. Econometrica, 61:1047–1071, 1993. D. Gauthier. Coordination. Dialogue, 14:195–221, 1975. D. Gilbert. Rationality and salience. Philosophical Studies, 57:61–77, 1989. V. Gervasi and G. Prencipe. Robotic cops: The intruder problem. In Proceedings of IEEE International Conference on Systems, Man, and Cybernetics, 2003. M. Kandori, G. Mailath, and R. Rob. Learning, mutation and long-run equilibria in games. Econometrica, 61:29–56, 1993. G. Prencipe. Corda: Distributed coordination of a set of autonomous mobile robots. In Proceedings of the Fourth European Research Seminar on Advances in Distributed Systems, pages 185–190, 2001. P. Vanderschraaf. Learning and coordination: inductive deliberation, equilibrium and convention. Routledge, 2001. X. Wang and T. Sandholm. Reinforcement learning to play an optimal nash equilibirum in team markov games. In Advances in Neural Information Processing Systems 15 (NIPS-2002), 2002. G. Weiss. Multiagent Systems: a Modern Approach to Distributed Artificial Intelligence. The MIT Press, Cambridge, 1999.
Agenda • Coordinated Systems Research Group (CSRG) • Introduction to “Coordinated Systems”
CSRG Mandate • The CSRG focuses on issues of coordination in large scale decentralized engineering systems. • Engineering systems are increasingly reliant on coordination, more than on (centralized) optimization processes and/or control. • Typically, coordination is achieved through ad hoc protocols for relatively well defined (constrained) interactions between distributed systems. • However, as information systems become more integrated into society, we find that existing protocols aren’t adequately tuned for new applications and/or unexpected situations. • CSRG goals: • Develop theory for achieving coordination with limited communication and without the benefit or predefined protocols • Transform theory into practice by developing and evaluating prototype applications
CSRG Activities • Distributed decision-making without communication • fault tolerant and adaptive construction of overlay networks for group communication and collaboration. • remote sensing. • robotic coordination. • Auction mechanisms for “on-demand” IT services • market mechanisms for efficient allocation of distributed computation and communication services • Fictitious play in Internet traffic engineering • account for competing interests of end-to-end connections within and across Internet autonomous systems
Peter Beling • Associate Professor • Department of Systems and Information Engineering • Research: • Financial engineering • Optimization theory & computational complexity • Funded Projects: • Solution Concepts for Static Coordination Problems • NASA LaRC Grant NNL-04-AA66G
Alfredo Garcia • Assistant Professor • Department of Systems and Information Engineering • Research: • Modeling and control of communications networks • Stochastic Optimization and Optimal Control • Funded Projects: • Complex Networks Optimization • NSF Grant DMI-0217371 • Security of Supply & Strategic Learning in Restructured Power Markets • NSF Grant ECS-0224747
Stephen Patek • Assistant Professor • Department of Systems and Information Engineering • Research: • Modeling and control of communications networks • Stochastic Optimization and Optimal Control • Coordination Processes • Funded Projects: • Dynamic Coordination Processes for Distributed Planning with Limited Communication, NSF (DST-0414727)
Current Students • Himanshu Gupta • Kaushik Sinha • Yijia Zhao
Dial Wait Dial Wait Example: Dial / Wait • Two-Players, Two Actions: • What makes sense? • Two “coordinated minimum-cost solutions” : (Dial, Wait) and (Wait, Dial). • Unfortunately, neither player knows which one to select.
Some Thoughts • Arbitrarily selecting an action is irrational! • If Player 1 arbitrarily chooses to Dial, Player 2 could also arbitrarily choose to Dial, resulting in a worst case outcome. • Worst case solutions are easy to achieve arbitrarily (without coordination). • Randomly selecting an action makes more sense! • If Player 1 chooses to Dial with probability p and Player 2 chooses to Dial with probability q, then the expected cost of the outcome is • Only in introducing “mixed actions” (randomized decisions) is it meaningful to talk about expected cost.
Interesting Observations • Suppose Player 1 chooses p = .5, then In other words, regardless of Player 2’s decision, Player 1 is able to “lock in” an expected cost of .5. • Suppose Player 2 chooses q = .5, then In other words, regardless of Player 1’s decision, Player 2 is able to lock in an expected cost of .5.
A Strong Equilibrium Each player has the ability to lock in an expected have of .5, namely by choosing p = .5 and q = .5, respectively. • In fact, given p = .5 and q = .5, neither player has the ability to change (let alone) improve the expected cost of the solution. • So, p = .5 and q = .5 constitutes a strong equilibrium solution for the Dial / Wait problem.
Connection to Game Theory • The Dial / Wait problem is a finite, two-player, non-cooperative, game in strategic form (with identical interests) • For the Dial / Wait game there are two pure strategy Nash equilibria: • (Dial, Wait) • (Wait, Dial) and exactly one mixed (non-pure) strategy mixed Nash equilibrium: • p = .5, q = .5.
Remarks • The pure strategy equilibria are completely uninteresting. • They can only be achieved if the players are allowed to coordinate their actions. • Minimax is correct here, but it is not a reasonable approach in general. • The mixed strategy equilibrium makes a lot of sense. • It achieves a decent value of expect cost and is not the result of an arbitrary decision. • In fact, all that’s required in this example is that one of the two players play their equilibrium strategy. • But, in general, there can be multiple mixed strategy equilibria, and it is not obvious which one each player would select.
Uncoordinated Decision-Making We consider N-agent multistage decision situations, where • all N agents must select an available action at each stage, without coordinating their actions in advance (simultaneous decision making) • all N agents perceive the same cost (disutility) associated the joint selection of actions at each stage.
Key Idea Since all agents share the same notion of cost, they would coordinate their actions (if they could) to pick out a minimum-cost joint solution. • This would an easy, even without coordination, if there were a single minimum-cost joint selection of actions. • Unfortunately, if more than one “coordinated minimum-cost solution” exists, then there may not be a clear course of action for all agents.
Dynamic Version of the Dial / Wait Problem • Both players decide to Dial or Wait in stages. • If both decide to Dial or if both decide to Wait, then they remain unconnected. • Both players are interested in reaching in as few stages as possible. not connected 1 connected (Dial, Wait) (Wait, Dial) (Dial, Dial) (Wait, Wait)
Formalization • Statespace, X • Set of conditions associated with the operation of an underlying system. • Mixed Action sets, Ai(x) • Actions available to player i when the system is in state x2X. • Transition reward function, gx(a1, …, aN) • Expected reward (perceived equally by all players) associated with the profile of mixed actions (a1, …., aN) at state x 2 X. • State transition probability, pxy(a1, …., aN) • Probability of transitioning from x2 X to y 2 X under the profile of actions (a1, …, aN). • Time horizon, T • Number of stages of decision-making before the process ends.
m n n-m Set Up • Consider a network in which n connections are served by a direct link between two nodes A and B. A B n connections … … • Suppose two alternative links exist with capacities m and (n-m), respectively. • Note that all n connections can still be accommodated, but how should they re-route themselves?
Dynamic Recovery • The initial state of this process is < m, n-m >. • Rules: • If k < m connections select the top link, then those k connections are satisfied, but the other link is overwhelmed. • The system transitions to < m-k, n-m >, with n-k connections left to be satisfied. • We are left with a network routing problem similar to the original one but involving the same or fewer unsatisfied connections. • If k = m connections select the top link, then all n connections are satisfied. • The system transitions to • If k > m connections select the top link, then the top link is overwhelmed, but the n-k other connections are satisfied. • The system transitions to < m, k-m >, with k connections left to be satisfied. • We are left with a network routing problem similar to the original one but involving the same or fewer unsatisfied connections. • All players randomize their selection of links.
Problem Overview • N sensing platforms • Identical in capability • M targets • Autonomous planning • Strict collision avoidance protocol • Impossible for two or more platforms investigate a single target simultaneously • Finite Time Horizon, T
Simple Example • Parameters • 3 UAVs • 5 points of interest • 2 sensing opportunities • Stage 1 • Green UAV successfully reaches its target • Blue and Red UAVS compete and fail to make the necessary observations
Simple Example • Stage 2 • Green and Blue UAVs compete a new target • Red UAV is successful • Summary • We end up only observing two targets! • (With coordination, we could have arranged for all points of interest to be revealed.)
Research Question What decision rule should we implement within each sensing platform so that we gather as much information as possible within the available time, without requiring the platforms to coordinate their actions?
Topics of Interest • Game Theory • solution concepts for rational decision-making • alternatives to Nash and correlated equilibria for non-cooperative games • equilibrium selection • Learning algorithms for games • including fictitious play • reinforcement learning • Multi-Agent Frameworks/Applications • Robotic Cops, Robo-anything • Team theory • Philosophy of Coordination