690 likes | 821 Views
Artistic Robots through Interactive Genetic Algorithm with ELO rating system. Andy Goetz, Camille Huffman, Kevin Riedl, Mathias Sunardi and Marek Perkowski Department of Electrical Engineering, Portland State University. Portland Cyber Theatre. Making science out of robot theater?.
E N D
Artistic Robots through Interactive GeneticAlgorithm with ELO rating system Andy Goetz, Camille Huffman, Kevin Riedl, Mathias Sunardi and Marek Perkowski Department of Electrical Engineering, Portland State University
Interactive Genetic Algorithm Human evaluators Behavior Generation and Verification Behavior expression robot Probabilistic Automaton behavior Generator and verifier Behavior Automaton
Main Idea of this paper • A new approach to create fitness function for Interactive Genetic Algorithm in which (possibly) many humans evaluate robot motions via Internet page. • Based on ELO rating system known from chess. • The robots use: • a genetic algorithm, • fuzzy logic, • probabilistic state machines, • a small set of functions for creating picture components, • and a user interface which allows the Internet users to rate individual sequences.
Previous work on IEC systems • Human-based genetic algorithm. • Interactive evolution strategy, • Interactive genetic programming, • Interactive genetic algorithm. Mostly for music composition and graphics Usually weighted functions were used
Ranking Systems in Sports • Rating systems for many sports award points in accordance with subjective evaluations of the 'greatness' of certain achievements. • For example, winning an important golf tournament might be worth an arbitrarily chosen five times as many points as winning a lesser tournament. • A statistical endeavor, by contrast, uses a model that relates the game results to underlying variables representing the ability of each player.
Elo rating system • TheElo rating system is a method for calculating the relative skill levels of players in two-playergames such as chess. • It is named after its creator Arpad Elo, a Hungarian-bornAmerican physics professor. • The Elo system was invented as an improved chess rating system, but today it is also used in many other games. • It is also used as a rating system for multiplayer competition in a number of video games. • It has been adapted to team sports including association football, American college football, basketball, and Major League Baseball.
Pairwise Comparison • Method: • Compare each two candidates (players) head-to-head. • Award each candidate one point for each head-to-head victory. • The candidate with the most points wins. • N(N-1)/2 comparisons.
Pairwise Comparison - Example • Selection of best robot facial expression: • 4 candidates: {A,B,C,D} and 4 rankings of them • 37 voters • 5 trials (columns) • Table shows the rankings of the candidates (rows) and the number of voters (columns) that ranked the candidates that way
Pairwise Comparison - Example • Compare candidates A & B: • 14 voters ranked A higher than B • 10+8+4+1=23 voters ranked B higher than A • So, B wins against A
Pairwise Comparison - Example • Next, compare candidates A & C: • 14 voters ranked A higher than C • 10+8+4+1=23 voters ranked C higher than A • So, C wins against A • Continue for next pairs: A vs. D, B vs. C, B vs. D, C vs. D • Exclude: • permutations (e.g. C vs. A = A vs. C) • comparison with itself (e.g. A vs. A)
Pairwise Comparison - Example • Record points: • wins=1, lose=0 Cell values: number of voters that ranked candidate (row) over candidate (column)
Pairwise Comparison - Example • Record points: • wins=1, lose=0 Cell values: number of voters that ranked candidate (row) over candidate (column)
Pairwise Comparison - Example • Record points: • wins=1, lose=0 Cell values: number of voters that ranked candidate (row) over candidate (column)
Pairwise Comparison - Example • Record points: • wins=1, lose=0 Cell values: number of voters that ranked candidate (row) over candidate (column) C wins!
Pairwise Comparison - Example • Record points: • wins=1, lose=0 Another way to calculate the winner: use half the table triangle, mark the winner, and count the number of times the player appears C wins!
Other possible scenario • A three-way tie: • Inconsistency: • A wins over B, B wins over C, C wins over A
Overview of ELO • A player’s skill is assumed to be a normal distribution: • True skill is around the mean • Elo System gives two things: • A players expected chance of winning • A method to update a player’s Elo Rating
Basic Ideas of ELO • One cannot look at a sequence of moves and say, "That performance is 2039." • Performance can only be inferred from wins, draws and losses. • Therefore, if a player wins a game, he is assumed to have performed at a higher level than his opponent for that game. • Conversely if he loses, he is assumed to have performed at a lower level. • If the game is a draw, the two players are assumed to have performed at nearly the same level.
Scores and ranking of players • A player’s ranking is updated based on its: • Expected value of winning (E) • Which depends on the ranking difference with the opponent • Outcome of the match (S for ‘score’) • 1 = win • 0 = lose • 0.5 = draw
Expected scores in Elo Rating rating • Expected score (E) • Where: • EA, EB = expected score for player A and B, respectively • RA, RB = Rating of player A and B, respectively • Remember: 1=win, 0=lose, 0.5=draw score http://en.chessbase.com/home/TabId/211/PostId/4007114
Characteristics of ELO • A player with higher Elo ranking than his opponent has a higher expected value (i.e. chance of winning), and vice versa. • When both players have similar Elo rankings, the chance of having a draw is higher. • After the match, both players’ rankings are updated with the same amount, but: • the winner gains the rank (rating), • the loser loses the rank. • If a higher ranking player (‘stronger’) wins against a weaker player, the rank changes are smaller than when the weaker player wins against the higher ranking player. • Subjective value K
Basic Assumptions of ELO • Elo's central assumption was that the chess performance of each player in each game is a normally distributed random variable. • Although a player might perform significantly better or worse from one game to the next, ELO assumed that the mean value of the performances of any given player changes only slowly over time. • A further assumption is necessary, because chess performance in the above sense is still not measurable. • Our question: “Is ELO good for human evaluation of robot art (motion, behavior)?”
How ELO Works • A player's expected score is his probability of winning plus half his probability of drawing. • Thus an expected score of 0.75 could represent a 75% chance of winning, 25% chance of losing, and 0% chance of drawing. • On the other extreme it could represent a 50% chance of winning, 0% chance of losing, and 50% chance of drawing. • The probability of drawing, as opposed to having a decisive result, is not specified in the Elo system. • Instead a draw is considered half a win and half a loss.
How ELO Works • The relative difference in rating between two players determines an estimate for the expected score between them. • Both the average and the spread of ratings can be arbitrarily chosen. • Elo suggested scaling ratings so that a difference of 200 rating points in chess would mean that the stronger player has an expected score (which basically is an expected average score) of approximately 0.75, • The USCF initially aimed for an average club player to have a rating of 1500.
Elo Rating - Example • Suppose a Robot Boxing league: • The league has tens, hundreds, or more robots • Each robot has a ranking (higher number = higher rank) • A robot’s ranking is updated after each match • But it can also be done after multiple matches • A match is a one-vs-one battle
Elo Rating Example: scores for robots • Expected score (E) • Suppose: • Robot A rank: 1500 • Robot B rank: 1320 • Then: • EA = 1 / (1 + 10(1320 - 1500)/400) = 0.738 • EB = 1 - 0.738 = 0.262 Expected to win
Elo Rating Example: Adjusting ratings after match • Next, the match is held. • After the match, the ratings of both robots will be adjusted by: • Where: • R’A = Robot A’s new rating • RA = Robot A’s old/current rating • K = some constant*, for practical reasons we choose K=24 in this example • S = Score/match result (1=win, 0=lose, 0.5=draw) • EA = Expected score • Similarly for robot B
Elo Rating Example: Adjusting scores after one match • Suppose the outcome of the match: • Robot A wins! • Robot B wins! • It’s a draw! Remember before the match it was: • Robot A rank: 1500 • Robot B rank: 1320
Elo Rating Example: adjusting rankings after five matches • Suppose rank update is done after 5 matches: • Robot A current rank: 1500
About K in chess • K is the rate of adjustments to one’s rating. • Example when Robot A wins (B loses): • Some Elo implementations adjust K based on some criteria. For example: • FIDE (World Chess Federation): • K = 30 for a player new to the rating list until s/he has completed events with a total of at least 30 games. • K = 15 as long as a player's rating remains under 2400. • K = 10 once a player's published rating has reached 2400, and s/he has also completed events with a total of at least 30 games. Thereafter it remains permanently at 10. • USCF (United States Chess Federation): • Players below 2100 --> K-factor of 32 used • Players between 2100 and 2400 --> K-factor of 24 used • Players above 2400 --> K-factor of 16 used. How about robot art?
ELO for art (motion) scoring Score of 194
ELO for art (motion) scoring Score of 0
Physical Robot DERPY Derpy with a sharpie marker
Fuzzy/Probabilistic state Machine operates differently in dark and light areas. Image with dark and light areas. Examples of fuzzy variables.
Fuzzy and Probabilistic Machines Simple probabilistic machine of Derpy