540 likes | 673 Views
E Pluribus Unum Matchmaking in Halo 3. Chris Butcher Bungie Studios butcher@bungie.com Game Developers Conference 2008. Overview. What Is Matchmaking? Matchmaking Basics Lessons from Halo 2 Halo 3 Design Goals Voice, Identity, Community Reinforcement
E N D
E Pluribus Unum Matchmaking in Halo 3 Chris Butcher Bungie Studios butcher@bungie.com Game Developers Conference 2008
Overview • What Is Matchmaking? • Matchmaking Basics • Lessons from Halo 2 • Halo 3 Design Goals • Voice, Identity, Community Reinforcement • Skill Measurement and Reward Systems • Technical Design • TrueSkill • Matchmaking Algorithms • Recommendations • Results from Halo 3 Live Operation
Manual Game Browsing • User is presented with a list of possible games • Tries to find an open slot • Tries to find a fair game • Inconsistent experience • Not good for casual gamers • “I Just Want To Play!”
Terminology • Manual game browsing is a standard technique • Host Game / Join Game options in UI are common • Xbox LIVE refers to this as the “Matchmaking” API • Quick Match, Custom Match • In this presentation, “matchmaking” means an automated peer-to-peer system that organizes players into groups based on user preference • Game could still be client / server once the game starts • Could even use dedicated servers
Vision of Matchmaking • Provide an experience that is: Fast Reliable Consistent • Continuous stream of enjoyable games • Reward skilled play and also investment of time • Don’t give players a reason to stop!
Matchmaking Ecosystem • Continuous stream of groups entering matchmaking • Some groups decide to start gathering a game • The remainder search for games to join • Each group can be multiple machines and players
Xbox LIVE Matchmaking Service • Gatherers register with XBL service • Each group has unique matchmaking desires • Type of game, skill level, spoken language, etc • Searchers query service with parameter filters • Service returns matching candidates
Candidate Evaluation • Searcher evaluates all candidates in parallel, best matches first • Ping network connectivity, get current group state • Measure quality of connection using Xbox QoS probes • Group-to-group XNetConnect • Group-to-group session join • Network layer handles as asynchronous processes
Matchmaking Life Cycle • Groups enter Matchmaking continuously • Each group chooses to gather or search • Gather: register session with XBL service • Search: query service for candidates • Search, evaluate candidates, try to join • If no suitable candidates, search again • Halo specific game flow: • Gatherer waits until game is full • Determine game settings, host selection • Start game
Halo 2 has had good longevity • Year-on-Year retention is > 80%
Game is well suited to Matchmaking • Small-group gameplay (2-5 per team) • Interact with friends in your group • Both coordinated effort and individual skill required • Opponents are anonymous and interchangeable • Long term goals are self-driven rather than peer-driven • I want to reach Level 30 • Not: I want to be the best on my server
Lessons Learned - Matchmaking • Received well by the majority of players • Always something to do, a mix of novelty and the familiar • Configurable experience allows longevity • Required several early updates to operate robustly • DLC maps locking people out was a problem • People don’t like feeling they have no control • International experience was poor
Lessons Learned - Skill System • Modified ELO rating system • Both a skill measurement and also reward for investment • Non-zero-sum for levels 1-20 to give a “hill-climbing” experience • Was abused through boosting • Zero-sum competition for advancement • Skill level achievement is always in jeopardy • Leads to anxiety, anger and frustration in players • “WTF I lost my level 30, my team sucks”
Ranked Matchmaking in Halo 2 Hyper-Competition + Anonymity + Loss Anxiety = Negative Emotional Pressure
Overall Goals • Make the online experience approachable • Provide accountability and identity • Give players a reason to keep coming back • Tools: • Voice • Identity • Skill System • Reward System • New Player Experience
Voice Design • Can’t predict how players will use voice • Give listeners control over what they hear • Remove temptation to use voice negatively • Allow time for socialization that isn’t under pressure • Make it easy for players to opt out or mute • Positive: Chatting idly with friendly strangers • Negative: Being abused by hostile anonymous bigots
Identity Design • Every player has a public Service Record • Persistent individual identity reduces anonymity • Goal is to reduce anonymity and provide long-term identification • Publicly accessible in-game to everyone • Reduce sock-puppeting that was prevalent in Halo 2 • Rewards are individual • Success recognized directly, or via social comparison with friends • No global leaderboards! • Primarily competing with yourself
Skill System Design • Range 1-50; everyone starts at level 1 • Almost everyone gains levels quickly, providing positive feedback • After 50-100 games, skill level stabilizes • Needs to still feel dynamic and not stagnant. • But shouldn’t “lose a level” from one bad game. • Skill should be a statistic, not a reward
Reward System Design • Reward for playing • “Experience points” (XP) • Only for wins, to prevent boosting • Penalty for quitting games early • Experience rating hierarchy • Ratings require both skill and XP • Emphasized in UI over skill • Permanent; no loss anxiety
New Player Experience • Separate “Boot Camp” playlist for new players only • Limited set of maps and weapons to ease players in • Small groups for socialization • Reward early and often! • Move skilled players out quickly • 5 wins triggers ‘graduation’
Xbox LIVE Skill System – TrueSkill • A mathematical library implemented on XBL back end • Bayesian estimation techniques developed by Microsoft Research Cambridge • Models player skills as probability density functions [µ, σ] • µ is mean (current estimate), σ is standard deviation (uncertainty) • TrueSkill is stored and updated invisibly by XBL back end
Using TrueSkill in Halo 3 • Don’t show players the raw mathematics of [µ, σ] • Use skill lower bound: s = µ - kσ (we chose k=4) • Transform by remap function into range 1-50 for display in UI
Customizing TrueSkill • Mathematical configuration variables • β (performance factor), (dynamics factor), draw probability • Left β alone: dangerous, affects final skill distribution • Increased so that players’ skill never fully converges • Draw probability must be accurate, if it is set too low then ties will be considered highly significant • Update Weight – modifies rate of change of [µ, σ] • We used this to give players a “hill-climbing” experience by initially decreasing their TrueSkill update weight • Weights start out small and return to normal over 50-100 games in a playlist • Even though we can identify good or bad players after 8 games, it is more satisfying for them to feel they earned their skill over time
TrueSkill Summary • Advantages • Already implemented for you by Xbox LIVE • Converges quickly • Provides good estimate of player skill for matchmaking • TrueSkill developers are very helpful and knowledgeable • Disadvantages • Complex mathematics, takes an expert to understand and tweak • Hard to predict overall convergence of system • Default behavior does not fit our ideals for a skill system • Vulnerable to exploitation, both real and perceived (ties increase rank) • This is a hard problem with no clear solution
Search Criteria • Use precise initial query parameters to find an ideal match • Skill, Experience, Network Connection Quality • Initial queries are less likely to find a match • Allows tight matches in large populations • Query parameters must include all selection criteria • Halo 2 had some criteria that were not stored in XBL service • Searcher spent time querying candidates that they would never want to join e.g. due to spoken language • Wastes bandwidth and also wastes precious search time
Search Expansion • ‘Fuzzy match’ in many dimensions • Analog parameters (skill, experience, network connection) • Binary parameters (language, country, DLC maps) • Treat binary parameters as “soft filters” • Expansion has multiple phases • Look for ideal match, expand analog filter a bit • Remove binary “soft filters” • Expand analog filter out to max, relax connection quality • Keep trying intermittently, switch to gathering
Ecosystem Balance • Must have good balance of searchers and gatherers • Halo 2: Easy to model in theory, impossible in practice • Global Internet network properties • Latency in Live service updating • Network engine internals (time to discover, time to join) • Expire lists of candidates quickly • Searchers are in a race to join limited set of active games • Make the ecosystem adaptive • Gatherers can also search • If nobody is joining you, you have a chance to join someone • Ecosystem can adaptively balance for low-population scenario
Is Matchmaking Right For You? • Works with different genres • Works with different game models • Could match into games in progress • Could use dedicated servers • Scales to wide range of user populations • Halo 3 playlists range from < 1k to 100k concurrent users • Significant investment in client software • 2-3 developers for project lifecycle • Back end functionality optional but helps a lot • Payoff comes from building a lasting community
Design For Security • Users have no control so you must provide safe games for them • Every aspect of your game will be attacked • Hardware attacks, network attacks (bridging, standby, DoS), game attacks (modified content, LSP interception), exploits (skill de-leveling, out-of-map), many more • Halo 2 required five updates over three years for security • This is an entire talk in itself
Test Early, Test Often • MS-Internal Alpha and Beta (10k) – 11/06 and 4/07 • Rich text data mining as primary feedback • Searchable centralized logging system with event severities • Find hard bugs in client (edge cases, crashes, network protocol) • Transcontinental Matchmaking and network testing • Public Beta (900k) – 5/07 • PR boost, some gameplay feedback also • Tune TrueSkill distribution curves on real player skill mix • Load balancing of LSP servers to avoid Day 1 meltdown • High population games must involve XBL in testing • Easy to create a scalability problem on back end
Collect Data From Production • You will need retail environment instrumentation • Can’t use data mining – too much data to store and transmit • For Halo 2 launch we had no alternative • Halo 3 uses special-purpose binary uploads • End-of-game report for bungie.net analysis • Matchmaking status report for ecosystem diagnosis • Network QoS report for research • Volume of data is massive, we discard 90%+ • Slim fire-and-forget stateless HTTP-over-XLSP upload • Per-machine settings for deep investigation
Results – Overall • Deployed successfully, no client update needed • Some normal LSP scalability balancing in first few days • Reviews mention MP as “streamlined, transparent” • High penetration of online multiplayer • 5.9M unique users observed on Xbox LIVE • 5.2M have played in Matchmaking (88%) • Approximately 3x Halo 2 peak concurrency • 560k peak concurrent users in 15 playlists • Longevity is an open question • Tracking steady at 1.2M unique users per week
Results – Player Community • Skill numbers that you can actually believe in! • Perception is that level 50 means skilled, not a cheater • Some experience boosting • We assumed it would be possible to circle boost XP • There were ways that let you do it many times faster than normal • No way to advance to higher ratings without ranked play • This was probably a mistake • Player identity features very well received • Social community seems to be better than Halo 2
Results – MP Game Selection Custom Games: 16% Matchmaking: 84%