Automated Testing: Better, Cheaper, Faster, For Everything

Automated Testing:Better, Cheaper, Faster,For Everything Larry Mellon, Steve Keller Austin Game Conference Sept, 2004

What Is A MMPAutomated Testing System? • Push-button ability to run large-scale, repeatable tests • Cost • Hardware / Software • Human resources • Process changes • Benefit • Accurate, repeatable measurable tests during development and operations • Stable software, faster, measurable progress • Base key decisions on fact, not opinion

MMP Requires A Strong Commitment To Testing • System complexity, non-determinism, scale • Tests provide hard data in a confusing sea of possibilities • Increase comfort and confidence of entire team • Tools augment your team’s ability to do their jobs • Find problems faster • Measure / change / measure: repeat as necessary • Production / exec teams: come to depend on this data to a high degree

How To Get There • Plan for testing early • Non-trivial system • Architectural implications • Make sure the entire team is on board • Be willing to devote time and money

Automation: Architecture Startup & Control Collection & Analysis Repeatable, Sync’ed Test Inputs System Under Test System Under Test System Under Test Scripted Test Clients Emulated User Play Sessions Multi-client synchronization Report Managers Raw Data Collection Aggregation / Summarization Alarm Triggers Test Manager Test Selection/Setup Control N Clients RT probes

Outline Overview: Automated Testing Definition, Value, High-Level Approach Applying Automated Testing Mechanics, Applications Process Shifts: Stability, Scale & Metrics Implementation & Key Risks Summary & Questions

Scripted Test Clients • Scripts are emulated play sessions: just like somebody plays the game • Command steps: what the player does to the game • Validation steps: what the game should do in response

Scripts TailoredTo Each Test Application • Unit testing: 1 feature = 1 script • Load testing: Representative play session • The average Joe, times thousands • Shipping quality: corner cases, feature completeness • Integration: test code changes for catastrophic failures

“Bread Crumbs”: Aggregated Instrumentation Flags Trouble Spots Server Crash

Quickly Find Trouble Spots DB byte count oscillates out of control

Drill Down For Details A single DB Request is clearly at fault

Amount of work done Time Target Launch Project Start MMP Developer Efficiency Strong test support Weak test support Process Shift: Applying Automation to Development Earlier Tools Investment Equals More Gain Not Good Enough

Process Shifts: Automated Testing Can Change The Shape Of The Development Progress Curve Stability Keep Developers moving forward, not bailing water Scale Focus Developers on key, measurable roadblocks

First Passing Test Now Process Shift: Measurable Targets, Projected Trend Lines Target Complete Core Functionality Tests, Any Feature (e.g. # clients) Time Any Time (e.g. Alpha) Actionable progress metrics, early enough to react

Stability Analysis: What Brings Down The Team? Test Case: Can an Avatar Sit in a Chair? use_object () • Failures on the Critical Path block access to much of the game. • Worse, unreliable failures… buy_object () enter_house () buy_house () create_avatar () login ()

Impact On Others

Pre-Checkin Regression: don’t let broken code into the Critical Path.

Monkey Test: EnterLot

Non-Deterministic Failures

Code Repository Compilers Reference Servers Stability Via Monkey Tests Continual Repetition of Critical Path Unit Tests

Process Shift: Comb Filter Testing Sniff Test, Monkey Tests - Fast to run - Catch major errors - Keeps coders working Smoke Test, Server Sniff - Is the game playable? - Are the servers stable under a light load? - Do all key features work? Full Feature Regression, Full Load Test - Do all test suites pass? - Are the servers stable under peak load conditions? $$$ $$ $ New code ready For checkin Promotable to full testing Promotable to paying customers Full system build • Cheap tests to catch gross errors early in the pipeline • More expensive tests only run on known functional builds

Process Shift: Who Tests What? • Automation: simple tasks (repetitive or large-scale) • Load @ scale • Workflow (information management) • Full weapon damage assessment, broad, shallow feature coverage • Manual: judgment / innovative tasks • Visuals, playability, creative bug hunting • Combined • Tier 1 / Tier 2: automation flags potential errors, manual investigates • Within a single test: automation snapshots key game states, manual evaluates results • Augmented / accelerated: complex build steps, …

Process Shift: Load Testing (Before Paying Customers Show Up) Expose issues that only occur at scale Establish hardware requirements Establish play is acceptable @ scale

Resource Debugging Data Load Testing Team Metrics Client Metrics Load Control Rig Test Test Test Test Test Test Test Test Test Client Client Client Client Client Client Client Client Client Test Driver CPU Test Driver CPU Test Driver CPU Game Traffic Internal System Server Cluster Probes Monitors

Client-Server Comparison

Live Beta Testers AlphaVilleServers Test Servers Highly Accurate Load Testing:“Monkey See / Monkey Do” Sim Actions (Player Controlled) Sim Actions (Script Controlled)

Outline Overview: Automated Testing Definition, Value, High-Level Approach Applying Automated Testing Mechanics, Applications Process Shifts: Stability, Scale & Metrics Implementation & Key Risks Summary & Questions

Data Driven Test Client Load Regression Reusable Scripts & Data Single API Test Client Single API Key Game States Pass/Fail Responsiveness Script-Specific Logs & Metrics

Test Client Game Client Script Engine Game GUI State State Client-Side Game Logic Scripted Players: Implementation Commands Presentation Layer

What Level To Test At? Game Client View Mouse Clicks Presentation Layer Logic Regression: Too Brittle (UI&pixel shift) Load: Too Bulky

What Level To Test At? Game Client View Internal Events Presentation Layer Logic Regression & Load: Too Brittle (Churn Rate vs Logic & Data)

Chat Enter Lot Use Object Route Avatar … Automation Scripts == QA Tester Scripts Basic gameplay changes less frequently than UI or protocol implementations. NullView Client View Presentation Layer Logic

Common Gotchas • Setting the Test bar too high, too early • Feature drift == expensive test maintenance • Code is built incrementally: reporting failures nobody is prepared to deal with wastes everybody’s time • Non-determinism • Race conditions, dirty buffers/processState, … • Developers test with a single client against a single server: no chance to expose race conditions • Not designing for testability • Testability is an end requirement • Retrofitting is expensive • No senior engineering committed to the testing problem

Outline Overview: Automated Testing Definition, Value, High-Level Approach Applying Automated Testing Mechanics, Applications Process Shifts: Stability & Scale Implementation & Key Risks Summary & Questions

Summary: Mechanics & Implications • Scripted test clients and instrumented code rock! • Collection, aggregation and display of test data is vital in making decisions on a day to day basis • Lessen the panic • Scale&Break is a very clarifying experience • Stable code&servers in development greatly ease the pain of building a MMP game • Hard data (notopinion) is both illuminating and calming • Long-term operations: testing is a recurring cost

Summary: Process • Integrate automated testing at all levels • Don’t just throw testing over the wall to QA monsters • Use automation to speed & focus development • Stability: Sniff Test, Monkey Tests • Scale: Load Test

Tabula Rasa PreCheckin SniffTest Keep Mainline Working Hourly Monkey Tests Baseline for Developers Dedicated Tools Group Easy to Use == Used Executive Support Radical Shifts in Process Load Test: Early & Often Break It Before Live Distribute Test Development & Ownership Across Full Team

Cautionary Tales Flexible Game Development Requires Flexible Tests Signal To Noise Ratio Defects & Variance In The Testing System

Questions (15 Minutes) Overview: Automated Testing Definition, Value, High-Level Approach Applying Automated Testing Mechanics, Applications Process Shifts: Stability, Scale & Metrics Implementation & Key Risks Slides online @ www.maggotranch.com/MMP

Automated Testing: Better, Cheaper, Faster, For Everything