410 likes | 422 Views
Create multi-user virtual worlds with increasing complexity and interactions. Utilize advanced compute systems and hybrid techniques for rich object dynamics across large environments. Manage world state changes, city road schemes, and dynamic object computations effectively. Enhance scalability, performance, and cross-platform communication using cutting-edge technologies and libraries.
E N D
Multi-user Extensible Virtual WorldsIncreasing complexity of objects and interactions with increasing world size, users, numbers of objects and types of interactions. Sheldon Brown,Site Director CHMPR, UCSD Daniel Tracy, Programmer, Experimental Game Lab Erik Hill, Programmer, Experimental Game Lab Todd Margolis, Technical Director, CRCA Kristen Kho, Programmer, Experimental Game Lab
Current schemes using compute clusters break virtual worlds into small “shards” which have a few dozen interacting objects. Compute systems with large amounts of coherent addressable memory alleviate cluster node jumping and can create worlds with several orders of higher level data complexity. Tens of thousands of entities vs. dozens per shard. Takes advantage of techniques hybrid compute techniques for richer object dynamics.
Central server manages world state changes Number of clients and amount of activity determines world size and shape
City road schemes are computed for each player when they enter a new city, using Hybrid multicore compute accelerators
Each player has several views of the world: • Partial view of one city • Total view of one city • Partial view of two cities • View of entire globe Within a city are several thousand objects. The dynamics of these objects are computed on the best available resource, balancing computability and coherency and alleviating world Sharding.
Many classes of computing devices are used. z10 mainframe – transaction processing state management Server side compute accelerators: NVidia Tesla, Cell processor and x86 Multi-core portable devices (i.e. snapdragon based cell phone) Varied desktop comptuation including hybrid multicore Computing cloud data storage.
Increasing complexity of objects and interactions with increasing world size, users, numbers of objects and types of interactions. Multiple 10gb interfaces to compute accelerators, storage clusters and compute cloud. Cell Processor, x86 and GPU compute accelerators for asset transformation, physics and behaviors. Server services are distributed across cloud clusters, and redistributed across clients as performance or local work necessitates. Coherency with overall system is pursued, managed by centralized server. Virtual world components have dynamic tolerance levels for discoherency and latency.
Development Server Framework 5/2010 2 QS22 blades – 4 Cell Processors 2 HS22 blades - 4 Xeons 3 10gb interfaces to compute accelerators 1 10gb interfaces to internet Many Clients Z10 mainframe computer at San Diego Supercomputer Center 2- IFL’s with 128mb Ram, zVM virtual OS manager with Linux guests 6 tb storage fast local storage – 15K disks 4 SR and 2 LR 10gb ethernet interfaces 4 QS20 blades nVidia Tesla accelerator – 4 GPU’s on linux host, external dual pci connection.
Multi-user Extensible Virtual Worlds Producing a multi-user networked virtual world from a single-player environment
Goals Feasibility Transformation from single-player program to client/server multi-player networking is non-trivial Structured methodology for transformation required Scalability Support large environments, massively multi-player After working version, iteratively tackle bottlenecks Multi-platform server Explore z10, x86, CellBE, Tesla accelerators Cross-platform communication required
Evaluate “drop in” solutions • Benefits and liabilities of client/server side schemes such as OpenSIM and Darkstar.
The (Original) Scalable City Technology Infrastructure OpenGL Direct3D Ogre3D real time3D rendering engine Custom virtual reality engine ERSATZ ODE, Newton Open source physics libraries fmod Sound library Intel OpenCV Real time computer vision CGAL Computational Geometry Library Loki, Xerces, Boost Utilities Libraries Autodesk Maya, 3DMax Procedural assets creation through our own plug-ins Chromium, DMX, Sage Distributed rendering Libraries Serial pipeline. Increase performance by increasing CPU speed. NVIDIA FX Composer, ATI Render Monkey IDEs for HLSL and GLSL, GPU programming
Moore’s law computational gains have not been achievable via faster clock speeds for the past 8 years. • Multicore computing is the tactic • New computing architectures • New algorithmic methods • New software engineering • New systems designs nVidia Fermi GPGPU 16 units with 32 cores each IBM System z processor 4 cores 1 service procesor Sony/Toshiba/IBM Cell BE Processor 1 PPU, 8 SPU’s per chip Intel Larrabee Processor 32 x86 cores per chip
The Scalable City Next StageTechnology Infrastructure Cell Processors compute Dynamic Assets Intel OpenCV Real time computer vision ERSATZ ENGINE Computational Geometry Library Input Data Abstract physics to use multiple physics libraries (ODE, Bullet, etc.) Replace computational bottlenecks in these libraries with data parallel operations. Data Parallel n threads + SIMD Thread Barrier Output Data Fmod Sound library Input Data Convert assets to data parallel meshes after physics transformation, boosts rendering ~33% Output Data Ogre3D Scene graph Open Source Libraries – needs work for adding data level parallelism
The Scalable City Next StageTechnology Infrastructure Cell Processors compute Dynamic Assets Intel OpenCV Real time computer vision ERSATZ ENGINE Computational Geometry Library Input Data Abstract physics to use multiple physics libraries (ODE, Bullet, etc.) Replace computational bottlenecks in these libraries with data parallel operations. Data Parallel DarkStar Server n threads + SIMD Thread Barrier Output Data Fmod Sound library Input Data Convert assets to data parallel meshes after physics transformation, boosts rendering ~33% Output Data Ogre3D Scene graph Max’s out at about 12 clients for world as complex as Scalable City
Real Xtend or Linden Client Open Sim Server ERSATZ ENGINE Systems are not designed for interaction of 10,000’s of dynamic objects Even a handful of complex objects overload dynamics computation. Extensive re-engineering makes to provide capability and use hybrid multicore infrastructure – defeating their general purpose platform
Challenges & Approach Software Engineering Challenges: SC: Large, Complex, with many behaviors. Code consisted of tightly coupled systems not conducive to separation into client and server. Multi-user support takes time, and features will be expanded by others simultaneously! Basic Approach - Agile methodology: Incrementally evolve single-user code into a system that can be trivially made multi-user in the final step. Always have a running and testable program. Test for unwanted behavioral changes at each step. Allows others to expand features simultaneously.
Step by Step Conversion Data-structure focused: is it client or server? Some data structures may have to be split.
Landscape Manager BlackBoard (Singleton) Rendering Data Structures Clouds Physics Player Inverse Kinematics Camera House Piece Audio Road Animation User Input House Lots Visual Component MeshHandler
Abstracting Client & Server Object Representations Server: Visual Component Visual asset representation on the server side Consolidates task of updating clients Used for house pieces, cyclones, landscape, roads, fences, trees, signs (animated, static, dynamic). Dynamic, run-time properties control update behavior Client: Mesh Mesh properties communicated from Visual Component Used to select rendering algorithm Groups assets per city for quick de-allocation
Step by Step Conversion Data-structure focused: is it client or server? Some data structures may have to be split. All data access paths must be segmented into c/s Cross-boundary calls recast as buffered communication.
Data Access Paths Systems access world state via the Blackboard (singleton pattern) After separating into Client & Server Blackboard, Server systems must be weaned off of Client Blackboard and vice versa. Cross-boundary calls recast as buffered communication.
Step by Step Conversion Data-structure focused: is it client or server? Some data structures may have to be split. All data access paths must be segmented into c/s Cross-boundary calls recast as buffered communication. Initialization & run loop separation Dependencies on order must be resolved.
Initialize Graphics Initialize Physics Init Loading Screen Load Landscape Data Initialize Clouds Create Roads Place Lots Place House Pieces Place Player Get Camera Position Initialization & Run-loop Initialize Physics Load Landscape Data Create Roads Place Lots Place House Pieces Place Player Initialize Graphics Init Loading Screen Initialize Clouds Get Camera Position
Step by Step Conversion Data-structure focused: is it client or server? Some data structures may have to be split. All data access paths must be segmented into c/s Cross-boundary calls recast as buffered communication. Initialization & run loop separation Dependencies on order must be resolved. Unify cross-boundary comm. to one subsystem. This will interface with network code in the end.
ReadClient ReadServer Unify Communication MovePlayer Transforms Animations Render Physics/IK UserInput WriteClient WriteServer Single buffer, common format, ordered messages Communicate in one stage: solve addiction to immediate answers
Step by Step Conversion Data-structure focused: is it client or server? Some data structures may have to be split. All data access paths must be segmented into c/s Cross-boundary calls recast as buffered communication. Initialization & run loop separation Dependencies on order must be resolved. Unify cross-boundary comm. to one subsystem. This will interface with network code in the end. Final separation of client & server into two programs Basic networking code allows communication
Separate Two programs, plus basic synchronous networking code Loops truly asynchronous (previously one called the other)
Step by Step Conversion • Data-structure focused: is it client or server? • Some data structures may have to be split. • All data access paths must be segmented into c/s • Cross-boundary calls recast as buffered communication. • Initialization & run loop separation • Dependencies on order must be resolved. • Unify cross-boundary comm. to one subsystem. • This will interface with network code in the end. • Final separation of client & server into two programs • Basic networking code allows communication • Optimize! • New configuration changes behavior even for single player
Experience Positives Smooth transition to multi-user possible All features/behaviors retained or explicitly disabled Feature development continued successfully during transition (performance, feature, and behavioral enhancements on both client and server side, CAVE support, improved visuals, machinima engine, etc). Negatives Resulting code structure not ideal for client/server application (no MVC framework, some legacy structure). Feature development and client/server work sometimes clash, require re-working in client/server fashion.
Initial Optimizations Basic issues addressed in converting to a massively multi-user networked model
Multi-User Load Challenges Communications Graphics Rendering Geometry Processing Shaders Rendering techniques Dynamics Computation Physics AI or other application specific behaviors Animation
Multi-User Load Challenges Communications Graphics Rendering Geometry Processing Shaders Rendering techniques Dynamics Computation Physics AI or other application specific behaviors Animation
Communication In a unified system, subsystems can share data and communicate quickly. In a Client/Server model, subsystems on different machines have to rely on messages sent over the network Data marshalling overhead Data unmarshalling overhead Bandwidth/latency limitations
New Client Knowledge Model Stand-Alone version had all cities in memory All clients received updates for activity in all cities Increased memory & bandwidth use as environment scales Now: Clients only given cities they can see City assets dynamically loaded onto client as needed Reduces the updates the clients need Further Challenge: Dynamically loading cities without server or client hiccups.
Communication Challenges More Clients leads to: More activity Physics object movements Road/Land Animations House Construction More communication Per client due to increase in activity More clients for server to keep up to date Server communication = activity x clients! Dynamically loading large data sets (cities in this case) without server or client hiccups
Communication Subsystem Code-generation for data marshalling Fast data structure serialization Binary transforms for cross-platform Token or text-based too slow Endian issues resolved during serialization Tested on z10, Intel Asynchronous reading and writing Dedicated threads perform communication Catch up on all messages each game cycle
Reducing Data Marshalling Time Reduce use of per-player queues: Common messages sent to a queue associated with the event’s city Players receive buffers of each city they see, in addition to their player-specific queue. Perform buffer allocation, data marshalling, & copy once for many players. Significantly reduces communication overhead for server.
Preventing Stutters Send smaller chunks of data Break up large messages Incrementally load cities as a player approaches them Space out sending assets over many cycles Large geometry (landscape) subdivided If player arrives, finish all transfers Prevent disk access on client Pre-load resources