290 likes | 348 Views
Learn how EVE, the massive MMORPG space game, relies on Stackless Python, boosting its concurrency and performance. Explore the unique features of Stackless Python and its tight integration with EVE's architecture.
E N D
Stackless Python in EVE Kristján Valur Jónsson kristjan@ccpgames.com CCP Games inc.
EVE • MMORPG Space game • Client / server • Single shard massive server • 120.000 active players, >24.000 concurrent users • World concurrency record on a shard • Relies on Stackless Python
The Tranquility cluster • 400 GHz CPU / 200 Gb RAM • 2 Routers (CISCO Alteon) • 14 Proxy servers (IBM Blade) • 55 Sol servers (IBM x335) • 2 DB servers (clustered, IBM Brick x445) • FastT600 Fiber, 56 x FC 15k disks, DS4300 + 3*EXP700 • Windows 2000, MS SQL Server • Currently being upgraded • AMD x64
EVE Architecture • COM-like basic architecture • Python tighly integrated at an early stage • Home-grown wrapping of BLUE objects
Stackless Python • Tasklets • Threads of execution. Not OS threads • Lightweight • No pre-emption • Channels • Tasklet rendezvous point • Data passing • Scheduling • Synchronization
Stackless? • No C stack • Python stack in linked frame objects • Tasklet switching by swapping frame chain • Compromise • stackless where possible. • C stack whisked away if necessary
Channel semantics • Send on a channel with no receiver blocks tasklet. • Send on a channel with a (blocked) receiver, suspends tasklet and runs receiver immediately. Sender runs again in due course. • Symmetric wrt. Send and Receive. • “balance”, can have a queue of readers or writers. • Conceptually similar to Unix pipes
Channel semantics, cont. • Scheduling semantics are precise: • A blocked tasklet is run immediately • Usable as a building block: • semaphores • mutex • critical section • condition variables
Stackless in EVE • BLUE foundation: robust, but cumbersome • RAD • Stackless Python: Python and so much more • EVE is inconceivable without Stackless • Everyone is a programmer
The main loop • Establish stackless context • int WinMain(...) { • PyObject *myApp = new EveApp(); • PyObject *r = PyStackless_CallMethod_Main(MyApp, “WinMain”, 0); • return PyInt_AsLong( r );
The main loop cont. PyObject* EveApp::WinMain(PyObject *self, PyObject *args) { PyOS->ExecFile("script:/sys/autoexec.py"); MSG msg; while(PeekMessage(&msg, 0, 0, 0, PM_REMOVE)){ TranslateMessage(&msg); DispatchMessage(&msg); } for (TickIt i = mTickers.begin(; i != mTickers.end(); i++) i->mCb->OnTick(mTime, (void*)taskname); } • Regular Windows message loop • Runs in Stackless context • The “Main Tasklet”
Autoexec.py import blue def Startup(): import service srvMng = service.ServiceManager() run = ["dataconfig", "godma", “ui", …] srvMng.Run(run) #Start up the client in a tasklet! if CheckDXVersion(): import blue blue.pyos.CreateTasklet(Startup, (), {})
Tickers • Tickers are BLUE modules: • Trinity (the renderer) • Netclient • DB (on the server) • Audio • PyOS (special python services) • …
The PyOS tick: • Runs fresh tasklets • (sleepers awoken elsewhere) Tick() { … mSynchro->Tick() PyObject *watchdogResult; do { watchdogResult = PyStackless_RunWatchdog(20000000); if (!watchdogResult) PyFlushError("PumpPython::Watchdog"); Py_XDECREF(watchdogresult); } while (!watchdogResult);
blue.pyos.synchro • Synchro: • Provides Thread-like tasklet utilities: • Sleep(ms) • Yield() • BeNice()
blue.pyos.synchro cont. • Sleep: A python script makes the call blue.pyos.Sleep(200) • C++ code runs: • Main tasklet check • sleeper = New Sleeper();mSleepers.insert(sleeper);PyObject *r = PyChannel_Receive(sleeper->mChannel); • Another tasklet runs
blue.pyos.synchro, ctd. • Main tasklet in windows loop enters PyOS::Tick() • mSleepers are examined for all that are due we do: mSleepers.remove(sleeper);PyChannel_Send(sleepers.mChannel, Py_NONE); • Main tasklet is suspended (but runnable), sleeper runs.
Points to note: • A tasklet goes to sleep by calling PyChannel_Receive() on a channel which has no pending sender. • It will sleep there (block) until someone sends • Typically the main tasklet does this, doing PyChannel_Send() on a channel with a reader • Ergo: The main tasklet may not block
Socket Receive • Use Windows asynchronous file API • Provide a synchronous python API. A python script calls Read(). • Tasklet may be blocked for a long time, (many frames) other tasklets continue running. • Do this using channels.
Receive, cont. • Python script runs:foo, bar = socket.Read() • C code executes the request: Request *r = new Request(this);WSAReceive(mSocket, …);mServe->insert( r );PyChannel_Receive(r->mChannel); • Tasklet is suspended
Receive, cont. • Socket server is ticked from main loop • For all requests that are marked completed, it transfers the data to the sleeping tasklets: PyObject *r = PyString_FromStringAndSize(req->mData, req->mDataLen);PyChannel_Send(req->mChannel, r);Py_DECREF(data);delete req; • The sleeping tasklet wakes up, main tasklet is suspended (but runnable)
Main Tasklet • The one running the windows loop • Can be suspended, allowing other tasklets to run • Can be blocked, as long as there is another tasklet to unblock it (dangerous) • Is responsible for waking up Sleepers, Yielders, IO tasklets, etc. therefore cannot be one of them • Is flagged as non-blockable (stackless.get_current().block_trap = True)
Channel magic • Channels perform the stackless context switch. • If there is a C stack in the call chain, it will magically swap the stacks. • Your entire C stack (with C and python invocations) is whisked away and stored, to be replaced with a new one. • This allows stackless to simulate cooperative multi-threading
Co-operative multitasking • Context is switched only at known points. • In Stakcless, this is channel.send() and channel.receive() • Also synchro.Yield(), synchro.Sleep(), BeNice(), socket and DB ops, etc. • No unexpected context switches • Almost no race conditions • Program like you are single-threaded • Very few exceptions. • This extends to C state too!
Tasklets • Tasklets are cheap • Used liberally to reduce perceived lag • UI events forked out to tasklets • A click can have heavy consequences. • Heavy logic • DB Access • Networks access • special rendering tasks forked out to tasklets. • controlling an audio track • “tasklet it out” • Use blue.pyos.synchro.BeNice() in large loops
Example: UI Event: • Main tasklet receives window messages such as WM_CLICK • Trinity invokes handler on UI elements or global handler • Handler “tasklets out” any action to allow main thread to continue immediately. def OnGlobalUp(self, *args): if not self or self.destroyed: return mo = eve.triapp.uilib.mouseOver if mo in self.children: uthread.new(mo._OnClick) class Action(xtriui.QuickDeco): def _OnClick(self, *args): pass
That’s all • For more info: • http://www.ccpgames.com • http://www.eve-online.com • http://www.stackless.com • kristjan@ccpgames.com