190 likes | 358 Views
Multithreading in CASTOR. How to use pthreads without seeing them (almost…) Giuseppe Lo Presti DM technical meeting – July 1 st , 2008. Outline. Overview Architecture and requirements A C++ framework for multithreading Design and implementation Some user code samples The internals
E N D
Multithreading in CASTOR How to use pthreads without seeing them (almost…) Giuseppe Lo Presti DM technical meeting – July 1st, 2008
Outline • Overview • Architecture and requirements • A C++ framework for multithreading • Design and implementation • Some user code samples • The internals • The framework in action • Conclusions Giuseppe Lo Presti, Multithreading in CASTOR - 2
Castor Architecture Overview • Database centric • Stateless redundant software components • State stored in a central database for scalability and fault resiliency purposes • Technology choices • A number of multithreaded daemons perform all needed tasks to serve user requests • Each operation is reflected in the database => tasks are inherently I/O bound or better “latency bound” • Dominated by db/network latency • Concurrency issues resolved in the databaseby using locks Giuseppe Lo Presti, Multithreading in CASTOR - 3
High Level Requirements • Multithreading to achieve better overall throughput in terms of #requests/sec • System inherently superlinear because of I/O bound tasks • Need for supporting thread pools • Each one dedicated to a different task • Lightweight multithreading infrastructure • Limit memory footprint of the daemons • Seamless integration with C++ • Very limited issues with synchronization and data sharing across different threads • Context data is always in the db • Each thread deals with a different request:typical case of embarassing parallelism Giuseppe Lo Presti, Multithreading in CASTOR - 4
A Framework for Multithreading • Choices • Usage of Linux POSIX threads • C++ package to hide pthreads complexity and provide a Java-like interface • IThread abstract class (cf. Java Runnable interface) • Specialized thread pools to implement different functionalities (e.g. requests handling) • Very high reusability across all software components • Ability to have thread-safe and thread-shared variables • Daemon mode with embedded signal handling • Support for graceful stop and restart of daemons Giuseppe Lo Presti, Multithreading in CASTOR - 5
Framework Implementation • Usage of an existing OS abstraction layer:the Cthread API • Replicates all pthread API, and additionally provides thread-safe global variables • One of the most mature (read old…) parts in the Castor codebase, shared by different projects in IT • C++ code • Clean interface for the user: generic methods to compose daemons out of user classes • Cthread / pthread / system calls are kept hidden from user code, but still accessible for special cases • E.g. mutexes • Typical use cases • Listening to a port and accepting requests • Polling the database for next operation to perform Giuseppe Lo Presti, Multithreading in CASTOR - 6
Simplified Class Diagram Programmer’s interface Giuseppe Lo Presti, Multithreading in CASTOR - 7
Main Classes • Thread pools • ListenerThreadPool: generic socket connection dispatcher à-la Apache • Specialized classes for TCP, UDP, … sockets • SignalThreadPool: pool manager for backend activities that need to run periodically or upon external signalling • The signalling mechanism is based on condition variables • ForkedProcessPool: pool manager based on fork(), not on pthreads, supporting pipes for IPC • Classes for servers • BaseServer: basic generic server providing daemon mode (detach from shell) and logging initialization • BaseDaemon: more sophisticated base class for daemons, supporting system signal handling and any combinations of the implemented thread pools Giuseppe Lo Presti, Multithreading in CASTOR - 8
Code Samples • Excerpt from the Monitoring daemon’s main() • Different thread pools are mixed together • The start() method from BaseDaemon spawns all the requested threads RmMasterDaemon daemon; ... // db threadpool daemon.addThreadPool(new castor::server::SignalThreadPool( "DatabaseActuator”, new DbActuatorThread( daemon.clusterStatus()), updateInterval)); daemon.getThreadPool('D')->setNbThreads(1); // update threadpool daemon.addThreadPool(new castor::server::UDPListenerThreadPool( "Update", new UpdateThread( daemon.clusterStatus()), listenPort)); ... // Start daemon daemon.parseCommandLine(argc, argv); daemon.start(); Giuseppe Lo Presti, Multithreading in CASTOR - 9
Code Samples • User threads • As easy as inheriting from IThread: • Typical pitfall: code is shared among all threads in each given pool • Mutex sections and synchronization to be explicitly implemented – no synchronized methods like in Java • Consequence: class variables are thread-shared, only local variables are thread-safe • But you may need thread-safe singletons… • Our solution (provided by Cthreads): for each thread-safe global variable, keep an hash map indexed by TID class UpdateThread : public castor::server::IThread { public: virtual void run(void *param) throw(); virtual void stop(); } Giuseppe Lo Presti, Multithreading in CASTOR - 10
The Internals • …So, where are the (p)threads? • BaseThreadPool serves as basic infrastructure • A friend function _thread_run() is the thread entrypoint, which runs the user code • All specialized thread pools use this function when spawning threads void* castor::server::_thread_run(void* param) { struct threadArgs *args = (struct threadArgs*)param; castor::server::BaseThreadPool* pool = dynamic_cast<castor::server::BaseThreadPool*>(args->handler); // Executes the thread try { pool->m_thread->run(args->param); } catch(castor::exception::Exception any) { // error handling } } Giuseppe Lo Presti, Multithreading in CASTOR - 11
The Internals • SignalThreadPool encapsulates pthread_create() calls and condition variables • Threads wait until a condition variable gets notified, or after a timeout has passed • pthread_cond_wait() and pthread_cond_signal() • One (or more) thread in the pool is waken up and executes the user code • Pool keeps track of current # of busy threads void castor::server::SignalThreadPool::run() throw (...) { ... // create pool of detached threads for (int i = 0; i < m_nbThreads; i++) { if (Cthread_create_detached( castor::server::_thread_run, args) >= 0) { ++n; // for later error handling } } ... } Giuseppe Lo Presti, Multithreading in CASTOR - 12
The Internals • ForkedProcessPool encapsulates fork() calls, with children dispatch handled via select() void castor::server::ForkedProcessPool::init() throw (...) { // create pool of forked processes // we do it here so it is done before daemonization m_childPid = new int[m_nbThreads]; for (int i = 0; i < m_nbThreads; i++) { ... castor::io::PipeSocket* ps = new castor::io::PipeSocket(); m_childPid[i] = 0; int pid = fork(); if(pid < 0) { ... // error } else if(pid == 0) { // child ... childRun(ps);// this is a blocking call to the user code exit(EXIT_SUCCESS); } else { // parent: save pipe and pid m_childPid[i] = pid; m_childPipe.push_back(ps); ps->closeRead(); int fd = ps->getFdWrite(); FD_SET(fd, &m_writePipes); // prepare mask for select() } } } Giuseppe Lo Presti, Multithreading in CASTOR - 13
The Internals • BaseDaemon manages all threads and encapsulates the system signal handling • To avoid unpredictable behaviours, all threads need to be protected from signals via:pthread_sigmask(SIG_BLOCK, &signalSet, NULL)where signalSet includes all usual system signals • Yet another pthread performs a customized system signal handling by looping on sigwait() • After spawning all user threads, the main thread waits for a notification from the dedicated signal handling thread, and broadcasts an appropriate message to all running threads • E.g. on SIGTERM, all user threads’ stop() methods are called; after # of busy threads goes to 0, exit() is called. • Forked children are told to stop via SIGTERM too Giuseppe Lo Presti, Multithreading in CASTOR - 14
The Internals • Additional facilities in the framework • BaseDbThread implements the IThread interface and provides a graceful termination of a thread-specific database connection upon stop() • Mutex wraps common pthread functions to handle mutexes on integer variables • wait() and signal() methods provided • Generic mutexes on variables of any type left to the user code • PipeSocket wraps a Unix pipe and allows object streaming between different processes (e.g. parent and children) Giuseppe Lo Presti, Multithreading in CASTOR - 15
The Internals: full Class Diagram Giuseppe Lo Presti, Multithreading in CASTOR - 16
The Framework in Action • Class Diagram from Castor doxygen documentation • Most Castor daemons inherit from BaseDaemon • They all support graceful stop, e.g.: • DATE=20080522175726.156834 HOST=lxb1952.cern.ch LVL=System FACILITY=Stager PID=11439 […] MESSAGE="GRACEFUL STOP [SIGTERM] - Shutting down the service" • DATE=20080522175728.857292 HOST=lxb1952.cern.ch LVL=System FACILITY=Stager PID=11439 […] MESSAGE="GRACEFUL STOP [SIGTERM] - Shut down successfully completed" Giuseppe Lo Presti, Multithreading in CASTOR - 17
The Framework in Action • Typical load on a node • 8 cores run a total of ~90 threads, each owning a db connection, with a fraction of the total available CPU and memory resources even during high load peaks • The stager daemon alone runs 53 threads • This is the current deployment of a productionCastor instance top - 16:17:53 up 115 days, 11:00, 4 users, load average: 1.06, 0.78, 0.59 Tasks: 173 total, 2 running, 171 sleeping, 0 stopped, 0 zombie Cpu(s): 6.5% us, 1.9% sy, 0.0% ni, 91.3% id, 0.0% wa, 0.0% hi, 0.3% si Mem: 16414780k total, 7548712k used, 8866068k free, 634696k buffers Swap: 4192880k total, 220k used, 4192660k free, 5285996k cached PID USER PR NI %CPU TIME+ %MEM VIRT RES SHR S COMMAND 31110 stage 16 0 20 3:07.56 0.2 183m 32m 11m S migrator 3107 root 16 0 4 4:48.46 0.5 237m 76m 5972 S dlfserver 31107 stage 15 0 3 0:38.23 0.2 183m 32m 11m S migrator 3309 root 15 0 2 22:11.80 0.7 741m 109m 9.8m S stagerDaemon 3315 root 16 0 2 21:28.97 0.7 741m 109m 9.8m S stagerDaemon 3314 root 16 0 2 21:58.28 0.7 741m 109m 9.8m S stagerDaemon 3238 root 16 0 1 40:37.70 0.2 372m 29m 8380 S rhserver ... Giuseppe Lo Presti, Multithreading in CASTOR - 18
Conclusions • We have shown how the pthread API can be powerful enough to support many high level multithreaded tasks • But don’t forget that we started with an embarassing parallelism scenario… • The CASTOR service moved from 6 dual CPU nodes to one 8-cores node • No way out of multithreading • I know, that’s become pretty obvious by now… • Comments, questions? www.cern.ch/castor Giuseppe Lo Presti, Multithreading in CASTOR - 19