480 likes | 923 Views
Multithreaded Programming With the Win32 API. Andrew Tucker Debugger Development Lead March 13, 1998. What We Will Cover . Intro to Multithreaded Concepts Starting and Stopping Threads Synchronization Debugging and Testing Issues Interprocess Communication
E N D
Multithreaded Programming With the Win32 API Andrew Tucker Debugger Development Lead March 13, 1998
What We Will Cover • Intro to Multithreaded Concepts • Starting and Stopping Threads • Synchronization • Debugging and Testing Issues • Interprocess Communication • Advanced Topics and Additional Resources
Caveats • Multithreaded feature sets differ between NT, Win95 and CE and versions of the same OS
Intro to Multithreaded Concepts • What is a thread? “path of execution in a process” • owned by a single process • all processes have main thread, some have more • has full access to process address space Operating System Process1 Main T3 ProcessN T2 Process2 T1 T1 Main Main
Intro to Multithreaded Concepts Scheduling - cooperative vs preemptive • Preemptive - allow a thread to execute for a specified amount of time and then automatically performs a “context switch” to change to a new thread (e.G. NT, win95, WCE) • Cooperative - performs context switch only when the user specifies (“manually scheduled”) • Win16 is neither: multitasking, but not multithread
Starting and Stopping Threads CreateThread API HANDLE CreateThread( LPSECURITY_ATTRIBUTES lpsa, // pointer to thread security attributes DWORD dwStackSize, // initial thread stack size, in bytes LPTHREAD_START_ROUTINE lpStartAddress, // pointer to thread function LPVOID lpParameter, // argument for new thread DWORD dwCreationFlags, // creation flags LPDWORD lpThreadId // pointer to returned thread identifier ); _beginthreadex CRT function unsigned long _beginthreadex( void *security, unsigned stack_size, unsigned ( __stdcall *start_address )( void * ), void *arglist, unsigned initflag, unsigned *thrdaddr ); So, what’s the difference?
Starting and Stopping Threads • Difference is the initialization of the CRT library • Linking with multithreaded CRT is not enough DWORD ThreadFunc(PVOID pv) { char *psz = strtok((char*)pv, “;”); while ( psz ) { …. process data …. psz = strtok(NULL. “;”); } } int main() { // BUG - use _beginthreadex to ensure thread safe CRT HANDLE hthrd1 = CreateThread( … ThreadFunc … ); HANDLE hthrd2 = CreateThread( … ThreadFunc … ); } • _beginthreadex creates a structure to ensure global and static CRT variables are thread-specific
Starting and Stopping Threads • Thread functions have the following prototype: DWORD WINAPI ThreadFunc(PVOID pv); • It is very useful to use pv as a pointer to a user-defined structure to pass extra data
Starting and Stopping Threads • A return will automatically call the respective _endthreadex or EndThread API • A return does not close the handle from the creation routine (user must call CloseHandle to avoid resource leak) • Threads should be self-terminating (avoid the TerminateThread API)
Starting and Stopping Threads Reasons to avoid the TerminateThread API: • If the target thread owns a critical section, it will not be released • If the target thread is executing certain kernel calls, the kernel state for the thread’s process could be inconsistent • If the target thread is manipulating the global state of a shared DLL, the state of the DLL could be destroyed, affecting other users of the DLL
Starting and Stopping Threads • Using a C++ member as a thread function (fixing the ‘this’ problem): class ThreadedClass { public: ThreadedClass(); BOOL Start(); void Stop(); private: HANDLE m_hThread; BOOL m_bRunning; static UINT WINAPI StaticThreadFunc(LPVOID lpv); DWORD MemberThreadFunc(); }; UINT WINAPI ThreadedClass::StaticThreadFunc( LPVOID lpv) { ThreadedClass *pThis = (ThreadedClass *)lpv; return pThis->MemberThreadFunc(); } DWORD ThreadedClass::MemberThreadFunc() { while ( m_bRunning ) { … do processing... } } BOOL ThreadedClass::Start(DWORD dwStart) { UINT nTID; m_hThread = (HANDLE)_beginthreadex(NULL, 0, StaticThreadFunc, this, 0, &nTID) return TRUE; } void ThreadedClass::Stop() { m_bRunning = FALSE; // wait for thread to finish DWORD dwExitCode; GetExitCodeThread(m_hThread, &dwExitCode); while ( dwExitCode == STILL_ACTIVE ) { GetExitCodeThread(m_hThread, &dwExitCode); } m_hThread = 0; } int main() { ThreadedClass tc1, tc2; tc1.Start(5); tc2.Start(5000); Sleep(3000); tc1.Stop(); tc2.Stop(); return 0; }
Starting and Stopping Threads • SuspendThread and ResumeThread allow you to pause and restart any thread • Suspension state is a count not a boolean - calls should be balanced • Example: hitting a bp in a debugger causes all current threads to be suspended and resumed on step or go
Starting and Stopping Threads • GetCurrentThread and GetCurrentThreadId are useful for identifying current thread • GetExitCodeThread is useful for determining if a thread is still alive • GetThreadTimes is useful for performance analysis and measurement
Synchronization • Used to coordinate the activities of concurrently running threads • Always avoid coordinating with a poll loop when possible for efficiency reasons
Synchronization • Interlocked functions • Critical Sections • Wait functions • Mutexes • Semaphores • Events • Waitable Timers
Synchronization • Interlocked functions: PVOID InterlockedCompareExchange(PVOID *destination, PVOID Exchange, PVOID Comperand) if ( *Destination == Comperand ) *Destination = Exchange; LONG InterlockedExchange(LPLONG Target, LONG Value ) *Target = Value; LONG InterlockedExchangeAdd(LPLONG Addend, LONG Increment) *Addend += Increment; LONG InterlockedDecrement(LPLONG Addend) *Addend -= 1; LONG InterlockedIncrement(LPLONG Addend) *Addend += 1; • All operations are guaranteed to be “atomic” - the entire routine will execute w/o a context switch
Synchronization Why must simple operations like incrementing an integer be “atomic”? Multiple CPU instructions are required for the actual implementation. If we retrieved a variable and were then preempted by a thread that changed that variable, we would be using the wrong value.
Synchronization • A critical section is a tool for guaranteeing that only one thread is executing a section of code at any time void InitializeCriticalSection(LPCRITICAL_SECTION lpCritSec) void DeleteCriticalSection(LPCRITICAL_SECTION lpCritSec) void EnterCriticalSection(LPCRITICAL_SECTION lpCritSec) void LeaveCriticalSection(LPCRITICAL_SECTION lpCritSec) BOOL TryEnterCriticalSection(LPCRITICAL_SECTION lpCritSec)
Synchronization • EnterCriticalSection will not block on nested calls as long as the calls are in the same thread. Calls to LeaveCriticalSection must still be balanced
Synchronization • Critical section example: counting source lines in multiple files DWORD g_dwTotalLineCount = 0; DWORD CountLinesThread(PVOID pv) { PSTR pszFileName = (PSTR)pv; DWORD dwCount; dwCount = CountSourceLines(pszFileName); EnterCriticalSection(&cs); g_dwTotalLineCount += dwCount; LeaveCriticalSection(&cs); return 0; } void UpdateSourceLineCount() { FileNameList fnl; HANDLE *pHandleList; CRITICAL_SECTION cs; GetFileNameList(&fnl); InitializeCriticalSection(&cs); pHandleList = malloc(sizeof(HANDLE)*fnl.Size()); for ( int i = 0; i < FileNameList.Size(); i++) pHandleList[i] = _beginthreadex(…CountLinesThread, fnl[i]…); //we’ll cover this shortly… WaitForMultipleObjects(fnl.Size(), pHandleList, TRUE, INFINITE); DeleteCriticalSection(&cs); …process g_dwTotalLineCount... }
Synchronization • Wait functions - allow you to pause until one or more objects become signaled • At all times, an object is in one of two states: signaled or nonsignaled • Picture signaled as a flag being raised and nonsignaled as a flag being lowered - the wait functions are watching for a flag to be raised
Synchronization Types of objects that can be “waited on“: • Processes • Threads • Console Input • File Change Notifications • Mutexes* • Semaphores* • Events* • Waitable Timers*
Synchronization • Processes and threads are non-signaled at creation and become signaled when they terminate
Synchronization DWORD WaitForSingleObject(HANDLE hHandle, DWORD dwMilliseconds) • Returns WAIT_OBJECT_0 if hHandle has become signaled or WAIT_TIMEOUT if dwMilliseconds elapsed and the object is still non-signaled DWORD WaitForMultipleObjects(DWORD nCount, HANDLE *pHandles, BOOL bWaitAll, DWORD dwMilliseconds) • If bWaitAll is FALSE and one of the object handles was signaled, the return value minus WAIT_OBJECT_0 is the array index of that handle. If bWaitAll is TRUE and all of the objects become signaled the return value minus WAIT_OBJECT_0 is a valid index into the handle array. If dwMilliseconds elapsed and no object was signaled, WAIT_TIMEOUT is returned. nCount can be no more than MAXIMUM_WAIT_OBJECTS (currently defined as 64) • INFINITE can be used as a timeout value
Synchronization • Mutexes provide mutually exclusive access to an object (hence the name) HANDLE CreateMutex(LPSECURITY)ATTRIBUTES lpsa, BOOL bInitialOwner, LPCTSTR lpName) • Ownership is equivalent to the nonsignaled state - if bInitialOwner is TRUE the creation state of the mutex is nonsignaled • lpName is optional • ReleaseMutex is used to end ownership
Synchronization What’s the difference between a critical section and a mutex? A mutex is a OS kernel object, and can thus be used across process boundaries. A critical section is limited to the process in which it was created
Synchronization Two methods to get a handle to a named mutex created by another process: • OpenMutex - returns handle to an existing mutex • CreateMutex - creates or returns handle to an existing mutex. GetLastError will return ERROR_ALREADY_EXISTS for the latter case
Synchronization • Comparing mutex and critical section performance
Synchronization • A mutex will not block on nested calls as long as they are in the same thread. ReleaseMutex calls must still be balanced • Examples of when to use a mutex: • Error logging system that can be used from any process • Detecting multiple instances of an application
Synchronization • Semaphores allow access to a resource to be limited to a fixed number HANDLE CreateSemaphore(LPSECURITY_ATTRIBUTE lpsa, LONG cSemInitial, LONG cSemMax, LPCTSTR lpName) • Semaphores are in the signaled state when their available count is greater than zero • ReleaseSemaphore is used to decrement usage • Conceptually, a mutex is a binary semaphore
Synchronization • Named semaphores can be used across process boundaries with OpenSemaphore and CreateSemaphore • Can be used to solve the classic “single writer / multiple readers” problem
Synchronization • Example: limiting number of entries in a queue const int QUEUE_SIZE = 5; HANDLE g_hSem = NULL; long g_iCurSize = 0; UINT WINAPI PrintJob(PVOID pv) { WaitForSingleObject(g_hSem, INFINITE); InterlockedIncrement(&g_iCurSize); printf("%08lX - entered queue: size = %d\n", GetCurrentThreadId(), g_iCurSize ); Sleep(500); // print job.... InterlockedDecrement(&g_iCurSize); long lPrev; ReleaseSemaphore(g_hSem, 1, &lPrev); return 0; } int main() { const int MAX_THREADS = 64; HANDLE hThreads[MAX_THREADS]; g_hSem = CreateSemaphore(NULL, QUEUE_SIZE, QUEUE_SIZE, NULL ); UINT dwTID; for ( int i = 0; i < MAX_THREADS; i++ ) hThreads[i] = (HANDLE)_beginthreadex(NULL, 0, PrintJob, NULL, 0, &dwTID); WaitForMultipleObjects(MAX_THREADS, hThreads, TRUE, INFINITE); return 0; }
Synchronization • Events provide notification when some condition has been met HANDLE CreateEvent(LPSECURITY_ATTRIBUTES lpsa, BOOL bManualReset, BOOL bInitialState, LPCTSTR lpName) • If bInitialState is TRUE, object is created in the signaled state • bManualReset specifies the type of event requested
Synchronization Two kinds of event objects: • Auto reset - when signaled it is automatically changed to a nonsignaled state after a single waiting thread has been released • Manual reset - when signaled it remains in the signaled state until it is manually changed to the nonsignaled state
Synchronization • Named event objects can be used across process boundaries with OpenEvent and CreateEvent • SetEvent sets the object state to signaled • ResetEvent sets the object state to nonsignaled • PulseEvent conceptually calls SetEvent/ResetEvent sequentially, but ...
Synchronization PulseEvent vs SetEvent
Synchronization • Example: displaying OutputDebugString text without a debugger int main() { HANDLE hAckEvent, hReadyEvent; PSTR pszBuffer; hAckEvent = CreateEvent(NULL, FALSE, FALSE, “DBWIN_BUFFER_READY”); if (GetLastError() == ERROR_ALREADY_EXISTS) { // handle multiple instance case } hReadyEvent = CreateEvent(NULL, FALSE, FALSE, “DBWIN_DATA_READY”); pszBuffer = /* get pointer to data in memory mapped file */; SetEvent(hAckEvent); while ( TRUE ) { int ret = WaitForSingleObject(hReadyEvent, INFINITE); if ( ret != WAIT_OBJECT_0) { // handle error } else { printf(pszBuffer); SetEvent(hAckEvent); } } }
Synchronization • Waitable timers are kernel objects that provide a signal at a specified time interval HANDLE CreateWaitableTimer(LPSECURITY_ATTRIBUTES lpsa, BOOL bManualReset, LPCTSTR lpName) • Manual/auto reset behavior is identical to events • Time interval is specified with SetWaitableTimer
Synchronization BOOL SetWaitableTimer(HANDLE hTimer, LARGE_INTEGER *pDueTime, LONG lPeriod, PTIMERACPROUTINE pfnCompletion, PVOID pArg, BOOL fResume) • pDueTime specifies when the timer should go off for the first time (positive is absolute, negative is relative) • lPeriod specifies how frequently to go off after the initial time • fResume controls whether the system is awakened when timer is signaled
Synchronization • Consecutive calls to SetWaitableTimer overwrite each other • CancelWaitabletimer stops the timer so that it will not go off again (unless SetWaitableTimer is called)
Synchronization • Example: firing an event every N seconds const int MAX_TIMES = 3; const int N = 10; DWORD WINAPI ThreadFunc(PVOID pv) { HANDLE hTimer = (HANDLE)pv; int iCount = 0; DWORD dwErr = 0; while (TRUE) { if ( WaitForSingleObject(hTimer, INFINITE) == WAIT_OBJECT_0 ) { … handle timer event... if ( ++iCount >= MAX_TIMES ) break; } return 0; } int main() { HANDLE hTimer = CreateWaitableTimer(NULL, FALSE, NULL); LARGE_INTEGER li; const int nNanosecondsPerSecond = 10000000; __int64 qwTimeFromNow = N * nNanosecondsPerSecond; qwTimeFromNow = -qwTimeFromNow; li.LowPart = (DWORD)(qwTimeFromNow & 0xFFFFFFFF); li.HighPart = (DWORD)(qwTimeFromNow >> 32); SetWaitableTimer(hTimer, &li, N * 1000, NULL, NULL, FALSE); DWORD dwTID; HANDLE hThread = CreateThread(NULL, 0, ThreadFunc, hTimer, 0, &dwTID); WaitForSingleObject(hThread, INFINITE); return 0; }
Debugging and Testing Issues • Very difficult , if not impossible, to reproduce and test every possible deadlock and race condition • Stepping through code will not necessarily help • OutputDebugString can be very helpful
Debugging and Testing Issues • Don’t underestimate the value of peer review and code inspection • Hint: after every wait or release ask yourself “what if a context switch occurred here”
Interprocess Communication • Tools to provide the ability to pass data between processes or machines • Types of IPC: • Clipboard • DDE • OLE • Memory Mapped Files • Mailslots • Pipes • RPC • Sockets • WM_COPYDATA
Advanced Topics • Thread local storage • UI vs worker threads • CreateRemoteThread • Scheduling and Get/SetThreadPriority) • Fibers (NT only) • Asynchronous procedure calls
Resources • Advanced Windows by Jeff Richter • Win32 System Programming by Johnson Hart • Win32 Multithreaded Programming by Aaron Cohen and Mike Woodring • Windows NT Programming in Practice by editors of WDJ (including Paula T)