140 likes | 278 Views
Extending managed debugger with backstepping facilities. Evgeny Vigdorchik, Alexey Nikitin St.Petersburg State University. Synopsis. Debugging is usually a most labor-intensive work Stepping back eliminates multiple debugger runs
E N D
Extending managed debugger with backstepping facilities Evgeny Vigdorchik, Alexey Nikitin St.Petersburg State University
Synopsis • Debugging is usually a most labor-intensive work • Stepping back eliminates multiple debugger runs • The conventional way is to create new process using Unix fork call, the old process stores the state of a program for the backstepping • More efficient checkpointing requires the modification of VM sources
Watchpoints • Watchpoints are atomic events that the user is able to backstep. Each watchpoint has a unique counter value associated with it • No unified high-level language semantics for watchpoints is possible: e.g. associating source code line with watchpoint is absurd with following ML code List.fold (fun el sum -> el + sum) 0 li • Use CIL basic blocks instead • Calls to watchpoint code are emitted by FJIT
Checkpoints • Registers are copied • Copy-on-write strategy is used for saving memory that is implemented using write protection • Memory regions that need to be saved are enumerated explicitly (i.e. heap, large heap, class loader heap, handle table…) • Stack is not write protected and is saved eagerly
Saving registers • Watchpoints are implemented using FCALL helper calling conventions in Rotor • When frames are placed, the registers are captured • Execution may be simulated to provide register state after the exit from the helper: LazyMachState::getState() • Allows to restore to point after the call to respective watchpoint
Saving stack • Stack is saved to common memory history, i.e. restored like any other memory region • The whole thread stack is copied • Stack bounds are found using stack walk routine static StackWalkAction StackBottomWalker( CrawlFrame *pCF, VOID* pData // Caller's private data ) { void * pTop = alloca(2); if(pCF && pCF->IsFrameless() && pCF->GetRegisterSet() && pCF-> GetRegisterSet()->pEbp) PStackBounds(pData)->pBottom = (const BYTE *)*pCF->GetRegisterSet()->pEbp; PStackBounds(pData)->pTop = (const BYTE *)&pTop; … }
Saving GC Heap • Invoke minor GC during checkpoint : • No protection is needed for gen0 – more frequently used generation • Decreases overall memory usage, unlike in fork based debuggers • Survivers are promoted to gen1 where they live forever unless a major collection is forced by the user
Debugging similarity issues • Due to periodic minor GC and no automatic major GC behavior of the program may change when debugging (finalization, weak refs, …) • The user may issue major GC explicitly from managed code at cost of higher memory usage • An option to perform major GC automatically may be given on startup
cordbg architecture • Debugger resides out of debuggee process • Runtime controller thread is listening to debugger events • Restoration of context is done by RTC
Restoring to checkpoint • Decrement watchpoint counter by desired value • Restore the context and run forward until user break is issued: void Debugger::EventCounterChange (Thread *thread){ CheckpointInterface * ci; LONG ulTime = thread->IncrementEventCount (); _ASSERTE (JanusGetCheckpointInterface(&ci) == S_OK); if (thread->IsTargetEventReached ()) { DebuggerExitFrame __def; SendUserBreakpoint (thread); __def.Pop(); thread->ResetTargetEventCount (); } else if (ulTime % ci->GetDensity () == 0) {/* checkpoint */} }
Restoring to checkpoint (cont) • Need to recover from previous debugger event processing: • substitute return address to SuspendThread and resume execution • Win32 SuspendThread places return value on stack after thread is resumed • need to ensure new stack won’t be corrupted • know new stack bounds • grow the stack before suspension
Results • The framework was implemented on top of rotor EE and managed debugger API • For computational programs with small basic blocks (computing digits of pi accumulating result in StringBuilder) the slowdown is 4 times compared to cordbg, almostall this time spent in watchpoint helper calls • For doing 225 checkpoints memory used increased only 10 times
Future Work • The replay of the external calls, e.g. file I/O, networking etc. • Correct and deterministic replay of multithreaded programs
That’s all! Q&A {ven,lex}@{tercom.ru, intellij.com}