1 / 11

EECS 700 Project TimerQueue Implementation for the HybridThread Framework

EECS 700 Project TimerQueue Implementation for the HybridThread Framework. Jason Agron jagron@ittc.ku.edu. Overview. TimerQueue Design. TimerQueue Architecture. Results. Conclusions. Questions. TimerQueue Design. Provide ACTUAL nanosecond resolution for sleep times.

Download Presentation

EECS 700 Project TimerQueue Implementation for the HybridThread Framework

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EECS 700 Project TimerQueue Implementation for the HybridThread Framework Jason Agron jagron@ittc.ku.edu

  2. Overview • TimerQueue Design. • TimerQueue Architecture. • Results. • Conclusions. • Questions.

  3. TimerQueue Design • Provide ACTUAL nanosecond resolution for sleep times. • Precise and deterministic. • Low overhead and low jitter. • Implemented in FPGA. • Management Time != Less application time.

  4. TimerQueue Operations • User APIs. • WriteLO(threadID,lo_delta); • Stores the lower 32-bits of sleep_delta. • WriteHI(threadID,hi_delta); • Stores the upper 32-bits of sleep_delta. • Nanosleep(threadID); • Sorted insert of a thread into the event queue based on its sleep delta. • System (Non-callable) APIs. • Get_debug_reg(); • Returns internal TQ debug information . • Dequeue(threadID); • Removes a thread from the event queue as a result of “wakeup”.

  5. TimeQueue Architecture • Event-queue structure. • Doubly-linked sorted queue. • Implemented using inferred BRAMs. • Saves CLBs. • Improves portability. • Slave FSM • Handles all incoming TQ operations. • Master FSM • Handles event monitoring and “wakeup” operation.

  6. TimerQueue Block Diagram

  7. TQ’s HW Operations TQ[tid] = set_lo(0xFFFFFFFF). TQ[tid] = set_hi(0x00000001). Sorted insert of tid into event-queue. If event-queue is empty, initialize pointers to be tid. Otherwise: Start at head_ptr and move down until an entry point is found. If tail_ptr is reached, insert at the end. SW Operations WriteLO(tid,0xFFFFFFFF); WriteHI(tid,0x00000001); Nanosleep(tid); Thread “falls” asleep. How threads “fall” asleep

  8. TQ’s HW Operations Master FSM is monitoring the head_ptr of the event-queue. Timer expires: Master FSM sends add_thread() message to the Thread Manager. Master FSM sends a dequeue() message to the TQ’s Slave FSM TM’s HW Operations TM is idling, waiting for and then servicing requests. Receives an add_thread() msg. TM marks the thread as being queued TM sends enqueue() message to the Scheduler. Scheduler decides if preemption interrupt will be thrown The CPU will only context switch on preemption interrupts. How threads wake back up

  9. Timing Results • WriteLO() • 4 clock cycles. • WriteHI() • 4 clock cycles. • Nanosleep() • 9+3*X clock cycles • X is the # of threads in event-queue. • Dequeue() • 8 clock cycles. • Wakeup – includes add_thread() to TM and Dequeue(). • ~ 100 clock cycles. • Depends on bus traffic.

  10. Conclusions • Wakeup Process • Consistent. • ~100 clock cycles  1 microsecond. • Fudge-factor can be taken into account. • Actual nanosecond resolution. • Summary • Design and Implementation follow the POSIX spec. • Wakeup AFTER timer expires. • Actual nanosecond resolution of sleep times can be achieved. • Services available to HW and SW threads.

  11. Future Work • Synthesis and Integration. • Currently synthesizes. • ~1200 slices in a Virtex-II Pro 30 FPGA. • SW APIs have not been created. • 64-bit values are needed. • Wes’s expertise with the kernel is needed. • Further testing. • On-board tests with lots of contention for system resources. • OPB bus. • All other system cores in HybridThread framework.

More Related