EECS 700: Project Proposal TimerQueue Implementation for the HybridThread Framework

EECS 700: Project Proposal TimerQueue Implementation for the HybridThread Framework Jason Agron jagron@ittc.ku.edu

Overview • HybridThread Architecture. • Background on “yielding” calls. • Timed vs. Untimed calls. • Background on nanosleep(). • Proposed architecture. • Conclusion. • Questions.

HybridThread Architecture • Hybrid architecture = CPU + FPGA. • Unified programming model. • Threaded programming model. • HW/SW co-design of OS services. • Precise, and deterministic behavior of OS components. • Good for embedded and RTOS systems.

HybridThread Architecture

“Yielding” System Calls • Consider yield() and nanosleep() • Timed vs. Untimed  affects how a thread re-added to the R2R-Queue. • Timed yield calls – i.e. nanosleep(). • Using software managed timer-queues. • Periodic management • Only managed at most once per system clock! • Non-deterministic execution times. • Cache misses, branch mispredictions. • Signal delivery, jitter due to “clock alignment”.

Nanosleep() • POSIX specification: • Can sleep less than time specified. • Due to asynchronous signal delivery. • Can sleep more than time specified. • Specified time is rounded up to resolution of system clock. • Periodic queue management  missed timer events. • Why is it even called NANOsleep?

Proposed Solution • Goal – REAL nanosecond resolution for thread sleep times w/o increased OS overhead. • Build a timer-queue IP core. • HW implementation. • Very fast, deterministic, constant queue management. • Timer-events != CPU interrupts. • HybridThread architecture allows for TQ to “talk” directly with TM and SCHED. • Uniform interface for both SW and HW.

How it will work • nanosleep(threadID,delta) • Call interacts with TimerQueue core. • Adds an entry to a sorted queue. • Calling thread “yields” • Time goes on  timer-events expire. • Upon expiration • TQ sends add_thread message to the TM. • Single bus transaction. • TQ removes timer-event entry, waits for more to expire.

Design Decisions • Actual queue structure. • Space vs. Time. • Registers • Fast, parallel access. • Take up CLBs. • BRAMs • Fast, sequential access. • Doesn’t take up CLBs. • Sorted vs. Unsorted. • “Monitor” the top of the queue vs. constant queue traversal.

Conclusion • FPGA implementation of TimerQueue • Allows for TQ to monitor events at the rate of the FPGA clock (~100 MHz). • Accurate nanosecond sleep times!!! • Allows for uniform access of TQ services to both SW and HW threads! • Provides an abstraction of the FPGA resources for the everyday programmer!

EECS 700: Project Proposal TimerQueue Implementation for the HybridThread Framework