70 likes | 87 Views
Parallel ClockDesigner. Andrew Menard 18.377 Final Project Spring 2002. The Problem. Distribute a clock signal from one source to a large number of latches using a tree. Delay must be the same on every path through the tree. Each branch must satisfy minimum and maximum size constraints.
E N D
Parallel ClockDesigner Andrew Menard 18.377 Final Project Spring 2002
The Problem • Distribute a clock signal from one source to a large number of latches using a tree. • Delay must be the same on every path through the tree. • Each branch must satisfy minimum and maximum size constraints. • Need to minimize resources used.
Initial algorithm • Add latches to a net until it violates a constraint, then create another net. • This results in a solution which is legal but suboptimal. • This legal solution can be handed to a refinement optimizer which will improve it.
Serial refinement optimizer • For each pair of nets, check whether there is any combination of latches that can be swapped between them that will make the larger one smaller • Repeat several times over the set of nets, so that any pair of nets is refined several times.
Parallel Refine Optimizer • Mother process spawns one child process per processor. • Mother process sends out a net and a list of adjacent nets to each thread, repeats. • Child process spins waiting for data, optimizes that net with list of adjacent nets, then waits again.
Complications • Child processes have to lock nets they are working on; a significant time sink. • Different pairs of nets can take wildly different amounts of time; some serialization at the end of each job. • Starting threads is expensive.
Results • 20% Speedup in 4-processor version over serial version • Negligible change in memory • Scheduler matters a lot