160 likes | 302 Views
Tcl / Tk Conference 2012 Pulling Out All The Stops – Part II. Phil Brooks November 15, 2012. Purpose of the talk. Experiences developing for maximum performance with Threads in Tcl Recap Pulling Out All The Stops – part 1 Changing Customer usage patterns Threads and Performance
E N D
Tcl/Tk Conference 2012 Pulling Out All The Stops – Part II Phil Brooks November 15, 2012
Purpose of the talk • Experiences developing for maximum performance with Threads in Tcl • Recap Pulling Out All The Stops – part 1 • Changing Customer usage patterns • Threads and Performance • Observations IIT Engr - August 2011 D2S Operations Meeting
Context Application • Calibre LVS Device Extraction • Uses geometric analysis to identify devices in a graphical database • High Performance • Multi-threaded execution • User programmability • SVRF calculator • Tcl IIT Engr – February, 2012 D2S Operations Meeting
Pulling Out All The Stops – Part I • Developed in 2005 • Device TVF – Tcl based extension to the then existing SVRF Calculator • Achieved excellent performance by: • Use TCL_EVAL_GLOBAL • pre-run compilation using Tcl_EvalObjv • pre-run data access setup via cached Tcl_Obj arguments and results • End users strongly encouraged to write efficient Tcl code • (remember expr { ... }) • Interpreter was single threaded and computation threads accessed it serially through a lock. IIT Engr - August 2011 D2S Operations Meeting
Initial Use Originally, Tcl was added to avoid having to add looping constructs to our calculator. Simple call with parameters and return a single result interface. a few Tcl calls per device p1 = TVF_NUMERIC_FUNCTION::libname::procname( parm1, parm2, parm3) p2 = TVF_NUMERIC_FUNCTION::libname::procname( parm4, parm5, parm6) IIT Engr - August 2011 D2S Operations Meeting
Early user code tended to look like this: proc do_calculation my_arr { set entry_count [ $my_arr entry_count ] # iterate using an index set val 0.0 for { set i 0 } { $i < $entry_count } { incr i } { set val [ expr { .... } ] } return $val } (where ... was a very long expression) IIT Engr - August 2011 D2S Operations Meeting
8 years later • dozens of parameters • complex multi-step calculations • returning long string results instead of numbers • dozens of calls per device • So we went from: • 2005 • good scaling across 4-8 processors using a single Tcl interpreter • 2012 • poor scaling across 8-32 processors using a single Tcl interpreter IIT Engr - August 2011 D2S Operations Meeting
So Lets turn on Multi-Threading Look for information on Tcl threads: "At the C programming level, Tcl's threading model requires that a Tcl interpreter be managed by only one thread." p. 322 “Practical Programming in Tcl and Tk” - Brent B. Welch, Ken Jones, with Jeffrey Hobbs, Prentice Hall, 2003 IIT Engr - August 2011 D2S Operations Meeting
More about Calibre threading revision 1.1 date: 1998/04/10 17:41:31; author: ****; state: Exp; routines related to flat drc thread are going be in this file. Currently there is not much in it. IIT Engr - August 2011 D2S Operations Meeting
Task Queue and thread pool pattern from Wikipedia article on Thread Pool pattern IIT Engr - August 2011 D2S Operations Meeting
Task Queue and Thread Pool API • Basic work flow • Define a task (data and code) • Put it in the queue • Wait until it is done • While working on a task: • Create thread specific storage (Tcl_CreateInterp) • do calculations (Tcl_EvalObjv) • return results • Where does Tcl_DeleteInterp go? IIT Engr - August 2011 D2S Operations Meeting
How to clean up the interpreters? • Loop through them and delete them from the main thread • Right? Threads would individually calculate, using Tcl and they ran nicely in parallel. At the end of the processing, once all of the tasks were done and the threads all quietly parked in their Calibre threading model parking slots, I would loop through on the main thread and destroy each interpreter. This is where things would go terribly wrong. Crashes, memory leaks, unclosed files, etc.. • I was able to reproduce the behavior in a small C/Tcltestcase. IIT Engr - August 2011 D2S Operations Meeting
Gerald spotted the problem Gerald W. Lester: ... > I also found that by deleting the interpreter inside the same execution thread that it was created in, Where else would you be deleting it from -- an interpreter is not supposed to be accessed by more than one thread. You may have many interpreters per thread (i.e. a thread can access/use many interpreters), but only one thread per interpreter (i.e. only a single thread should be accessing a given interpreter). IIT Engr - August 2011 D2S Operations Meeting
Review the documentation "At the C programming level, Tcl's threading model requires that a Tcl interpreter be managed by only one thread." p. 322 "At the C programming level, Tcl's threading model requires that a Tcl interpreter be created, used and destroyed by only one thread. Interpreters cannot be used across multiple threads.“ IIT Engr - August 2011 D2S Operations Meeting
MT Performance Tcl 8.4 Tcl 8.5 C++ Tcl 8.6 IIT Engr - August 2011 D2S Operations Meeting
www.mentor.com IIT Engr - August 2011 D2S Operations Meeting