60 likes | 146 Views
GMTK parallel tools. Arthur Kantor 9/5/06. Overview. Before diving in, consider reading: Bowan’s Sungrid Engine cheat sheet http://www.ifp.uiuc.edu/~bowonlee/research/cluster/linux_cluster.htm GMTK documentation http://ssli.ee.washington.edu/~bilmes/gmtk/
E N D
GMTK parallel tools Arthur Kantor 9/5/06
Overview • Before diving in, consider reading: • Bowan’s Sungrid Engine cheat sheet • http://www.ifp.uiuc.edu/~bowonlee/research/cluster/linux_cluster.htm • GMTK documentation • http://ssli.ee.washington.edu/~bilmes/gmtk/ • Parallel scripts for emtrain and viterbi • Other useful scripts from JHU WS06 • JHU WS06 parallel scripts • More finicky but do more (for now)
Parallel scripts for emtrain and viterbi ifp-32.ifp.uiuc.edu/cworkspace/ifp-32-1/hasegawa/programs/gmtk/parallelImproved • distribute.pl • Reads a list of tasks from file and runs them in parallel on the cluster • example • emtrainParallel.pl (viterbiParallel.pl) • Runs a single iteration of gmtkEmtrainNew (gmtkViterbiNew) • Determines all the parallelisation settings automatically • Can be safely restarted after a crash • Does a sanity check of your settings before launching jobs in parallel • Example
Other useful scripts from JHU WS06 ifp-32.ifp.uiuc.edu/cworkspace/ifp-32-1/hasegawa/programs/gmtk/scripts • View model files: lg • Creating initial gaussian mixtures: genGMParms.pl • Creating DTs • genDictionaryDT.pl creates a DT to determine the phone, given the word, pronunciation variant and phoneCounter • Other scripts to create word transition DTs, word Counter to word, etc… • All DTs have already been generated for Svitchboard
Data ifp-32.ifp.uiuc.edu:/cworkspace/ifp-32-1/hasegawa/jhu06/export/ws06afsr/data/SVB • Svitcboard data for tasks with vocab sizes of 10, 25, 50, 100, 250, 500 • NN outputs for all of Svitchboard to be used in tandem models is available • Broken up into 5-fold cross validation chunks • Filelists already generated
JHU WS06 parallel scripts ifp-32.ifp.uiuc.edu/cworkspace/ifp-32-1/hasegawa/programs/gmtk/parallel • Example config files