1 / 55

Building and Modifying Condor

Learn how to build and modify Condor, a high-throughput computing system. This talk covers space and UNIX requirements, GNU tools, downloading, and the building process.

wilsonsteve
Download Presentation

Building and Modifying Condor

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Nick LeRoy Computer Sciences Department University of Wisconsin - Madison nleroy@cs.wisc.edu http://www.cs.wisc.edu/condor Building and Modifying Condor

  2. Before I start … • If you have any questions, stop me along the way • There should hopefully be time for discussions after the talk • Feel free to talk to me, or any of the Condor developers, any time during the conference • Todd will give the last part of the talk • Windows specifics

  3. Space Requirements • 5G is probably enough • Actual amount depends on the actual features built • Bare minimum 2G • Temporary space is required for building externals, automatically cleaned up

  4. UNIX Requirements • Most tools are standard on Linux development systems • In other cases, they can be downloaded as binaries • Or, downloaded as source and built by hand

  5. UNIX Requirements List • GNU tools: • GNU make • GNU autoconf and autoheader (2.59 or greater) • GNU tar (1.13 or higher) • GNU Compiler Collection (gcc >= 2.95.3) • gzip • Other tools: • perl (5.005_03 or greater) • patch (must support unified diffs, GNU patch is preferred) • strip (can be either GNU or the vendor's version) • lex • yacc (or GNU bison) • some other typically-found utilities (for example, cut, awk, etc.)

  6. Getting it • Download from the same place that you download the rest of Condor • In the form of a gzip-ed “tarball” • Unpack the tarball • If you don’t know how to do this, try: rm condor_src-7.1.0-all-all.tar.gz

  7. First Glance • BUILD-ID • NMI build ID, you can ignore this • config and imake • Yes, we still use imake • The rest of the world wisely abandoned it years ago … • You can probably ignore these • Adds requirement: GNU cpp <= 4.1.3 • LICENSE-2.0.txt • Copy of the Apache License, Version 2.0 • The license under which we’ve released Condor

  8. Interesting Pieces • README.building • Document describing building Condor • NTconfig • Files required for building under Windows • externals • Externally maintained packages • Some are “hard” requirements, others “soft” • src • The Condor source code 

  9. Simple Build • The basic Condor build is simple: $ cd src $ ./build_init $ ./configure $ make

  10. Didn’t work? • Most common problem is that you’re trying to build on a system that we haven’t ported the Standard Universe to • Solution: Disable the standard universe and try again $ ./configure --disable-full-port \ --disable-gcc-version-check $ make

  11. Externals • Always have your bags packed • Bags are getting pretty big these days • Globus, ClassAds, PCRE, zlib, Kerberos • Externals and versions by configure • To use system packages: $ ./configure --enable-proper • “All or nothing” • Some features (in particular Condor-G) will be disabled • We’re working on making this selective • Externals tree selected by: $ ./configure --with-externals=/path/tree

  12. First look at src • CODING_GUIDELINES • condor_* • Directories with most of the source code • In the future, we’ll rename them and get rid of the condor_ prefix • Also: h • We’ll look at more of these later

  13. Configuring the build • Uses GNU configure • Some options, like, --prefix don’t work • Make sure that the cpp you use isn’t >= 4.2 $ export CXXCPP=/usr/bin/cpp-4.1 $ ./configure • Default: $ ./configure

  14. Minimal configuration • To save disk & time, make use of –without-xxx or –disable-xxx options you don’t care about • Use ./configure –help to get a list of them • Packages listed as “hard requirement” can’t be turned off • There are some interdependencies $ ./configure --without-globus --without-nordugridgahp --without-unicoregahp --without-gt4gahp --without-srb --without-oci --without-gcb --without-gsoap --without-drmaa --without-gahp --without-blahp --disable-full-port

  15. Some Problems & SolutionsUnknown GCC version configure: error: Condor will not compile with gcc version 4.2.1 • Try: $ ./configure --disable-gcc-version-check • The build itself may fail due to compiler incompatibilities

  16. Some Problems & Solutions Unknown glibc version checking glibc... ERROR configure: error: Condor does NOT know what glibc external to use with glibc-2.6.1 • Edit (yeah, with vi or emacs) configure.ac • Around line 2500, add a block for your glibc version (cut & paste from nearby): "2.6.1" ) # OpenSUSE 10.3 uses glibc 2.6.1 including_glibc_ext=NO ;; • Rerun ./build_init for this to take affect

  17. Build it • From the src directory: $ make • Will build the externals as required • Go get a beverage – this could take quite a while

  18. Build Problems & Solutions Error in ClassAds external classads-1.0rc5: FAILED! (see /home/condor-7.1.0/externals/build/log.classads-1.0rc5) • Disable ClassAds in configure: $ ./configure –without-classads • condor_q –better-analyze will be broken

  19. Build Problems & Solutions Error building other externals xxxx-1.2.3: FAILED! (see /home/condor-7.1.0/externals/build/log.xxxx-1.2.3) • Disable xxxx in configure: $ ./configure –without-xxxx • If this is a “hard requirement” or you rely on this feature: • Look in the above log and correct the problem

  20. Build Problems & SolutionsStandard Universe /tmp/IIf.0twp5X:114:6: error: #error Checkpoint library not compatible with compiler! ../../imake/imake: Exit code 1. Stop. • Standard Universe features haven’t been ported to this compiler / platform yet. $ ./configure --disable-full-port

  21. It built! make[1]: Nothing to be done for `all'. make[1]: Leaving directory `/home/build/condor-7.1.0/src/condor_examples‘ $make release …

  22. Build targets • Testing release • $ make release • Suitable for testing • Creates release_dir • Public release • What we actually release to the public • $ make public • Packaged tarballs wind up in ../public

  23. Test It • We’ll create a test installation of our Condor build • We built condor in /home/condor-7.1.0 • We’ll make our test directory a subdirectory of that • /home/condor-7.1.0/install • Do a basic Condor install of the Condor from release_dir, just like you would any other Condor install • Or …

  24. Test Installation(Step by step) $ CONDOR=/home/condor-7.1.0/install $ mkdir $CONDOR $ cd $CONDOR $ mkdir checkpoints cred_dir execute spool log test $ ln –s ../release_dir/* . $ cp etc/examples/condor_config.generic etc/condor_config $ export CONDOR_CONFIG=$CONDOR/etc/condor_config $ vi $CONDOR_CONFIG $ export PATH=$CONDOR/bin:$CONDOR/sbin:$PATH $ rehash $ condor_master

  25. Simple checks • Run ‘ps’, verify that the Condor processes are running • Run condor_status –any • Run condor_status to verify that the Startd’s machine is correct • Make sure that you wait a bit for the Startd to publish it’s ad(s) • Look through the logs • Submit a simple “hello world” test job, verify that it runs as expected

  26. More tests • We have a whole suite of tests $ cd condor_tests $ make $ ./batch_test.pl –b IsThisNightly passed <…/src/condor_tests> Workspace testing … submitting . tests lib_chirpio_van.run succeeded lib_procapi_pidtracking-snapshot.run succeeded … • Wait patiently (very patiently)

  27. Use the source, Luke • Libraries • Daemon Core • Client (command line) Tools • Daemons • Standard Universe • Other

  28. Source Directories • Most of the directory names are pretty clear • We’re in the process of cleaning up, moving things around, and renaming, so be prepared for changes over time • GIT is finally giving us this freedom  • Quite a few have version numbers in the name that make little or no sense to the outside world (condor_startd.V6, …) • This will get cleaned up, too

  29. Master, Quill, Startd, Shadow, Starter, Collector Submit, Q, tools, etc. ClassAds, I/O, Daemon Client, Daemon Core, ProcAPI, SysAPI C++ Utilities, C Utilities “h”, includes Layering

  30. Condor Libraries • The layering is not perfect, there are interdependencies • General purpose: • condor_util_lib • condor_c++_util • I/O & Networking: • condor_io • condor_daemon_client • Process Tracking: • condor_procapi • System Information: • condor_sysapi • ClassAds: • condor_classad • Daemon Core • condor_daemon_core.V6

  31. C / C++ Utilities • In general, there’s a utility for everything • POSIX and stdio library wrappers • C++ Standard library replacements • Condor templates (CTL) • We don’t use STL for hysterical reasons • Designed to be portable • Look here before reinventing the wheel

  32. C: dprintf() • Works like printf() • Conditionally writes to the log dprintf(D_ALWAYS, “Two + two is %d\n”, 2+2); • OR together for multiple levels, so dprintf(D_COMMAND|D_SECURITY, <…>); • Useful debug levels • D_ALWAYS • D_FULLDEBUG • Everything else is probably too esoteric (see condor_debug.h)

  33. C++: MyString.h • Similar to STL’s string • Prefer MyString buffer to char buffer[1024] • automatically allocates and resizes memory • Notable methods / operators: • sprintf() and sprintf_cat() • Value() and GetCStr() – read-only access • += is overloaded to append a lot of types to the string • perl-like chomp() and trim() to get rid of whitespace • readLine() that can slurp in data from a FILE* and ostreams • replacement for strtok() • Other tricks • search for substrings • escape characters

  34. C++: Configuration • Lookup values from the configuration • NOT a ClassAd!  • Basic: param(const char *name) • Returns a char * that you must decode manually • You MUSTfree() this buffer! • Others: param_<type>(<name>) • Decodes to the specified type, and free()’s the buffer • Does NOT handle expressions! • Integer: param_integer(<name>) • Double: param_double(<name>) • Boolean: param_boolean(<name>)

  35. C++: Boolean Configuration Expressions • Boolean Expression: param_boolean_expr(<name>) • This one Does handle expressions • Configuration: WIZBANG = ( FUBAR > 10 || SUPERCALIFRAGILISIC ) • Source Code: bool wizbang = param_boolean( “WIZBANG” );

  36. More C & C++ • Wrappers and similar: • safe_open_wrapper(), my_popen() • “CTL” • ExtArray, string_list, Queue, tree, stringSpace, counted_ptr • A lot of other classes & functions • File / Directory access classes: Directory, StatInfo • exponential_backoff • my_hostname(), my_username()

  37. Condor I/O & Networking • All Condor daemons have a “Command Socket” • Data is encoded with CEDAR • Condor External DAta Representation • CEDAR is all-singing, all-dancing • Data representation • socket abstraction • Security • bandwidth limiting • port ranges

  38. Stream, Sock, et. al. • The layering of the Condor socket objects is not obvious • Stream (base class, in stream.{h,C} ) • CEDAR streaming • Integers, chars, strings, etc. • Sock (derived from Stream, in sock.{h,C} ) • Adds connection / session management • ReliSock (derived from Sock, in reli_sock.{h,C} ) • TCP-specific “Sock” • SafeSock (derived from Sock, in safe_sock.{h,C} ) • UDP-specific “Sock”

  39. Daemon Client • Series of classes with knowledge of how to communicate with specific daemons • Master, Collector, Startd, etc. • All derived from a common base

  40. ClassAds • C++ API to access the ClassAds that Condor uses internally • “Old” ClassAds • Subclassed from AttrList, so look there • Lookup() versus Eval() • Lookup() will return “7 + 2” • Eval() will return 9 • ClassAds are parsed to ExprTree(s) • Can generally avoid this and use Eval<Type> • Insert() and Assign() to update the ad • sPrint(), fPrint(), and dPrint() to serialize

  41. Condor Daemons • The code for most Condor daemons are in directories named after the daemon: • Startd is in condor_startd.V6 … • Note: 2 sets of starters / shadows • condor_starter.V5and condor_shadow.V6 • Standard Universe • condor_{starter,shadow}.V6.1 • All others

  42. Daemon Core • Heart and body of a Condor daemon • Usually a singleton object • Event-driven loop around select() • Single threaded! • Your code registers events for select() and callbacks • Timers, Pipes, Signals, Reaper, Socket, CEDAR “Commands”

  43. Registering a Callback • Use Daemon Core’s Register_Command() method: daemonCore->Register_Command(128, "SAY_HELLO", (CommandHandler)&say_hello, "say_hello", NULL, READ, D_FULLDEBUG ); • Parameters: • The command number (usually defined in condor_commands.h and condor_commands.C) • Text description of the command • "CommandHandler", which is really a function pointer • Text description of the handler • The service class to use -- since this is a C handler, we don't need one. • What Permission level we need to be to call this function (i.e. HOSTALLOW_READ, HOSTALLOW_ADMINISTRATOR, etc) • What dprintf() level to use

  44. Some guidelines • You must not • Throw an exception • Call printf() or exit() or assert() • You can: • call ASSERT() • call dprintf()

  45. Dependency Hell • Dependancies work on Windows • Our build system has no knowledge of dependencies • If you modify an include file, make sure that everything that depends on it gets rebuilt • $ make clean && make

  46. More on Dependencies • Objects from some directories need to get “repackaged” with the C++ library • condor_classads • condor_daemon_client • Thus, to rebuild these: • $ make && make –C ../condor_c++_util

  47. (Even) More on Dependencies • If you’re working on a daemon and make a library change • Example daemon: Startd in the condor_startd.V6 directory • Example library: condor_daemon_client $ make –C ../condor_daemon_client && make -C ../condor_c++_util && make release • If you modified dc_startd.h and want to be paranoid: $ (cd ../condor_daemon_client && make clean && make) $ (cd ../condor_c++_util && make clean && make) $ make clean && make release

  48. Adding a Source File • Add the file to the appropriate section of the Imakefile • No, I’m not going to explain our Imakefile syntax here $ ../condor_imake $ make

  49. Testing & Debugging • OK, You’ve built a modified Startd, how do I test / debug it? • Remove STARTD from DAEMON_LIST • Start the master • Run the startd by hand $ ./condor_startd -t –f • -t to log to stdout • -f to run it in the foreground • CTRL-C to kill it

  50. More debugging • Segfaults can sometimes be caused by object version mismatches • You added a field to a class in C++ Util, but didn’t rebuild the Startd that uses the class • With the the use of the –t and -f flags, you can debug like any other program • Adding dprintf()’s • With gdb • Using strace

More Related