290 likes | 468 Views
Configuring Quill Condor Week 2007. Execute-Only. Execute-Only. Submit-Only. = Process Spawned. schedd. master. master. master. startd. startd. Typical Condor Pool. Central Manager. = ClassAd Communication Pathway. master. negotiator. collector. What is Quill?.
E N D
Execute-Only Execute-Only Submit-Only = Process Spawned schedd master master master startd startd Typical Condor Pool Central Manager = ClassAd Communication Pathway master negotiator collector
What is Quill? A technology to store a read only version of the job queue and job historical data in a relational database.
Why Quill? Offloads query overhead from sched • Performance boost! • Easier to make web portal • RDMS access easier than SOAP/CLI
schedd schedd Database quilld Job Queue Job Queue Job Queue Management Without Quill With Quill
Quill downsides • Additional latency • More complicated setup • Handful of attributes not in DBMS
Quill and Quill++ • Quill in Condor since 6.7.11 • Quill++ (quillpp) coming soon. • Support for all daemons • Multiple schedds in one database • Support for Oracle on some platforms • Replaces quill • We’ll talk about both
Execute-Only Execute-Only = Process Spawned master master startd startd Typical Quill’d Condor Pool Central Manager = ClassAd Communication Pathway master negotiator collector Submit-Only master schedd query Database quill quill condor_q postgres
Execute-Only Execute-Only = Process Spawned master master startd startd Typical Quillpp’d Condor Pool Central Manager = ClassAd Communication Pathway quillpp master negotiator collector quillpp Submit-Only quillpp master schedd query Database quillpp quill condor_q postgres
How to use Schema? • We’ll talk about this in another talk • Quill Front End and Schema BoF • Thursday 11am
Quill (not Quill++) Deployment • One Quill daemon per schedd • Quill daemons must be uniquely named • Each Quill daemon uses a unique DB name • Currently uses PostgreSQL • Recommend PostgreSQL 8.2 or later • Better disk management
Quill++ deployment • One condor_quillpp per machine • One condor_dbmsd per database • Manual installation of schema • One DB per pool • Uses Postgres or Oracle
Condor’s Interface to Quill • Modified two tools to utilize the DB • condor_q • condor_history
A User Perspective: condor_q • condor_q changes • When QUILL_ENABLED, goes to rdbms • -name takes a ScheddName or QuillName • -avgqueuetime details average time in queue for all jobs
Condor_q -direct • -direct rdbms • (default when QUIL_ENABLE=true) • -direct quilld • (useful for firewall traversal) • -direct schedd • (100% up-to-date view)
A User Perspective: condor_history • condor_history changes • -name takes a Quill Name to retrieve job histories from a remote quill’s database
Condor_history -direct • There isn’t any (yet) • Condor_history –f \ • `condor_config_val HISTORY` • No –direct quilld equivalent
PostgreSQL Configuration • Add two special user accounts: quillreader and quillwriter • createuser quillreader --no-createdb --no-adduser --pwprompt • createuser quillwriter --createdb --no-adduser --pwprompt
PostgreSQL Configuration (cont) • Allow TCP/IP connections • Edit file postgresql.conf • Add listen_address = '*' • Allow connections from specific hosts • Edit file pg_hba.conf • host all quillreader 128.105.0.0 255.255.0.0 password • host all quillwriter 128.105.0.0 255.255.0.0 password • Note: only use ‘password’ authentication at this time.
Quill Configuration • User quillwriter needs a password. • Store it in • $(SPOOL)/.quillwritepassword (quill) • $(SPOOL)/.pgpass (quill++) • .pgpass has host:port:db:user:pass • Ensure only the condor uid can read it if Condor is running as root
Quill Configuration (cont) • Condor system specific attributes in file condor_config.local • QUILL = $(SBIN)/condor_quill • QUILL_LOG = $(LOG)/QuillLog • QUILL_ADDRESS_FILE = $(LOG)/.quill_address • DAEMON_LIST = …, QUILL • VALID_SPOOL_FILES = …, .quillwritepassword • DC_DAEMON_LIST = …, QUILL
Quill Configuration (cont) • Quill specific attributes • QUILL_ENABLED = TRUE • # The quill name must be unique across all • # quill daemons AND schedds • QUILL_NAME = psilord_quilld@merlin.cs • QUILL_DB_NAME = psilord_db • QUILL_DB_IP_ADDR = merlin.cs.wisc.edu:42999 • QUILL_POLLING_PERIOD = 10(seconds)
Quill Configuration (cont) • QUILL_HISTORY_CLEANING_INTERVAL = 24 (hours) • QUILL_HISTORY_DURATION = 30 (days) • QUILL_MANAGE_VACUUM = FALSE • QUILL_IS_REMOTELY_QUERYABLE = TRUE • QUILL_DB_QUERY_PASSWD = xxx
Schema management • Quill automatically loads schema • Upgrades itself automatically • Quill++ requires manual loading: • Psql –Uquillwriter<common_createddl.sql • Psql –Uquillwriter<pgsql_createddl.sql
Conversion to Quill++ • Conversion only matters for history • Conversion is one-way-only! • Two steps: • Dump quill history tables to file with • Condor_dump_history • Load quill++ history tables from file with • Condor_load_history
Data Management • Constrain database size • History truncation • Quill++ other tables, too • Postgres Index management • Oracle cleans itself • Careful of long queries, esp with Quill
Data Management: Quill • HISTORY_CLEANING_INTERVAL • In hours (24 hours) • HISTORY_DURATION • How long in days (7 days) • QUILL_SHOULD_REINDEX • Boolean (false) • QUILL_MANAGE_VACUUM (false)
Data Management: Quill++ • Condor_dbmsd does all the work • QUILL_DBSIZE_LIMIT (20 Gb) • Emails warning when 75% is hit • DATABASE_PURGE_INTERVAL (s (24 hours)) • DATABASE_REINDEX_INTERVAL (s (24 hours)) • QUILL_DB_TYPE (oracle, pgsql) • QUILL_RESOURCE_HISTORY_DURATION (7 days) • QUILL_JOB_HISTORY_DURATION (10 years!) • QUILL_RUN_HISTORY_DURATION (7 days)
Thank you! • Want more information? • BOF “Databases in Condor”