Revision $Id: webjob-run-queued-jobs.base,v 1.13 2010/07/01 20:50:00 klm Exp $ Purpose This recipe demonstrates a simple script-based scheme that can be used within a WebJob framework to execute an arbitrary number of jobs in a job queue. Motivation The motivation for this recipe was the need to quickly and easily assign many oneshot jobs to a given set of clients and to have those jobs be executed in a serial or parallel fashion. Requirements Cooking with this recipe requires an operational WebJob server. If you do not have one of those, refer to the instructions provided in the README.INSTALL file that comes with the source distribution. The latest source distribution is available here: http://sourceforge.net/project/showfiles.php?group_id=40788 Each client must be running UNIX and have basic system utilities and WebJob (1.8.0 or higher) installed. The server must be running UNIX and have basic system utilities, JQD utilities, MLDBM utilities, PaD utilities, and WebJob installed. The commands presented throughout this recipe were designed to be executed within a Bourne shell (i.e., sh or bash). Time to Implement Assuming that you have satisfied all the requirements/prerequisites, this recipe should take less than 30 minutes to implement. Solution The solution is to deploy and configure Job Queue Directories (JQD) on your server, define a group, queue up a job for that group, and have the clients in that group download and execute the queue-worker.sh script, which, in turn, will cause them to download and run their queued job. The following steps describe how to implement this solution. 1. Set WEBJOB_CLIENT and WEBJOB_COMMANDS as appropriate for your server. Next, extract the queue-worker.sh script at the bottom of this recipe, and install it in the common commands directory. Once the file is in place, set its ownership and permissions to root:wheel and mode 644, respectively. # WEBJOB_CLIENT=common # WEBJOB_COMMANDS=/var/webjob/profiles/${WEBJOB_CLIENT}/commands # sed -e '1,/^--- queue-worker.sh ---$/d; /^--- queue-worker.sh ---$/,$d' webjob-run-queued-jobs.txt > queue-worker.sh # TARGET="queue-worker.sh" # cp ${TARGET} ${WEBJOB_COMMANDS}/ # chmod 644 ${WEBJOB_COMMANDS}/${TARGET} # chown root:wheel ${WEBJOB_COMMANDS}/${TARGET} If you are using digitally signed jobs, be sure to review and sign the script before you install it. # WEBJOB_DSV_KEY=/var/webjob/config/dsv/key.pem # webjob-dsvtool -s -k ${WEBJOB_DSV_KEY} ${WEBJOB_COMMANDS}/${TARGET} When finished, you should have the following files in the commands directory: queue-worker.sh queue-worker.sh.sig (only for digitally signed jobs) Make sure that you also have the following: testenv testenv.sig (only for digitally signed jobs) 2. Define a new group called 'test'. For example, the following command will create that group and insert three members: # webjob-jqd-create-group -g test client_1 client_2 client_3 Note: If the test group already exists you may want to use webjob-jqd-update-group to modify the group as necessary. See the webjob-jqd-update-group man page for more information on the usage and options. The result of that operation can be viewed with: # egrep '^test=' /var/webjob/config/jqd/groups test=client_1,client_2,client_3 3. Create and configure JQD directories. Each client will need its own private queue directory. If your web server does not run as the user 'apache', adjust the '-o' argument as needed. # webjob-jqd-create-queue -o apache `webjob-jqd-list-members -i %test` Creating /var/webjob/spool/jqd/client_1 Creating /var/webjob/spool/jqd/client_1/hold Creating /var/webjob/spool/jqd/client_1/todo Creating /var/webjob/spool/jqd/client_1/sent Creating /var/webjob/spool/jqd/client_1/open Creating /var/webjob/spool/jqd/client_1/done Creating /var/webjob/spool/jqd/client_1/pass Creating /var/webjob/spool/jqd/client_1/fail Creating /var/webjob/spool/jqd/client_1/foul Creating /var/webjob/spool/jqd/client_1/change.lock Creating /var/webjob/spool/jqd/client_2 Creating /var/webjob/spool/jqd/client_2/hold Creating /var/webjob/spool/jqd/client_2/todo Creating /var/webjob/spool/jqd/client_2/sent Creating /var/webjob/spool/jqd/client_2/open Creating /var/webjob/spool/jqd/client_2/done Creating /var/webjob/spool/jqd/client_2/pass Creating /var/webjob/spool/jqd/client_2/fail Creating /var/webjob/spool/jqd/client_2/foul Creating /var/webjob/spool/jqd/client_2/change.lock Creating /var/webjob/spool/jqd/client_3 Creating /var/webjob/spool/jqd/client_3/hold Creating /var/webjob/spool/jqd/client_3/todo Creating /var/webjob/spool/jqd/client_3/sent Creating /var/webjob/spool/jqd/client_3/open Creating /var/webjob/spool/jqd/client_3/done Creating /var/webjob/spool/jqd/client_3/pass Creating /var/webjob/spool/jqd/client_3/fail Creating /var/webjob/spool/jqd/client_3/foul Creating /var/webjob/spool/jqd/client_3/change.lock 4. Create a job file. # cd ${WEBJOB_COMMANDS} # cat > testenv.job < [arguments] As stated above, the value for CommandAlias differs from Command only in the case where server-side GET hooks are enabled, which is beyond the scope of this recipe. The job file may optionally include the following key/value pairs: Comment A comment that describes the job or provides some other relevant information. Note that the value for this key must fit on a single line. 5. Queue up the job to run on each client in the test group. If your web server does not run as the user 'apache', adjust the '-o' argument as needed. # webjob-jqd-create-job -o apache -f testenv.job -i %test -t serial pass|/var/webjob/spool/jqd/client_1/todo/s000_50_1236918342_147252_1415efde pass|/var/webjob/spool/jqd/client_2/todo/s000_50_1236918342_147252_1415efde pass|/var/webjob/spool/jqd/client_3/todo/s000_50_1236918342_147252_1415efde The resulting files should look similar to the following: # cat /var/webjob/spool/jqd/client_1/todo/s000_50_1236918342_147252_1415efde Command=testenv CommandAlias=testenv CommandLine=testenv Created=2009-03-12 23:25:42 Creator=root Note: If you are using a POUND-aware revision of webjob-jqd-create-job (1.3 or higher), the resulting files should look like this: # cat /var/webjob/spool/jqd/client_1/todo/s000_50_1236918342_147252_1415efde Command=testenv CommandAlias=testenv CommandLine=testenv CommandMd5=784645c0bb0f321a65d02fa9ab1e9a42 CommandSha1=88212c0cf8a21808d58bf99d07ad62ac39dceea0 CommandSize=202 Created=2009-03-12 23:25:42 Creator=root Be sure to read the Closing Remarks section. You can use webjob-jqd-list-jobs to report the status of jobs. The command below will list all serial jobs for the test group including their current job state (i.e., todo, sent, etc.). # webjob-jqd-list-jobs -i %test -s all s000_50_1236918342_147252_1415efde /var/webjob/spool/jqd/client_1/todo/s000_50_1236918342_147252_1415efde /var/webjob/spool/jqd/client_2/sent/s000_50_1236918342_147252_1415efde /var/webjob/spool/jqd/client_3/done/s000_50_1236918342_147252_1415efde The last argument in the command is actually interpreted as a Perl regular expression. If you wanted to know all serial jobs with a priority of 50, you could run this: # webjob-jqd-list-jobs -i %test -s all s000_50 Similarly, if you wanted to know all serial in the 'done' state, you could run this: # webjob-jqd-list-jobs -i %test -s done s You may use any valid Perl regular expression (e.g., s.*), but make sure you use quotes, as necessary, to prevent unwanted shell expansions. 6. Test out the queues by running the following job from one of your clients: # webjob -e -f upload.cfg queue-worker.sh -c -t serial The queue-worker.sh script has the following usage: queue-worker.sh [-c client-id] [-H webjob-home] [-j job-count] [-q queue] -t {p|parallel|s|serial} where -c client-id Specifies the client ID to use when requesting jobs. This utility attempts to obtain the default client ID from the environment by way of the WEBJOB_CLIENTID variable. -H Specifies the path to WEBJOB_HOME. The script's PATH is subsequently updated to include ${WEBJOB_HOME}/bin. The default value is /usr/local/webjob. -j Specifies the number of jobs to request. The default value is zero, which means return all jobs in the queue. The number of jobs actually returned will depend on how many jobs were in the specified queue and server-side controls that govern how many and when jobs are let out. -q queue Specifies the name of the queue, if different than the client ID, from which jobs will be requested. This option is typically used to request jobs from a public queue rather than the client's own private queue. -t {p|parallel|s|serial} Specifies the job/queue type, which can either be parallel or serial. Serial jobs are executed sequentially in the foreground, and parallel jobs are executed concurrently in the background. Closing Remarks This recipe is a work in progress, and the JQD system is in flux. If you are working from the 1.8.0 release, you may need to pull the current revisions of the following scripts from the CVS repository: nph-webjob.cgi webjob-jqd-change-state webjob-jqd-create-job These scripts are POUND-aware. Going forward, the target command for a queued job will be stored in a POUND database. Using this type of database does two important things: 1) it eliminates duplication by ensuring that only one copy of a given command file resides in the database and 2) it helps prevent inadvertent or accidental command file modification between the time that a job is queued and when it is actually executed. This feature is currently disabled by default in the CGI script. However, the command files for any queued jobs will automatically be loaded into the POUND. The WebJob server has several controls to manage queues, which are are explained below. JobQueueActive When active, JobQueueActive causes the script to pull jobs from the specified job queue. JobQueuePqActiveLimit JobQueuePqActiveLimit specifies the maximum number of parallel jobs that may be in an active state (i.e., either 'sent' or 'open'). If JobQueueActive is disabled, this control is ignored. A value of zero means there is no limit. JobQueuePqAnswerLimit JobQueuePqAnswerLimit specifies the maximum number of parallel jobs that will be returned to the client. If JobQueueActive is disabled, this control is ignored. A value of zero means there is no limit. JobQueueSqActiveLimit JobQueueSqActiveLimit specifies the maximum number of serial jobs that may be in an active state (i.e., either 'sent' or 'open'). If JobQueueActive is disabled, this control is ignored. A value of zero means there is no limit. JobQueueSqAnswerLimit JobQueueSqAnswerLimit specifies the maximum number of serial jobs that will be returned to the client. If JobQueueActive is disabled, this control is ignored. A value of zero means there is no limit. The controls listed above are global in scope, but they can also be overridden on a per queue basis. The global config file is located here: /var/webjob/config/nph-webjob/nph-webjob.cfg The per queue config file for client_1 would be located here: /var/webjob/config/nph-webjob/queues/client_1/nph-webjob.cfg For example, if you wanted to limit the number (say no more than 5) of serial jobs that could be returned to client_1, you could do the following: # mkdir -p /var/webjob/config/nph-webjob/queues/client_1 # cat > /var/webjob/config/nph-webjob/queues/client_1/nph-webjob.cfg <&2 echo "Usage: ${PROGRAM} [-c client-id] [-H webjob-home] [-j job-count] [-q queue] -t {p|parallel|s|serial}" 1>&2 echo 1>&2 exit 1 } ###################################################################### # # Main # ###################################################################### JOB_COUNT=0 # This means return all jobs in the queue. JOB_TYPE= QUEUE_NAME= WEBJOB_CLIENTID=${WEBJOB_CLIENTID} while getopts "c:H:j:q:t:" OPTION ; do case "${OPTION}" in c) WEBJOB_CLIENTID="${OPTARG}" ;; H) WEBJOB_HOME="${OPTARG}" ;; j) JOB_COUNT="${OPTARG}" ;; q) QUEUE_NAME="${OPTARG}" ;; t) JOB_TYPE="${OPTARG}" ;; *) Usage ;; esac done if [ ${OPTIND} -le $# ] ; then Usage fi if [ -z "${JOB_TYPE}" ] ; then Usage fi PATH=${WEBJOB_HOME=/usr/local/webjob}/bin:${PATH} export PATH ###################################################################### # # Get and/or check the required configuration controls. # ###################################################################### MY_JOB_COUNT_REGEXP="[0-9]+" echo "${JOB_COUNT}" | egrep "${MY_JOB_COUNT_REGEXP}" > /dev/null 2>&1 if [ $? -ne 0 ] ; then # The value is not valid. echo "${PROGRAM}: Error='JOB_COUNT does not pass muster.'" 1>&2 exit 2; fi if [ -z "${WEBJOB_CLIENTID}" ] ; then echo "${PROGRAM}: Error='Client ID is not defined. This value can be set in the environment or on command line using the WEBJOB_CLIENTID variable or the \"-c\" option, respectively.'" 1>&2 exit 2; fi WEBJOB_GETURL=`egrep -i UrlGetUrl ${WEBJOB_HOME}/etc/upload.cfg | awk -F= '{print $2}' | sed 's/\#.*$//; s/^ *//; s/ *$//;'` if [ -z "${WEBJOB_GETURL}" ] ; then REAL_CONFIG=`egrep '^Import=.*/upload[.]cfg[.][ab]$' ${WEBJOB_HOME}/etc/upload.cfg | awk -F= '{print $2}'` if [ -n "${REAL_CONFIG}" ] ; then WEBJOB_GETURL=`egrep -i UrlGetUrl ${REAL_CONFIG} | awk -F= '{print $2}' | sed 's/\#.*$//; s/^ *//; s/ *$//;'` if [ -z "${WEBJOB_GETURL}" ] ; then echo "${PROGRAM}: Error='WEBJOB_GETURL is not defined.'" 1>&2 exit 2; fi else echo "${PROGRAM}: Error='WEBJOB_GETURL is not defined.'" 1>&2 exit 2; fi fi if [ -z "${QUEUE_NAME}" ] ; then QUEUE_NAME=${WEBJOB_CLIENTID} fi case `echo ${JOB_TYPE} | tr 'A-Z' 'a-z'` in p|parallel) BG=1 JOB_TYPE=parallel ;; s|serial) BG=0 JOB_TYPE=serial ;; *) Usage ;; esac ###################################################################### # # Check the webjob version, and abort if it's too low. # ###################################################################### WEBJOB_VERSION=`webjob -v | awk '{print $2}'` WEBJOB_VERSION_IS_GOOD=`echo "${WEBJOB_VERSION}" | awk -F. '{ if ($1 >= 1 && $2 >= 8) { print "true"; } else { print "false"; } }'` if [ X"${WEBJOB_VERSION_IS_GOOD}" != X"true" ] ; then echo "${PROGRAM}: Error='WebJob version (${WEBJOB_VERSION}) is too low. Version 1.8.0 or higher is required.'" 1>&2 exit 2 fi ###################################################################### # # Request some jobs and execute them in a parallel or serial loop. # ###################################################################### MY_URL="${WEBJOB_GETURL}?ClientId=${WEBJOB_CLIENTID}&QueueName=${QUEUE_NAME}&JobType=${JOB_TYPE}&JobCount=${JOB_COUNT}" QUEUETAG_REGEX="[ps][0-9][0-9][0-9]*_[0-9][0-9]_[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]_[0-9][0-9][0-9][0-9][0-9][0-9]_[0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f]" webjob -g -f ${WEBJOB_HOME}/etc/upload.cfg ${MY_URL} | while read MY_QUEUETAG_KVP; do WEBJOB_QUEUETAG=`echo "${MY_QUEUETAG_KVP}" | awk -F= '{print $1}'` ; export WEBJOB_QUEUETAG WEBJOB_QUEUECMD=`echo "${MY_QUEUETAG_KVP}" | sed "s/^${QUEUETAG_REGEX}=//;"` if [ ${BG} -eq 1 ] ; then echo "${WEBJOB_QUEUECMD}" | xargs webjob -e -f ${WEBJOB_HOME}/etc/upload.cfg & else echo "${WEBJOB_QUEUECMD}" | xargs webjob -e -f ${WEBJOB_HOME}/etc/upload.cfg fi done --- queue-worker.sh ---