This recipe demonstrates how to keep a website up-to-date using WebJob. Payload and Delivery (PaD) technology will be used to turn a gzipped, tar file of a website into a self extracting executable. WebJob will then be configured to download, unpack, and execute this PaD file. The motivating factors behind this recipe are: 1) it's desirable to have a way to keep websites up-to-date, 2) having the ability to automatically detect and repair damaged or corrupted content makes websites more resilient, and 3) minimizing user activity on production systems helps to establish well-behaved profiles -- an important factor in identifying abnormal activity. Cooking with this recipe requires that you already have an operational WebJob server. If this is not the case, refer to the instructions provided in the INSTALL file that comes with the source distribution. While this recipe was written for UNIX environments, the concepts should be portable. You need to have at least the following programs available on your production system: basename, diff, gzip, mv, rm, sh, tar, and webjob (1.4.0 or higher). This recipe assumes that for a given client, say client_0001, you wish to periodically inspect a website and repair or update it as necessary. Here, website refers to the set of files contained within a single directory tree. It is also assumed that you maintain a master copy of the website on a separate system -- i.e. not the production Web server. It is from this system (the master server) that PaD files will be created. 1. Log into the master server and create a gzipped, tar file of the website as it will exist on the production server. To keep things simple, let's assume that the website tree is called htdocs and is located in /usr/local/www for both the master and production servers. Here, we'll construct a sample tree that contains three files: mkdir -m 755 -p /usr/local/www/htdocs cd /usr/local/www for I in "1" "2" "3" ; do F=file${I} ; echo "This is ${F}." > htdocs/${F} ; done To create htdocs.tgz, do the following: tar -C /usr/local/www -zcf htdocs.tgz htdocs 2. Next, create htdocs.tgz.pad and copy it to client_0001's commands directory on the WebJob server. pad-make-script --create htdocs.tgz > htdocs.tgz.pad chmod 644 htdocs.tgz.pad scp -p htdocs.tgz.pad you@your.webjob.server.net:/integrity/profiles/client_0001/commands/ The server should now have an integrity tree similar to the one shown here: integrity | - incoming | - logfiles | - profiles | - clientid_0001 | - baseline | - commands | - htdocs.tgz.pad 3. At this point, the WebJob server has been configured. Now, we'll focus on the production server. Log into the production server, and create upload.cfg as shown below. Be sure to set URLGetURL, URLPutURL, URLUsername, and URLPassword as appropriate. Install this config file in /usr/local/integrity/etc. If necessary, create /usr/local/integrity/run with: mkdir -m 755 -p /usr/local/integrity/run --- upload.cfg --- ClientId=client_0001 URLGetURL=https://your.webjob.server.net/cgi-webjob/nph-webjob.cgi URLPutURL=https://your.webjob.server.net/cgi-webjob/nph-webjob.cgi URLUsername=client_0001 URLPassword=password URLAuthType=basic RunType=snapshot OverwriteExecutable=Y UnlinkOutput=Y UnlinkExecutable=Y GetTimeLimit=0 RunTimeLimit=0 PutTimeLimit=0 URLDownloadLimit=10000000 TempDirectory=/usr/local/integrity/run --- upload.cfg --- sed -e '1,/^--- upload.cfg ---$/d; /^--- upload.cfg ---$/,$d' webjob-pad-update-website.txt > upload.cfg install -m 600 -o root -g wheel upload.cfg /usr/local/integrity/etc 4. Next, create update-website.sh as shown below, and install it in /usr/local/integrity/bin. This script is actually one big command -- look at the last line. The WebJob portion of this command downloads, and executes htdocs.tgz.pad using upload.cfg as its config file. The PaD portion of this command unpacks the gzipped, tar file, compares the resulting tree to the deployed tree, and if they are different, updates the deployed tree by replacing it. Finally, the run status and any differences are posted back to the WebJob server. Before running this script, be sure to set PAD_FILE, JOB_HOME, and WEB_SITE as appropriate. --- update-website.sh --- #!/bin/sh PAD_FILE=htdocs.tgz.pad JOB_HOME=/usr/local/integrity WEB_SITE=/usr/local/www PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/sbin:/usr/local/bin:${JOB_HOME}/bin DONAME="PAD_BASE=\`basename %payload .tgz\`" UNPACK="( echo \"Building workdir...\" 1>&2 && tar -C ${JOB_HOME}/run -zxmf %payload )" DODIFF="( echo \"Checking website...\" 1>&2 && diff -urP ${WEB_SITE} ${JOB_HOME}/run/\${PAD_BASE} )" UPDATE="( echo \"Updating website...\" 1>&2 && rm -rf ${WEB_SITE}.old && mv ${WEB_SITE} ${WEB_SITE}.old && mv ${JOB_HOME}/run/\${PAD_BASE} ${WEB_SITE} )" FINISH="( echo \"Cleaning workdir...\" 1>&2 && rm -rf ${JOB_HOME}/run/\${PAD_BASE} )" ${JOB_HOME}/bin/webjob -e -f ${JOB_HOME}/etc/upload.cfg ${PAD_FILE} ${DONAME} \&\& \( ${UNPACK} \&\& ${DODIFF} \|\| ${UPDATE} \; ${FINISH} \) --- update-website.sh --- sed -e '1,/^--- update-website.sh ---$/d; /^--- update-website.sh ---$/,$d' webjob-pad-update-website.txt > update-website.sh install -m 755 -o root -g wheel update-website.sh /usr/local/integrity/bin 5. At this point, you are ready to test everything out. Assuming that the deployed tree doesn't exist, go ahead and create it with: mkdir -m 755 -p /usr/local/www/htdocs This ensures that there is something to diff against the first time you run the update script. Now run the update script as follows: /usr/local/integrity/bin/update-website.sh If it completes successfully, you should find four files on the WebJob server in its incoming directory. The files listed here will give you an idea of what to look for: client_0001_20021108165853_update-website.sh.env client_0001_20021108165853_update-website.sh.err client_0001_20021108165853_update-website.sh.out client_0001_20021108165853_update-website.sh.rdy Inspect these files. Their content should be similar to that shown below. The .rdy file is simply a lock release mechanism. Therefore, it's content has not been listed. --- client_0001_20021108165853_update-website.sh.env --- CommandLine=htdocs.tgz.pad PAD_BASE=`basename %payload .tgz` && ( ( echo "Building workdir..." 1>&2 && tar -C /usr/local/integrity/run -zxmf %payload ) && ( echo "Checking website..." 1>&2 && diff -urP /usr/local/www/htdocs /usr/local/integrity/run/${PAD_BASE} ) || ( echo "Updating website..." 1>&2 && rm -rf /usr/local/www/htdocs.old && mv /usr/local/www/htdocs /usr/local/www/htdocs.old && mv /usr/local/integrity/run/${PAD_BASE} /usr/local/www/htdocs ) ; ( echo "Cleaning workdir..." 1>&2 && rm -rf /usr/local/integrity/run/${PAD_BASE} ) ) JobPid=15514 KidPid=6281 KidStatus=0 KidSignal=0 KidReason=The kid exited cleanly. Hostname=your.production.server.net SystemOS=i686 Linux 2.4.18-grsec-1.9.4 JobEpoch=2002-11-08 13:57:23 PST (1036792643.192068) GetEpoch=2002-11-08 13:57:23 PST (1036792643.343611) RunEpoch=2002-11-08 13:57:23 PST (1036792643.987115) PutEpoch=2002-11-08 13:57:24 PST (1036792644.167154) --- client_0001_20021108165853_update-website.sh.env --- --- client_0001_20021108165853_update-website.sh.err --- Extracting payload... Delivering payload... PAD_BASE=`basename htdocs.tgz .tgz` && ( ( echo "Building workdir..." 1>&2 && tar -C /usr/local/integrity/run -zxmf htdocs.tgz ) && ( echo "Checking website..." 1>&2 && diff -urP /usr/local/www/htdocs /usr/local/integrity/run/${PAD_BASE} ) || ( echo "Updating website..." 1>&2 && rm -rf /usr/local/www/htdocs.old && mv /usr/local/www/htdocs /usr/local/www/htdocs.old && mv /usr/local/integrity/run/${PAD_BASE} /usr/local/www/htdocs ) ; ( echo "Cleaning workdir..." 1>&2 && rm -rf /usr/local/integrity/run/${PAD_BASE} ) ) Building workdir... Checking website... Updating website... Cleaning workdir... DeliveryStatus='0' --- client_0001_20021108165853_update-website.sh.err --- --- client_0001_20021108165853_update-website.sh.out --- diff -urP /usr/local/www/htdocs/file1 /usr/local/integrity/run/htdocs/file1 --- /usr/local/www/htdocs/file1 Wed Dec 31 16:00:00 1969 +++ /usr/local/integrity/run/htdocs/file1 Fri Nov 8 13:57:24 2002 @@ -0,0 +1 @@ +This is file1. diff -urP /usr/local/www/htdocs/file2 /usr/local/integrity/run/htdocs/file2 --- /usr/local/www/htdocs/file2 Wed Dec 31 16:00:00 1969 +++ /usr/local/integrity/run/htdocs/file2 Fri Nov 8 13:57:24 2002 @@ -0,0 +1 @@ +This is file2. diff -urP /usr/local/www/htdocs/file3 /usr/local/integrity/run/htdocs/file3 --- /usr/local/www/htdocs/file3 Wed Dec 31 16:00:00 1969 +++ /usr/local/integrity/run/htdocs/file3 Fri Nov 8 13:57:24 2002 @@ -0,0 +1 @@ +This is file3. --- client_0001_20021108165853_update-website.sh.out --- 6. Once you are satisfied that all is well, add the following cron job to the production server's crontab -- it runs every day at 12:00 midnight. --- crontab.daily --- 0 0 * * * /usr/local/integrity/bin/update-website.sh > /dev/null 2>&1 --- crontab.daily --- From this point forward, the website will automatically repair itself if the deployed content differs from the PaD content for any reason. When you want to update the website, simply put a new PaD file on the WebJob server. As an added benefit, the diff output preserved in the WebJob upload can be used as a negative or reverse patch. Applying this patch to the tree extracted from the PaD file allows you to reconstruct the original website as deployed. Thus, there is an inherit quarantining property built into this scheme. Unfortunately, binary files must be quarantined separately.