Problem This recipe demonstrates how to deploy and manage argus -- a real time flow monitor designed to perform comprehensive IP network traffic auditing. Requirements Cooking with this recipe requires an operational WebJob server. If you do not have one of those, refer to the instructions provided in the README.INSTALL file that comes with the source distribution. The latest source distribution is available here: http://sourceforge.net/project/showfiles.php?group_id=40788 Each client system must be running UNIX, have argus installed, and have at least the following utilities: awk, basename, cat, crontab, date, echo, eval, find, grep, hostname, ifconfig, mkdir, mv, ps, rm, test, webjob, and wc. Likewise, the server must have at least the following utilities: awk, basename, cat, egrep, find, head, mail, mv, rm, sh, test, and wc. The commands presented throughout this recipe were designed to be executed within a Bourne shell (i.e. sh or bash). Solution The solution is to use WebJob to manage argus. Clients execute harvest_argus to transfer argus data to a WebJob server, and the server, in turn, processes that data with process_argus. The following steps describe how to implement this solution. 1. Set WEBJOB_HOME as appropriate for your WebJob server. This variable should be set to the prefix directory that was used when WebJob was installed (e.g., /usr/local/integrity). $ export WEBJOB_HOME=/usr/local/integrity 2. Set WEBJOB_CLIENT and WEBJOB_COMMANDS as appropriate for your WebJob server: $ export WEBJOB_CLIENT=common $ export WEBJOB_COMMANDS=/integrity/profiles/${WEBJOB_CLIENT}/commands 3. Extract harvest_argus from this recipe, and move it to the appropriate commands directory. If you want the script to be shared by all clients, run the following commands as given below. Otherwise, specify the desired client ID when setting WEBJOB_CLIENT. Once harvest_argus is in place, set its ownership and permissions to 0:0 and mode 644 respectively. $ sed -e '1,/^--- harvest_argus ---$/d; /^--- harvest_argus ---$/,$d' webjob-monitor-argus.txt > harvest_argus $ mv -f harvest_argus ${WEBJOB_COMMANDS} $ chown 0:0 ${WEBJOB_COMMANDS}/harvest_argus $ chmod 644 ${WEBJOB_COMMANDS}/harvest_argus Modify the following control variables in harvest_argus as appropriate: arc_day_ The number of days of argus data to archive on , with zero (0) as indefinite. For example, the following will remove files older than 30 days on client1: arc_day_client1="30" arc_dir_ The directory where archived argus data will be place on the (e.g., "/usr/local/bin/argus/archive"). arc_zip_ The full path and arguments of the compression program used to compress archive data. If blank, no compression will be used. Typical examples include: "/bin/gzip -9", "/usr/local/bin/bzip2 -9", or "compress". bin_ The full path to and filename of the argus binary (e.g., "/usr/local/sbin/argus"). int_ Interface name that argus will attach to (e.g., hme0, hme1, dmfe0, dmfe1, etc.). out_ = output file The full path and filename of the Argus output file (e.g., "/tmp/argus.out"). 4. Extract process_argus from this recipe, and move it to the server's bin directory. Set this file's ownership and permissions to 0:0 and mode 755 respectively. $ sed -e '1,/^--- process_argus ---$/d; /^--- process_argus ---$/,$d' webjob-monitor-argus.txt > process_argus $ mv -f process_argus ${WEBJOB_HOME}/bin $ chown 0:0 ${WEBJOB_HOME}/bin/process_argus $ chmod 755 ${WEBJOB_HOME}/bin/process_argus Next, extract process_argus.cfg from this recipe, and move it to the server's etc directory. Set this file's ownership and permissions to 0:0 and mode 644 respectively. $ sed -e '1,/^--- process_argus.cfg ---$/d; /^--- process_argus.cfg ---$/,$d' webjob-monitor-argus.txt > process_argus.cfg $ mv -f process_argus.cfg ${WEBJOB_HOME}/etc $ chmod 644 ${WEBJOB_HOME}/etc/process_argus.cfg $ chown 0:0 ${WEBJOB_HOME}/etc/process_argus.cfg The config file, process_argus.cfg, contains various site-specific variables (see below). Set these variables as appropriate. arch_day The number of days to maintain archive data before deletion, where zero (0) maintains data indefinitely. arch_dir The directory where argus data will be archived (e.g., "/backup/argus") arch_zip The full path and arguments of the compression program used to compress archive data. If blank, no compression will be used. Typical examples include: "/bin/gzip -9", "/usr/local/bin/bzip2 -9", or "compress". clnt_lst A space delimited list of WebJob clients that will be processed by process_argus (e.g., "client1 client2 client3"). drop_dir The directory where WebJob places argus data (i.e. the dropzone). Typically, this is "/integrity/incoming". log_dir The full path where process_argus will log activities (e.g., "/usr/local/integrity/log"). oper_dir The directory where argus data files will be consolidated and analyzed. oper_zip The full path and arguments of the compression program used to compress operational data. If blank, no compression will be used. Typical examples include: "/bin/gzip -9", "/usr/local/bin/bzip2 -9", or "compress". 5. Create a crontab entry on each client that periodically executes harvest_argus. The following crontab entry shown below will execute harvest_argus hourly. Replace WEBJOB_HOME as appropriate for your system prior to committing the cron job. 0 * * * * WEBJOB_HOME/bin/webjob -e -f WEBJOB_HOME/etc/webjob.cfg harvest_argus 6. Create a crontab entry on the server that periodically executes process_argus. To ensure that each job is processed in a timely fashion and that no significant back-logging occurs, use a processing frequency that is slightly offset from and at least twice as fast as harvest_argus. Following the example above, we will execute process_argus every 30 minutes. It is configured to write its output to stdout and delete the incoming files after they have been processed. 10,40 * * * * WEBJOB_HOME/bin/process_argus Closing Remarks Rather than running process_argus from cron, it would be better to create a daemon that proactively manages the incoming directory (i.e. the dropzone). This daemon would be responsible for processing incoming files and invoking the proper follow-on analysis stages such as process_argus. Credit This recipe is brought to you by Andy Bair, July 2003. Appendix 1 --- harvest_argus --- #!/bin/sh program=`basename $0` ###################################################################### # # Variables # # arc_day = number of days to archive arc_day_client1="0" arc_day_client2="0" # arc_dir = location of archived data arc_dir_client1="/opt/argus" arc_dir_client2="/opt/argus" # arc_zip = location of archived data arc_zip_client1="/bin/gzip -9" arc_zip_client2="/usr/bin/bzip2 -9" # bin = full path to argus binary bin_client1="/usr/local/sbin/argus" bin_client2="/usr/local/sbin/argus" # int {hme0|hme1|dmfe0|dmfe1} = interface to run argus int_client1="eth0" int_client2="hme1" # out = output file out_client1="/tmp/argus.out" out_client2="/tmp/argus.out" ###################################################################### # # Functions # # archive arc_dir arc_zip date hostname tmp archive () { arc_dir="$1" arc_zip="$2" date="$3" hostname="$4" tmp="$5" echo "${program}: Archiving ..." 1>&2 if [ ! -e "${arc_dir}" ]; then echo "${program}: Warning='Archive directory does not exist, creating ${arc_dir}.'" 1>&2 mkdir -p "${arc_dir}" fi if [ ! -e "${tmp}" ]; then echo "${program}: Error='Archive file does not exist: ${tmp}.'" 1>&2 return 1 else filename="${arc_dir}/${hostname}_${date}.argus" echo "${program}: Info='Moving data to archive: ${tmp} to ${filename}.'" 1>&2 mv "${tmp}" "${filename}" if [ -n "${arc_zip}" ]; then echo "${program}: Info='Compressing: ${arc_zip} ${filename}.'" 1>&2 eval "${arc_zip}" "${filename}" else echo "${program}: Info='Not compressing: ${filename}.'" 1>&2 fi fi } # harvest arc_day arc_dir arc_zip date hostname out tmp harvest () { arc_day="$1" arc_dir="$2" arc_zip="$3" date="$4" hostname="$5" out="$6" tmp="$7" echo "${program}: Harvesting ..." 1>&2 move "${out}" "${tmp}" transfer "${tmp}" archive "${arc_dir}" "${arc_zip}" "${date}" "${hostname}" "${tmp}" purge "${arc_day}" "${arc_dir}" "${hostname}" } # move out tmp move () { out="$1" tmp="$2" echo "${program}: Moving ..." 1>&2 if [ ! -e "${out}" ]; then echo "${program}: Error='Output file does not exist: ${out}.'" 1>&2 return 1 else if [ -e "${tmp}" ]; then echo "${program}: Warning='Temporary output file exists, overwriting: ${tmp}.'" 1>&2 fi echo "${program}: Info='Moving ${out} to ${tmp}.'" 1>&2 mv "${out}" "${tmp}" fi } # purge arc_day arc_dir hostname purge () { arc_day="$1" arc_dir="$2" hostname="$3" echo "${program}: Purging ..." 1>&2 if [ ${arc_day} -lt 0 ]; then echo "${program}: Error='Number of archive days is less than 0: ${arc_day}.'" 1>&2 return 1 else if [ ! -e "${arc_dir}" ]; then echo "${program}: Warning='Archive directory does not exist, creating ${arc_dir}'" 1>&2 mkdir -p "${arc_dir}" fi if [ ${arc_day} -gt 0 ]; then echo "${program}: Info='Purging ${arc_day} days of archive data in ${arc_dir}'" 1>&2 files=`find "${arc_dir}" -mtime +${arc_day} -name "${hostname}*"` for file in ${files}; do echo "${program}: Info='Purging ${file}.'" 1>&2 rm -f "${file}" done else echo "${program}: Info='Not purging archive data in ${arc_dir}'" 1>&2 fi fi } # shoot pid_file shoot () { pid_file="$1" echo "${program}: Shooting ..." 1>&2 if [ -s "${pid_file}" ]; then pids=`cat "${pid_file}"` for pid in ${pids}; do echo "${program}: Info='Killing argus pid ${pid}.'" 1>&2 kill -SIGTERM "${pid}" done else echo "${program}: Info='PID file ${pid_file} does not exist or is zero size.'" 1>&2 fi } # start bin cnt int out start () { bin="$1" cnt="$2" int="$3" out="$4" out_dir=`dirname "${out}"` echo "${program}: Starting ..." 1>&2 if [ ${cnt} -gt 0 ]; then echo "${program}: Warning='Argus is already running.'" 1>&2 else if [ ! -e "${bin}" ]; then echo "${program}: Error='Argus binary not present or executable: ${bin}.'" 1>&2 return 1 fi ifconfig "${int}" > /dev/null 2>&1 if [ $? -eq 1 ]; then echo "${program}: Error='Invalid interface ${int}.'" 1>&2 return 1 fi if [ ! -d "${out_dir}" ]; then echo "${program}: Warning='Output directory does not exist, creating ${out_dir}.'" 1>&2 mkdir -p "${out_dir}" fi cmd=`echo "${bin}" -cd -i "${int}" -w "${out}"` echo "${program}: Info='Launching argus: ${cmd}.'" 1>&2 ${cmd} fi } # transfer tmp transfer () { tmp="$1" echo "${program}: Transferring ..." 1>&2 if [ ! -e "${tmp}" ]; then echo "${program}: Error='Transfer file does not exist: ${tmp}.'" 1>&2 return 1 else echo "${program}: Info='Transferring ${tmp}.'" 1>&2 cat "${tmp}" fi } hostname=`hostname` eval arc_day=\$arc_day_${hostname} eval arc_dir=\$arc_dir_${hostname} eval arc_zip=\$arc_zip_${hostname} eval bin=\$bin_${hostname} eval int=\$int_${hostname} eval out=\$out_${hostname} if [ "${arc_day}" = "" -o "${arc_dir}" = "" -o "${bin}" = "" -o "${int}" = "" -o "${out}" = "" ] ; then echo "${program}: Error='Missing one or more variables.'" 1>&2 echo " arc_day: ${arc_day}" 1>&2 echo " arc_dir: ${arc_dir}" 1>&2 echo " bin: ${bin}" 1>&2 echo " int: ${int}" 1>&2 echo " out: ${out}" 1>&2 shoot "/var/run/argus.pid" exit 1 fi cnt=`ps -ef | grep -v grep | grep "${bin}" | wc -l | awk '{print $1}'` date=`date +%Y%m%d%H%M%S` tmp="${out}.tmp" harvest "${arc_day}" "${arc_dir}" "${arc_zip}" "${date}" "${hostname}" "${out}" "${tmp}" if [ ${cnt} -eq 0 ]; then echo "${program}: Warning='Argus is not running.'" 1>&2 start "${bin}" "${cnt}" "${int}" "${out}" fi echo "${program}: Done with harvest." 1>&2 exit 0 --- harvest_argus --- Appendix 2 --- process_argus.cfg --- arch_day="1" arch_dir="/tmp/archive" arch_zip="/bin/gzip -9" clnt_lst="client1 client2" log_dir="/usr/local/integrity/log" drop_dir="/integrity/incoming" oper_dir="/tmp/operational" oper_zip="/bin/gzip -9" --- process_argus.cfg --- Appendix 3 --- process_argus --- #!/bin/sh IFS=' ' PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/sbin:/usr/local/bin PROGRAM=`basename $0` DATE=`date +%Y%m%d%H%M%S` WEBJOB_HOME=${WEBJOB_HOME-/usr/local/integrity} if [ -r ${WEBJOB_HOME}/etc/process_argus.cfg ] ; then . ${WEBJOB_HOME}/etc/process_argus.cfg else echo "${PROGRAM}: Error='${WEBJOB_HOME}/etc/process_argus.cfg does not exist or is unreadable.'" 1>&2 exit 1 fi if [ "${log_dir}" = "" ]; then echo "${PROGRAM}: Error='log_dir is not defined.'" 1>&2 exit 1 fi if [ ! -e "${log_dir}" ]; then mkdir -p "${log_dir}" if [ $? -ne 0 ]; then echo "${PROGRAM}: Error 'Could not create log_dir.'" 1>&2 exit 1 fi fi if [ ! -d "${log_dir}" ]; then echo "${DATE}: Error 'log_dir is not a directory.'" 1>&2 >> "${log_dir}" exit 1 fi log_file="${log_dir}/${PROGRAM}.log" if [ -z "${arch_day}" ]; then echo "${DATE}: Error 'arch_day is not defined.'" 1>&2 >> "${log_file}" exit 1 fi if [ -z "${arch_dir}" ]; then echo "${DATE}: Error 'arch_dir is not defined.'" 1>&2 >> "${log_file}" exit 1 fi if [ ! -d "${arch_dir}" ]; then echo "${DATE}: Warning 'arch_dir does not exist.'" 1>&2 >> "${log_file}" exit 1 fi if [ -z "${clnt_lst}" ]; then echo "${DATE}: Error 'clnt_lst is not defined.'" 1>&2 >> "${log_file}" exit 1 fi if [ -z "${drop_dir}" ]; then echo "${DATE}: Error 'drop_dir is not defined.'" 1>&2 >> "${log_file}" exit 1 fi if [ ! -d "${drop_dir}" ]; then echo "${DATE}: Warning 'drop_dir does not exist.'" 1>&2 >> "${log_file}" exit 1 fi if [ -z "${oper_dir}" ]; then echo "${DATE}: Error 'oper_dir is not defined.'" 1>&2 >> "${log_file}" exit 1 fi if [ ! -d "${oper_dir}" ]; then echo "${DATE}: Warning 'oper_dir does not exist.'" 1>&2 >> "${log_file}" exit 1 fi if [ ! -e "${arch_dir}" ]; then echo "${DATE}: Warning='${arch_dir} does not exist, creating.'" 1>&2 >> "${log_file}" mkdir -p "${arch_dir}" fi for client in ${clnt_lst}; do for RDY_FILE in `find ${drop_dir} -name "${client}*harvest_argus.rdy" 2> /dev/null` ; do ENV_FILE=${RDY_FILE%rdy}env ERR_FILE=${RDY_FILE%rdy}err OUT_FILE=${RDY_FILE%rdy}out for FILE in "${ENV_FILE}" "${ERR_FILE}" "${OUT_FILE}" ; do if [ ! -r "${FILE}" ] ; then echo "${DATE}: Error='${FILE} does not exist or is unreadable.'" 1>&2 >> "${log_file}" exit 1 fi done yyyymmdd=`echo "${OUT_FILE}" | cut -d\_ -f2 | cut -c 1-8` HOSTNAME=`egrep "Hostname=" ${ENV_FILE} | awk -F= '{print $2}'` ARCH="${oper_dir}/${HOSTNAME}_${yyyymmdd}" if [ -s "${OUT_FILE}" ]; then echo "${DATE}: Info='Appending ${OUT_FILE} to ${ARCH}.'" 1>&2 >> "${log_file}" cat "${OUT_FILE}" >> "${ARCH}" fi echo "${DATE}: Info='Moving ${OUT_FILE} to ${arch_dir}'" 1>&2 >> "${log_file}" mv "${OUT_FILE}" "${arch_dir}" filename=`basename "${OUT_FILE}"` if [ -n "${arch_zip}" ]; then echo "${DATE}: Info='Compressing: ${arch_zip} ${arch_dir}/${filename}.'" 1>&2 >> "${log_file}" eval "${arch_zip}" "${arch_dir}/${filename}" fi echo "${DATE}: Info='Removing: ${ENV_FILE}.'" 1>&2 >> "${log_file}" rm -f "${ENV_FILE}" "${ERR_FILE}" "${RDY_FILE}" done if [ ${arch_day} -ne 0 ]; then files=`find "${arch_dir}" -mtime +${arch_day} -name "${HOSTNAME}_*.out"` echo "${DATE}: Info='Purging files older than ${arch_day} days.'" 1>&2 >> "${log_file}" for file in ${files}; do echo "${DATE}: Info='Purging ${file}.'" 1>&2 >> "${log_file}" rm -f "${file}" done else echo "${DATE}: Info='arch_day is ${arch_day}, not purging.'" 1>&2 >> "${log_file}" fi if [ -n "${oper_zip}" ]; then files=`find "${oper_dir}" -mtime +1 -name "${HOSTNAME}_*" | grep -v \.gz` echo "${DATE}: Info='Compressing ...'" 1>&2 >> "${log_file}" for file in ${files}; do echo "${DATE}: Info='Compressing: ${oper_zip} ${file}.'" 1>&2 >> "${log_file}" eval "${oper_zip}" "${file}" done else echo "${DATE}: Info='oper_zip is ${oper_zip}, not compressing.'" 1>&2 >> "${log_file}" fi done --- process_argus ---