Problem

  This recipe demonstrates how to deploy and manage argus -- a real
  time flow monitor designed to perform comprehensive IP network
  traffic auditing.

Requirements

  Cooking with this recipe requires an operational WebJob server.
  If you do not have one of those, refer to the instructions provided
  in the README.INSTALL file that comes with the source distribution.
  The latest source distribution is available here:

    http://sourceforge.net/project/showfiles.php?group_id=40788

  Each client system must be running UNIX, have argus installed,
  and have at least the following utilities: awk, basename, cat,
  crontab, date, echo, eval, find, grep, hostname, ifconfig, mkdir,
  mv, ps, rm, test, webjob, and wc.

  Likewise, the server must have at least the following utilities:
  awk, basename, cat, egrep, find, head, mail, mv, rm, sh, test,
  and wc.

  The commands presented throughout this recipe were designed to
  be executed within a Bourne shell (i.e. sh or bash).

Solution

  The solution is to use WebJob to manage argus.  Clients execute
  harvest_argus to transfer argus data to a WebJob server, and the
  server, in turn, processes that data with process_argus.

  The following steps describe how to implement this solution.

  1. Set WEBJOB_HOME as appropriate for your WebJob server.  This
     variable should be set to the prefix directory that was used
     when WebJob was installed (e.g., /usr/local/integrity).

       $ export WEBJOB_HOME=/usr/local/integrity

  2. Set WEBJOB_CLIENT and WEBJOB_COMMANDS as appropriate for your
     WebJob server:

       $ export WEBJOB_CLIENT=common
       $ export WEBJOB_COMMANDS=/integrity/profiles/${WEBJOB_CLIENT}/commands

  3. Extract harvest_argus from this recipe, and move it to the
     appropriate commands directory.  If you want the script to be
     shared by all clients, run the following commands as given
     below.  Otherwise, specify the desired client ID when setting
     WEBJOB_CLIENT.  Once harvest_argus is in place, set its ownership
     and permissions to 0:0 and mode 644 respectively.

       $ sed -e '1,/^--- harvest_argus ---$/d; /^--- harvest_argus ---$/,$d' webjob-monitor-argus.txt > harvest_argus
       $ mv -f harvest_argus ${WEBJOB_COMMANDS}
       $ chown 0:0 ${WEBJOB_COMMANDS}/harvest_argus
       $ chmod 644 ${WEBJOB_COMMANDS}/harvest_argus

     Modify the following control variables in harvest_argus as
     appropriate:

       arc_day_<client>
	 The number of days of argus data to archive on <client>,
	 with zero (0) as indefinite.  For example, the following
	 will remove files older than 30 days on client1:

	   arc_day_client1="30"

       arc_dir_<client>
         The directory where archived argus data will be place on
         the <client> (e.g., "/usr/local/bin/argus/archive").

       arc_zip_<client>
	 The full path and arguments of the compression program
	 used to compress archive data.  If blank, no compression
	 will be used.	Typical examples include: "/bin/gzip -9",
	 "/usr/local/bin/bzip2 -9", or "compress".

       bin_<client>
	 The full path to and filename of the argus binary (e.g.,
	 "/usr/local/sbin/argus").

       int_<client>
	 Interface name that argus will attach to (e.g., hme0, hme1,
	 dmfe0, dmfe1, etc.).

       out_<client> = output file
         The full path and filename of the Argus output file (e.g.,
	 "/tmp/argus.out").

  4. Extract process_argus from this recipe, and move it to the
     server's bin directory.  Set this file's ownership and
     permissions to 0:0 and mode 755 respectively.

       $ sed -e '1,/^--- process_argus ---$/d; /^--- process_argus ---$/,$d' webjob-monitor-argus.txt > process_argus
       $ mv -f process_argus ${WEBJOB_HOME}/bin
       $ chown 0:0 ${WEBJOB_HOME}/bin/process_argus
       $ chmod 755 ${WEBJOB_HOME}/bin/process_argus

     Next, extract process_argus.cfg from this recipe, and move it
     to the server's etc directory.  Set this file's ownership and
     permissions to 0:0 and mode 644 respectively.

       $ sed -e '1,/^--- process_argus.cfg ---$/d; /^--- process_argus.cfg ---$/,$d' webjob-monitor-argus.txt > process_argus.cfg
       $ mv -f process_argus.cfg ${WEBJOB_HOME}/etc
       $ chmod 644 ${WEBJOB_HOME}/etc/process_argus.cfg
       $ chown 0:0 ${WEBJOB_HOME}/etc/process_argus.cfg

     The config file, process_argus.cfg, contains various site-specific
     variables (see below).  Set these variables as appropriate.

       arch_day
	 The number of days to maintain archive data before deletion,
	 where zero (0) maintains data indefinitely.

       arch_dir
	 The directory where argus data will be archived (e.g.,
	 "/backup/argus")

       arch_zip
	 The full path and arguments of the compression program
	 used to compress archive data.  If blank, no compression
	 will be used.  Typical examples include: "/bin/gzip -9",
	 "/usr/local/bin/bzip2 -9", or "compress".

       clnt_lst
	 A space delimited list of WebJob clients that will be
	 processed by process_argus (e.g., "client1 client2 client3").

       drop_dir
	 The directory where WebJob places argus data (i.e. the
	 dropzone).  Typically, this is "/integrity/incoming".

       log_dir
	 The full path where process_argus will log activities
	 (e.g., "/usr/local/integrity/log").

       oper_dir
	 The directory where argus data files will be consolidated
	 and analyzed.

       oper_zip
	 The full path and arguments of the compression program
	 used to compress operational data.  If blank, no compression
	 will be used.  Typical examples include: "/bin/gzip -9",
	 "/usr/local/bin/bzip2 -9", or "compress".

  5. Create a crontab entry on each client that periodically
     executes harvest_argus.  The following crontab entry shown
     below will execute harvest_argus hourly.  Replace WEBJOB_HOME
     as appropriate for your system prior to committing the cron
     job.

       0 * * * * WEBJOB_HOME/bin/webjob -e -f WEBJOB_HOME/etc/webjob.cfg harvest_argus

  6. Create a crontab entry on the server that periodically executes
     process_argus.  To ensure that each job is processed in a
     timely fashion and that no significant back-logging occurs,
     use a processing frequency that is slightly offset from and
     at least twice as fast as harvest_argus.  Following the example
     above, we will execute process_argus every 30 minutes.  It is
     configured to write its output to stdout and delete the incoming
     files after they have been processed.

       10,40 * * * * WEBJOB_HOME/bin/process_argus

Closing Remarks

  Rather than running process_argus from cron, it would be better
  to create a daemon that proactively manages the incoming directory
  (i.e. the dropzone).  This daemon would be responsible for
  processing incoming files and invoking the proper follow-on
  analysis stages such as process_argus.

Credit

  This recipe is brought to you by Andy Bair, July 2003.

Appendix 1

--- harvest_argus ---
#!/bin/sh

program=`basename $0`

######################################################################
#
# Variables
#

# arc_day = number of days to archive
arc_day_client1="0"
arc_day_client2="0"

# arc_dir = location of archived data
arc_dir_client1="/opt/argus"
arc_dir_client2="/opt/argus"

# arc_zip = location of archived data
arc_zip_client1="/bin/gzip -9"
arc_zip_client2="/usr/bin/bzip2 -9"

# bin = full path to argus binary
bin_client1="/usr/local/sbin/argus"
bin_client2="/usr/local/sbin/argus"

# int {hme0|hme1|dmfe0|dmfe1} = interface to run argus
int_client1="eth0"
int_client2="hme1"

# out = output file
out_client1="/tmp/argus.out"
out_client2="/tmp/argus.out"


######################################################################
#
# Functions
#

# archive arc_dir arc_zip date hostname tmp
archive () {
  arc_dir="$1"
  arc_zip="$2"
  date="$3"
  hostname="$4"
  tmp="$5"

  echo "${program}: Archiving ..." 1>&2
  if [ ! -e "${arc_dir}" ]; then
    echo "${program}: Warning='Archive directory does not exist, creating ${arc_dir}.'" 1>&2
    mkdir -p "${arc_dir}"
  fi

  if [ ! -e "${tmp}" ]; then
    echo "${program}: Error='Archive file does not exist: ${tmp}.'" 1>&2
    return 1
  else
    filename="${arc_dir}/${hostname}_${date}.argus"
    echo "${program}: Info='Moving data to archive: ${tmp} to ${filename}.'" 1>&2
    mv "${tmp}" "${filename}"
    if [ -n "${arc_zip}" ]; then
      echo "${program}: Info='Compressing: ${arc_zip} ${filename}.'" 1>&2
      eval "${arc_zip}" "${filename}"
    else
      echo "${program}: Info='Not compressing: ${filename}.'" 1>&2
    fi
  fi
  }

# harvest arc_day arc_dir arc_zip date hostname out tmp
harvest () {
  arc_day="$1"
  arc_dir="$2"
  arc_zip="$3"
  date="$4"
  hostname="$5"
  out="$6"
  tmp="$7"

  echo "${program}: Harvesting ..." 1>&2
  move "${out}" "${tmp}"
  transfer "${tmp}"
  archive "${arc_dir}" "${arc_zip}" "${date}" "${hostname}" "${tmp}"
  purge "${arc_day}" "${arc_dir}" "${hostname}"
  }

# move out tmp
move () {
  out="$1"
  tmp="$2"

  echo "${program}: Moving ..." 1>&2
  if [ ! -e "${out}" ]; then
    echo "${program}: Error='Output file does not exist: ${out}.'" 1>&2
    return 1
  else
    if [ -e "${tmp}" ]; then
      echo "${program}: Warning='Temporary output file exists, overwriting: ${tmp}.'" 1>&2
    fi
    echo "${program}: Info='Moving ${out} to ${tmp}.'" 1>&2
    mv "${out}" "${tmp}"
  fi
  }

# purge arc_day arc_dir hostname
purge () {
  arc_day="$1"
  arc_dir="$2"
  hostname="$3"

  echo "${program}: Purging ..." 1>&2
  if [ ${arc_day} -lt 0 ]; then
    echo "${program}: Error='Number of archive days is less than 0: ${arc_day}.'" 1>&2
    return 1
  else

    if [ ! -e "${arc_dir}" ]; then
      echo "${program}: Warning='Archive directory does not exist, creating ${arc_dir}'" 1>&2
      mkdir -p "${arc_dir}"
    fi

    if [ ${arc_day} -gt 0 ]; then
      echo "${program}: Info='Purging ${arc_day} days of archive data in ${arc_dir}'" 1>&2
      files=`find "${arc_dir}" -mtime +${arc_day} -name "${hostname}*"`
      for file in ${files}; do
        echo "${program}: Info='Purging ${file}.'" 1>&2
        rm -f "${file}"
      done
    else
      echo "${program}: Info='Not purging archive data in ${arc_dir}'" 1>&2
    fi
  fi
  }

# shoot pid_file
shoot () {
  pid_file="$1"

  echo "${program}: Shooting ..." 1>&2
  if [ -s "${pid_file}" ]; then
    pids=`cat "${pid_file}"`
    for pid in ${pids}; do
      echo "${program}: Info='Killing argus pid ${pid}.'" 1>&2
      kill -SIGTERM "${pid}"
    done
  else
    echo "${program}: Info='PID file ${pid_file} does not exist or is zero size.'" 1>&2
  fi
  }

# start bin cnt int out
start () {
  bin="$1"
  cnt="$2"
  int="$3"
  out="$4"
  out_dir=`dirname "${out}"`

  echo "${program}: Starting ..." 1>&2
  if [ ${cnt} -gt 0 ]; then
   echo "${program}: Warning='Argus is already running.'" 1>&2
  else

    if [ ! -e "${bin}" ]; then
      echo "${program}: Error='Argus binary not present or executable: ${bin}.'" 1>&2
      return 1
    fi

    ifconfig "${int}" > /dev/null 2>&1
    if [ $? -eq 1 ]; then
      echo "${program}: Error='Invalid interface ${int}.'" 1>&2
      return 1
    fi

    if [ ! -d "${out_dir}" ]; then
      echo "${program}: Warning='Output directory does not exist, creating ${out_dir}.'" 1>&2
      mkdir -p "${out_dir}"
    fi

    cmd=`echo "${bin}" -cd -i "${int}" -w "${out}"`
    echo "${program}: Info='Launching argus: ${cmd}.'" 1>&2
    ${cmd}
  fi
  }

# transfer tmp
transfer () {
  tmp="$1"

  echo "${program}: Transferring ..." 1>&2
  if [ ! -e "${tmp}" ]; then
    echo "${program}: Error='Transfer file does not exist: ${tmp}.'" 1>&2
    return 1
  else
    echo "${program}: Info='Transferring ${tmp}.'" 1>&2
    cat "${tmp}"
  fi
  }

hostname=`hostname`
eval arc_day=\$arc_day_${hostname}
eval arc_dir=\$arc_dir_${hostname}
eval arc_zip=\$arc_zip_${hostname}
eval bin=\$bin_${hostname}
eval int=\$int_${hostname}
eval out=\$out_${hostname}

if [ "${arc_day}" = "" -o "${arc_dir}" = "" -o "${bin}" = "" -o "${int}" = "" -o "${out}" = "" ] ; then
  echo "${program}: Error='Missing one or more variables.'" 1>&2
  echo "  arc_day: ${arc_day}" 1>&2
  echo "  arc_dir: ${arc_dir}" 1>&2
  echo "  bin: ${bin}" 1>&2
  echo "  int: ${int}" 1>&2
  echo "  out: ${out}" 1>&2
  shoot "/var/run/argus.pid"
  exit 1
fi

cnt=`ps -ef | grep -v grep | grep "${bin}" | wc -l | awk '{print $1}'`
date=`date +%Y%m%d%H%M%S`
tmp="${out}.tmp"

harvest "${arc_day}" "${arc_dir}" "${arc_zip}" "${date}" "${hostname}" "${out}" "${tmp}"
if [ ${cnt} -eq 0 ]; then
  echo "${program}: Warning='Argus is not running.'" 1>&2
  start "${bin}" "${cnt}" "${int}" "${out}"
fi

echo "${program}: Done with harvest." 1>&2
exit 0
--- harvest_argus ---

Appendix 2

--- process_argus.cfg ---
arch_day="1"
arch_dir="/tmp/archive"
arch_zip="/bin/gzip -9"
clnt_lst="client1 client2"
log_dir="/usr/local/integrity/log"
drop_dir="/integrity/incoming"
oper_dir="/tmp/operational"
oper_zip="/bin/gzip -9"
--- process_argus.cfg ---

Appendix 3

--- process_argus ---
#!/bin/sh

IFS=' 	
'
PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/sbin:/usr/local/bin
PROGRAM=`basename $0`
DATE=`date +%Y%m%d%H%M%S`

WEBJOB_HOME=${WEBJOB_HOME-/usr/local/integrity}

if [ -r ${WEBJOB_HOME}/etc/process_argus.cfg ] ; then
  . ${WEBJOB_HOME}/etc/process_argus.cfg
else
  echo "${PROGRAM}: Error='${WEBJOB_HOME}/etc/process_argus.cfg does not exist or is unreadable.'" 1>&2
  exit 1
fi

if [ "${log_dir}" = "" ]; then
  echo "${PROGRAM}: Error='log_dir is not defined.'" 1>&2
  exit 1
fi

if [ ! -e "${log_dir}" ]; then
  mkdir -p "${log_dir}"
  if [ $? -ne 0 ]; then
    echo "${PROGRAM}: Error 'Could not create log_dir.'" 1>&2
    exit 1
  fi
fi

if [ ! -d "${log_dir}" ]; then
  echo "${DATE}: Error 'log_dir is not a directory.'" 1>&2 >> "${log_dir}"
  exit 1
fi

log_file="${log_dir}/${PROGRAM}.log"


if [ -z "${arch_day}" ]; then
  echo "${DATE}: Error 'arch_day is not defined.'" 1>&2 >> "${log_file}"
  exit 1
fi

if [ -z "${arch_dir}" ]; then
  echo "${DATE}: Error 'arch_dir is not defined.'" 1>&2 >> "${log_file}"
  exit 1
fi

if [ ! -d "${arch_dir}" ]; then
  echo "${DATE}: Warning 'arch_dir does not exist.'" 1>&2 >> "${log_file}"
  exit 1
fi

if [ -z "${clnt_lst}" ]; then
  echo "${DATE}: Error 'clnt_lst is not defined.'" 1>&2 >> "${log_file}"
  exit 1
fi

if [ -z "${drop_dir}" ]; then
  echo "${DATE}: Error 'drop_dir is not defined.'" 1>&2 >> "${log_file}"
  exit 1
fi

if [ ! -d "${drop_dir}" ]; then
  echo "${DATE}: Warning 'drop_dir does not exist.'" 1>&2 >> "${log_file}"
  exit 1
fi

if [ -z "${oper_dir}" ]; then
  echo "${DATE}: Error 'oper_dir is not defined.'" 1>&2 >> "${log_file}"
  exit 1
fi

if [ ! -d "${oper_dir}" ]; then
  echo "${DATE}: Warning 'oper_dir does not exist.'" 1>&2 >> "${log_file}"
  exit 1
fi

if [ ! -e "${arch_dir}" ]; then
  echo "${DATE}: Warning='${arch_dir} does not exist, creating.'" 1>&2 >> "${log_file}"
  mkdir -p "${arch_dir}"
fi

for client in ${clnt_lst}; do

  for RDY_FILE in `find ${drop_dir} -name "${client}*harvest_argus.rdy" 2> /dev/null` ; do

    ENV_FILE=${RDY_FILE%rdy}env
    ERR_FILE=${RDY_FILE%rdy}err
    OUT_FILE=${RDY_FILE%rdy}out

    for FILE in "${ENV_FILE}" "${ERR_FILE}" "${OUT_FILE}" ; do

      if [ ! -r "${FILE}" ] ; then
        echo "${DATE}: Error='${FILE} does not exist or is unreadable.'" 1>&2 >> "${log_file}"
        exit 1
      fi

    done

    yyyymmdd=`echo "${OUT_FILE}" | cut -d\_ -f2 | cut -c 1-8`
    HOSTNAME=`egrep "Hostname=" ${ENV_FILE} | awk -F= '{print $2}'`
    ARCH="${oper_dir}/${HOSTNAME}_${yyyymmdd}"

    if [ -s "${OUT_FILE}" ]; then
      echo "${DATE}: Info='Appending ${OUT_FILE} to ${ARCH}.'" 1>&2 >> "${log_file}"
      cat "${OUT_FILE}" >> "${ARCH}"
    fi


    echo "${DATE}: Info='Moving ${OUT_FILE} to ${arch_dir}'" 1>&2 >> "${log_file}"
    mv "${OUT_FILE}" "${arch_dir}"

    filename=`basename "${OUT_FILE}"`
    if [ -n "${arch_zip}" ]; then
      echo "${DATE}: Info='Compressing: ${arch_zip} ${arch_dir}/${filename}.'" 1>&2 >> "${log_file}"
      eval "${arch_zip}" "${arch_dir}/${filename}"
    fi

    echo "${DATE}: Info='Removing: ${ENV_FILE}.'" 1>&2 >> "${log_file}"
    rm -f "${ENV_FILE}" "${ERR_FILE}" "${RDY_FILE}"

  done

  if [ ${arch_day} -ne 0 ]; then
    files=`find "${arch_dir}" -mtime +${arch_day} -name "${HOSTNAME}_*.out"`
    echo "${DATE}: Info='Purging files older than ${arch_day} days.'" 1>&2 >> "${log_file}"
    for file in ${files}; do
      echo "${DATE}: Info='Purging ${file}.'" 1>&2 >> "${log_file}"
      rm -f "${file}"
    done
  else
    echo "${DATE}: Info='arch_day is ${arch_day}, not purging.'" 1>&2 >> "${log_file}"
  fi

  if [ -n "${oper_zip}" ]; then
    files=`find "${oper_dir}" -mtime +1 -name "${HOSTNAME}_*"  | grep -v \.gz`
    echo "${DATE}: Info='Compressing ...'" 1>&2 >> "${log_file}"
    for file in ${files}; do
      echo "${DATE}: Info='Compressing: ${oper_zip} ${file}.'" 1>&2 >> "${log_file}"
      eval "${oper_zip}" "${file}"
    done
  else
    echo "${DATE}: Info='oper_zip is ${oper_zip}, not compressing.'" 1>&2 >> "${log_file}"
  fi

done
--- process_argus ---