Check logs of central syslog-ng log host on FreeBSD
This blog post continues where the blog post A central log host with syslog-ng on FreeBSD left off. Open source solutions to check syslog log messages exist, such as Logcheck or Logwatch. Although these are not to difficult to implement and maintain, I still found these to much. So I went for my own home grown solution to check the syslog messages of the SoCruel.NU central log host. And the solution presented in this blog post works pretty well for me! Some events I detected with this solution are:
- failing hard disks
- daemons which are not configured correctly
- daemons behaving differently as expected
- disks or volumes filling up
- bad requests to web servers and mail servers
This post describes a simple method to check the logs of the central log host with a shell script. This shell script checks the syslog messages against a list of known bad strings, like ‘error’, ‘false’, etc. (the blacklist) using regular expression. After this step the script checks the outcome of ‘the bad’ against a list of strings which we do not want to see (the whitelist) in the output report. This shell script is run periodically on the central log host and sends an e-mail with the output of the checks.
Requirements
The following requirements have to be in place to be able to implement what is described in this post:
- an up to date FreeBSD system version 11.x or 12.x
- a central syslog host is implemented based on A central log host with syslog-ng on FreeBSD
- a mail client is setup on the log host which is able to send mail using the mailcommand (i.e.ssmtpordma)
Install bash and logcheck
The shell script which is discussed below is a bash script, so the bash package is installed first:
# pkg install -y bash
The program logtail2 is used to ‘tail’ through the logs in the script. This program is part of the logcheck package, so this is installed as well:
# pkg install -y logcheck
The nice thing about logtail2 is that it keeps track what it has already read from its input (see man logtail2).
The syslog check shell script
The script to check the logs is called /root/check-syslog.sh, but you can name it any way you like. Use whatever favorite editor you have to make this script. It is discussed section by section such that it is easy to follow how this is done. In the first line we tell that this is a bash script:
#!/usr/local/bin/bash
In the next section the variables are defined:
#---------------------------------------------------------------------
#       Variables
#---------------------------------------------------------------------
# Binary programs
LOGTAIL="/usr/local/sbin/logtail2"
TOUCH="/usr/bin/touch"
CHMOD="/bin/chmod"
FIND="/usr/bin/find"
MAIL="/usr/bin/mail"
CAT="/bin/cat"
# Date and time items
WEEKDAY=`/bin/date "+%a"`
YEAR=`/bin/date "+%Y"`
MONTH=`/bin/date "+%m"`
DAY=`/bin/date "+%d"`
HOUR=`/bin/date "+%H"`
MINUTE=`/bin/date "+%M"`
# Regular expression variables
GREP="/usr/bin/grep -F -v"
GREP_FAILURES="/usr/bin/egrep -E -i"
GREP_EXCEPTIONS="/usr/bin/grep -E -v"
# The logfile which is checked
LOGFILE="/loghost/dailylogs/${WEEKDAY}.log"
# The output variables
OUTPUT_DIR="/loghost/.out"
OUTPUT_FILE="/loghost/.out/syslog-check-${YEAR}${MONTH}${DAY}-${HOUR}${MINUTE}.txt"
# Mail variables
MAIL_RECIPIENT="syslogcheck@domain.tld"
MAIL_SUBJECT="syslog check report ${YEAR}${MONTH}${DAY}-${HOUR}${MINUTE}"
You can set the variables $OUTPUTDIR, $OUTPUTFILE, $MAILRECIPIENT and $MAILSUBJECT to your own liking and configuration. The variable $LOGFILE depends on what you have configured in the loghost.conf file (see A central log host with syslog-ng on FreeBSD).
Next 2 functions are defined. The first is called FAILURES which defines the bad strings which we want to search for in our logs:
#---------------------------------------------------------------------
#       Functions
#---------------------------------------------------------------------
FAILURES()
   {
      ${GREP_FAILURES} "error|crit|invalid|fail|false|warn|restart|deny\
                       |disable|ignore[d]|miss|except|invalid|fault\
                       |cannot|denied|broken|exceed|block|unable|offline\
                       |unsolicited|unhandled|traps|corrupt|unsafe\
                       |shutting down|shutdown|stopping|terminating"
   }
The second function is called EXCEPTIONS and defines the strings which we know are harmless and do not want to see in our output. This is our example whitelist:
EXCEPTIONS ()
   {
      ${GREP_EXCEPTIONS} "socruel\.nu" \
      | ${GREP_EXCEPTIONS} "\(syslog\/info\) \[syslogd\] restart"
   }
Two exceptions are defined in this example:
- ${GREP_EXCEPTIONS} “socruel.nu” defines that we do NOT want to see any syslog message with ‘socruel.nu’ in it
- ${GREP_EXCEPTIONS} “(syslog\/info) [syslogd] restart” defines that we do not want to see a syslog message with ‘syslog/info [syslogd] restart’ in it
Please be aware that all lines except the last one needs to end with the ‘\’ (backslash). And all lines except the first needs to start with the ‘|’ (pipe)! The number of exceptions and which exceptions you need depend on what you are running, how your environment is behaving and how much you want to see in the report. The SoCruel.NU implementation uses currently around 30 exceptions.
Next is making sure only the root user account can run this script:
#---------------------------------------------------------------------
#       Check if root
#---------------------------------------------------------------------
if [ "$(id -u)" != "0" ]; then
     echo "This script must be run by root." 2>&1
     exit 1
fi
With the above the basis is created, done and dusted. Now the real work starts, the actual script. First the output file is created and the permissions on it set:
#---------------------------------------------------------------------
#       The script
#---------------------------------------------------------------------
${TOUCH} ${OUTPUT_FILE}
${CHMOD} 0600 ${OUTPUT_FILE}
Then we add some intial information in the output file and add a blanc line:
echo "Syslog check ${YEAR}${MONTH}${DAY}-${HOUR}${MINUTE}" >> ${OUTPUT_FILE}
echo "==========================" >> ${OUTPUT_FILE}
echo "" >> ${OUTPUT_FILE}
The following command is where all the magic happens (!):
${LOGTAIL} ${LOGFILE} \
   | ${GREP} "$0" \
   | FAILURES \
   | EXCEPTIONS \
   >> ${OUTPUT_FILE}
The log file (${LOGFILE}) is checked against the FAILURES (the blacklist) first and then against the EXCEPTIONS. The result is appended to the output file ($OUTPUT_FILE).
We are almost at the finish line! We have one more objective to go and that is sending out the output file to our defined mail recipient. We do this using the following command:
${CAT} ${OUTPUT_FILE} | ${MAIL} -s "${MAIL_SUBJECT}" ${MAIL_RECIPIENT}
Once the script is saved, please make sure it has the right permissions set:
# chmod 0700 /root/syslog-check.sh
Last but not least: schedule it
Now we have to do one final action and that is schedule the script periodically (here very hour):
# echo "$(echo '0 */1 * * * /root/syslog-check.sh ; crontab -l')" | crontab -
Now we are done! We have a central logging host, clients which log to it and we check all these logs with a simple script! Nice right?
Resources
Some (other) resources about this subject: