Just as Script #84, Exploring the Apache access_log, reveals the interesting and useful statistical information found in the regular access log of an Apache or Apache-compatible web server, this script extracts the critical information from the error_log.
For those web servers that don't automatically split their log file into separate access_log and error_log components, you can sometimes split a central log file into access and error components by filtering based on the return code (field 9) of each entry in the log:
awk '{if (substr($9,0,1) <= "3") { print $0 } }' apache.log > access_log awk '{if (substr($9,0,1) > "3") { print $0 } }' apache.log > error_log
A return code that begins with a 4 or a 5 is a failure (the 400s are client errors, the 500s are server errors), and a return code beginning with a 2 or a 3 is a success (the 200s are success messages, the 300s are redirects):
Other servers that produce a single central log file containing both successes and errors denote the error message entries with an [error] field value. In that case, the split can be done with a grep '[error]' to create the error_log and a grep -v '[error]' to create the access_log.
Whether your server automatically creates an error_log or you have to create your own error log by searching for entries with the '[error]' string, in the error log just about everything is different, including the way the date is specified:
$ head -1 error_log [Thu Jan 2 10:07:07 2003] [error] [client 208.180.31.244] File does not exist: /usr/local/etc/httpd/htdocs/intuitive/favicon.ico
In the access_log, dates are specified as a compact one-field value with no spaces, but the error_log takes five fields instead. Further, rather than a consistent scheme in which the word/string position in a space-delimited entry consistently identifies a particular field, entries in the error_log have a meaningful error description that varies in length. An examination of just those description values reveals surprising variation:
$ awk '{print $9" "$10" "$11" "$12 }' error_log | sort -u File does not exist: Invalid error redirection directive: Premature end of script execution failure for parameter premature EOF in parsed script not found or malformed header from script
Some of these errors should be examined by hand because they can be difficult to track backward to the offending web page once identified. Others are just transient problems:
[Thu Jan 16 20:03:12 2003] [error] [client 205.188.209.101] (35) Resource temporarily unavailable: couldn't spawn include command /usr/home/taylor/web/intuitive/library/header.cgi: Cannot fork: Resource temporarily unavailable
This script focuses on the most common problems — in particular, File does not exist errors — and then produces a dump of all other error_log entries that don't match well-known error situations.
#!/bin/sh # weberrors - Scans through an Apache error_log file and reports the # most important errors, then lists additional entries. temp="/tmp/$(basename $0).$$" # The following three lines will need to be customized for your own # installation for this script to work best. htdocs="/usr/local/etc/httpd/htdocs/" myhome="/usr/home/taylor/" cgibin="/usr/local/etc/httpd/cgi-bin/" sedstr="s/^/ /g;s|$htdocs|[htdocs] |;s|$myhome|[homedir] |;s|$cgibin|[cgi-bin] |" screen="(File does not exist|Invalid error redirect|premature EOF|Premature end of script|script not found)" length=5 # entries per category to display checkfor() { grep "${2}:" "$1" | awk '{print $NF}' |\ sort | uniq -c | sort -rn | head -$length | sed "$sedstr" > $temp if [ $(wc -l < $temp) -gt 0 ] ; then echo "" echo "$2 errors:" cat $temp fi } trap "/bin/rm -f $temp" 0 if [ "$1" = "-l" ] ; then length=$2; shift 2 fi if [ $# -ne 1 -o ! -r "$1" ] ; then echo "Usage: $(basename $0) [-l len] error_log" >&2 exit 1 fi echo Input file $1 has $(wc -l < "$1") entries. start="$(grep -E '\[.*:.*:.*\]' "$1" | head -1 | awk '{print $1" "$2" "$3" "$4" "$5 }')" end="$(grep -E '\[.*:.*:.*\]' "$1" | tail -1 | awk '{print $1" "$2" "$3" "$4" "$5 }')" echo -n "Entries from $start to $end" echo "" ### Check for various common and well-known errors: checkfor "$1" "File does not exist" checkfor "$1" "Invalid error redirection directive" checkfor "$1" "premature EOF" checkfor "$1" "script not found or unable to stat" checkfor "$1" "Premature end of script headers" grep -vE "$screen" "$1" | grep "\[error\]" | grep "\[client " | \ sed 's/\[error\]/\`/' | cut -d\` -f2 | cut -d\ -f4- | \ sort | uniq -c | sort -rn | sed 's/^/ /' | head -$length > $temp if [ $(wc -l < $temp) -gt 0 ] ; then echo "" echo "Additional error messages in log file:" cat $temp fi echo "" echo "And non-error messages occurring in the log file:" grep -vE "$screen" "$1" | grep -v "\[error\]" | \ sort | uniq -c | sort -rn | \ sed 's/^/ /' | head -$length exit 0
This script works by scanning the error_log for the five errors specified in the calls to the checkfor function, extracting the last field on each error line with an awk call for $NF (NF represents the number of fields in that particular input line). This output is then fed through the common sort | uniq -c | sort -rn sequence to allow the extraction of the most commonly occurring errors for that category of problem.
To ensure that only those error types with matches are shown, each specific error search is saved to the temporary file, which is then tested for contents before a message is output. This is all neatly done with the checkfor function that appears near the top of the script.
The last few lines of the script are perhaps the most complex. First they identify the most common errors not otherwise checked for by the script that are still in standard Apache error log format. The following grep invocations are part of a longer pipe:
grep -vE "$screen" "$1" | grep "\[error\]"
Then the script identifies the most common errors not otherwise checked for by the script that don't occur in standard Apache error log format. Again, the following grep invocations are part of a longer pipe:
grep -vE "$screen" "$1" | grep -v "\[error\]"
This script should be fed a standard Apache-format error log as its only argument. If invoked with an -l length argument, it'll display length number of matches per error type checked rather than the default of five entries per error type.
$ weberrors error_log Input file error_log has 1040 entries. Entries from [Sat Aug 23 18:10:21 2003] to [Sat Aug 30 17:23:38 2003] File does not exist errors: 24 [htdocs] intuitive/coolweb/Graphics/Graphics/off.gif 19 [htdocs] intuitive/taylor/Graphics/biohazard.gif 19 [homedir] public_html/tyu/tyu-toc.html 14 [htdocs] intuitive/Graphics/bottom-menu.gif 12 [htdocs] intuitive/tmp/rose-ceremony/spacer.gif Invalid error redirection directive errors: 23 index.html script not found or unable to stat errors: 55 [htdocs] intuitive/coolweb/apps/env.cgi 4 [htdocs] intuitive/cgi-local/apps/env.cgi 4 [cgi-bin] FormMail.pl 3 [htdocs] intuitive/origins/playgame.cgi Additional error messages in log file: 5 (35)Resource temporarily unavailable: couldn't spawn include command 4 unknown parameter "src" to tag include in /usr/local/etc/httpd/htdocs/intuitive/tmp/ECR0803b.shtml 4 execution failure for parameter "cmd" to tag exec in file /usr/local/etc/httpd/htdocs/intuitive/library/footer.shtml 1 execution failure for parameter "cmd" to tag exec in file /usr/local/etc/httpd/htdocs/intuitive/library/WindWillows.shtml And non-error messages occurring in the log file: 39 /usr/home/taylor/web/intuitive/library/header.cgi: Cannot fork: Resource temporarily unavailable 20 identify: Missing an image file name. 17 sort: -: write error: Broken pipe 16 /web/bin/lastmod: not found 16 /web/bin/counter: not found
This HTML Help has been published using the chm2web software. |