8.1. Apache Logging FacilitiesApache can produce many types of logs. The two essential types are the access log, where all requests are noted, and the error log, which is designed to log various informational and debug messages, plus every exceptional event that occurs. Additional information can be found in module-specific logs, as is the case with mod_ssl, mod_rewrite and mod_security. The access log is created and written to by the module mod_log_config, which is not a part of the core, but this module is so important that everyone treats it as if it is. 8.1.1. Request LoggingYou only need to be familiar with three configuration directives to manage request logging: In fact, you will need to use only two. The CustomLog directive is so flexible and easy to use that you will rarely need to use transferLog in your configuration. (It will become clear why later.) Other directives are available, but they are deprecated and should not be used because CustomLog can achieve all the necessary functionality. Some have been removed from Apache 2:
8.1.1.1 LogFormatBefore covering the process of logging to files, consider the format of our log files. One of the benefits of Apache is its flexibility when it comes to log formatting. All this is owed to the LogFormat directive, whose default is the following, referred to as the Common Log Format (CLF): LogFormat "%h %l %u %t \"%r\" %>s %b" common The first parameter is a format string indicating the information to be included in a log file and the format in which it should be written; the second parameter gives the format string a name. You can decipher the log format using the symbol table. The table is available from the Apache reference documentation (http://httpd.apache.org/docs-2.0/mod/mod_log_config.html). It is reproduced in Table 8-1.
You have a lot of fields to play with. Format strings support optional parameters, as represented by the " . . . " in each format string representation in the table. Optional parameters can be used for the following actions:
Apache modules can collaborate on logging if they create a named note (a text string) and attach it to the request. If the %{note}n format string is used, the contents of the note will be written to the log. A change in the Apache architecture in the second generation allows for modules to collaborate and provide custom format strings. These format strings are available if the module that provides them is included in the configuration. (See Table 8-2.)
With the inclusion of mod_logio, you can measure the number of bytes transferred for every request. This feature allows hosting providers to put accurate billing mechanisms in place. (With Apache 1, you can only record the size of the response body, leaving request headers, request body, and response headers unmeasured.) Now that you are familiar with format strings, look at commonly used log formats (see Table 8-3). (You will need to define these formats in httpd.conf if they are not already there.)
Though you can create your own log format, you will most likely use one of the formats above since that is what web server log analyzers support. Nevertheless, the ability to create logs with a custom format is convenient for advanced uses, as we shall see later in this chapter. 8.1.1.2 TransferLogtransferLog is the basic request logging directive, which creates an access log with the given filename: TransferLog /var/www/logs/access_log The filename can be given with an absolute path, as above; if a relative filename is supplied, Apache will create the full path by pre-pending the server home directory (e.g. /usr/local/apache). By default, the transferLog directive uses the Common Log Format (CLF), which logs every request on a single line with information formatted (as shown in Section 8.1.1.1). Here is an example of what such a line looks like: 81.137.203.242 - - [29/Jun/2004:14:36:04 +0100] "POST /upload.php HTTP/1.1" 200 3229 However, if a LogFormat directive has been used earlier in the configuration file, the transferLog directive will use the format it defined and not the CLF. This is unexpected and can lead to errors since changing the order in which formats are defined can lead to a different format being used for the log files. I prefer not to use transferLog, and instead use the CustomLog directive (which forces me to explicitly define the log format). 8.1.1.3 CustomLogThe real power comes from using the CustomLog directive. The equivalent to the transferLog usage described above looks like this: CustomLog /var/www/logs/access_log custom The explicit naming of the log format helps us avoid mistakes. I like this directive because of its conditional logging features. Have a look at the following configuration fragment: # determine which requests are static - you may need to # adjust the regular expression to exclude other files, such # as PDF documents, or archives SetEnvIfNoCase REQUEST_URI "\.(gif|png|jpg)$" static_request # only log dynamic requests CustomLog /var/www/logs/application_log combined env=!static_request The conditional logging opens the door to many interesting logging opportunities, which really helps in real life. Most commonly, you will use mod_setenvif or mod_rewrite (which can also set environment variables) to determine what gets logged. I mentioned that, by default, Apache uses the CLF, which does not record many request parameters. At the very least you should change the configuration to use the combined format, which includes the UserAgent and the Referer fields. Looking at the log format string table shown in the LogFormat section, you can see over twenty different format strings, so even the use of a combined format results in loss of information. Create your own log format based on your information requirements. A nice example can be found at:
In the article, Chris makes a case for a log format that allows for web serving troubleshooting and performance management. At the end, he names the resulting log format Blackbox. 8.1.2. Error LoggingThe Apache error log contains error messages and information about events unrelated to request serving. In short, the error log contains everything the access log doesn't:
The format of the error log is fixed. Each line essentially contains only three fields: the time, the error level, and the message. In some rare cases, you can get raw data in the error log (no time or error level). Apache 2 adds the Referer information to 404 responses noted in the error log. Error logs are created using the ErrorLog configuration directive. Standard file naming conventions apply here; a relative filename will be assumed to be located in the server main folder. ErrorLog /var/www/logs/error_log The directive can be configured globally or separately for each virtual host. The LogLevel directive configures log granularity and ensures more information is not in the log than necessary. Its single parameter is one of the levels in Table 8-4. Events that are on the specified level or higher will be written to the log file.
The default setting is warn. However, Apache always logs the messages of level notice when logging to text files. Some interesting messages are emitted on the informational level (e.g., that a client timed out on a connection, a potential sign of a DoS attack). Consider running the error log on the information level: LogLevel info Take some time to observe the error log to get a feeling as to what constitutes normal Apache behavior. Some messages seem dangerous but may not be. On server startup, you will get a message similar to this one: [Mon Jul 05 12:26:27 2004] [notice] Apache/2.0.50 (Unix) DAV/2 PHP/4.3.4 configured -- resuming normal operations You will see a message to log the shutdown of the server: [Mon Jul 05 12:27:22 2004] [notice] caught SIGTERM, shutting down Most other relevant events will find their way to the error log as well. The Apache error log is good at telling you that something bad has happened, but it may not contain enough information to describe it. For example, since it does not contain information about the host where the error occurred, it is difficult to share one error log between virtual hosts. There is a way to get more informational error messages using the mechanism of custom logging. Here is an example: LogFormat "%h %l %u %t \"%r\" %>s %b \"%{error-notes}n\"" commone CustomLog logs/super_error_log commone Most of the time, the error message that caused a request to fail is contained in the error-notes note. By adding the contents of that variable to the log line output to the access log, we can get any request detail we want and the error message at the same time. This trick does not remove a need for the error log but makes forensic log analysis much easier. 8.1.3. Special Logging ModulesApache processes should never crash, but when they do, a message such as the following will appear in the error log: [Mon Jul 5 08:33:08 2004] [notice] child pid 1618 exit signal Segmentation fault (11) A segmentation fault appears because of an error in Apache code or because a hacker is taking advantage of the web server through a buffer overflow attack. Either way, this is bad and you have to find out why it is happening. Having frequent unexplained segmentation faults is a reason for concern. Your first impulse after discovering a segmentation fault will probably be to find the request that caused it. Due to the inadequate format of the error log, this may be difficult. Segmentation fault messages appear only in the main error log and not in the virtual hosts. Finding the corresponding request log entry may prove difficult when hosting a server with more than a couple of virtual hosts since the information about which virtual host was being processed at the time is unavailable. To make the matter worse, the request usually is not logged to the access log. The logging phase is one of the last phases of request processing to take place, so nothing is logged when the server crashes during one of the earlier phases. The purpose of mod_forensics (available since Versions 1.3.31 and 2.0.50) is to reveal the requests that make the server crash. It does that by having a special log file where requests are logged twice: once at the beginning and once at the end. A special utility script is used to process the log file. If a request appears only once in the log file, we know the server crashed before it could log the request for the second time. To enable mod_forensics you also need to enable mod_unique_id. After you add the module to your configuration, decide where to put the new log file: ForensicLog /var/www/logs/forensic_log After restarting the server, the beginning of each request will be marked with a log of the request data (with headers but excluding the request body). Here is an example: +QOmjHtmgtpkAADFIBBw|GET /cgi-bin/modsec-test.pl HTTP/1.1|Accept:text/xml,application/xml,application/xhtml+xml,text/html %3bq=0.9,text/plain%3bq=0.8,image/png,image/jpeg,image/gif%3 bq=0.2,%2a/%2a%3bq=0.1|Accept-Charset:ISO-8859-1,utf-8%3bq=0.7,%2a%3bq=0.7| Accept-Encoding:gzip,deflate|Accept-Language:en-us,en%3bq=0.5| Cache-Control:max-age=0|Connection:keep-alive|Host:www.ivanristic.com:8080| Keep-Alive:300|User-Agent:Mozilla/5.0 %28Windows%3b U%3b Windows NT 5.1%3b en-US%3b rv:1.7%29 Gecko/20040616 For each request that was properly handled, the unique identifier will be written to the log, too: -QOmjHtmgtpkAADFIBBw As you can see, a lot of data is being logged, so implement frequent log rotation for the forensic log. I don't think it is a good idea to leave mod_forensics enabled on a production server because excessive logging decreases performance. The chances of catching the offending request with mod_forensics are good though in some rare instances this module will fail:
Once you figure out the request, you should determine which of the active modules causes it. Your goal here is to determine whether to contact the module author (for a third-party module) or the Apache developers at dev@apache.org (for standard modules). If you have to continue on your own, consider the following tips:
8.1.4. Audit LogOne major disadvantage of Apache's (and most other web servers') logging facilities is that there is no way to observe and log request and response bodies. While most web application attacks take place through GET requests, that is only because they are performed (or programmed) by less capable attackers. The dangerous types will take the extra two minutes to craft a POST request, knowing the chances of the attack being logged are very small. However, audit logging becomes a possibility with the help of mod_security (http://www.modsecurity.org). This module (described further in Chapter 12) adds audit logging configuration directives that can be placed almost anywhere in the configuration. It works with the main server, virtual servers, or in a directory context. To specify the audit log file and start audit logging, add the following to your configuration: SecAuditEngine On SecAuditLog /var/www/logs/audit_log After the installation and configuration, you will be able to log the contents of those POST payloads for the first time. Below is an example of an individual audit log entry, where mod_security denied the request because a pattern "333" was detected in the request body. ("333" is not a real attack but something I often use for testing to make sure my configuration works.) = = = = = = = = = = = = = = = = = = = = = = = UNIQUE_ID: QOFMpdmgtpkAAFM1ALQ Request: 127.0.0.1 - - [29/Jun/2004:12:04:05 +0100] "POST /cgi-bin/ modsec-test.pl HTTP/1.0" 500 539 Handler: cgi-script ---------------------------------------- POST /cgi-bin/modsec-test.pl HTTP/1.0 Connection: Close Content-Length: 5 Content-Type: application/x-www-form-urlencoded Host: 127.0.0.1:8080 User-Agent: mod_security regression test utility mod_security-message: Access denied with code 500. Pattern match "333" at POST_PAYLOAD. mod_security-action: 500 5 p=333 HTTP/1.0 500 Internal Server Error Connection: close Content-Type: text/html; charset=iso-8859-1 The entry begins with a few request identifiers followed by the request headers and the request body, followed by the response headers. The module will automatically detect and make use of the unique ID generated by mod_unique_id. This variable can help track a request over several log files. Currently, the module does not support response body logging, though the filter architecture of Apache 2 allows for it.
The audit engine of mod_security supports several logging levels (configured with the SecAuditEngine directive):
8.1.5. Performance MeasurementAn experimental feature in the Apache 2 version of mod_security adds performance measurement support. Measuring script performance can be difficult because the response is typically generated and transmitted back to the client concurrently. The only measure normally available is the total time it took to process a request. But that number does not mean much. For example, for a client accessing the server over a slow link (e.g., a modem connection), the processing time will be long but that does not indicate a fault. You can measure performance of individual processes but only if you separate them first. This can be done if the response is not sent to the client as it is being generated. Instead, the response is kept in a memory buffer until generation is complete: This is called buffering. mod_security already introduces buffering into the request processing but for different reasons (security). With buffering in place, performance measurement becomes trivial. mod_security records elapsed time at three points for each request:
These measurements are useful when used in a custom log together with information provided by the mod_logio module, because to make sense of the numbers you need to know the number of bytes sent to, (format string %I) and from, (format string %O) the server: CustomLog logs/timer_log "%t \"%r\" %>s - %I %O -\ %{mod_security-time1}n %{mod_security-time2}n \ %{mod_security-time3}n %D Each entry in the log will look something like this: [19/Nov/2004:22:30:08 +0000] "POST /upload.php HTTP/1.1" 200 - 21155 84123 - 673761 687806 5995926 7142031 All times are given in microseconds, relative to the beginning of request processing. The following conclusions can be made out of the line given in the previous example (with the figures rounded to the nearest millisecond so they are easier to read):
8.1.6. File Upload InterceptionA special case of audit logging occurs when files are uploaded to the server. Since mod_security supports the multipart/form-data encoding, you can choose to keep the uploaded files: SecUploadKeepFiles On SecUploadDir /var/www/logs/files The SecUploadKeepFiles directive can have one of three possible values:
8.1.7. Application LogsInclude the application logs on the list of logs you monitor. At the very least, you should integrate the logs of the application engine with the rest of the logs. For example, configuring PHP to send errors to the Apache error log (described in Chapter 3) removes one thing from the TODO list. For each application, you should do the following:
Consider the following five recommendations to increase the security of your application logs:
8.1.8. Logging as Much as PossibleThe default logging format is adequate to generate traffic statistics but inadequate for forensic analysis. We need to use the custom logging facility and design a log format that provides us with the information we need. By starting with the combined log format and adding more fields, we increase the information logged while retaining backward-compatibility with traffic analysis software. We add six fields to the log format:
The new log format will be shown soon after discussing how the information needed for the additional fields may be obtained. For example, integration with applications is required to achieve adequate logging levels. This comes in two forms: usage of HTTP status codes and integration with PHP. 8.1.8.1 Using HTTP status codesFirst, the application must make use of HTTP status codes other than 200 (which is used by default) where appropriate. These codes are very useful but not many applications utilize them. There are five code categories (see Table 8-5).
The 4XX category is particularly interesting and is the one we use the most (see Table 8-6).
With the status codes in mind, Table 8-7 presents the codes an application should return, given various events.
At first, I thought using the 401 status would be impossible since it would make the browser ask users to enter their credentials. Having done some tests, I determined that returning the status code alone (without the WWW-Authenticate header) is insufficient to trigger the authentication process. The 401 status can be used after all, and it appears in the access log. 8.1.8.2 Integration with PHPWhen installed as a module, PHP integrates with Apache and allows direct communication between modules to take place. Other application engines may provide similar support. We will take advantage of the POST request body being available to the PHP code. We can, therefore, take the body and return it to Apache, along with other parameters known to the application (the username and the session identifier). This is possible because Apache has a feature called notes, which was specifically designed for inter-module communication. The following code fragment sends some of the information from the PHP module to Apache, where the information is available for other modules to use. It creates four Apache notes: x_username, x_sessionid, x_request, and x_log. function inform_apache($username, $sessionid) { // reconstruct the first line of the request $request = $_SERVER["REQUEST_METHOD"]; $request .= " " . $_SERVER["REQUEST_URI"]; // add any available POST parameters if (count($_POST) != 0) { // some POST requests contain parameters in the URI if (strpos($request, "?") = = false) $request .= "?"; else $request .= "&"; $count = 0; foreach($_POST as $name => $value) { if ($count != 0) $request .= "&"; $request .= urlencode($name) . "=" . urlencode($value); $count++; } } $request .= $_SERVER["SERVER_PROTOCOL"]; // send the parameters to Apache through notes apache_note("x_username", $username); apache_note("x_sessionid", $sessionid); apache_note("x_request", $request); // set an environment variable to trigger // conditional logging apache_setenv("x_log", "true"); } Sending a message from the application to the logging module can be useful. This can be done through a warning note: function warn_apache($warning) { apache_note("x_warning", $warning); } 8.1.8.3 Recommended log formatFinally, we arrive at our new log format: LogFormat "%h %l %{x_username}n %t \"%{x_request}n\" %>s %b \"%{Referer}i\" \ \"%{User-Agent}i\" %{UNIQUE_ID}n %T %D %{x_sessionid}n %{x_warning}n \ %{error-notes}n" apptrack Note the following:
We use the new log format together with a conditional logging directive to avoid having bogus lines in the log file: # log only requests that have the extra PHP-supplied information CustomLog /var/www/logs/special_log apptrack env=x_log 8.1.8.4 Alternative integration methodIf you cannot take advantage of the Apache notes mechanism and the PHP integration (you may not be running PHP as a module, for example), the alternative is to use mod_security to recover the POST request body (it will create the x_request note when configured to do so) and to use response headers to transport the information out of the application. In the application code, send out the session identifier and the username, using headers x_sessionid and x_username. These headers can be logged with %{x_sessionid}o and %{x_username}o, respectively. header("x_sessionid: $sessionid"); header("x_username: $username"); You will not be able to send a warning from the application using response headers though. Outgoing headers will be visible to the client, too, and using them for a warning may result in revealing critical information to an attacker. |