12.2. Using mod_securitymod_security is a web application firewall module I developed for the Apache web server. It is available under the open source GPL license, with commercial support and commercial licensing as an option. I originally designed it as a means to obtain a proper audit log, but it grew to include other security features. There are two versions of the module, one for each major Apache branch, and they are almost identical in functionality. In the Apache 2 version, mod_security uses the advanced filtering API available in that version, making interception of the response body possible. The Apache 2 version is also more efficient in terms of memory consumption. In short, mod_security does the following:
In this section, I present a deployment guide for mod_security, but the principles behind it are the same and can be applied to any web application firewall. For a detailed reference manual, visit the project documentation area at http://www.modsecurity.org/documentation/. 12.2.1. IntroductionThe basic ingredients of every mod_security configuration are:
The purpose of this section is to present enough information as to how these ingredients interact with each other to enable you to configure and use mod_security. The subsequent sections will cover some advanced topics to give you more insight needed in some specific cases. 12.2.1.1 Installation and basic configurationTo install mod_security, you need to compile it using the apxs tool, as you would any other module. Some contributors provide system-specific binaries for download, and I put links to their web sites at http://www.modsecurity.org/download/. If you have installed Apache from source, apxs will be with other Apache binaries in the /usr/local/apache/bin/ folder. If you cannot find the apxs tool on your system, examine the vendor-provided documentation to learn how to add it. For example, on Red Hat systems apxs is a part of the httpd-devel package. Position to the correct source code directory (there's one directory for each Apache branch) and execute the following commands: # /usr/local/apache/bin/apxs -cia mod_security.c # /usr/local/apache/bin/apachectl stop # /usr/local/apache/bin/apachectl start After having restarted Apache, mod_security will be active but disabled. I recommend the following configuration to enable it with minimal chances of denying legitimate requests. You can enable mod_security with fewer configuration directives. Most options have default settings that are the same as the following configurations, but I prefer to configure things explicitly rather than wonder if I understand what the default settings are: # Enable mod_security SecFilterEngine On # Retrieve request payload SecFilterScanPOST On # Reasonable automatic validation defaults SecFilterCheckURLEncoding On SecFilterCheckCookieFormat Off SecFilterNormalizeCookies Off SecFilterCheckUnicodeEncoding Off # Accept almost all byte values SecFilterForceByteRange 1 255 # Reject invalid requests with status 403 SecFilterDefaultAction deny,log,status:403 # Only record the relevant information SecAuditEngine RelevantOnly SecAuditLog /var/www/logs/audit_log # Where to store temporary and intercepted files SecUploadDir /var/www/logs/files/ # Do not store intercepted files for the time being SecUploadKeepFiles Off # Use 0 for the debug level in production # and 4 for testing SecFilterDebugLog /var/www/logs/modsec_debug_log SecFilterDebugLevel 4 Starting from the top, this configuration data enables mod_security and tells it to intercept request bodies, configures settings for various encoding validation and anti-evasion features (explained below), configures the default action list to handle invalid requests, and configures the two log types. After adding the configuration data to your httpd.conf file, make a couple of requests to the web server and examine the audit_log and modsec_debug_log files. Without any rules configured, there won't be much output in the debug log but at least you will be certain the module is active. 12.2.1.2 Processing orderYou must understand what mod_security does and in what order for every request. Generally, processing consists of four phases:
12.2.1.3 Anti-evasion featuresAs mentioned in Chapter 10, evasion techniques can be used to sneak in malicious payload undetected by web intrusion detection software. To counter that, mod_security performs the following anti-evasion techniques automatically:
12.2.1.4 Encoding validation featuresIn some ways, encoding validation can be treated as anti-evasion. As mentioned previously, web servers and applications are often very flexible and allow invalid requests to be processed anyway. Using one of the following encoding validation options, it is possible to restrict what is accepted:
12.2.1.5 RulesThe best part of mod_security is its flexible rule engine. In the simplest form, a rule requires only a single keyword. The SecFilter directive performs a broad search against the request parameters, as well as against the request body for POST requests: SecFilter KEYWORD If the keyword is detected, the rule will be triggered and will cause the default action list to be executed. The keyword is actually a regular expression pattern. Using a simple string, such as 500, will find its occurrence anywhere in the search content. To make full use of mod_security, learn about regular expressions. If you are unfamiliar with them, I suggest the link http://www.pcre.org/pcre.txt as a good starting point. If you prefer a book, check out Mastering Regular Expressions by Jeffrey E. F. Friedl (O'Reilly), which is practically a regular expression reference guide. Here are a couple of points I consider important:
I will demonstrate what can be done with regular expressions with a regular expression pattern you will find useful in the real world: ^[0-9]{1,9}$. This pattern matches only numbers and only ones that have at least one but up to nine digits.
Although broad rules are easy to write, they usually do not work well in real life. Their use significantly increases the chances of introducing false positives and reducing system availability to its legitimate users (not to mention the annoyance they cause). A much better approach to rule design is to consider the impact and only apply rules to certain parts of HTTP requests. This is what SecFilterSelective is for. For example, the following rule will look for the keyword only in the query string: SecFilterSelective QUERY_STRING KEYWORD The QUERY_STRING variable is one of the supported variables. The complete list is given in Tables Table 12-1 (standard variables available for use with mod_rewrite or CGI scripts) and Table 12-2 (extended variables specific to mod_security). In most cases, the variable names are the same as those used by mod_rewrite and the CGI specification.
When using selective rules, you are not limited to examining one field at a time. You can separate multiple variable names with a pipe. The following rule demonstrates how to access named parts of the request, in this example, a parameter and a cookie: # Look for the keyword in the parameter "authorized" # and in the cookie "authorized". A match in either of # them will trigger the rule. SecFilterSelective ARG_authorized|COOKIE_authorized KEYWORD If a variable is absent in the current request the variable will be treated as empty. For example, to detect the presence of a variable, use the following format, which triggers execution of the default action list if the variable is not empty: SecFilterSelective ARG_authorized !^$ A special syntax allows you to create exceptions. The following applies the rule to all parameters except the parameter html: SecFilterSelective ARGS|!ARG_html KEYWORD Finally, single rules can be combined to create more complex expressions. In my favorite example, I once had to deploy an application that had to be publicly available because our users were located anywhere on the Internet. The application has a powerful, potentially devastating administration account, and the login page for users and for the administrator was the same. It was impossible to use other access control methods to restrict administrative logins to an IP address range. Modifying the source code was not an option because we had no access to it. I came up with the following two rules: SecFilterSelective ARG_username ^admin$ chain SecFilterSelective REMOTE_ADDR !^192\.168\.254\.125$ The first rule triggers whenever someone tries to log in as an administrator (it looks for a parameter username with value admin). Without the optional action chain being specified, the default action list would be executed. Since chain is specified, processing continues with execution of the second rule. The second rule allows the request to proceed if it is coming from a single predefined IP address (192.168.254.125). The second rule never executes unless the first rule is satisfied. 12.2.1.6 ActionsYou can do many things when an invalid request is discovered. The SecFilterDefaultAction determines the default action list: # Reject invalid requests with status 403 SecFilterDefaultAction deny,log,status:403 You can override the default action list by supplying a list of actions to individual rules as the last (optional) parameter: # Only log a warning message when the KEYWORD is found SecFilter KEYWORD log,pass The full list of supported actions is given in Table 12-3.
12.2.1.7 LoggingThere are three places where, depending on the configuration, you may find mod_security logging information:
Here is an example of an error message resulting from invalid content discovered in a cookie: [Tue Oct 26 17:44:36 2004] [error] [client 127.0.0.1] mod_security: Access denied with code 500. Pattern match "!(^$|^[a-zA-Z0-9]+$)" at COOKIES_VALUES(sessionid) [hostname "127.0.0.1"] [uri "/cgi-bin/modsec-test.pl"] [unique_id bKjdINmgtpkAADHNDC8AAAAB] The message indicates that the request was rejected ("Access denied") with an HTTP 500 response because the content of the cookie sessionid contained content that matched the pattern !(^$|^[a-zA-Z0-9]+$). (The pattern allows a cookie to be empty, but if it is not, it must consist only of one or more letters and digits.) 12.2.2. More Configuration AdviceIn addition to the basic information presented in the previous sections, some additional (important) aspects of mod_security operation are presented here. 12.2.2.1 Activation timeFor each request, mod_security activities take place after Apache performs initial work on it but before the actual request processing starts. During the first part of the work, Apache sometimes decides the request can be fulfilled or rejected without going to the subsequent processing phases. Consequently, mod_security is never executed. These occurrences are not cause for concern, but you need to know about them before you start wondering why something you configured does not work. Here are some situations when Apache finishes early:
12.2.2.2 Performance impactThe performance of the rule database is directly related to how many rules are in the configuration. For all normal usage patterns, the number of rules is small, and thus, there is practically no impact on the request processing speed. The only serious impact comes from increased memory consumption in the case of file uploads and Apache 1, which is covered in the next section. In some circumstances, requests that perform file upload will be slower. If you enable the feature to intercept uploaded files, there will be an additional overhead of writing the file to disk. The exact slowdown depends on the speed of the filesystem, but it should be small. 12.2.2.3 Memory consumptionThe use of mod_security results in increased memory consumption by the Apache web server. The increase can be very small, but it can be very big in some rare circumstances. Understanding why it happens will help you avoid problems in those rare circumstances. When mod_security is not active, Apache only sees the first part of the request: the request line (the first line of the request) and the subsequent headers. This is enough for Apache to do its work. When request processing begins, the module that does the processing feeds the request body to where it needs to be consumed. In the case of PHP, for example, the request body goes directly to PHP. Apache almost never sees it. With mod_security enabled, it becomes a requirement to have access to the complete request body before processing begins. That is the only approach that can protect the application. (Early versions of mod_security did look at the body bit by bit but that proved to be insufficient.) That is why mod_security reads the complete request into its own buffer and later feeds it from there to the processing module. Additional memory space is needed so that the anti-evasion processing can take place. A buffer twice the size of the request body is required by mod_security to complete processing. In most cases, this is not a problem since request bodies are small. The only case when it can be a problem is when file upload functionality is required. Files can be quite large (sizes of over 100 MB are not unheard of), and mod_security will want to put all of them into memory, twice. If you are running Apache 1, there is no way around this but to disable request body buffering (as described near the end of this chapter) for those parts of the application where file upload takes place. You can also (and probably should) limit the maximum size of the body by using the Apache configuration directive LimitRequestBody. But there is good news for the users of Apache 2. Because of its powerful content filtering API, mod_security for Apache 2 is able to stream the request body to the disk if its size is larger than a predefined value (using the directive SecUploadInMemoryLimit , set to 64 KB by default), so increased memory consumption does not take place. However, mod_security will need to store the complete request to the disk and read it again when it sends it forward for processing. A similar thing happens when you enable output monitoring (described later in this chapter). Again, the output cannot and will not be delivered to the client until all of it is available to mod_security and after the analysis takes place. This process introduces response buffering. At the moment, there is no way to limit the amount of memory spent doing output buffering, but it can be used in a controlled manner and only enabled for HTML or text files, while disabled for binary files, via output filtering, described later in this chapter. 12.2.2.4 Per-context configurationIt is possible to use mod_security in the main server, in virtual hosts, and in per-directory contexts. Practically all configuration directives support this. (The ones that do not, such as SecChrootDir, make no sense outside of the main server configuration.) This allows a different policy to be implemented wherever necessary. Configuration and rule inheritance is also implemented. Rules added to the main server will be inherited by all virtual hosts, but there is an option to start from scratch (using the SecFiltersInheritance directive). On the same note, you can use mod_security from within .htaccess files (if the AllowOverride option Options is specified), but be careful not to allow someone you do not trust to have access to this feature. 12.2.2.5 Tight Apache integrationAlthough mod_security supports the exec action, which allows a custom script to be executed upon detecting an invalid action, Apache offers two mechanisms that allow for tight integration and more flexibility. One mechanism you should use is the ErrorDocument, which allows a script to be executed (among other things) whenever request processing returns with a particular response status code. This feature is frequently used to create a "Page not found" message. Depending on your security policy, the same feature can be used to explain that the security system you put in place believes something funny is going on and, therefore, decided to reject the request. At the same time, you can add code to the script to do something else, for example, to send a notification somewhere. An example script for Apache integration comes with the mod_security distribution. The other thing you can do is add mod_unique_id (distributed with Apache and discussed in Chapter 8) into your configuration. After you do, this module will generate a unique ID (guaranteed to be unique within the server) for every request, storing it in the environment variable UNIQUE_ID (where it will be picked up by mod_security). This feature is great to enable you to quickly find what you are looking for. I frequently use it in the output of an ErrorDocument script, where the unique ID is presented to the user with the instructions to cite it as reference when she complains to the support group. This allows you to quickly and easily pinpoint and solve the problem. 12.2.2.6 Event monitoringIn principle, IDSs support various ways to notify you of the problems they discover. In the best-case scenario, you have some kind of monitoring system to plug the IDS into. If you do not, you will probably end up devising some way to send notifications to your email, which is a bad way to handle notifications. Everyone's natural reaction to endless email messages from an IDS is to start ignoring them or to filter them automatically into a separate mail folder. A better approach (see Chapter 8) is to streamline IDS requests into the error log and to implement daily reporting at one location for everything that happens with the web server. That way, when you come to work in the morning, you only have one email message to examine. You may decide to keep email notifications for some dangerous attackse.g., SQL injections. 12.2.3. Deployment GuidelinesDeploying a web firewall for a known system requires planning and careful execution. It consists of the following steps:
Probably the best advice I can give is for you to learn about the system you want to protect. I am asked all the time to provide an example of a tight mod_security configuration, but I hesitate and almost never do. Intrusion detection (like many other security techniques) is not a simple, fire-and-forget, solution in spite of what some commercial vendors say. Incorrect rules, when deployed, will result in false positives that waste analysts' time. When used in prevention mode, false positives result in reduced system availability, which translates to lost revenue (or increased operations expenses, depending on the way you look at it). In step 2, you need to decide whether intrusion detection can bring a noticeable increase in security. This is not the same as what I previously discussed in this chapter, that is, whether intrusion detection is a valid tool at all. Here, the effort of introducing intrusion detection needs to be weighed against other ways to solve the problem. First, understand the time commitment intrusion detection requires. If you cannot afford to follow up on all alerts produced by the system and to work continuously to tweak and improve the configuration, then you might as well give up now. The other thing to consider is the nature and the size of the system you want to protect. For smaller applications for which you have the source code, invest in a code review and fix the problems in the source code. Establishing a protection policy is arguably the most difficult part of the work. You start with the list of weaknesses you want to protect and, having in mind the capabilities of the protection software, work out a feasible protection plan. If it turns out the tool is not capable enough, you may look for a better tool. Work on the policy is similar to the process of threat modeling discussed in Chapter 1. Installation and configuration is the easy part and already covered in detail here. You need to work within the constraints of your selected tool to implement the previously designed policy. The key to performing this step is to work on a development server first and to test the configuration thoroughly to ensure the protection rules behave as you would expect them to. In the mod_security distribution is a tool ( run_test.pl) that can be used for automated tests. As a low-level tool, run_test.pl takes a previously created HTTP request from a text file, sends it to the server, and examines the status code of the response to determine the operation's success. Run regression tests periodically to test your IDS. Deploying in detection mode only is what you do to test the configuration in real life in an effort to avoid causing disturbance to normal system operation. For several weeks, the IDS should only send notifications without interrupting the requests. The configuration should then be fine-tuned to reduce the false positives rate, hopefully to zero. Once you are confident the protection is well designed (do not hurry), the system operation mode can be changed to prevention mode. I prefer to use the prevention mode only for problems I know I have. In all other cases, run in the detection mode at least for some time and see if you really have the problems you think you may have.
12.2.3.1 Reasonable configuration starting pointThere is a set of rules I normally use as a starting point in addition to the basic configuration given earlier. These rules are not meant to protect from direct attacks but rather to enforce strict HTTP protocol usage and make it more difficult for attackers to make manual attacks. As I warned, these rules may not be suitable for all situations. If you are running a public web site, there will be all sorts of visitors, including search engines, which may be a little bit eccentric in the way they send HTTP requests that are normal. Tight configurations usually work better in closed environments. # Accept only valid protocol versions, helps # fight HTTP fingerprinting. SecFilterSelective SERVER_PROTOCOL !^HTTP/(0\.9|1\.0|1\.1)$ # Allow supported request methods only. SecFilterSelective REQUEST_METHOD !^(GET|HEAD|POST)$ # Require the Host header field to be present. SecFilterSelective HTTP_Host ^$ # Require explicit and known content encodings for methods # other than GET or HEAD. The multipart/form-data encoding # should not be allowed at all if the application does not # make use of file upload. There are many automated attacks # out there that are using wrong encoding names. SecFilterSelective REQUEST_METHOD !^(GET|HEAD)$ chain SecFilterSelective HTTP_Content-Type \ !(^application/x-www-form-urlencoded$|^multipart/form-data;) # Require Content-Length to be provided with # every POST request. Length is a requirement for # request body filtering to work. SecFilterSelective REQUEST_METHOD ^POST$ chain SecFilterSelective HTTP_Content-Length ^$ # Don't accept transfer encodings we know we don't handle # (you probably don't need them anyway). SecFilterSelective HTTP_Transfer-Encoding !^$ You may also choose to add some of the following rules to warn you of requests that do not seem to be from common browsers. Rules such as these are suited for applications where the only interaction is expected to come from users using browsers. On a public web site, where many different types of user agents are active, they result in too many warnings. # Most requests performed manually (e.g., using telnet or nc) # will lack one of the following headers. # (Accept-Encoding and Accept-Language are also good # candidates for monitoring since popular browsers # always use them.) SecFilterSelective HTTP_User-Agent|HTTP_Connection|HTTP_Accept ^$ log,pass # Catch common nonbrowser user agents. SecFilterSelective HTTP_User-Agent \ (libwhisker|paros|wget|libwww|perl|curl) log,pass Ironically, your own monitoring tools are likely to generate error log warnings. If you have a dedicated IP address from which you perform monitoring, you can add a rule to skip the warning checks for all requests coming from it. Put the following rule just above the rules that produce warnings: # Allow requests coming from 192.168.254.125 SecFilterSelective REMOTE_ADDR ^192.168.254.125$ allow Though you could place this rule on the top of the rule set, that is a bad idea; as one of the basic security principles says, only establish minimal trust. 12.2.4. Detecting Common AttacksWeb IDSs are good at enforcing strict protocol usage and defending against known application problems. Attempts to exploit common web application problems often have a recognizable footprint. Pattern matching can be used to detect some attacks but it is generally impossible to catch all of them without having too many false positives. Because of this, my advice is to use detection only when dealing with common web application attacks. There is another reason to adopt this approach: since it is not possible to have a foolproof defense against a determined attacker, having a tight protection scheme will force such an attacker to adopt and use evasion methods you have not prepared for. If that happens, the attacker will become invisible to you. Let some attacks through so you are aware of what is happening. The biggest obstacle to reliable detection is the ability for users to enter free-form text, and this is common in web applications. Consequently, content management systems are the most difficult ones to defend. (Users may even be discussing web application security in a forum!) When users are allowed to enter arbitrary text, they will sooner or later attempt to enter something that looks like an attack. In this section, I will discuss potentially useful regular expression patterns without going into details as to how they are to be added to the mod_security configuration since the method of adding patterns to rules has been described. (If you are not familiar with common web application attacks, reread Chapter 10.) In addition to the patterns provided here, you can seek inspiration in rules others have created for nonweb IDSs. (For example, rules for Snort, a popular NIDS, can be found at http://www.snort.org and http://www.bleedingsnort.com.) 12.2.4.1 Database attacksDatabase attacks are executed by sneaking an SQL query or a part of it into request parameters. Attack detection must, therefore, attempt to detect commonly used SQL keywords and metacharacters. Table 12-4 shows a set of patterns that can be used to detect database attacks.
So far, I have presented generic SQL patterns. Most databases have proprietary extensions of one kind or another, which require keywords that are often easier to detect. These patterns differ from one database to another, so creating a good set of detection rules requires expertise in the deployed database. Table 12-5 shows some interesting patterns for MSSQL and MySQL.
12.2.4.2 Cross-site scripting attacksCross-site scripting (XSS) attacks can be difficult to detect when launched by those who know how to evade detection systems. If the entry point is in the HTML, the attacker must find a way to change from HTML and into something more dangerous. Danger comes from JavaScript, ActiveX components, Flash programs, or other embedded objects. The following list of problematic HTML tags is by no means exhaustive, but it will prove the point:
Your best bet is to try to detect any HTML in the parameters and also the special JavaScript entity syntax that only works in Netscape. If a broad pattern such as <.+> is too broad for you, you may want to list all possible tag names and detect them. But if the attacker can sneak in a tag, then detection becomes increasingly difficult because of many evasion techniques that can be used. From the following two evasion examples, you can see it is easy to obfuscate a string to make detection practically impossible:
If the attacker can inject content directly into JavaScript, the list of evasion options is even longer. For example, he can use the eval( ) function to execute an arbitrary string or the document.write( ) function to output HTML into the document:
Now you understand why you should not stop attackers too early. Knowing you are being attacked, even successfully attacked, is sometimes better than not knowing at all. A useful collection list of warning patterns for XSS attacks is given in Table 12-6. (I call them warning patterns because you probably do not want to automatically reject requests with such patterns.) They are not foolproof but cast a wide net to catch potential abuse. You may have to refine it over time to reduce false positives for your particular application.
12.2.4.3 Command execution and file disclosureDetecting command execution and file disclosure attacks in the input data can be difficult. The commands are often very short and can appear as normal words in many request parameters. The recommended course of action is to implement a set of patterns to detect but not reject requests. Table 12-7 shows patterns that can be of use. (I have combined many patterns into one to save space.) The patterns in the table are too broad and should never be used to reject requests automatically.
Command execution and file disclosure attacks are often easier to detect in the output. On my system, the first line of /etc/passwd contains "root:x:0:0:root:/root:/bin/bash," and this is the file any attacker is likely to examine. A pattern such as root:x:0:0:root is likely to work here. Similarly, the output of the id command looks like this: uid=506(ivanr) gid=506(ivanr) groups=506(ivanr) A pattern such as uid=[[:digit:]]+\([[:alnum:]]+\) gid=\[[:digit:]]\([[:alnum:]]+\) will catch its use by looking at the output. 12.2.5. Advanced TopicsI conclude this chapter with a few advanced topics. These topics are regularly the subject of email messages I get about mod_security on the users' mailing list. 12.2.5.1 Complex configuration scenariosThe mod_security configuration data can be placed into any Apache context. This means you can configure it in the main server, virtual hosts, directories, locations, and file matches. It can even work in the .htaccess files context. Whenever a subcontext is created, it automatically inherits the configuration and all the rules from the parent context. Suppose you have the following: SecFilterSelective ARG_p KEYWORD <Location /moresecure/> SecFilterSelective ARG_q KEYWORD </Location> Requests for the parent configuration will have only parameter p tested, while the requests that fall in the /moresecure/ location will have p and q tested (in that order). This makes it easy to add more protection. If you need less protection, you can choose not to inherit any of the rules from the parent context. You do this with the SecFilterInheritance directive. For example, suppose you have: SecFilterSelective ARG_p KEYWORD <Location /moresecure/> SecFilterInheritance Off SecFilterSelective ARG_q KEYWORD </Location> Requests for the parent configuration will have only parameter p tested, while the requests that fall in the /moresecure/ location will have only parameter q tested. The SecFilterInheritance directive affects only rule inheritance. The rest of the configuration is still inherited, but you can use the configuration directives to change configuration at will. 12.2.5.2 Byte-range restrictionByte-range restriction is a special type of protection that aims to reduce the possibility of a full range of bytes in the request parameters. Such protection can be effective against buffer overflow attacks against vulnerable binaries. The built-in protection, if used, will validate that every variable used in a rule conforms to the range specified with the SecFilterForceByteRange directive. Applications built for an English-speaking audience will probably use a part of the ASCII set. Restricting all bytes to have values from 32 to 126 will not prevent normal functionality: SecFilterForceByteRange 32 126 However, many applications do need to allow 0x0a and 0x0d bytes (line feed and carriage return, respectfully) because these characters are used in free-form fields (ones with a <textarea> tag). Though you can relax the range slightly to allow byte values from 10 on up, I am often asked whether it is possible to have more than one range. The SecFilterForceByteRange directive does not yet support that, but you could perform such a check with a rule that sits at the beginning of the rule set. SecFilterSelective ARGS !^[\x0a\x0d\x20-\x7e]*$ The previous rule allows characters 0x0a, 0x0d, and a range from 0x20 (32) to 0x7e (126). 12.2.5.3 File upload interception and validationSince mod_security understands the multipart/form-data encoding used for file uploads, it can extract the uploaded files from the request and store them for future reference. In a way, this is a form of audit logging (see Chapter 8). mod_security offers another exciting feature: validation of uploaded files in real time. All you need is a script designed to take the full path to the file as its first and only parameter and to enable file validation functionality in mod_security: SecUploadApproveScript /usr/local/apache/bin/upload_verify.pl The script will be invoked for every file upload attempt. If the script returns 1 as the first character of the first line of its output, the file will be accepted. If it returns anything else, the whole request will be rejected. It is useful to have the error message (if any) on the same line after the first character as it will be printed in the mod_security log. File upload validation can be used for several purposes:
If you have the excellent open source antivirus program Clam AntiVirus (http://www.clamav.net) installed, then you can use the following utility script as an interface: #!/usr/bin/perl $CLAMSCAN = "/usr/bin/clamscan"; if (@ARGV != 1) { print "Usage: modsec-clamscan.pl <filename>\n"; exit; } my ($FILE) = @ARGV; $cmd = "$CLAMSCAN --stdout --disable-summary $FILE"; $input = `$cmd`; $input =~ m/^(.+)/; $error_message = $1; $output = "0 Unable to parse clamscan output"; if ($error_message =~ m/: Empty file\.$/) { $output = "1 empty file"; } elsif ($error_message =~ m/: (.+) ERROR$/) { $output = "0 clamscan: $1"; } elsif ($error_message =~ m/: (.+) FOUND$/) { $output = "0 clamscan: $1"; } elsif ($error_message =~ m/: OK$/) { $output = "1 clamscan: OK"; } print "$output\n"; 12.2.5.4 Restricting mod_security to process dynamic requests onlyWhen mod_security operates from within Apache (as opposed to working as a network gateway), it can obtain more information about requests. One useful bit of information is the choice of a module to handle the request (called a handler). In the early phases of request processing, Apache will look for candidate modules to handle the request, usually by looking at the extension of the targeted file. If a handler is not found, the request is probably for a static file (e.g., an image). Otherwise, the handler will probably process the file in some way (for example, executing the script in the case of PHP) and dynamically create a response. Since mod_security mostly serves the purpose of protecting dynamic resources, this information can be used to perform optimization. If you configure the SecFilterEngine directive with the DynamicOnly parameter then mod_security will act only on those requests that have a handler attached to them. # Only process dynamic requests SecFilterEngine DynamicOnly Unfortunately, it is possible to configure Apache to serve dynamic content and have the handler undefined, by misusing its AddType directive. Even the official PHP installation guide recommends this approach. If that happens, mod_security will not be able to determine which requests are truly dynamic and will not be able to protect them. The correct approach is to use the AddHandler directive, as in this example for PHP: AddHandler application/x-httpd-php .php Relying on the existence of a request handler to decide whether to protect a resource can be rewarding, but since it can be dangerous if handlers are not configured correctly, check if relying on handlers really works in your case. You can do this by having a rule that rejects every request (in which case it will be obvious whether mod_security works) or by looking at what mod_security writes to the debug log (where it will state if it believes the incoming request is for a static resource).
12.2.5.5 Request body monitoringThere are two ways to control request body buffering and monitoring. You have seen one in the default configuration where the SecFilterScanPOST directive was used. This works if you know in advance where you want and do not want buffering to take place. Using the Apache context directives, you can turn off buffering for some parts of the site, as in the following example: # Turn off POST buffering for # scripts in this location <Location /nobuffering/> SecFilterScanPOST Off </Location> Sometimes you need to disable buffering on a per-request basis, based on some request attribute. This is possible. If mod_security detects that the MODSEC_NOPOSTBUFFERING environment variable is defined, it will not read in the request body. The environment variable can be defined with the help of the mod_setenvif module and its SetEnvIf directive: # Disable request body buffering for all file uploads SetEnvIfNoCase Content-Type ^multipart/form-data \ "MODSEC_NOPOSTBUFFERING=Do not buffer file uploads" The text you assign to the variable will appear in the debug log, to make it clear why the request body was not buffered. Turning off buffering like this can result in removing protection from your scripts. If the attacker finds out how to disable request body buffering, he may be able to do so for every script and then use the POST method for all attacks. 12.2.5.6 Response body monitoringResponse body monitoring is supported in the Apache 2 version of mod_security and can prevent information leak or detect signs of intrusion. This type of filtering needs to be enabled first because it is off by default: # Enable output filtering SecFilterScanOutput On # Restrict output filtering to text-based pages SecFilterOutputMimeTypes "(null) text/plain text/html" It is important to restrict filtering using MIME types to avoid binary resources, such as images, from being buffered and analyzed. The SecFilterSelective keyword is used against the OUTPUT variable to monitor response bodies. The following example watches pages for PHP errors: SecFilterSelective OUTPUT "Fatal Error:" Using a trick conceived by Ryan C. Barnett (some of whose work is available at https://sourceforge.net/users/rcbarnett/), output monitoring can be used as a form of integrity monitoring to detect and protect against defacement attacks. Attackers performing defacement usually replace the complete home page with their content. To fight this, Ryan embeds a unique keyword into every page and creates an output filtering rule that only allows the page to be sent if it contains the keyword. SecFilterSelective OUTPUT !KEYWORD This is not recommended for most applications due to its organizational overhead and potential for errors, but it can work well in a few high-profile cases. 12.2.5.7 Deploying positive security model protectionThough most of this chapter used negative security model protection for examples, you can deploy mod_security in a positive security model configuration. A positive security model relies on identifying requests that are safe instead of looking for dangerous content. In the following example, I will demonstrate how this approach can be used by showing the configuration for two application scripts. For each script, the standard Apache container directive <Location> is used to enclose mod_security rules that will only be applied to that script. The use of the SecFilterSelective directive to specify rules has previously been described. <Location /user_view.php> # This script only accepts GET SecFilterSelective REQUEST_METHOD !^GET$ # Accept only one parameter: id SecFilterSelective ARGS_NAMES !^id$ # Parameter id is mandatory, and it must be # a number, 4-14 digits long SecFilterSelective ARG_id !^[[:digit:]]{4,14}$ </Location> <Location /user_add.php> # This script only accepts POST SecFilterSelective REQUEST_METHOD !^POST$ # Accept three parameters: firstname, lastname, and email SecFilterSelective ARGS_NAMES !^(firstname|lastname|email)$ # Parameter firstname is mandatory, and it must # contain text 1-64 characters long SecFilterSelective ARG_firstname !^[[:alnum:][:space:]]{1,64}$ # Parameter lastname is mandatory, and it must # contain text 1-64 characters long SecFilterSelective ARG_lastname !^[ [:alnum:][:space:]]{1,64}$ # Parameter email is optional, but if it is present # it must consist only of characters that are # allowed in an email address SecFilterSelective ARG_email !(^$|^[[:alnum:].@]{1,64}$) </Location> There is a small drawback to this configuration approach. To determine which <Location> block is applicable for a request, Apache has to look through all such directives present. For applications with a small number of scripts, this will not be a problem, but it may present a performance problem for applications with hundreds of scripts, each of which need a <Location> block. A feature to allow user-defined types (predefined regular expressions), such as one present in mod_parmguard (see the sidebar), would significantly ease the task of writing configuration data.
|