Section 10.8. Evasion Techniques

10.8. Evasion Techniques

Intrusion detection systems (IDSs) are an integral part of web application security. In Chapter 9, I introduced web application firewalls (also covered in Chapter 12), whose purpose is to detect and reject malicious requests.

Most web application firewalls are signature-based. This means they monitor HTTP traffic looking for signature matches, where this type of "signature" is a pattern that suggests an attack. When a request is matched against a signature, an action is taken (as specified by the configuration). But if an attacker modifies the attack payload in some way to have the same meaning for the target but not to resemble a signature the web application firewall is looking for, the request will go through. Techniques of attack payload modification to avoid detection are called evasion techniques.

Evasion techniques are a well-known tool in the TCP/IP-world, having been used against network-level IDS tools for years. In the web security world, evasion is somewhat new. Here are some papers on the subject:

"A look at whisker's anti-IDS tactics" by Rain Forest Puppy (http://www.apachesecurity.net/archive/whiskerids.html)
"IDS Evasion Techniques and Tactics" by Kevin Timm (http://www.securityfocus.com/printable/infocus/1577)

10.8.1. Simple Evasion Techniques

We start with the simple yet effective evasion techniques:

Using mixed case characters: This technique can be useful for attackers when attacking platforms (e.g., Windows) where filenames are not case sensitive; otherwise, it is useless. Its usefulness rises, however, if the target Apache includes mod_speling as one of its modules. This module tries to find a matching file on disk, ignoring case and allowing up to one spelling mistake.
Character escaping: Sometimes people do not realize you can escape any character by preceding the character with a backslash character (\), and if the character does not have a special meaning, the escaped character will convert into itself. Thus, \d converts to d. It is not much but it is enough to fool an IDS. For example, an IDS looking for the pattern id would not detect a string i\d, which has essentially the same meaning.
Using whitespace: Using excessive whitespace, especially the less frequently thought of characters such as TAB and new line, can be an evasion technique. For example, if an attacker creates an SQL injection attempt using DELETE FROM (with two spaces in between the words instead of one), the attack will be undetected by an IDS looking for DELETE FROM (with just one space in between).

10.8.2. Path Obfuscation

Many evasion techniques are used in attacks against the filesystem. For example, many methods can obfuscate paths to make them less detectable:

Self-referencing directories: When a ./ combination is used in a path, it does not change the meaning but it breaks the sequence of characters in two. For example, /etc/passwd may be obfuscated to the equivalent /etc/./passwd.
Double slashes: Using double slashes is one of the oldest evasion techniques. For example, /etc/passwd may be written as /etc//passwd.
Path traversal: Path traversal occurs when a backreference is used to back out of the current folder, but the name of the folder is used again to advance. For example, /etc/passwd may be written as /etc/dummy/../passwd, and both versions are legal. This evasion technique can be used against application code that performs a file download to make it disclose an arbitrary file on the filesystem. Another use of the attack is to evade an IDS system looking for well-known patterns in the traffic (/etc/passwd is one example).
Windows folder separator: When the web server is running on Windows, the Windows-specific folder separator \ can be used. For example, ../../cmd.exe may be written as ..\..\cmd.exe.
IFS evasion: Internal Field Separator (IFS) is a feature of some UNIX shells (sh and bash, for example) that allows the user to change the field separator (normally, a whitespace character) to something else. After you execute an IFS=X command on the shell command line, you can type CMD=X/bin/catX/etc/passwd;eval$CMD to display the contents of the /etc/passwd file on screen.

10.8.3. URL Encoding

Some characters have a special meaning in URLs, and they have to be encoded if they are going to be sent to an application rather than interpreted according to their special meanings. This is what URL encoding is for. (See RFC 1738 at http://www.ietf.org/rfc/rfc1738.txt and RFC 2396 at http://www.ietf.org/rfc/rfc2396.txt.) I showed URL encoding several times in this chapter, and it is an essential technique for most web application attacks.

It can also be used as an evasion technique against some network-level IDS systems. URL encoding is mandatory only for some characters but can be used for any. As it turns out, sending a string of URL-encoded characters may help an attack slip under the radar of some IDS tools. In reality, most tools have improved to handle this situation.

Sometimes, rarely, you may encounter an application that performs URL decoding twice. This is not correct behavior according to standards, but it does happen. In this case, an attacker could perform URL encoding twice.

The URL:

http://www.example.com/paynow.php?p=attack

becomes:

http://www.example.com/paynow.php?p=%61%74%74%61%63%6B

when encoded once (since %61 is an encoded a character, %74 is an encoded t character, and so on), but:

http://www.example.com/paynow.php?p=%2561%2574%2574%2561%2563%256B

when encoded twice (where %25 represents a percent sign).

If you have an IDS watching for the word "attack", it will (rightly) decode the URL only once and fail to detect the word. But the word will reach the application that decodes the data twice.

There is another way to exploit badly written decoding schemes. As you know, a character is URL-encoded when it is represented with a percentage sign, followed by two hexadecimal digits (0-F, representing the values 0-15). However, some decoding functions never check to see if the two characters following the percentage sign are valid hexadecimal digits. Here is what a C function for handling the two digits might look like:

unsigned char x2c(unsigned char *what) {    
    unsigned char c0 = toupper(what[0]);
    unsigned char c1 = toupper(what[1]);
    unsigned char digit;
   
    digit = ( c0 >= 'A' ? c0 - 'A' + 10 : c0 - '0' );
    digit = digit * 16;
    digit = digit + ( c1 >= 'A' ? c1 - 'A' + 10 : c1 - '0' );
   
    return digit;
}

This code does not do any validation. It will correctly decode valid URL-encoded characters, but what happens when an invalid combination is supplied? By using higher characters than normally allowed, we could smuggle a slash character, for example, without an IDS noticing. To do so, we would specify XV for the characters since the above algorithm would convert those characters to the ASCII character code for a slash.

The URL:

http://www.example.com/paynow.php?p=/etc/passwd

would therefore be represented by:

http://www.example.com/paynow.php?p=%XVetc%XVpasswd

10.8.4. Unicode Encoding

Unicode attacks can be effective against applications that understand it. Unicode is the international standard whose goal is to represent every character needed by every written human language as a single integer number (see http://en.wikipedia.org/wiki/Unicode). What is known as Unicode evasion should more correctly be referenced as UTF-8 evasion. Unicode characters are normally represented with two bytes, but this is impractical in real life. First, there are large amounts of legacy documents that need to be handled. Second, in many cases only a small number of Unicode characters are needed in a document, so using two bytes per character would be wasteful.

Internet Information Server (IIS) supports a special (nonstandard) way of representing Unicode characters, designed to resemble URL encoding. If a letter "u" comes after the percentage sign, then the four bytes that follow are taken to represent a full Unicode character. This feature has been used in many attacks carried out against IIS servers. You will need to pay attention to this type of attack if you are maintaining an Apache-based reverse proxy to protect IIS servers.

UTF-8, a transformation format of ISO 10646 (http://www.ietf.org/rfc/rfc2279.txt) allows most files to stay as they are and still be Unicode compatible. Until a special byte sequence is encountered, each byte represents a character from the Latin-1 character set. When a special byte sequence is used, two or more (up to six) bytes can be combined to form a single complex Unicode character.

One aspect of UTF-8 encoding causes problems: non-Unicode characters can be represented encoded. What is worse is multiple representations of each character can exist. Non-Unicode character encodings are known as overlong characters, and may be signs of attempted attack. There are five ways to represent an ASCII character. The five encodings below all decode to a new line character (0x0A):

0xc0 0x8A
0xe0 0x80 0x8A
0xf0 0x80 0x80 0x8A
0xf8 0x80 0x80 0x80 0x8A
0xfc 0x80 0x80 0x80 0x80 0x8A

Invalid UTF-8 encoding byte combinations are also possible, with similar results to invalid URL encoding.

10.8.5. Null-Byte Attacks

Using URL-encoded null bytes is an evasion technique and an attack at the same time. This attack is effective against applications developed using C-based programming languages. Even with scripted applications, the application engine they were developed to work with is likely to be developed in C and possibly vulnerable to this attack. Even Java programs eventually use native file manipulation functions, making them vulnerable, too.

Internally, all C-based programming languages use the null byte for string termination. When a URL-encoded null byte is planted into a request, it often fools the receiving application, which happily decodes the encoding and plants the null byte into the string. The planted null byte will be treated as the end of the string during the program's operation, and the part of the string that comes after it and before the real string terminator will practically vanish.

We looked at how a URL-encoded null byte can be used as an attack when we covered source code disclosure vulnerabilities in the "Source Code Disclosure" section. This vulnerability is rare in practice though Perl programs can be in danger of null-byte attacks, depending on how they are programmed.

A Real Compromise Example

This example will explain how several vulnerabilities can be chained together to escalate problems until a compromise is possible.

A web site I was asked to investigate used a Perl-based content management system. Here are the steps I took in my investigation:

After some preliminary analysis of the application structure, I probed the application for common problems in input validation. One of the probes proved successful, and I was able to manipulate one of the parameters and cause the application not to find a file it was including.

What enabled me to take matters further was information disclosure vulnerability. The application displayed a detailed error message, which contained full file paths on the server. However, first attempts at exploiting the problem did not yield results. I discovered I could use path traversal against it.

I decided to investigate the application further and discovered one of the previous versions was available for full source code download. Luckily for my investigation, this particular part of the code did not change much between versions.

After downloading the source code, I discovered why my file disclosure attempts failed. The application was appending a string ".html" to the parameter. I could see some hints of this happening earlier but now I was able to see exactly how it was done.

Realizing the application was developed in Perl, I appended a URL-encoded null byte at the end of the parameter. This move fooled the application. It did append the extension to the filename, but the extension was not recognized as it came only after the null byte.

I was now able to fetch any file from the server.

At this point, I lost interest and wrote a detailed report for the site owner. Interestingly, after checking for the same problems a couple of days later, I realized they had not corrected the root cause of the problem. They only removed the information disclosure vulnerability (the error message). With my notes still in hand, I was able to retrieve any file from the server again. This is a good example of why security through obscurity is frequently bashed as inadequate. A determined attacker would have been able to compromise the server using a process of trial and error.

I explained this in my second email to them, but they never responded. I did not check to see if they were vulnerable again.

Null-byte encoding is used as an evasion technique mainly against web application firewalls when they are in place. These systems are almost exclusively C-based (they have to be for performance reasons), making the null-byte evasion technique effective.

Web application firewalls trigger an error when a dangerous signature (pattern) is discovered. They may be configured not to forward the request to the web server, in which case the attack attempt will fail. However, if the signature is hidden after an encoded null byte, the firewall may not detect the signature, allowing the request through and making the attack possible.

To see how this is possible, we will look at a single POST request, representing an attempt to exploit a vulnerable form-to-email script and retrieve the passwd file:

POST /update.php HTTP/1.0
Host: www.example.com
Content-Type: application/x-form-urlencoded
Content-Length: 78
   
firstname=Ivan&lastname=Ristic%00&email=ivanr@webkreator.com;cat%20/etc/passwd

A web application firewall configured to watch for the /etc/passwd string will normally easily prevent such an attack. But notice how we have embedded a null byte at the end of the lastname parameter. If the firewall is vulnerable to this type of evasion, it may miss our command execution attack, enabling us to continue with compromise attempts.

10.8.6. SQL Evasion

Many SQL injection attacks use unique combinations of characters. An SQL comment --%20 is a good example. Implementing an IDS protection based on this information may make you believe you are safe. Unfortunately, SQL is too versatile. There are many ways to subvert an SQL query, keep it valid, but sneak it past an IDS. The first of the papers listed below explains how to write signatures to detect SQL injection attacks, and the second explains how all that effort is useless against a determined attacker:

"Detection of SQL Injection and Cross-site Scripting Attacks" by K. K. Mookhey and Nilesh Burghate (http://www.securityfocus.com/infocus/1768)
"SQL Injection Signatures Evasion" by Ofer Maor and Amichai Shulman (http://www.imperva.com/application_defense_center/white_papers/sql_injection_signa-tures_evasion.html)

"Determined attacker" is a recurring theme in this book. We are using imperfect techniques to protect web applications on the system administration level. They will protect in most but not all cases. The only proper way to deal with security problems is to fix vulnerable applications.