The Honeyd framework supports several ways of logging network activity. It can create connection logs that report attempted and completed connections for all protocols. To get more detailed information, services can log arbitrary information to Honeyd via stderr. The framework also uses syslog for communicating warnings or system-level errors. In most situations, we expect that Honeyd runs in conjunction with a network intrusion detection system (NIDS) or a set of custom scripts to parse and analyze the log files.
Packet-level logs can be enabled via the -l command line option. It takes a single filename as an argument. The directory in which the file is being created needs to be writeable by the user that Honeyd is running as — usually nobody. Analyzing packet logs is the easiest way to get an overview of what kind of traffic your honeypots receive. The log file contains information about the source and destination IP addresses, and which protocols and ports were being used. If a connection gets established, the log file also contains information about when the connection started, when it ended, and how many bytes were transmitted. Figure 4.14 contains an example from a log file in table format.
The Date column contains a timestamp of when the packet was received by Honeyd. The next column contains information about the Internet protocol, usually TCP, UDP, or ICMP. However, when receiving rare network probes, it could also be any other Internet protocol. The third column labeled T contains the connection type: S stands for connection start, E stands for connection end, and - indicates that the packet does not belong to any connection. The next four columns show information about the source IP address, source port, destination IP address, and destination port. For some protocols, like ICMP, the port columns are empty because these protocols do not use ports. The Info column contains information associated with a connection or packet. When a connection ends, it contains the number of bytes received and sent by Honeyd, respectively. For a probe packet, it contains additional protocol information:
TCP: The size of the packet and the flags set in the header. Honeyd knows about the following flags: F — Fin, S — Syn, R — Rst, P — Push, A — Ack, U — Urg, E — ECE, and C — CWR.
ICMP: The ICMP code and type and the size of the packet. Consult Stevens's book TCP/IP Illustrated for more information [38].
UDP: Size of the packet.
The Info column contains additional human readable information. In many cases, it at least contains a guess on the remote operating system based on passive fingerprinting.
For protocols that support connections like TCP or UDP, Honeyd does not log all packets but instead logs the start of connection and the corresponding end in a fashion similar to Netflow. The main benefit is reduced clutter in the logs. For example, if somebody were to download a large file from a honeypot, there is no benefit in logging each individual packet in the download. Honeyd uses an S to indicate the start of a connection and summarizes the amount of information exchanged when the connection ends using the code E.
The packet logs are very useful for data mining. A simple Python script can be used to calculate the number of different IP addresses that probe our honeypots per day, a distribution of operating systems or a list of the most popular ports. The example Python script in Figure 4.15 computes the number of unique IP addresses that contact our honeypots per day. Measuring the number of IP addresses will give you a good idea of the scanning activity your honeypots are exposed to. This measure is likely to grow over time.
import sys old_day = '' ips = {} # Dictionary containing each unique IP once for line in sys.stdin: (date, _, _, srcip, _) = line.split(' ', 4) # Extract date and source IP day = '-'.join(date.split('-')[0:3]) if day != old_day: if old_day: print old_day, len(ips) old_day = day ips = {} ips[srcip] = 1 print day, len(ips) |
These log files can grow very large over time, depending on how much traffic your honeypots receive. It is good practice to rotate these log files so your filesystem does not overflow. Honeyd supports log rotation via the USR1 signal. When Honeyd receives this signal, all current log files are being closed and new ones are opened. To manually rotate a log file, use the script shown in Figure 4.16.
mv logfile logfile.0 mv logfile.srv logfile.srv kill -USR1 $(cat /var/run/honeyd.pid) |
Service-level logs can be enabled via the -s command line option. This flag takes a single filename as an argument. The directory in which the file is being created needs to be writeable by the user that Honeyd is running as — usually nobody. While packet logs give us an overview of the overall traffic, service logs give us very detailed information about the ongoing traffic. Each service script can ask Honeyd to write information into this log file by printing information to stderr. That also entails that the precise format of this log files can differ from service to service. If you write your own service emulation scripts, it is up to you to choose a format that is easy to analyze.
The example in Figure 4.17 shows a remote IP address falling for our fake proxy and SMTP servers. The IP address tried to anonymously send e-mail by connecting to a mail server via an open proxy. To the remote mail server, it seems that the e-mail originates from the IP address of the proxy. Other interesting examples gleaned from these logs show attempts to break into the secure web servers of oil companies in Russia or the login servers of instant messaging companies.