http.log

The HyperText Transfer Protocol (HTTP) log, or http.log, is another core data source generated by Zeek. With the transition from clear-text HTTP to encrypted HTTPS traffic, the http.log is less active in many environments. In some cases, however, organizations implement technologies or practices to expose HTTPS as HTTP. Whether you’re looking at legacy HTTP on the wire, or HTTPS that has been exposed as HTTP, Zeek’s http.log offers utility for examining normal, suspicious, and malicious activity.

The Zeek scripting manual, derived from the Zeek source code, completely explains the meaning of each field in the http.log (and other logs). It would be duplicative to manually recreate that information in another format here. Therefore, this entry seeks to show how an analyst would make use of the information in the http.log. Those interested in getting details on every element of the http.log should refer to HTTP::Info.

Throughout the sections that follow, we will inspect Zeek logs in JSON format.

Inspecting the http.log

To inspect the http.log, we will use the same techniques we learned earlier in the manual. First, we have a JSON-formatted log file, either collected by Zeek watching a live interface, or by Zeek processing stored traffic. We use the jq utility to review the contents.

zeek@zeek:~/zeek-test/json$ jq . -c http.log
{"ts":1591367999.512593,"uid":"C5bLoe2Mvxqhawzqqd","id.orig_h":"192.168.4.76","id.orig_p":46378,"id.resp_h":"31.3.245.133","id.resp_p":80,"trans_depth":1,"method":"GET","host":"testmyids.com","uri":"/","version":"1.1","user_agent":"curl/7.47.0","request_body_len":0,"response_body_len":39,"status_code":200,"status_msg":"OK","tags":[],"resp_fuids":["FEEsZS1w0Z0VJIb5x4"],"resp_mime_types":["text/plain"]}

This is a very simple http.log. With only one entry, it’s the simplest possible entry. As before, we could see each field printed on its own line:

zeek@zeek:~/zeek-test/json$ jq . http.log
{
  "ts": 1591367999.512593,
  "uid": "C5bLoe2Mvxqhawzqqd",
  "id.orig_h": "192.168.4.76",
  "id.orig_p": 46378,
  "id.resp_h": "31.3.245.133",
  "id.resp_p": 80,
  "trans_depth": 1,
  "method": "GET",
  "host": "testmyids.com",
  "uri": "/",
  "version": "1.1",
  "user_agent": "curl/7.47.0",
  "request_body_len": 0,
  "response_body_len": 39,
  "status_code": 200,
  "status_msg": "OK",
  "tags": [],
  "resp_fuids": [
    "FEEsZS1w0Z0VJIb5x4"
  ],
  "resp_mime_types": [
    "text/plain"
  ]
}

HTTP is a protocol that was initially fairly simple. Over time it has become increasingly complicated. It’s not the purpose of this manual to describe how HTTP can be used and abused. Rather, we will take a brief look at the most important elements of this http.log entry, which is almost all of them.

Understanding the http.log Entry

Similar to the previous dns.log, the http.log is helpful because it combines elements from the conversation between the source and destination in one log entry. The most fundamental elements of the log answer questions concerning who made a request, who responded, and the nature of the request and response.

In this entry, we see that 192.168.4.76 made a request to 31.3.245.133. The originator made a HTTP version 1.1 GET request for the / or root of the site testmyids.com hosted by the responder, passing a user agent of curl/7.47.0.

The responder replied with a 200 OK message, with a MIME (Multipurpose Internet Mail Extensions) type of text/plain. Zeek provides us a file ID (or fuid) of FEEsZS1w0Z0VJIb5x4. If we had configured Zeek to log files of type text/plain, we could look at the content returned by the responder.

Finally, note the UID of C5bLoe2Mvxqhawzqqd. This is the same UID found in the conn.log for this TCP connection. This allows us to link the conn.log entry with this http.log entry.

Reviewing the Original Traffic

To better understand the original traffic, and how it relates to the Zeek http.log, let’s look at the contents manually. HTTP is a clear-text protocol. Assuming the contents are also clear text, and not obfuscated or encrypted, we can look at the contents. In the following example I use the venerable program tcpflow to create two files. One contains data from the originator to the responder, while the second contains data from the responder to the originator.

zeek@zeek:~/zeek-test$ tcpflow -r tm1t.pcap port 80

Let’s first look at the data from the originator to the responder.

zeek@zeek:~/zeek-test$ cat 192.168.004.076.46378-031.003.245.133.00080
GET / HTTP/1.1
Host: testmyids.com
User-Agent: curl/7.47.0
Accept: */*

Here is the data from the responder to the originator.

zeek@zeek:~/zeek-test$ cat 031.003.245.133.00080-192.168.004.076.46378
HTTP/1.1 200 OK
Server: nginx/1.16.1
Date: Fri, 05 Jun 2020 14:40:07 GMT
Content-Type: text/html; charset=UTF-8
Content-Length: 39
Connection: keep-alive
Last-Modified: Fri, 10 Jan 2020 21:36:02 GMT
ETag: "27-59bcfe9932c32"
Accept-Ranges: bytes

uid=0(root) gid=0(root) groups=0(root)

As you can see, there are elements, particularly in the response, that do not appear in the http.log. For example, the Server type of nginx/1.16.1 is not logged. If an analyst or administrator decided that he or she wished to include that data in his or her http.log, it is possible to make adjustments.

The data from the responder also shows the application payload it sent:

uid=0(root) gid=0(root) groups=0(root)

This is the output of a Unix uname -a command. It is hosted at the server testmyids.com to trigger a “GPL ATTACK_RESPONSE id check returned root” alert found in open source intrusion detection engine rule sets, such as that supported by Suricata. Analysts sometimes use this site to test if their intrusion detection engines are functioning properly. A more modern option with many different tests can be found at https://github.com/0xtf/testmynids.org.

Conclusion

Zeek’s http.log is another important log that offers a great deal of information on how systems are interacting with the Internet and each other. In the example in this section we looked at a very simple interaction between an originator and a responder. We could see the benefit of summarizing an HTTP request and response in a single log entry. In the next section we will look at other core Internet protocols.