Connection Handling

Checksum Behavior

By default, Zeek will mostly ignore packets that have invalid checksums.

When the IPv4 header checksum is invalid, Zeek produces a bad_IP_checksum weird and discards the packet before proceeding with its connection lookup step.

As of Zeek 8.2, for L4 checksums (TCP, UDP, ICMP, …), Zeek will only determine if the L4 checksum is invalid after looking up or creating a connection using information from the potentially corrupt L4 header.

This connection lookup is implemented in IPBasedAnalyzer::AnalyzePacket(). Checksum validation then happens in the concrete protocol specific analyzers within TCPAnalyzer::DeliverPacket(), UDPAnalyzer::DeliverPacket(), etc.

Packets with invalid checksums are not counted towards orig_pkts or resp_pkts of a connection and also not passed to a connection’s analyzers. However, a c or C is added to the history field in logarithmic fashion.

This behavior can result in conn.log entries that show zero orig_pkts and resp_pkts, but do show c or C sequences in the history.

# zeek -D -b -r Traces/tcp/syn-bad.pcap base/protocols/conn LogAscii::use_json=T
# jq < conn.log
{
  "ts": 1362692526.869344,
  "uid": "CJKFoj4bpHEhTeaRoj",
  "id.orig_h": "141.142.228.5",
  "id.orig_p": 59856,
  "id.resp_h": "192.150.187.43",
  "id.resp_p": 80,
  "proto": "tcp",
  "conn_state": "OTH",
  "local_orig": false,
  "local_resp": false,
  "missed_bytes": 0,
  "history": "C",
  "orig_pkts": 0,
  "orig_ip_bytes": 0,
  "resp_pkts": 0,
  "resp_ip_bytes": 0,
  "ip_proto": 6
}

# zeek -b  -r Traces/dns/dns-corrupt.pcap base/protocols/conn LogAscii::use_json=T
# jq < conn.log
{
  "ts": 1777450586.006844,
  "uid": "CJKFoj4bpHEhTeaRoj",
  "id.orig_h": "192.168.0.109",
  "id.orig_p": 34357,
  "id.resp_h": "8.8.8.8",
  "id.resp_p": 53,
  "proto": "udp",
  "conn_state": "OTH",
  "local_orig": true,
  "local_resp": false,
  "missed_bytes": 0,
  "history": "Cc",
  "orig_pkts": 0,
  "orig_ip_bytes": 0,
  "resp_pkts": 0,
  "resp_ip_bytes": 0,
  "ip_proto": 17
}

The history fields are C and cC for the respective test captures, but orig_pkts and resp_pkts are zero.

See issue #5277 on GitHub for some discussion around this behavior.

Flipping Connections

Zeek works with a concept of originator and responder for a connection. This is visible in the Zeek scripting layer as the is_orig: bool event parameter, but also on much lower-level C++ APIs like the various Analyzer APIs or accessors on Connection instances (OrigAddr() and RespAddr(), or ``OrigPort() and RespPort()).

In certain scenarios, Zeek decides to flip the notion of originator and responder. Usually, the first packet of a connection determines which endpoint is the originator and which the responder. As a special case, when the first packet has a source port that is set in likely_server_ports, this notion is flipped and a ^ (caret) added to this connection’s history.

This connection flipping permeates various layers. For example, there is a connection_flipped event that allows Zeek scripts to react on it. Additionally, the Analyzer API offers a virtual FlipRoles() method that is executed recursively on the analyzer tree when endpoint flipping happens. All analyzers have to update their internal state upon such an event. Consider the ConnSize_Analyzer analyzer: It tracks packet and byte counts transferred by originator and responder endpoints. When the notion of these endpoints changes, a ConnSize_Analyzer instance needs to update its own internal state, as for any following DeliverPacket() calls, the meaning of is_orig is inverted.

Luckily, this flipping usually happens before the first packet of connection is processed. More recently, however, flipping on the second packet has been added. Technically, flipping can be triggered by any analyzer or logic at any time, but this results in the very unfortunate scenario that an in-flight ForwardStream() or ForwardPacket() invocation on a connection’s analyzer tree ends-up using a stale is_orig parameter. For example, this was observed with the ConnSize_Analyzer that is visited after a TCPSessionAdapter::Process() invocation. If Process() flipped the connection, the DeliverPacket() invocation on the ConnSize_Analyzer would use a stale is_orig stack variable resulting in miss-accounting single packets.

In the future, it might make sense to re-design Zeek’s lowest layers to be agnostic of the originator and responder notion. That is, always sort endpoints deterministically and name them, e.g., left and right. The notion of originator and responder shouldn’t vanish from Zeek, but instead implemented on a higher-level instead. For example, the IPBasedConnKey class currently holds a flipped member and has a FlipRoles() API. However, it seems unreasonable that the raw connection tracking layer should have knowledge of the originator and responder concept, as it introduces quite some complexity.