Cluster Architectures

Introduction

Within Zeek, packet processing and script execution happen serially within a single thread of execution. Concretely, after passing a single packet through session tracking and analyzers, Zeek drains its event queue, executing all queued events, before continuing with the next packet.

When monitoring live traffic of any significant amount, a single Zeek process will not have enough CPU time available to process all incoming packets in a timely fashion. Generally, this results in packet buffers on the network interface card (NIC) to fill and eventually results in packet drops, causing data loss.

Most network monitoring applications solve this by horizontally scaling the packet processing path. Zeek is no different in this respect. The idea is to spawn individual Zeek processes, each receiving a flow-balanced portion of the monitored traffic. Flow-balancing ensures that all packets belonging to an individual flow between two endpoints are forwarded to the same Zeek process. The details aren’t important at this point. Generally, flow-balancing has been solved by various open-source projects or in the firmware of specialized network cards by symmetric hashing of network packet headers. Examples in this area are Linux’s AF_PACKET, netmap, PF_RING, Napatech Network Streams, Intel (and others) NIC queues via AF_PACKET’s PACKET_FANOUT_QM fanout setting, etc. Head to the Cluster Setup section for more details around this.

In a cluster, Zeek processes that listen on a network interface are called Zeek workers. Conceptually, each Zeek worker process gets assigned an individual interface via the -i command-line argument. To a degree, this controls which chunk of traffic an individual Zeek worker will receive.

Note

For AF_PACKET, all workers have the same -i argument: The interface name. The Linux kernel will flow-balance according to the number of running Zeek workers dynamically. This also means that adding or removing workers at runtime redistributes connections to different workers as flow-balancing uses a simple modulo operation over the number of workers. This may negatively impact data quality, so a static number of Zeek workers should be preferred.

Besides Zeek workers, three more process roles exist in a cluster: Proxies, loggers and a central manager process. Proxy, logger and worker processes can all be scaled independently.

Note

These processes are also called nodes. And their role called node type. In this chapter we attempt to stick with the process terminology. Many other systems use the word node to describe a host running multiple services or processes and Zeek itself has a concept of single-node and multi-node Zeek clusters (see below), so this can be a bit confusing.

On the script-level, the Cluster::node variable contains manager, worker-1, proxy-1, logger-1, etc., allowing to identify individual processes. The Cluster::local_node_type function allows to determine the role of the process a script executes on.

Zeek processes normally create log files in their working directories. In a Zeek cluster, manager, worker and proxy processes forward their log writes (as created via a Log::write script-level call) in a round-robin fashion to the running Zeek logger processes instead. The exact details depend on the Zeek cluster backend in use. See the backend documentation for details.

A process supervisor starts and manages individual Zeek processes. Various supervisors and deployment approaches have been prototyped, explored and used in certain environments over the years. ZeekControl remains the de-facto standard to run and operate a Zeek cluster as of Zeek 8.1.

Processes forming a Zeek cluster are connected via a topic-based publish/subscribe layer. Zeek processes can subscribe to topics and receive remote events published by other Zeek processes. All Zeek processes have visibility into remote events published by all other process. Subscriptions use prefix matching on topic names.

Note

The publish/subscribe visibility aspect has changed in Zeek 8.1 with the ZeroMQ cluster backend. Global publish/subscribe visibility wasn’t implemented with the Broker cluster backend, putting the burden of routing remote events through the cluster onto the user. You may stumble over scripts where a worker process publishes to the manager’s individual topic just for the manager to re-publish the remote event to all workers. Backends running a centralized topology, such as ZeroMQ, remove the need for this detour. It still works, but is less efficient than sending remote events directly to the destination topic.

References

Establishing subscription at the script-level is done using the Cluster::subscribe function. Publishing events uses Cluster::publish. Events are published by providing a topic name, the event identifier and the event’s arguments.

# event found_root()
Cluster::publish("/my/topic", found_root);

# event found_secret(secret: string)
Cluster::publish("/my/topic", found_secret, "SeCrEt!!");

Receiving and execution of remote events is implicit: The existence of a local event handler implies execution of received remote events with the same name. Publishing a remote event is asynchronous. The incurred latency between different processes depends on the cluster backend and whether all Zeek processes are located on the same node or distributed across multiple nodes.

Single Node Examples

A Zeek cluster can be deployed on a single hardware/virtual system. This looks as follows when the system has a single monitoring NIC.

../../../_images/single-system-one-nic.svg

This example depicts load-balancing of all network traffic across 8 workers.

When monitoring two NICs on the same system, the cluster looks as follows, assuming four workers are sufficient per NIC.

../../../_images/single-system-two-nics.svg

Note that while these diagrams depict just a few workers per NIC, high-performance environments may require 100 or more Zeek workers and—depending on the loaded scripts—up to 10s of proxies and loggers. Such environments generally require experiments, monitoring and performance tuning to establish the appropriate number of workers, loggers and proxies for a given hardware configuration and system.

Note

Zeek loggers aggregate the logs produced by other processes and logging to a single file. When Zeek logs are forwarded to a messaging queue like Kafka or NATS instead, all log records continue to be funneled through logger processes. It likely would be more efficient to instead send logs to a message queue directly.

As of Zeek 8.0, there is no ready-to-use recipe for running a logger-less cluster.

Multi Node Examples

Zeek was originally developed during a time in which multi-core CPUs weren’t widely available and scaling and flow-balancing was actually done on the level of a full hardware system. This can still be useful today for deploying multiple smaller worker systems with 10G NICs compared to a single large system with hundreds of CPUs and a powerful 100G NIC. Such an architecture requires an external packet broker that ensures consistent flow-balancing across the individual Zeek systems, essentially introducing a separate flow-balancing layer.

Operational Metrics via Prometheus

Historically, Zeek has exposed many of its operational metrics via logs (stats.log specifically) and assumed users have infrastructure in place to load and analyze such data. Today, Zeek logs may only be accessible to certain users and analysts, while runtime and performance metrics are more interesting to Zeek operators. Additionally, Prometheus has become a very popular option in the operational metrics space. Starting with version 4.1, Zeek has moved into a direction where Prometheus exposition for operational metrics is preferred over exporting them as classic Zeek logs.

The model is that in a Zeek cluster, every Zeek process opens a HTTP Prometheus listener. An external Prometheus server scrapes metrics from these endpoints at regular intervals. ZeekControl currently allocates listener ports statically.

The manager process additionally provides a /services.json endpoint for HTTP-based service discovery of all processes in a Zeek cluster. This allows the Prometheus Server to discover all metrics endpoints via the Zeek manager’s /services.json endpoint.

Adding to the previous diagrams, this looks as follows:

../../../_images/single-system-prometheus.svg

This should seamlessly work with multi-node clusters.

Note

As of Zeek 7, the Zeek manager builds the /services.json response based on the static cluster-layout.zeek. It statically knows all all the metric endpoints of all other processes. In the future this might be more dynamic. There’s no actual reason for this static setup. Other Zeek processes could as well dynamically register their metrics endpoint with the manager process at runtime.

Cluster Architectures

Introduction

Single Node Examples

Multi Node Examples

Shared Publish/Subscribe Layer

Split Publish/Subscribe Layer

WebSocket API to the Publish/Subscribe Layer

Operational Metrics via Prometheus