Spawning a Cluster

Introduction

A Zeek cluster is a collection of worker, logger and proxy processes as well as a single manager process. See the Cluster Architectures for a more detailed introduction. This section gives a short and general overview how to run these processes by hand in order to spawn a Zeek cluster.

The main ingredients needed to run a Zeek cluster as of Zeek 8.1 are

  • the cluster-layout.zeek file

  • a mechanism to spawn processes

For a production setup, you’d also include a monitoring component that restarts any crashed processes, monitors and reports on their health, etc.

Note

You are reading low-level background information for developers. As a Zeek user or operator, you usually use ZeekControl or other high-level tools to operate a Zeek cluster.

Cluster Layout

All Zeek processes (also called nodes) that are part of a Zeek cluster are given unique names. Conventionally, the name is the type of node (worker, proxy, logger or manager) suffixed with an incrementing number that starts at 1 (except for the manager node). It’s useful to always use a number suffix even if there’s only a single instance of the process in a cluster. Don’t use 0 padding for the numbering, as it complicates things.

When a single Zeek cluster spans multiple hosts or two interfaces are monitored, the naming scheme is conventionally extended to include another incrementing number to produce worker-1-1, worker-1-2, worker-2-1 and worker-2-2, where the first number is the number of the host and the second is the number of the node. There’s some flexibility here, however, and any naming scheme can be chosen.

These node names are used as keys in the Cluster::nodes table. This table is conventionally populated via redef when Zeek loads the cluster-layout.zeek file that has to be available somewhere in the ZEEKPATH. Note that the Supervisor Framework implementation uses a custom IPC mechanism to pass this information to the processes instead. We exclude it from the following discussion.

Since Zeek 8.1, there’s a small utility available called zeek-cluster-layout-generator that you may use to produce a basic cluster-layout.zeek file given the number of processes.

The following listing shows the output of running this tool for a Zeek cluster with 2 workers, 1 logger, 1 proxy and a manager. As mentioned, the redef of the Cluster::nodes variable is the crucial part here.

$ zeek-cluster-layout-generator -W 2
# Auto-generated by zeek-cluster-layout-generator

redef Cluster::manager_is_logger = F;

redef Cluster::nodes += {
    ["manager"] = [$node_type=Cluster::MANAGER, $ip=127.0.0.1, $p=27760/tcp, $metrics_port=9991/tcp],
    ["logger-1"] = [$node_type=Cluster::LOGGER, $ip=127.0.0.1, $p=27761/tcp, $manager="manager", $metrics_port=9992/tcp],
    ["proxy-1"] = [$node_type=Cluster::PROXY, $ip=127.0.0.1, $p=27762/tcp, $manager="manager", $metrics_port=9993/tcp],
    ["worker-1"] = [$node_type=Cluster::WORKER, $ip=127.0.0.1, $manager="manager", $metrics_port=9994/tcp],
    ["worker-2"] = [$node_type=Cluster::WORKER, $ip=127.0.0.1, $manager="manager", $metrics_port=9995/tcp],
};

@load base/frameworks/telemetry/options
redef Telemetry::metrics_address = "0.0.0.0";
redef Telemetry::metrics_port = Cluster::local_node_metrics_port();

Note

With the arrival of the ZeroMQ cluster backend, a number of fields in the cluster layout aren’t very important anymore. Indeed, there’s ideas and thoughts around removing the static cluster-layout.zeek file completely. For the time being, however, assume that you need to pre-render the full cluster-layout.zeek file.

Spawning Processes

Once the cluster-layout.zeek file has been generated, spawn individual cluster processes as follows:

  • Set and export the ZEEKPATH environment variable such that it contains the directory in which the cluster-layout.zeek file is located, or alternatively copy the generated cluster-layout.zeek file into the working directory of each node as created below (. is in the default ZEEKPATH).

  • Create working directories for all processes to be spawned. Conventionally these are named like the node itself, e.g, worker-1 or manager or logger-1.

  • Change into the working directory and set the CLUSTER_NODE environment variable to the name of the cluster process.

  • Execute the zeek process, passing arguments as needed. Record its PID. All processes should receive local by default to load the local.zeek file. Workers will generally also receive the -i <interface> argument, with the interface possibly prefixed by the packet source plugin to use. On Linux, using AF_PACKET and interface eth0, this would then end up as -i af_packet::eth0.

  • For pinning processes to CPUs, one common approach is to use the taskset utility and execute zeek using it instead.

Minimal Shell-Based Supervisor

The following shell script implements steps outlined above.

Warning

Do not use this script in production! It’s solely for documentation and demonstration purposes and contains only the bare minimum to get a Zeek cluster off the ground!

supervisor.sh
#!/bin/bash
#
# Minimal shell based supervisor.
#
# Relies on bash's SIGINT propagation for shutdown. SIGTERM does not
# work and will orphan the processes.
#
# shellcheck disable=SC2086
set -eu

INTERFACE=${ZEEK_INTERFACE:-lo}
WORKERS=${ZEEK_WORKERS:-4}
PROXIES=${ZEEK_PROXIES:-2}
LOGGERS=${ZEEK_LOGGERS:-1}
SPOOL_DIR=${ZEEK_SPOOL_DIR:-$(pwd)/spool}

# Disable log rotation by default, so the logs within the logger's
# working directory just grow and grow. Good for testing and avoids
# spilling archive-log messages...
ARGS=${ZEEK_ARGS:-local Log::default_rotation_interval=0sec}

# Ignore checksum errors by default here.
WORKER_ARGS=${ZEEK_WORKER_ARGS:--C}

# The cluster backend script appended to all Zeek invocations.
CLUSTER_BACKEND_ARGS=${ZEEK_CLUSTER_BACKEND_ARGS:-policy/frameworks/cluster/backend/zeromq}

# spawn_process <name> <args...>
#
# Creates the working directory and launches a Zeek process in
# the background the given cluster name, passing args to it.
function spawn_process {
    local name=$1
    shift # make "$@" in the sub shell work

    local wdir=$SPOOL_DIR/$name
    mkdir -p $wdir
    cp $SPOOL_DIR/cluster-layout.zeek $wdir/cluster-layout.zeek

    # Spawn a new shell and exec to zeek.
    (
        cd $wdir
        export CLUSTER_NODE=$name
        exec zeek "$@" $CLUSTER_BACKEND_ARGS
    ) &
}

zeek-cluster-layout-generator \
    -L $LOGGERS \
    -P $PROXIES \
    -W $WORKERS -o $SPOOL_DIR/cluster-layout.zeek

# Spawn all the different processes, go go go!
spawn_process manager $ARGS
for i in $(seq 1 $LOGGERS); do spawn_process logger-$i $ARGS; done
for i in $(seq 1 $PROXIES); do spawn_process proxy-$i $ARGS; done
for i in $(seq 1 $WORKERS); do spawn_process worker-$i $WORKER_ARGS -i $INTERFACE $ARGS; done

wait

Running this script and outputting the process tree gives the usual Zeek cluster process tree you might be used to from elsewhere.

$ ZEEK_WORKERS=2 ZEEK_INTERFACE=af_packet::lo ./supervisor.sh

$ pstree -acT  44168
bash
  └─supervisor.sh ./supervisor.sh
      ├─zeek local policy/frameworks/cluster/backend/zeromq
      ├─zeek local policy/frameworks/cluster/backend/zeromq
      ├─zeek local policy/frameworks/cluster/backend/zeromq
      ├─zeek local policy/frameworks/cluster/backend/zeromq
      ├─zeek -C -i af_packet::lo local policy/frameworks/cluster/backend/zeromq
      └─zeek -C -i af_packet::lo local policy/frameworks/cluster/backend/zeromq

Hopefully this removes some of the magic around what a Zeek cluster is, how it is spawned, etc. If you’re now tempted to write systemd service units, take a look at the zeek-systemd-generator first!