5. Web Socket

Broker offers access to the publish/subscribe layer via WebSocket in order to make its data accessible to third parties.

WebSocket clients are treated as lightweight peers. Each Broker endpoint can be configured to act as a WebSocket server by either (1) setting the environment variable BROKER_WEB_SOCKET_PORT; (2) setting broker.web-socket.port on the command line or in the configuration file; or (3) from C++ by calling endpoint::web_socket_listen(). When running inside Zeek, scripts may call Broker::listen_websocket() to have Zeek start listening for incoming WebSocket connections.

Note

Broker uses the same SSL parameters for native and WebSocket peers.

5.1. JSON API v1

To access the JSON API, clients may connect to wss://<host>:<port>/v1/messages/json (SSL enabled, default) or ws://<host>:<port>/v1/messages/json (SSL disabled). On this WebSocket endpoint, Broker allows JSON-formatted text messages only.

5.1.1. Handshake

The broker endpoint expects a JSON array of strings as the first message. This array encodes the subscriptions as a list of topic prefixes that the client subscribes to. Clients that only publish data must send an empty JSON array.

After receiving the subscriptions, the Broker endpoint sends a single ACK message:

{
  "type": "ack",
  "endpoint": "<uuid>",
  "version": "<broker-version>"
}

In this message, <uuid> is the unique endpoint ID of the WebSocket server and <broker-version> is a string representation of the libbroker version, i.e., the result of broker::version::string(). For example:

{
  "type": "ack",
  "endpoint": "925c9110-5b87-57d9-9d80-b65568e87a44",
  "version": "2.2.0-22"
}

5.1.2. Protocol

After the handshake, the WebSocket client may only send Data Messages. The Broker endpoint converts every message to its native representation and publishes it.

The WebSocket server may send Data Messages (whenever a data message matches the subscriptions of the client) and Error Messages_ to the client.

5.1.3. Data Representation

Broker uses a recursive data type to represent its values (see Data Model). This data model does not map to JSON-native types without ambiguity, e.g., because Broker distinguishes between signed and unsigned number types.

In JSON, we represent each value as a JSON object with two keys: @data-type and data. The former identifies one of Broker’s data types (see below) and denotes how Broker parses the data field.

5.1.3.1. None

There is only exactly one valid input for encoding a none:

{
  "@data-type": "none",
  "data": {}
}

5.1.3.2. Boolean

The type boolean can take on exactly two values and maps to the native JSON boolean type:

{
  "@data-type": "boolean",
  "data": true
}
{
  "@data-type": "boolean",
  "data": false
}

5.1.3.3. Count

A count is a 64-bit unsigned integer and maps to a (positive) JSON integer. For example, Broker encodes the count 123 as:

{
  "@data-type": "count",
  "data": 123
}

Note

Passing a number with a decimal point (e.g. ‘1.0’) is an error.

5.1.3.4. Integer

The type integer maps to JSON integers. For example, Broker encodes the integer -7 as:

{
  "@data-type": "integer",
  "data": -7
}

Note

Passing a number with a decimal point (e.g. ‘1.0’) is an error.

5.1.3.5. Real

The type real maps to JSON numbers. For example, Broker encodes -7.5 as:

{
  "@data-type": "real",
  "data": -7.5
}

5.1.3.6. Timespan

A timespan has no equivalent in JSON and Broker thus encodes them as strings. The format for the string is <value><suffix>, whereas the value is an integer and suffix is one of:

ns

Nanoseconds.

ms

Milliseconds.

s

Seconds.

min

Minutes

h

Hours.

d

Days.

For example, 1.5 seconds may be encoded as:

{
  "@data-type": "timespan",
  "data": "1500ms"
}

5.1.3.7. Timestamp

Like timespan, Broker uses formatted strings to represent timestamp since there is no native JSON equivalent. Timestamps are encoded in ISO 8601 as YYYY-MM-DDThh:mm:ss.sss.

For example, Broker represents April 10, 2022 at precisely 7AM as:

{
  "@data-type": "timestamp",
  "data": "2022-04-10T07:00:00.000"
}

5.1.3.8. String

Strings simply map to JSON strings, e.g.:

{
  "@data-type": "string",
  "data": "Hello World!"
}

5.1.3.9. Enum Value

Broker internally represents enumeration values as strings. Hence, this type also maps to JSON strings:

{
  "@data-type": "enum-value",
  "data": "foo"
}

5.1.3.10. Address

Network addresses are encoded as strings and use the IETF-recommended string format for IPv4 and IPv6 addresses, respectively. For example:

{
  "@data-type": "address",
  "data": "2001:db8::"
}

5.1.3.11. Subnet

Network subnets are encoded in strings with “slash notation”, i.e., <address>/<prefix-length>. For example:

{
  "@data-type": "subnet",
  "data": "255.255.255.0/24"
}

5.1.3.12. Port

Ports are rendered as strings with the format <port-number>/<protocol>, whereas <port-number> is a 16-bit unsigned integer and protocol is one of tcp, udp, icmp, or ?. For example:

{
  "@data-type": "port",
  "data": "8080/tcp"
}

5.1.3.13. Vector

A vector is a sequence of data. This maps to a JSON array consisting of JSON objects (that in turn each have the @data-type and data keys again). For example:

"@data-type": "vector",
"data": [
  {
    "@data-type": "count",
    "data": 42
  },
  {
    "@data-type": "integer",
    "data": 23
  }
]

5.1.3.14. Set

Sets are similar to vector, but each object in the list may only appear once. For example:

"@data-type": "set",
"data": [
  {
    "@data-type": "string",
    "data": "foo"
  },
  {
    "@data-type": "string",
    "data": "bar"
  }
]

5.1.3.15. Table

Since Broker allows arbitrary types for the key (even a nested table), Broker cannot render tables as JSON objects. Hence, tables are mapped JSON arrays of key-value pairs, i.e., JSON objects with key and value. For example:

{
  "@data-type": "table",
  "data": [
    {
      "key": {
        "@data-type": "string",
        "data": "first-name"
      },
      "value": {
        "@data-type": "string",
        "data": "John"
      }
    },
    {
      "key": {
        "@data-type": "string",
        "data": "last-name"
      },
      "value": {
        "@data-type": "string",
        "data": "Doe"
      }
    }
  ]
}

5.1.4. Data Messages

Represents a user-defined message with topic and data.

A data message consists of these keys:

type

Always data-message.

topic

The Broker topic for the message. A client will only receive topics that match its subscriptions.

@data-type

Meta field that encodes how to parse the data field (see Data Representation).

data

Contains the actual payload of the message.

Example:

{
  "type": "data-message",
  "topic": "/foo/bar",
  "@data-type": "count",
  "data": 1
}

5.1.5. Error Messages

The error messages on the WebSocket connection give feedback to the client if the server discarded malformed input from the client or if there has been an error while processing the JSON text.

An error message consists of these keys:

type

Always error.

code

A string representation of one of Broker’s error codes. See Section 2.1.5.

context

A string that gives additional information as to what went wrong.

For example, sending the server How is it going? instead of a valid data message would cause it to send this error back to the client:

{
  "type": "error",
  "code": "deserialization_failed",
  "context": "input #1 contained malformed JSON -> caf::pec::unexpected_character(1, 1)"
}

5.1.6. Encoding of Zeek Events

Broker encodes Zeek events as nested vectors using the following structure: [<format-nr>, <type>, [<name>, <args>, <metadata (optional)>]]:

format-nr

A count denoting the format version. Currently, this is always 1.

type

A count denoting the encoded Zeek message type. For events, this is always 1. Other message types in Zeek are currently not safe for 3rd-party use.

name

Identifies the Zeek event.

args

Contains the arguments for the event in the form of another vector.

metadata

Contains a vector of key-value pairs (represented as further vectors of size 2) for which the first element is a count for identification purposes and the second element any supported Broker data type. This vector can be used to attach arbitrary metadata to events.

Zeek version 6.0 and up always includes the network time of an event as metadata. The key for a network timestamp is 1 and the data type for the value is a timestamp.

Broker endpoints are free to use counts starting with 200 to identify and exchange metadata of their own choosing. Within a network of Broker nodes, individual endpoints need to agree on the meaning and type of metadata attached to events.

For example, an event called event_1 that has been published to topic /foo/bar with an integer argument 42 and a string argument test without attached metadata would be render as:

{
  "type": "data-message",
  "topic": "/foo/bar",
  "@data-type": "vector",
  "data": [
    {
      "@data-type": "count",
      "data": 1
    },
    {
      "@data-type": "count",
      "data": 1
    },
    {
      "@data-type": "vector",
      "data": [
        {
          "@data-type": "string",
          "data": "event_1"
        },
        {
          "@data-type": "vector",
          "data": [
            {
              "@data-type": "integer",
              "data": 42
            },
            {
              "@data-type": "string",
              "data": "test"
            }
          ]
        }
      ]
    }
  ]
}

An event including with NetworkTimestamp metadata event render as follows, having the args vector followed by another vector containing the network timestamp of the event:

{
  "type": "data-message",
  "topic": "/foo/bar",
  "@data-type": "vector",
  "data": [
    {
      "@data-type": "count",
      "data": 1
    },
    {
      "@data-type": "count",
      "data": 1
    },
    {
      "@data-type": "vector",
      "data": [
        {
          "@data-type": "string",
          "data": "event_1"
        },
        {
          "@data-type": "vector",
          "data": [
            {
              "@data-type": "integer",
              "data": 42
            },
            {
              "@data-type": "string",
              "data": "test"
            }
          ]
        },
        {
          "@data-type": "vector",
          "data": [
            {
              "@data-type": "vector",
              "data": [
                {
                  "@data-type": "count",
                  "data": 1
                },
                {
                  "@data-type": "timestamp",
                  "data": "2023-04-18T14:13:14.000"
                }
              ]
            }
          ]
        }
      ]
    }
  ]
}