3. Data Model

Broker offers a data model that is rich in types, closely modeled after Zeek. Both endpoints and data stores operate with the data abstraction as basic building block, which is a type-erased variant structure that can hold many different values.

There exists a total ordering on data, induced first by the type discriminator and then its value domain. For a example, an integer will always be smaller than a count. While a meaningful ordering exists only when comparing two values of the same type, the total ordering makes it possible to use data as index in associative containers.

3.1. Types

3.1.1. None

The none type has exactly one value: nil. A default-construct instance of data is of type none. One can use this value to represent optional or invalid data.

3.1.2. Arithmetic

The following types have arithmetic behavior.

3.1.2.1. Boolean

The type boolean can take on exactly two values: true and false. A boolean is a type alias for bool.

3.1.2.2. Count

A count is a 64-bit unsigned integer and type alias for uint64_t.

3.1.2.3. Integer

An integer is a 64-bit signed integer and type alias for int64_t.

3.1.2.4. Real

A real is a IEEE 754 double-precision floating point value, also commonly known as double.

3.1.3. Time

Broker offers two data types for expressing time: timespan and timestamp.

Both types seamlessly interoperate with the C++ standard library time facilities. In fact, they are concrete specializations of the time types in std::chrono:

using clock = std::chrono::system_clock;
using timespan = std::chrono::duration<int64_t, std::nano>;
using timestamp = std::chrono::time_point<clock, timespan>;

3.1.3.1. Timespan

A timespan represents relative time duration in nanoseconds. Given that the internal representation is a 64-bit signed integer, this allows for representing approximately 292 years.

3.1.3.2. Timestamp

A timestamp represents an absolute point in time. The frame of reference for a timestamp is the UNIX epoch, January 1, 1970. That is, a timestamp is simply an anchored timespan. The function now() returns the current wallclock time as a timestamp.

3.1.4. String

Broker directly supports std::string as one possible type of data.

3.1.4.1. Enum Value

An enum_value wraps enum types defined by Zeek by storing the enum value’s name as a std::string. The receiver is responsible for knowing how to map the name to the actual numeric value if it needs that information.

3.1.5. Networking

Broker comes with a few custom types from the networking domain.

3.1.5.1. Address

The type address is an IP address, which holds either an IPv4 or IPv6 address. One can construct an address from a byte sequence, along with specifying the byte order and address family. An address can be masked by zeroing a given number of bottom bits.

3.1.5.2. Subnet

A subnet represents an IP prefix in CIDR notation. It consists of two components: a network address and a prefix length.

3.1.5.3. Port

A port represents a transport-level port number. Besides TCP and UDP ports, there is a concept of an ICMP “port” where the source port is the ICMP message type and the destination port the ICMP message code.

3.1.6. Containers

Broker features the following container types: vector, set, and table.

3.1.6.1. Vector

A vector is a sequence of data.

It is a type alias for std::vector<data>.

3.1.6.2. Set

A set is a mathematical set with elements of type data. A fixed data value can occur at most once in a set.

It is a type alias for std::set<data>.

3.1.6.3. Table

A set is an associative array with keys and values of type data. That is, it maps data to data.

It is a type alias for std::map<data, data>.

3.2. Interface

The data abstraction offers two ways of interacting with the contained type instance:

  1. Querying a specific type T. Similar to C++17’s std::variant, the function get_if<T> returns either a T* if the contained type is T and nullptr otherwise:

    auto x = data{...};
    if (auto i = get_if<integer>(x))
      f(*i); // safe use of x
    

    Alternatively, the function get<T> returns a reference of type T& or const T&, based on whether the given data argument is const-qualified:

    auto x = data{...};
    auto& str = get<std::string>(x); // throws std::bad_cast on type clash
    f(str); // safe use of x
    
  2. Applying a visitor. Since data is a variant type, one can apply a visitor to it, i.e., dispatch a function call based on the type discriminator to the active type. A visitor is a polymorphic function object with overloaded operator() and a result_type type alias:

    struct visitor {
      using result_type = void;
    
      template <class T>
      result_type operator()(const T&) const {
        std::cout << ":-(" << std::endl;
      }
    
      result_type operator()(real r) const {
        std::cout << i << std::endl;
      }
    
      result_type operator()(integer i) const {
        std::cout << i << std::endl;
      }
    };
    
    auto x = data{42};
    visit(visitor{}, x); // prints 42
    x = 4.2;
    visit(visitor{}, x); // prints 4.2
    x = "42";
    visit(visitor{}, x); // prints :-(