The Basics

Understanding Scripts

Zeek includes an event-driven scripting language that provides the primary means for an organization to extend and customize Zeek’s functionality. Virtually all of the output generated by Zeek is, in fact, generated by Zeek scripts. It’s almost easier to consider Zeek to be an entity behind-the-scenes processing connections and generating events while Zeek’s scripting language is the medium through which we mere mortals can achieve communication. Zeek scripts effectively notify Zeek that should there be an event of a type we define, then let us have the information about the connection so we can perform some function on it. For example, the ssl.log file is generated by a Zeek script that walks the entire certificate chain and issues notifications if any of the steps along the certificate chain are invalid. This entire process is setup by telling Zeek that should it see a server or client issue an SSL HELLO message, we want to know about the information about that connection.

It’s often easiest to understand Zeek’s scripting language by looking at a complete script and breaking it down into its identifiable components. In this example, we’ll take a look at how Zeek checks the SHA1 hash of various files extracted from network traffic against the Team Cymru Malware hash registry. Part of the Team Cymru Malware Hash registry includes the ability to do a host lookup on a domain with the format <MALWARE_HASH>.malware.hash.cymru.com where <MALWARE_HASH> is the SHA1 hash of a file. Team Cymru also populates the TXT record of their DNS responses with both a “first seen” timestamp and a numerical “detection rate”. The important aspect to understand is Zeek already generating hashes for files via the Files framework, but it is the script policy/frameworks/files/detect-MHR.zeek that is responsible for generating the appropriate DNS lookup, parsing the response, and generating a notice if appropriate.

detect-MHR.zeek
##! Detect file downloads that have hash values matching files in Team
##! Cymru's Malware Hash Registry (http://www.team-cymru.org/Services/MHR/).

@load base/frameworks/files
@load base/frameworks/notice
@load frameworks/files/hash-all-files

module TeamCymruMalwareHashRegistry;

export {
    redef enum Notice::Type += {
        ## The hash value of a file transferred over HTTP matched in the
        ## malware hash registry.
        Match
    };

    ## File types to attempt matching against the Malware Hash Registry.
    option match_file_types = /application\/x-dosexec/ |
                             /application\/vnd.ms-cab-compressed/ |
                             /application\/pdf/ |
                             /application\/x-shockwave-flash/ |
                             /application\/x-java-applet/ |
                             /application\/jar/ |
                             /video\/mp4/;

    ## The Match notice has a sub message with a URL where you can get more
    ## information about the file. The %s will be replaced with the SHA-1
    ## hash of the file.
    option match_sub_url = "https://www.virustotal.com/en/search/?query=%s";

    ## The malware hash registry runs each malware sample through several
    ## A/V engines.  Team Cymru returns a percentage to indicate how
    ## many A/V engines flagged the sample as malicious. This threshold
    ## allows you to require a minimum detection rate.
    option notice_threshold = 10;
}

function do_mhr_lookup(hash: string, fi: Notice::FileInfo)
    {
    local hash_domain = fmt("%s.malware.hash.cymru.com", hash);

    when ( local MHR_result = lookup_hostname_txt(hash_domain) )
        {
        # Data is returned as "<dateFirstDetected> <detectionRate>"
        local MHR_answer = split_string1(MHR_result, / /);

        if ( |MHR_answer| == 2 )
            {
            local mhr_detect_rate = to_count(MHR_answer[1]);

            if ( mhr_detect_rate >= notice_threshold )
                {
                local mhr_first_detected = double_to_time(to_double(MHR_answer[0]));
                local readable_first_detected = strftime("%Y-%m-%d %H:%M:%S", mhr_first_detected);
                local message = fmt("Malware Hash Registry Detection rate: %d%%  Last seen: %s", mhr_detect_rate, readable_first_detected);
                local virustotal_url = fmt(match_sub_url, hash);
                # We don't have the full fa_file record here in order to
                # avoid the "when" statement cloning it (expensive!).
                local n: Notice::Info = Notice::Info($note=Match, $msg=message, $sub=virustotal_url);
                Notice::populate_file_info2(fi, n);
                NOTICE(n);
                }
            }
        }
    }

event file_hash(f: fa_file, kind: string, hash: string)
    {
    if ( kind == "sha1" && f?$info && f$info?$mime_type &&
         match_file_types in f$info$mime_type )
        do_mhr_lookup(hash, Notice::create_file_info(f));
    }

Visually, there are three distinct sections of the script. First, there is a base level with no indentation where libraries are included in the script through @load and a namespace is defined with module. This is followed by an indented and formatted section explaining the custom variables being provided (export) as part of the script’s namespace. Finally there is a second indented and formatted section describing the instructions to take for a specific event (event file_hash). Don’t get discouraged if you don’t understand every section of the script; we’ll cover the basics of the script and much more in following sections.

detect-MHR.zeek
@load base/frameworks/files
@load base/frameworks/notice
@load frameworks/files/hash-all-files

The first part of the script consists of @load directives which process the __load__.zeek script in the respective directories being loaded. The @load directives are often considered good practice or even just good manners when writing Zeek scripts to make sure they can be used on their own. While it’s unlikely that in a full production deployment of Zeek these additional resources wouldn’t already be loaded, it’s not a bad habit to try to get into as you get more experienced with Zeek scripting. If you’re just starting out, this level of granularity might not be entirely necessary. The @load directives are ensuring the Files framework, the Notice framework and the script to hash all files has been loaded by Zeek.

detect-MHR.zeek
export {
    redef enum Notice::Type += {
        ## The hash value of a file transferred over HTTP matched in the
        ## malware hash registry.
        Match
    };

    ## File types to attempt matching against the Malware Hash Registry.
    option match_file_types = /application\/x-dosexec/ |
                             /application\/vnd.ms-cab-compressed/ |
                             /application\/pdf/ |
                             /application\/x-shockwave-flash/ |
                             /application\/x-java-applet/ |
                             /application\/jar/ |
                             /video\/mp4/;

    ## The Match notice has a sub message with a URL where you can get more
    ## information about the file. The %s will be replaced with the SHA-1
    ## hash of the file.
    option match_sub_url = "https://www.virustotal.com/en/search/?query=%s";

    ## The malware hash registry runs each malware sample through several
    ## A/V engines.  Team Cymru returns a percentage to indicate how
    ## many A/V engines flagged the sample as malicious. This threshold
    ## allows you to require a minimum detection rate.
    option notice_threshold = 10;
}

The export section redefines an enumerable constant that describes the type of notice we will generate with the Notice framework. Zeek allows for re-definable constants, which at first, might seem counter-intuitive. We’ll get more in-depth with constants in a later chapter, for now, think of them as variables that can only be altered before Zeek starts running. By extending the Notice::Type as shown, this allows for the NOTICE function to generate notices with a $note field set as TeamCymruMalwareHashRegistry::Match. Notices allow Zeek to generate some kind of extra notification beyond its default log types. Often times, this extra notification comes in the form of an email generated and sent to a preconfigured address, but can be altered depending on the needs of the deployment. The export section is finished off with the definition of a few constants that list the kind of files we want to match against and the minimum percentage of detection threshold in which we are interested.

Up until this point, the script has merely done some basic setup. With the next section, the script starts to define instructions to take in a given event.

detect-MHR.zeek
function do_mhr_lookup(hash: string, fi: Notice::FileInfo)
    {
    local hash_domain = fmt("%s.malware.hash.cymru.com", hash);

    when ( local MHR_result = lookup_hostname_txt(hash_domain) )
        {
        # Data is returned as "<dateFirstDetected> <detectionRate>"
        local MHR_answer = split_string1(MHR_result, / /);

        if ( |MHR_answer| == 2 )
            {
            local mhr_detect_rate = to_count(MHR_answer[1]);

            if ( mhr_detect_rate >= notice_threshold )
                {
                local mhr_first_detected = double_to_time(to_double(MHR_answer[0]));
                local readable_first_detected = strftime("%Y-%m-%d %H:%M:%S", mhr_first_detected);
                local message = fmt("Malware Hash Registry Detection rate: %d%%  Last seen: %s", mhr_detect_rate, readable_first_detected);
                local virustotal_url = fmt(match_sub_url, hash);
                # We don't have the full fa_file record here in order to
                # avoid the "when" statement cloning it (expensive!).
                local n: Notice::Info = Notice::Info($note=Match, $msg=message, $sub=virustotal_url);
                Notice::populate_file_info2(fi, n);
                NOTICE(n);
                }
            }
        }
    }

event file_hash(f: fa_file, kind: string, hash: string)
    {
    if ( kind == "sha1" && f?$info && f$info?$mime_type &&
         match_file_types in f$info$mime_type )
        do_mhr_lookup(hash, Notice::create_file_info(f));

The workhorse of the script is contained in the event handler for file_hash. The file_hash event allows scripts to access the information associated with a file for which Zeek’s file analysis framework has generated a hash. The event handler is passed the file itself as f, the type of digest algorithm used as kind and the hash generated as hash.

In the file_hash event handler, there is an if statement that is used to check for the correct type of hash, in this case a SHA1 hash. It also checks for a mime type we’ve defined as being of interest as defined in the constant match_file_types. The comparison is made against the expression f$info$mime_type, which uses the $ dereference operator to check the value mime_type inside the variable f$info. If the entire expression evaluates to true, then a helper function is called to do the rest of the work. In that function, a local variable is defined to hold a string comprised of the SHA1 hash concatenated with .malware.hash.cymru.com; this value will be the domain queried in the malware hash registry.

The rest of the script is contained within a when block. In short, a when block is used when Zeek needs to perform asynchronous actions, such as a DNS lookup, to ensure that performance isn’t effected. The when block performs a DNS TXT lookup and stores the result in the local variable MHR_result. Effectively, processing for this event continues and upon receipt of the values returned by lookup_hostname_txt, the when block is executed. The when block splits the string returned into a portion for the date on which the malware was first detected, and the detection rate, by splitting the text on space and storing the values returned in a local table variable. In the do_mhr_lookup function, if the table returned by split1 has two entries, indicating a successful split, we store the detection date in mhr_first_detected and the rate in mhr_detect_rate using the appropriate conversion functions. From this point on, Zeek knows it has seen a file transmitted which has a hash that has been seen by the Team Cymru Malware Hash Registry, the rest of the script is dedicated to producing a notice.

The detection time is processed into a string representation and stored in readable_first_detected. The script then compares the detection rate against the notice_threshold that was defined earlier. If the detection rate is high enough, the script creates a concise description of the notice and stores it in the message variable. It also creates a possible URL to check the sample against virustotal.com’s database, and makes the call to NOTICE to hand the relevant information off to the Notice framework.

In approximately a few dozen lines of code, Zeek provides an amazing utility that would be incredibly difficult to implement and deploy with other products. In truth, claiming that Zeek does this in such a small number of lines is a misdirection; there is a truly massive number of things going on behind-the-scenes in Zeek, but it is the inclusion of the scripting language that gives analysts access to those underlying layers in a succinct and well defined manner.

The Event Queue and Event Handlers

Zeek’s scripting language is event driven which is a gear change from the majority of scripting languages with which most users will have previous experience. Scripting in Zeek depends on handling the events generated by Zeek as it processes network traffic, altering the state of data structures through those events, and making decisions on the information provided. This approach to scripting can often cause confusion to users who come to Zeek from a procedural or functional language, but once the initial shock wears off it becomes more clear with each exposure.

Zeek’s core acts to place events into an ordered “event queue”, allowing event handlers to process them on a first-come-first-serve basis. In effect, this is Zeek’s core functionality as without the scripts written to perform discrete actions on events, there would be little to no usable output. As such, a basic understanding of the event queue, the events being generated, and the way in which event handlers process those events is a basis for not only learning to write scripts for Zeek but for understanding Zeek itself.

Gaining familiarity with the specific events generated by Zeek is a big step towards building a mind set for working with Zeek scripts. The majority of events generated by Zeek are defined in the built-in-function (*.bif) files which also act as the basis for online event documentation. These in-line comments are compiled into an online documentation system using Zeekygen. Whether starting a script from scratch or reading and maintaining someone else’s script, having the built-in event definitions available is an excellent resource to have on hand. For the 2.0 release the Zeek developers put significant effort into organization and documentation of every event. This effort resulted in built-in-function files organized such that each entry contains a descriptive event name, the arguments passed to the event, and a concise explanation of the functions use.

## Generated for DNS requests. For requests with multiple queries, this event
## is raised once for each.
##
## See `Wikipedia <http://en.wikipedia.org/wiki/Domain_Name_System>`__ for more
## information about the DNS protocol. Zeek analyzes both UDP and TCP DNS
## sessions.
##
## c: The connection, which may be UDP or TCP depending on the type of the
##    transport-layer session being analyzed.
##
## msg: The parsed DNS message header.
##
## query: The queried name.
##
## qtype: The queried resource record type.
##
## qclass: The queried resource record class.
##
## .. zeek:see:: dns_AAAA_reply dns_A_reply dns_CNAME_reply dns_EDNS_addl
##    dns_HINFO_reply dns_MX_reply dns_NS_reply dns_PTR_reply dns_SOA_reply
##    dns_SRV_reply dns_TSIG_addl dns_TXT_reply dns_WKS_reply dns_end
##    dns_full_request dns_mapping_altered dns_mapping_lost_name dns_mapping_new_name
##    dns_mapping_unverified dns_mapping_valid dns_message dns_query_reply
##    dns_rejected non_dns_request dns_max_queries dns_session_timeout dns_skip_addl
##    dns_skip_all_addl dns_skip_all_auth dns_skip_auth
event dns_request%(c: connection, msg: dns_msg, query: string, qtype: count, qclass: count%);

Above is a segment of the documentation for the event dns_request (and the preceding link points to the documentation generated out of that). It’s organized such that the documentation, commentary, and list of arguments precede the actual event definition used by Zeek. As Zeek detects DNS requests being issued by an originator, it issues this event and any number of scripts then have access to the data Zeek passes along with the event. In this example, Zeek passes not only the message, the query, query type and query class for the DNS request, but also a record used for the connection itself.

The Connection Record Data Type

Of all the events defined by Zeek, an overwhelmingly large number of them are passed the connection record data type, in effect, making it the backbone of many scripting solutions. The connection record itself, as we will see in a moment, is a mass of nested data types used to track state on a connection through its lifetime. Let’s walk through the process of selecting an appropriate event, generating some output to standard out and dissecting the connection record so as to get an overview of it. We will cover data types in more detail later.

While Zeek is capable of packet level processing, its strengths lay in the context of a connection between an originator and a responder.

Note

Zeek’s notions of originator and responder aim to capture the natural roles of connection endpoints given the protocol information observed. They differ from the packet-level concepts of source and destination, as well as from higher-level abstractions such as client and server.

Zeek’s protocol analyzers determine originator and responder when establishing connection state, with the sender of the initial packet usually becoming the originator and the recipient becoming the responder. However, analyzers may subsequently flip the roles if protocol semantics suggest it. For example, in the presence of packet loss the first observed packet in a DNS transaction may indicate that it is in fact the response to a missing query. Zeek’s DNS analyzer will flip the endpoint roles, making the sender of this packet the connection’s responder.

Zeek defines events for the primary parts of the connection life-cycle, such as the following:

Of the events listed, the event that will give us the best insight into the connection record data type will be connection_state_remove . As detailed in the in-line documentation, Zeek generates this event just before it decides to remove this event from memory, effectively forgetting about it. Let’s take a look at a simple example script, that will output the connection record for a single connection.

connection_record_01.zeek
1@load base/protocols/conn
2
3event connection_state_remove(c: connection)
4    {
5    print c;
6    }

Again, we start with @load, this time importing the base/protocols/conn scripts which supply the tracking and logging of general information and state of connections. We handle the connection_state_remove event and simply print the contents of the argument passed to it. For this example we’re going to run Zeek in “bare mode” which loads only the minimum number of scripts to retain operability and leaves the burden of loading required scripts to the script being run. While bare mode is a low level functionality incorporated into Zeek, in this case, we’re going to use it to demonstrate how different features of Zeek add more and more layers of information about a connection. This will give us a chance to see the contents of the connection record without it being overly populated.

$ zeek -b -r http/get.trace connection_record_01.zeek
[id=[orig_h=141.142.228.5, orig_p=59856/tcp, resp_h=192.150.187.43, resp_p=80/tcp], orig=[size=136, state=5, num_pkts=7, num_bytes_ip=512, flow_label=0, l2_addr=c8:bc:c8:96:d2:a0], resp=[size=5007, state=5, num_pkts=7, num_bytes_ip=5379, flow_label=0, l2_addr=00:10:db:88:d2:ef], start_time=1362692526.869344, duration=0.211484, service={

}, history=ShADadFf, uid=CHhAvVGS1DHFjwGM9, tunnel=<uninitialized>, vlan=<uninitialized>, inner_vlan=<uninitialized>, conn=[ts=1362692526.869344, uid=CHhAvVGS1DHFjwGM9, id=[orig_h=141.142.228.5, orig_p=59856/tcp, resp_h=192.150.187.43, resp_p=80/tcp], proto=tcp, service=<uninitialized>, duration=0.211484, orig_bytes=136, resp_bytes=5007, conn_state=SF, local_orig=<uninitialized>, local_resp=<uninitialized>, missed_bytes=0, history=ShADadFf, orig_pkts=7, orig_ip_bytes=512, resp_pkts=7, resp_ip_bytes=5379, tunnel_parents=<uninitialized>], extract_orig=F, extract_resp=F, thresholds=<uninitialized>]

As you can see from the output, the connection record is something of a jumble when printed on its own. Regularly taking a peek at a populated connection record helps to understand the relationship between its fields as well as allowing an opportunity to build a frame of reference for accessing data in a script.

Zeek makes extensive use of nested data structures to store state and information gleaned from the analysis of a connection as a complete unit. To break down this collection of information, you will have to make use of Zeek’s field delimiter $. For example, the originating host is referenced by c$id$orig_h which if given a narrative relates to orig_h which is a member of id which is a member of the data structure referred to as c that was passed into the event handler. Given that the responder port c$id$resp_p is 80/tcp, it’s likely that Zeek’s base HTTP scripts can further populate the connection record. Let’s load the base/protocols/http scripts and check the output of our script.

Zeek uses the dollar sign as its field delimiter and a direct correlation exists between the output of the connection record and the proper format of a dereferenced variable in scripts. In the output of the script above, groups of information are collected between brackets, which would correspond to the $-delimiter in a Zeek script.

connection_record_02.zeek
1@load base/protocols/conn
2@load base/protocols/http
3
4event connection_state_remove(c: connection)
5    {
6    print c;
7    }
$ zeek -b -r http/get.trace connection_record_02.zeek
[id=[orig_h=141.142.228.5, orig_p=59856/tcp, resp_h=192.150.187.43, resp_p=80/tcp], orig=[size=136, state=5, num_pkts=7, num_bytes_ip=512, flow_label=0, l2_addr=c8:bc:c8:96:d2:a0], resp=[size=5007, state=5, num_pkts=7, num_bytes_ip=5379, flow_label=0, l2_addr=00:10:db:88:d2:ef], start_time=1362692526.869344, duration=0.211484, service={

}, history=ShADadFf, uid=CHhAvVGS1DHFjwGM9, tunnel=<uninitialized>, vlan=<uninitialized>, inner_vlan=<uninitialized>, conn=[ts=1362692526.869344, uid=CHhAvVGS1DHFjwGM9, id=[orig_h=141.142.228.5, orig_p=59856/tcp, resp_h=192.150.187.43, resp_p=80/tcp], proto=tcp, service=<uninitialized>, duration=0.211484, orig_bytes=136, resp_bytes=5007, conn_state=SF, local_orig=<uninitialized>, local_resp=<uninitialized>, missed_bytes=0, history=ShADadFf, orig_pkts=7, orig_ip_bytes=512, resp_pkts=7, resp_ip_bytes=5379, tunnel_parents=<uninitialized>], extract_orig=F, extract_resp=F, thresholds=<uninitialized>, http=[ts=1362692526.939527, uid=CHhAvVGS1DHFjwGM9, id=[orig_h=141.142.228.5, orig_p=59856/tcp, resp_h=192.150.187.43, resp_p=80/tcp], trans_depth=1, method=GET, host=bro.org, uri=/download/CHANGES.bro-aux.txt, referrer=<uninitialized>, version=1.1, user_agent=Wget/1.14 (darwin12.2.0), request_body_len=0, response_body_len=4705, status_code=200, status_msg=OK, info_code=<uninitialized>, info_msg=<uninitialized>, tags={

}, username=<uninitialized>, password=<uninitialized>, capture_password=F, proxied=<uninitialized>, range_request=F, orig_fuids=<uninitialized>, orig_filenames=<uninitialized>, orig_mime_types=<uninitialized>, resp_fuids=[FakNcS1Jfe01uljb3], resp_filenames=<uninitialized>, resp_mime_types=[text/plain], current_entity=<uninitialized>, orig_mime_depth=1, resp_mime_depth=1], http_state=[pending={

}, current_request=1, current_response=1, trans_depth=1]]

The addition of the base/protocols/http scripts populates the http=[] member of the connection record. While Zeek is doing a massive amount of work in the background, it is in what is commonly called “scriptland” that details are being refined and decisions being made. Were we to continue running in “bare mode” we could slowly keep adding infrastructure through @load statements. For example, were we to @load base/frameworks/logging, Zeek would generate a conn.log and http.log for us in the current working directory. As mentioned above, including the appropriate @load statements is not only good practice, but can also help to indicate which functionalities are being used in a script. Take a second to run the script without the -b flag and check the output when all of Zeek’s functionality is applied to the trace file.

Data Types and Data Structures

Scope

Before embarking on a exploration of Zeek’s native data types and data structures, it’s important to have a good grasp of the different levels of scope available in Zeek and the appropriate times to use them within a script. The declarations of variables in Zeek come in two forms. Variables can be declared with or without a definition in the form SCOPE name: TYPE or SCOPE name = EXPRESSION respectively; each of which produce the same result if EXPRESSION evaluates to the same type as TYPE. The decision as to which type of declaration to use is likely to be dictated by personal preference and readability.

data_type_declaration.zeek
1event zeek_init()
2    {
3    local a: int;
4    a = 10;
5    local b = 10;
6
7    if ( a == b )
8        print fmt("A: %d, B: %d", a, b);
9    }

Global Variables

A global variable is used when the state of variable needs to be tracked, not surprisingly, globally. While there are some caveats, when a script declares a variable using the global scope, that script is granting access to that variable from other scripts. However, when a script uses the module keyword to give the script a namespace, more care must be given to the declaration of globals to ensure the intended result. When a global is declared in a script with a namespace there are two possible outcomes. First, the variable is available only within the context of the namespace. In this scenario, other scripts within the same namespace will have access to the variable declared while scripts using a different namespace or no namespace altogether will not have access to the variable. Alternatively, if a global variable is declared within an export { ... } block that variable is available to any other script through the naming convention of <module name>::<variable name>, i.e. the variable needs to be “scoped” by the name of the module in which it was declared.

When the module keyword is used in a script, the variables declared are said to be in that module’s “namespace”. Where as a global variable can be accessed by its name alone when it is not declared within a module, a global variable declared within a module must be exported and then accessed via <module name>::<variable name>.

Constants

Zeek also makes use of constants, which are denoted by the const keyword. Unlike globals, constants can only be set or altered at parse time if the &redef attribute has been used. Afterwards (in runtime) the constants are unalterable. In most cases, re-definable constants are used in Zeek scripts as containers for configuration options. For example, the configuration option to log passwords decrypted from HTTP streams is stored in HTTP::default_capture_password as shown in the stripped down excerpt from base/protocols/http/main.zeek below.

http_main.zeek
1module HTTP;
2
3export {
4    ## This setting changes if passwords used in Basic-Auth are captured or
5    ## not.
6    const default_capture_password = F &redef;
7}

Because the constant was declared with the &redef attribute, if we needed to turn this option on globally, we could do so by adding the following line to our site/local.zeek file before firing up Zeek.

data_type_const_simple.zeek
1@load base/protocols/http
2
3redef HTTP::default_capture_password = T;

While the idea of a re-definable constant might be odd, the constraint that constants can only be altered at parse-time remains even with the &redef attribute. In the code snippet below, a table of strings indexed by ports is declared as a constant before two values are added to the table through redef statements. The table is then printed in a zeek_init event. Were we to try to alter the table in an event handler, Zeek would notify the user of an error and the script would fail.

data_type_const.zeek
1const port_list: table[port] of string &redef;
2
3redef port_list += { [6666/tcp] = "IRC"};
4redef port_list += { [80/tcp] = "WWW" };
5
6event zeek_init()
7    {
8    print port_list;
9    }
$ zeek -b data_type_const.zeek
{
[80/tcp] = WWW,
[6666/tcp] = IRC
}

Local Variables

Whereas globals and constants are widely available in scriptland through various means, when a variable is defined with a local scope, its availability is restricted to the body of the event or function in which it was declared. Local variables tend to be used for values that are only needed within a specific scope and once the processing of a script passes beyond that scope and no longer used, the variable is deleted. Zeek maintains names of locals separately from globally visible ones, an example of which is illustrated below.

data_type_local.zeek
 1function add_two(i: count): count
 2    {
 3    local added_two = i+2;
 4    print fmt("i + 2 = %d", added_two);
 5    return added_two;
 6    }
 7
 8event zeek_init()
 9    {
10    local test = add_two(10);
11    }

The script executes the event handler zeek_init which in turn calls the function add_two(i: count) with an argument of 10. Once Zeek enters the add_two function, it provisions a locally scoped variable called added_two to hold the value of i+2, in this case, 12. The add_two function then prints the value of the added_two variable and returns its value to the zeek_init event handler. At this point, the variable added_two has fallen out of scope and no longer exists while the value 12 still in use and stored in the locally scoped variable test. When Zeek finishes processing the zeek_init function, the variable called test is no longer in scope and, since there exist no other references to the value 12, the value is also deleted.

Data Structures

It’s difficult to talk about Zeek’s data types in a practical manner without first covering the data structures available in Zeek. Some of the more interesting characteristics of data types are revealed when used inside of a data structure, but given that data structures are made up of data types, it devolves rather quickly into a “chicken-and-egg” problem. As such, we’ll introduce data types from a bird’s eye view before diving into data structures and from there a more complete exploration of data types.

The table below shows the atomic types used in Zeek, of which the first four should seem familiar if you have some scripting experience, while the remaining six are less common in other languages. It should come as no surprise that a scripting language for a Network Security Monitoring platform has a fairly robust set of network-centric data types and taking note of them here may well save you a late night of reinventing the wheel.

Data Type

Description

int

64 bit signed integer

count

64 bit unsigned integer

double

double precision floating precision

bool

boolean (T/F)

addr

IP address, IPv4 and IPv6

port

transport layer port

subnet

CIDR subnet mask

time

absolute epoch time

interval

a time interval

pattern

regular expression

Sets

Sets in Zeek are used to store unique elements of the same data type. In essence, you can think of them as “a unique set of integers” or “a unique set of IP addresses”. While the declaration of a set may differ based on the data type being collected, the set will always contain unique elements and the elements in the set will always be of the same data type. Such requirements make the set data type perfect for information that is already naturally unique such as ports or IP addresses. The code snippet below shows both an explicit and implicit declaration of a locally scoped set.

data_struct_set_declaration.zeek
1event zeek_init()
2    {
3    local ssl_ports: set[port];
4    local non_ssl_ports = set( 23/tcp, 80/tcp, 143/tcp, 25/tcp );
5    }

As you can see, sets are declared using the format SCOPE var_name: set[TYPE]. Adding and removing elements in a set is achieved using the add and delete statements. Once you have elements inserted into the set, it’s likely that you’ll need to either iterate over that set or test for membership within the set, both of which are covered by the in operator. In the case of iterating over a set, combining the for statement and the in operator will allow you to sequentially process each element of the set as seen below.

data_struct_set_declaration.zeek
17    for ( i in ssl_ports )
18        print fmt("SSL Port: %s", i);
19
20    for ( i in non_ssl_ports )
21        print fmt("Non-SSL Port: %s", i);

Here, the for statement loops over the contents of the set storing each element in the temporary variable i. With each iteration of the for loop, the next element is chosen. Since sets are not an ordered data type, you cannot guarantee the order of the elements as the for loop processes.

To test for membership in a set the in statement can be combined with an if statement to return a true or false value. If the exact element in the condition is already in the set, the condition returns true and the body executes. The in statement can also be negated by the ! operator to create the inverse of the condition. While we could rewrite the corresponding line below as if ( !( 587/tcp in ssl_ports )) try to avoid using this construct; instead, negate the in operator itself. While the functionality is the same, using the !in is more efficient as well as a more natural construct which will aid in the readability of your script.

data_struct_set_declaration.zeek
13    # Check for SMTPS 
14    if ( 587/tcp !in ssl_ports )
15        add ssl_ports[587/tcp];

You can see the full script and its output below.

data_struct_set_declaration.zeek
 1event zeek_init()
 2    {
 3    local ssl_ports: set[port];
 4    local non_ssl_ports = set( 23/tcp, 80/tcp, 143/tcp, 25/tcp );
 5    
 6    # SSH
 7    add ssl_ports[22/tcp];
 8    # HTTPS
 9    add ssl_ports[443/tcp];
10    # IMAPS
11    add ssl_ports[993/tcp];
12    
13    # Check for SMTPS 
14    if ( 587/tcp !in ssl_ports )
15        add ssl_ports[587/tcp];
16    
17    for ( i in ssl_ports )
18        print fmt("SSL Port: %s", i);
19
20    for ( i in non_ssl_ports )
21        print fmt("Non-SSL Port: %s", i);
22    }
$ zeek data_struct_set_declaration.zeek
SSL Port: 22/tcp
SSL Port: 443/tcp
SSL Port: 587/tcp
SSL Port: 993/tcp
Non-SSL Port: 80/tcp
Non-SSL Port: 25/tcp
Non-SSL Port: 143/tcp
Non-SSL Port: 23/tcp

Tables

A table in Zeek is a mapping of a key to a value or yield. While the values don’t have to be unique, each key in the table must be unique to preserve a one-to-one mapping of keys to values.

data_struct_table_declaration.zeek
 1event zeek_init()
 2    {
 3    # Declaration of the table.
 4    local ssl_services: table[string] of port;
 5
 6    # Initialize the table.
 7    ssl_services = table(["SSH"] = 22/tcp, ["HTTPS"] = 443/tcp);
 8
 9    # Insert one key-value pair into the table.
10    ssl_services["IMAPS"] = 993/tcp;
11
12    # Check if the key "SMTPS" is not in the table.
13    if ( "SMTPS" !in ssl_services )
14        ssl_services["SMTPS"] = 587/tcp;
15
16    # Iterate over each key in the table.
17    for ( k in ssl_services )
18        print fmt("Service Name:  %s - Common Port: %s", k, ssl_services[k]);
19    }
$ zeek data_struct_table_declaration.zeek
Service Name:  SSH - Common Port: 22/tcp
Service Name:  HTTPS - Common Port: 443/tcp
Service Name:  SMTPS - Common Port: 587/tcp
Service Name:  IMAPS - Common Port: 993/tcp

In this example, we’ve compiled a table of SSL-enabled services and their common ports. The explicit declaration and constructor for the table are on two different lines and lay out the data types of the keys (strings) and the data types of the values (ports) and then fill in some sample key and value pairs. You can also use a table accessor to insert one key-value pair into the table. When using the in operator on a table, you are effectively working with the keys of the table. In the case of an if statement, the in operator will check for membership among the set of keys and return a true or false value. The example shows how to check if SMTPS is not in the set of keys for the ssl_services table and if the condition holds true, we add the key-value pair to the table. Finally, the example shows how to use a for statement to iterate over each key currently in the table.

Simple examples aside, tables can become extremely complex as the keys and values for the table become more intricate. Tables can have keys comprised of multiple data types and even a series of elements called a “tuple”. The flexibility gained with the use of complex tables in Zeek implies a cost in complexity for the person writing the scripts but pays off in effectiveness given the power of Zeek as a network security platform.

data_struct_table_complex.zeek
 1event zeek_init()
 2    {
 3    local samurai_flicks: table[string, string, count, string] of string;
 4    
 5    samurai_flicks["Kihachi Okamoto", "Toho", 1968, "Tatsuya Nakadai"] = "Kiru";
 6    samurai_flicks["Hideo Gosha", "Fuji", 1969, "Tatsuya Nakadai"] = "Goyokin";
 7    samurai_flicks["Masaki Kobayashi", "Shochiku Eiga", 1962, "Tatsuya Nakadai" ] = "Harakiri";
 8    samurai_flicks["Yoji Yamada", "Eisei Gekijo", 2002, "Hiroyuki Sanada" ] = "Tasogare Seibei";
 9    
10    for ( [d, s, y, a] in samurai_flicks )
11        print fmt("%s was released in %d by %s studios, directed by %s and starring %s", samurai_flicks[d, s, y, a], y, s, d, a);
12    }
$ zeek -b data_struct_table_complex.zeek
Harakiri was released in 1962 by Shochiku Eiga studios, directed by Masaki Kobayashi and starring Tatsuya Nakadai
Goyokin was released in 1969 by Fuji studios, directed by Hideo Gosha and starring Tatsuya Nakadai
Tasogare Seibei was released in 2002 by Eisei Gekijo studios, directed by Yoji Yamada and starring Hiroyuki Sanada
Kiru was released in 1968 by Toho studios, directed by Kihachi Okamoto and starring Tatsuya Nakadai

This script shows a sample table of strings indexed by two strings, a count, and a final string. With a tuple acting as an aggregate key, the order is important as a change in order would result in a new key. Here, we’re using the table to track the director, studio, year of release, and lead actor in a series of samurai flicks.

In the case of the for statement above, iteration is done over all parts of the key. When not all parts of a key are needed within the for loop’s body, these can be ignored by using the blank identifier _ instead of a variable. It’s important to note, however, that the structure of the key needs to be reflected: All parts of the key need to be captured within the brackets by a variable or the blank identifier. As a special case, a single blank identifier allows to ignore the whole key. In the previous example, we need squared brackets surrounding four temporary variables to act as a collection for our iteration. While this is a contrived example, we could easily have had keys containing IP addresses (addr), ports (port) and even a string calculated as the result of a reverse hostname lookup.

The example below continues with the samurai_flicks table and shows usage of the blank identifier in combination with key-value iteration. Using key-value iteration short-cuts the table access to lookup the value as it provides the respective entry’s value directly in addition to the key.

First, iteration is done by capturing the directors and movie names and ignoring all other elements of the key. Second, the whole key is ignored and only movie names used.

data_struct_table_complex_blank_value.zeek
 1event zeek_init()
 2    {
 3    # local samurai_flicks: ...
 4
 5    for ( [d, _, _, _], name in samurai_flicks )
 6        print fmt("%s was directed by %s", name, d);
 7
 8    for ( _, name in samurai_flicks )
 9        print fmt("%s is a movie", name);
10    }
$ zeek data_struct_table_complex_blank_value.zeek
Kiru was directed by Kihachi Okamoto
Harakiri was directed by Masaki Kobayashi
Tasogare Seibei was directed by Yoji Yamada
Goyokin was directed by Hideo Gosha
Kiru is a movie
Harakiri is a movie
Tasogare Seibei is a movie
Goyokin is a movie

Vectors

If you’re coming to Zeek with a programming background, you may or may not be familiar with a vector data type depending on your language of choice. On the surface, vectors perform much of the same functionality as associative arrays with unsigned integers as their indices. They are however more efficient than that and they allow for ordered access. As such any time you need to sequentially store data of the same type, in Zeek you should reach for a vector. Vectors are a collection of objects, all of which are of the same data type, to which elements can be dynamically added or removed. Since Vectors use contiguous storage for their elements, the contents of a vector can be accessed through a zero-indexed numerical offset.

The format for the declaration of a Vector follows the pattern of other declarations, namely, SCOPE v: vector of T where v is the name of your vector, and T is the data type of its members. For example, the following snippet shows an explicit and implicit declaration of two locally scoped vectors. The script populates the first vector by inserting values at the end; it does that by placing the vector name between two vertical pipes to get the vector’s current length before printing the contents of both Vectors and their current lengths.

data_struct_vector_declaration.zeek
 1event zeek_init()
 2    {
 3    local v1: vector of count;
 4    local v2 = vector(1, 2, 3, 4);
 5    
 6    v1 += 1;
 7    v1 += 2;
 8    v1 += 3;
 9    v1 += 4;
10    
11    print fmt("contents of v1: %s", v1);
12    print fmt("length of v1: %d", |v1|);
13    print fmt("contents of v2: %s", v2);
14    print fmt("length of v2: %d", |v2|);
15    }
$ zeek data_struct_vector_declaration.zeek
contents of v1: [1, 2, 3, 4]
length of v1: 4
contents of v2: [1, 2, 3, 4]
length of v2: 4

In a lot of cases, storing elements in a vector is simply a precursor to then iterating over them. Iterating over a vector is easy with the for keyword. The sample below iterates over a vector of IP addresses and for each IP address, masks that address with 18 bits. The for keyword is used to generate a locally scoped variable called i which will hold the index of the current element in the vector. Using i as an index to addr_vector we can access the current item in the vector with addr_vector[i].

data_struct_vector_iter.zeek
1event zeek_init()
2    {
3    local addr_vector: vector of addr = vector(1.2.3.4, 2.3.4.5, 3.4.5.6);
4
5    for ( i in addr_vector )
6        print mask_addr(addr_vector[i], 18);
7    }
$ zeek -b data_struct_vector_iter.zeek
1.2.0.0/18
2.3.0.0/18
3.4.0.0/18

Providing a value variable to the for loop allows skipping the extra index operation. As the index variable is now is unused, the script below uses _, the blank identifier, to ignore it. This script is semantically equivalent to the previous one, but does direct value iteration and therefore potentially more performant for very large vectors.

data_struct_vector_iter_value.zeek
1event zeek_init()
2    {
3    local addr_vector: vector of addr = vector(1.2.3.4, 2.3.4.5, 3.4.5.6);
4
5    for ( _, a in addr_vector )
6        print mask_addr(a, 18);
7    }

Data Types Revisited

addr

The addr, or address, data type manages to cover a surprisingly large amount of ground while remaining succinct. IPv4, IPv6 and even hostname constants are included in the addr data type. While IPv4 addresses use the default dotted quad formatting, IPv6 addresses use the RFC 2373 defined notation with the addition of squared brackets wrapping the entire address. When you venture into hostname constants, Zeek performs a little slight of hand for the benefit of the user; a hostname constant is, in fact, a set of addresses. Zeek will issue a DNS request when it sees a hostname constant in use and return a set whose elements are the answers to the DNS request. For example, if you were to use local google = www.google.com; you would end up with a locally scoped set[addr] with elements that represent the current set of round robin DNS entries for google. At first blush, this seems trivial, but it is yet another example of Zeek making the life of the common Zeek scripter a little easier through abstraction applied in a practical manner. (Note however that these IP addresses will never get updated during Zeek’s processing, so often this mechanism most useful for addresses that are expected to remain static.).

port

Transport layer port numbers in Zeek are represented in the format of <unsigned integer>/<protocol name>, e.g., 22/tcp or 53/udp. Zeek supports TCP(/tcp), UDP(/udp), ICMP(/icmp) and UNKNOWN(/unknown) as protocol designations. While ICMP doesn’t have an actual port, Zeek supports the concept of ICMP “ports” by using the ICMP message type and ICMP message code as the source and destination port respectively. Ports can be compared for equality using the == or != operators and can even be compared for ordering. Zeek gives the protocol designations the following “order”: unknown < tcp < udp < icmp. For example 65535/tcp is smaller than 0/udp.

subnet

Zeek has full support for CIDR notation subnets as a base data type. There is no need to manage the IP and the subnet mask as two separate entities when you can provide the same information in CIDR notation in your scripts. The following example below uses a Zeek script to determine if a series of IP addresses are within a set of subnets using a 20 bit subnet mask.

data_type_subnets.zeek
 1event zeek_init()
 2    {
 3    local subnets = vector(172.16.0.0/20, 172.16.16.0/20, 172.16.32.0/20, [2001:db8:b120::]/64);
 4    local addresses = vector(172.16.4.56, 172.16.47.254, 172.16.1.1, [2001:db8:b120::1]);
 5
 6    for ( a in addresses )
 7        {
 8        for ( s in subnets )
 9            {
10            if ( addresses[a] in subnets[s] )
11                print fmt("%s belongs to subnet %s", addresses[a], subnets[s]);
12            }
13        }
14
15    }

Because this is a script that doesn’t use any kind of network analysis, we can handle the event zeek_init which is always generated by Zeek’s core upon startup. In the example script, two locally scoped vectors are created to hold our lists of subnets and IP addresses respectively. Then, using a set of nested for loops, we iterate over every subnet and every IP address and use an if statement to compare an IP address against a subnet using the in operator. The in operator returns true if the IP address falls within a given subnet based on the longest prefix match calculation. For example, 10.0.0.1 in 10.0.0.0/8 would return true while 192.168.2.1 in 192.168.1.0/24 would return false. When we run the script, we get the output listing the IP address and the subnet in which it belongs.

$ zeek data_type_subnets.zeek
172.16.4.56 belongs to subnet 172.16.0.0/20
172.16.47.254 belongs to subnet 172.16.32.0/20
172.16.22.45 belongs to subnet 172.16.16.0/20
172.16.1.1 belongs to subnet 172.16.0.0/20

time

While there is currently no supported way to add a time constant in Zeek, two built-in functions exist to make use of the time data type. Both network_time and current_time return a time data type but they each return a time based on different criteria. The current_time function returns what is called the wall-clock time as defined by the operating system. However, network_time returns the timestamp of the last packet processed be it from a live data stream or saved packet capture. Both functions return the time in epoch seconds, meaning strftime must be used to turn the output into human readable output. The script below makes use of the connection_established event handler to generate text every time a SYN/ACK packet is seen responding to a SYN packet as part of a TCP handshake. The text generated, is in the format of a timestamp and an indication of who the originator and responder were. We use the strftime format string of %Y-%m-%d %H:%M:%S to produce a common date time formatted time stamp.

data_type_time.zeek
1event connection_established(c: connection)
2    {
3    print fmt("%s:  New connection established from %s to %s\n", strftime("%Y/%m/%d %H:%M:%S", network_time()), c$id$orig_h, c$id$resp_h);
4    }

When the script is executed we get an output showing the details of established connections.

$ zeek -r wikipedia.trace data_type_time.zeek
2011/06/18 19:03:08:  New connection established from 141.142.220.118 to 208.80.152.118\x0a
2011/06/18 19:03:08:  New connection established from 141.142.220.118 to 208.80.152.3\x0a
2011/06/18 19:03:08:  New connection established from 141.142.220.118 to 208.80.152.3\x0a
2011/06/18 19:03:08:  New connection established from 141.142.220.118 to 208.80.152.3\x0a
2011/06/18 19:03:08:  New connection established from 141.142.220.118 to 208.80.152.3\x0a
2011/06/18 19:03:08:  New connection established from 141.142.220.118 to 208.80.152.3\x0a
2011/06/18 19:03:08:  New connection established from 141.142.220.118 to 208.80.152.3\x0a
2011/06/18 19:03:08:  New connection established from 141.142.220.118 to 208.80.152.2\x0a
2011/06/18 19:03:09:  New connection established from 141.142.220.235 to 173.192.163.128\x0a

interval

The interval data type is another area in Zeek where rational application of abstraction makes perfect sense. As a data type, the interval represents a relative time as denoted by a numeric constant followed by a unit of time. For example, 2.2 seconds would be 2.2sec and thirty-one days would be represented by 31days. Zeek supports usec, msec, sec, min, hr, or day which represent microseconds, milliseconds, seconds, minutes, hours, and days respectively. In fact, the interval data type allows for a surprising amount of variation in its definitions. There can be a space between the numeric constant or they can be crammed together like a temporal portmanteau. The time unit can be either singular or plural. All of this adds up to to the fact that both 42hrs and 42 hr are perfectly valid and logically equivalent in Zeek. The point, however, is to increase the readability and thus maintainability of a script. Intervals can even be negated, allowing for - 10mins to represent “ten minutes ago”.

Intervals in Zeek can have mathematical operations performed against them allowing the user to perform addition, subtraction, multiplication, division, and comparison operations. As well, Zeek returns an interval when differencing two time values using the - operator. The script below amends the script started in the section above to include a time delta value printed along with the connection establishment report.

data_type_interval.zeek
 1# Store the time the previous connection was established.
 2global last_connection_time: time;
 3
 4# boolean value to indicate whether we have seen a previous connection.
 5global connection_seen: bool = F;
 6
 7event connection_established(c: connection)
 8    {
 9    local net_time: time  = network_time();
10
11    print fmt("%s:  New connection established from %s to %s", strftime("%Y/%m/%d %H:%M:%S", net_time), c$id$orig_h, c$id$resp_h);
12
13    if ( connection_seen )
14        print fmt("     Time since last connection: %s", net_time - last_connection_time);
15
16    last_connection_time = net_time;
17    connection_seen = T;
18    }

When we re-execute the script we see an additional line in the output, displaying the time delta since the last fully established connection.

$ zeek -r wikipedia.trace data_type_interval.zeek
2011/06/18 19:03:08:  New connection established from 141.142.220.118 to 208.80.152.118
2011/06/18 19:03:08:  New connection established from 141.142.220.118 to 208.80.152.3
     Time since last connection: 132.0 msecs 97.0 usecs
2011/06/18 19:03:08:  New connection established from 141.142.220.118 to 208.80.152.3
     Time since last connection: 177.0 usecs
2011/06/18 19:03:08:  New connection established from 141.142.220.118 to 208.80.152.3
     Time since last connection: 2.0 msecs 177.0 usecs
2011/06/18 19:03:08:  New connection established from 141.142.220.118 to 208.80.152.3
     Time since last connection: 33.0 msecs 898.0 usecs
2011/06/18 19:03:08:  New connection established from 141.142.220.118 to 208.80.152.3
     Time since last connection: 35.0 usecs
2011/06/18 19:03:08:  New connection established from 141.142.220.118 to 208.80.152.3
     Time since last connection: 2.0 msecs 532.0 usecs
2011/06/18 19:03:08:  New connection established from 141.142.220.118 to 208.80.152.2
     Time since last connection: 7.0 msecs 866.0 usecs
2011/06/18 19:03:09:  New connection established from 141.142.220.235 to 173.192.163.128
     Time since last connection: 817.0 msecs 703.0 usecs

Pattern

Zeek has support for fast text searching operations using regular expressions and even goes so far as to declare a native data type for the patterns used in regular expressions. A pattern constant is created by enclosing text within the forward slash characters. Zeek supports syntax very similar to the Flex lexical analyzer syntax. The most common use of patterns in Zeek you are likely to come across is embedded matching using the in operator. Embedded matching adheres to a strict format, requiring the regular expression or pattern constant to be on the left side of the in operator and the string against which it will be tested to be on the right.

data_type_pattern_01.zeek
 1event zeek_init()
 2    {
 3    local test_string = "The quick brown fox jumps over the lazy dog.";
 4    local test_pattern = /quick|lazy/;
 5    
 6    if ( test_pattern in test_string )
 7        {
 8        local results = split_string(test_string, test_pattern);
 9        print results[0];
10        print results[1];
11        print results[2];
12        }
13    }

In the sample above, two local variables are declared to hold our sample sentence and regular expression. Our regular expression in this case will return true if the string contains either the word quick or the word lazy. The if statement in the script uses embedded matching and the in operator to check for the existence of the pattern within the string. If the statement resolves to true, split_string is called to break the string into separate pieces. split_string takes a string and a pattern as its arguments and returns a vector of strings. Each element of the vector represents segments before and after any matches against the pattern but excluding the actual matches. In this case, our pattern matches twice resulting in a vector with three elements.

$ zeek data_type_pattern_01.zeek
The
 brown fox jumps over the
 dog.

Patterns can also be used to compare strings using equality and inequality operators through the == and != operators respectively. When used in this manner however, the string must match entirely to resolve to true. For example, the script below uses two ternary conditional statements to illustrate the use of the == operator with patterns. The output is altered based on the result of the comparison between the pattern and the string.

data_type_pattern_02.zeek
 1event zeek_init()
 2    {
 3    local test_string = "equality";
 4
 5    local test_pattern = /equal/;
 6    print fmt("%s and %s %s equal", test_string, test_pattern, test_pattern == test_string ? "are" : "are not");
 7    
 8    test_pattern = /equality/;
 9    print fmt("%s and %s %s equal", test_string, test_pattern, test_pattern == test_string ? "are" : "are not");
10    }
$ zeek data_type_pattern_02.zeek
equality and /^?(equal)$?/ are not equal
equality and /^?(equality)$?/ are equal

Record Data Type

With Zeek’s support for a wide array of data types and data structures, an obvious extension is to include the ability to create custom data types composed of atomic types and further data structures. To accomplish this, Zeek introduces the record type and the type keyword. Similar to how you would define a new data structure in C with the typedef and struct keywords, Zeek allows you to cobble together new data types to suit the needs of your situation.

When combined with the type keyword, record can generate a composite type. We have, in fact, already encountered a complex example of the record data type in the earlier sections, the connection record passed to many events. Another one, Conn::Info, which corresponds to the fields logged into conn.log, is shown by the excerpt below.

data_type_record.zeek
 1module Conn;
 2
 3export {
 4    ## The record type which contains column fields of the connection log.
 5    type Info: record {
 6        ts:           time            &log;
 7        uid:          string          &log;
 8        id:           conn_id         &log;
 9        proto:        transport_proto &log;
10        service:      string          &log &optional;
11        duration:     interval        &log &optional;
12        orig_bytes:   count           &log &optional;
13        resp_bytes:   count           &log &optional;
14        conn_state:   string          &log &optional;
15        local_orig:   bool            &log &optional;
16        local_resp:   bool            &log &optional;
17        missed_bytes: count           &log &default=0;
18        history:      string          &log &optional;
19        orig_pkts:     count      &log &optional;
20        orig_ip_bytes: count      &log &optional;
21        resp_pkts:     count      &log &optional;
22        resp_ip_bytes: count      &log &optional;
23        tunnel_parents: set[string] &log;
24    };
25}

Looking at the structure of the definition, a new collection of data types is being defined as a type called Info. Since this type definition is within the confines of an export block, what is defined is, in fact, Conn::Info.

The formatting for a declaration of a record type in Zeek includes the descriptive name of the type being defined and the separate fields that make up the record. The individual fields that make up the new record are not limited in type or number as long as the name for each field is unique.

data_struct_record_01.zeek
 1type Service: record {
 2    name: string;
 3    ports: set[port];
 4    rfc: count;
 5};
 6
 7function print_service(serv: Service)
 8    {
 9    print fmt("Service: %s(RFC%d)",serv$name, serv$rfc);
10    
11    for ( p in serv$ports )
12        print fmt("  port: %s", p);
13    }
14
15event zeek_init()
16    {
17    local dns: Service = [$name="dns", $ports=set(53/udp, 53/tcp), $rfc=1035];
18    local http: Service = [$name="http", $ports=set(80/tcp, 8080/tcp), $rfc=2616];
19    
20    print_service(dns);
21    print_service(http);
22    }
$ zeek data_struct_record_01.zeek
Service: dns(RFC1035)
  port: 53/udp
  port: 53/tcp
Service: http(RFC2616)
  port: 8080/tcp
  port: 80/tcp

The sample above shows a simple type definition that includes a string, a set of ports, and a count to define a service type. Also included is a function to print each field of a record in a formatted fashion and a zeek_init event handler to show some functionality of working with records. The definitions of the DNS and HTTP services are both done in-line using squared brackets before being passed to the print_service function. The print_service function makes use of the $ dereference operator to access the fields within the newly defined Service record type.

As you saw in the definition for the Conn::Info record, other records are even valid as fields within another record. We can extend the example above to include another record that contains a Service record.

data_struct_record_02.zeek
 1type Service: record {
 2    name: string;
 3    ports: set[port];
 4    rfc: count;
 5    };
 6
 7type System: record {
 8    name: string;
 9    services: set[Service];
10    };
11
12function print_service(serv: Service)
13    {
14    print fmt("  Service: %s(RFC%d)",serv$name, serv$rfc);
15    
16    for ( p in serv$ports )
17        print fmt("    port: %s", p);
18    }
19
20function print_system(sys: System)
21    {
22    print fmt("System: %s", sys$name);
23    
24    for ( s in sys$services )
25        print_service(s);
26    }
27
28event zeek_init()
29    {
30    local server01: System;
31    server01$name = "morlock";
32    add server01$services[[ $name="dns", $ports=set(53/udp, 53/tcp), $rfc=1035]];
33    add server01$services[[ $name="http", $ports=set(80/tcp, 8080/tcp), $rfc=2616]];
34    print_system(server01);
35    
36    
37    # local dns: Service = [ $name="dns", $ports=set(53/udp, 53/tcp), $rfc=1035];
38    # local http: Service = [ $name="http", $ports=set(80/tcp, 8080/tcp), $rfc=2616];
39    # print_service(dns);
40    # print_service(http);
41    }
$ zeek data_struct_record_02.zeek
System: morlock
  Service: http(RFC2616)
    port: 8080/tcp
    port: 80/tcp
  Service: dns(RFC1035)
    port: 53/udp
    port: 53/tcp

The example above includes a second record type in which a field is used as the data type for a set. Records can be repeatedly nested within other records, their fields reachable through repeated chains of the $ dereference operator.

It’s also common to see a type used to simply alias a data structure to a more descriptive name. The example below shows an example of this from Zeek’s own type definitions file.

init-bare.zeek
type string_array: table[count] of string;
type string_set: set[string];
type addr_set: set[addr];

The three lines above alias a type of data structure to a descriptive name. Functionally, the operations are the same, however, each of the types above are named such that their function is instantly identifiable. This is another place in Zeek scripting where consideration can lead to better readability of your code and thus easier maintainability in the future.

Custom Logging

Armed with a decent understanding of the data types and data structures in Zeek, exploring the various frameworks available is a much more rewarding effort. The framework with which most users are likely to have the most interaction is the Logging Framework. Designed in such a way to so as to abstract much of the process of creating a file and appending ordered and organized data into it, the Logging Framework makes use of some potentially unfamiliar nomenclature. Specifically, Log Streams, Filters and Writers are simply abstractions of the processes required to manage a high rate of incoming logs while maintaining full operability. If you’ve seen Zeek employed in an environment with a large number of connections, you know that logs are produced incredibly quickly; the ability to process a large set of data and write it to disk is due to the design of the Logging Framework.

Data is written to a Log Stream based on decision making processes in Zeek’s scriptland. Log Streams correspond to a single log as defined by the set of name/value pairs that make up its fields. That data can then be filtered, modified, or redirected with Logging Filters which, by default, are set to log everything. Filters can be used to break log files into subsets or duplicate that information to another output. The final output of the data is defined by the writer. Zeek’s default writer is simple tab separated ASCII files but Zeek also includes support for DataSeries and Elasticsearch outputs as well as additional writers currently in development. While these new terms and ideas may give the impression that the Logging Framework is difficult to work with, the actual learning curve is, in actuality, not very steep at all. The abstraction built into the Logging Framework makes it such that a vast majority of scripts needs not go past the basics. In effect, writing to a log file is as simple as defining the format of your data, letting Zeek know that you wish to create a new log, and then calling the Log::write method to output log records.

The Logging Framework is an area in Zeek where, the more you see it used and the more you use it yourself, the more second nature the boilerplate parts of the code will become. As such, let’s work through a contrived example of simply logging the digits 1 through 10 and their corresponding factorial to the default ASCII log writer. It’s always best to work through the problem once, simulating the desired output with print and fmt before attempting to dive into the Logging Framework.

framework_logging_factorial_01.zeek
 1module Factor;
 2
 3function factorial(n: count): count
 4    {
 5    if ( n == 0 )
 6        return 1;
 7    else
 8        return ( n * factorial(n - 1) );
 9    }
10
11event zeek_init()
12    {
13    local numbers: vector of count = vector(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
14    
15    for ( n in numbers )
16        print fmt("%d", factorial(numbers[n]));
17    }
$ zeek framework_logging_factorial_01.zeek
1
2
6
24
120
720
5040
40320
362880
3628800

This script defines a factorial function to recursively calculate the factorial of a unsigned integer passed as an argument to the function. Using print and fmt we can ensure that Zeek can perform these calculations correctly as well get an idea of the answers ourselves.

The output of the script aligns with what we expect so now it’s time to integrate the Logging Framework.

framework_logging_factorial_02.zeek
 1module Factor;
 2
 3export {
 4    # Append the value LOG to the Log::ID enumerable.
 5    redef enum Log::ID += { LOG };
 6
 7    # Define a new type called Factor::Info.
 8    type Info: record {
 9        num:           count &log;
10        factorial_num: count &log;
11        };
12    }
13
14function factorial(n: count): count
15    {
16    if ( n == 0 )
17        return 1;
18    
19    else
20        return ( n * factorial(n - 1) );
21    }
22
23event zeek_init()
24    {
25    # Create the logging stream.
26    Log::create_stream(LOG, [$columns=Info, $path="factor"]);
27    }
28
29event zeek_done()
30    {
31    local numbers: vector of count = vector(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);    
32    for ( n in numbers )
33        Log::write( Factor::LOG, [$num=numbers[n],
34                                  $factorial_num=factorial(numbers[n])]);
35    }

As mentioned above we have to perform a few steps before we can issue the Log::write method and produce a logfile. As we are working within a namespace and informing an outside entity of workings and data internal to the namespace, we use an export block. First we need to inform Zeek that we are going to be adding another Log Stream by adding a value to the Log::ID enumerable. In this script, we append the value LOG to the Log::ID enumerable, however due to this being in an export block the value appended to Log::ID is actually Factor::LOG. Next, we define the fields that make up the data of our logs and dictate its format. This script defines a new record datatype called Info (actually, Factor::Info) with two fields, both unsigned integers. Each of the fields in the Factor::Info record type include the &log attribute, indicating that these fields should be passed to the Logging Framework when Log::write is called. Any record fields without the &log attribute are ignored by the Logging Framework. The next step is to create the logging stream with Log::create_stream which takes a Log::ID and a record as its arguments. In this example, we call the Log::create_stream method and pass Factor::LOG and the Factor::Info record as arguments. From here on out, if we issue the Log::write command with the correct Log::ID and a properly formatted Factor::Info record, a log entry will be generated.

Now, if we run this script, instead of generating logging information to stdout, no output is created. Instead the output is all in factor.log, properly formatted and organized.

$ zeek framework_logging_factorial_02.zeek
$ cat factor.log
#separator \x09
#set_separator    ,
#empty_field      (empty)
#unset_field      -
#path     factor
#open     2018-12-14-21-47-18
#fields   num     factorial_num
#types    count   count
1 1
2 2
3 6
4 24
5 120
6 720
7 5040
8 40320
9 362880
10        3628800
#close    2018-12-14-21-47-18

While the previous example is a simplistic one, it serves to demonstrate the small pieces of script code that need to be in place in order to generate logs. For example, it’s common to call Log::create_stream in zeek_init and while in a live example, determining when to call Log::write would likely be done in an event handler, in this case we use zeek_done .

If you’ve already spent time with a deployment of Zeek, you’ve likely had the opportunity to view, search through, or manipulate the logs produced by the Logging Framework. The log output from a default installation of Zeek is substantial to say the least, however, there are times in which the way the Logging Framework by default isn’t ideal for the situation. This can range from needing to log more or less data with each call to Log::write or even the need to split log files based on arbitrary logic. In the later case, Filters come into play along with the Logging Framework. Filters grant a level of customization to Zeek’s scriptland, allowing the script writer to include or exclude fields in the log and even make alterations to the path of the file in which the logs are being placed. Each stream, when created, is given a default filter called, not surprisingly, default. When using the default filter, every key value pair with the &log attribute is written to a single file. For the example we’ve been using, let’s extend it so as to write any factorial which is a factor of 5 to an alternate file, while writing the remaining logs to factor.log.

framework_logging_factorial_03.zeek
 1module Factor;
 2
 3export {
 4    redef enum Log::ID += { LOG };
 5
 6    type Info: record {
 7        num:           count &log;
 8        factorial_num: count &log;
 9        };
10    }
11
12function factorial(n: count): count
13    {
14    if ( n == 0 )
15        return 1;
16    
17    else
18        return (n * factorial(n - 1));
19    }
20
21event zeek_done()
22    {
23    local numbers: vector of count = vector(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);    
24    for ( n in numbers )
25        Log::write( Factor::LOG, [$num=numbers[n],
26                                  $factorial_num=factorial(numbers[n])]);
27    }
28
29function mod5(id: Log::ID, path: string, rec: Factor::Info) : string    
30    {
31    if ( rec$factorial_num % 5 == 0 )
32        return "factor-mod5";
33    
34    else
35        return "factor-non5";
36    }
37
38event zeek_init()
39    {
40    Log::create_stream(LOG, [$columns=Info, $path="factor"]);
41    
42    local filter: Log::Filter = [$name="split-mod5s", $path_func=mod5];
43    Log::add_filter(Factor::LOG, filter);
44    Log::remove_filter(Factor::LOG, "default");
45    }

To dynamically alter the file in which a stream writes its logs, a filter can specify a function that returns a string to be used as the filename for the current call to Log::write. The definition for this function has to take as its parameters a Log::ID called id, a string called path and the appropriate record type for the logs called rec. You can see the definition of mod5 used in this example conforms to that requirement. The function simply returns factor-mod5 if the factorial is divisible evenly by 5, otherwise, it returns factor-non5. In the additional zeek_init event handler, we define a locally scoped Log::Filter and assign it a record that defines the name and path_func fields. We then call Log::add_filter to add the filter to the Factor::LOG Log::ID and call Log::remove_filter to remove the default filter for Factor::LOG. Had we not removed the default filter, we’d have ended up with three log files: factor-mod5.log with all the factorials that are a factors of 5, factor-non5.log with the factorials that are not factors of 5, and factor.log which would have included all factorials.

$ zeek framework_logging_factorial_03.zeek
$ cat factor-mod5.log
#separator \x09
#set_separator    ,
#empty_field      (empty)
#unset_field      -
#path     factor-mod5
#open     2018-12-14-21-47-18
#fields   num     factorial_num
#types    count   count
5 120
6 720
7 5040
8 40320
9 362880
10        3628800
#close    2018-12-14-21-47-1

The ability of Zeek to generate easily customizable and extensible logs which remain easily parsable is a big part of the reason Zeek has gained a large measure of respect. In fact, it’s difficult at times to think of something that Zeek doesn’t log and as such, it is often advantageous for analysts and systems architects to instead hook into the logging framework to be able to perform custom actions based upon the data being sent to the Logging Frame. To that end, every default log stream in Zeek generates a custom event that can be handled by anyone wishing to act upon the data being sent to the stream. By convention these events are usually in the format log_x where x is the name of the logging stream; as such the event raised for every log sent to the Logging Framework by the HTTP parser would be log_http. Instead of using an external script to parse the http.log file and do post-processing for each entry, this can be done in real time inside Zeek by defining an event handler for the log_http event.

Telling Zeek to raise an event in your own Logging stream is as simple as exporting that event name and then adding that event in the call to Log::create_stream. Going back to our simple example of logging the factorial of an integer, we add log_factor to the export block and define the value to be passed to it, in this case the Factor::Info record. We then list the log_factor function as the $ev field in the call to Log::create_stream

framework_logging_factorial_04.zeek
 1module Factor;
 2
 3export {
 4    redef enum Log::ID += { LOG };
 5
 6    type Info: record {
 7        num:           count &log;
 8        factorial_num: count &log;
 9        };
10    
11    global log_factor: event(rec: Info);
12    }
13
14function factorial(n: count): count
15    {
16    if ( n == 0 )
17        return 1;
18    
19    else
20        return (n * factorial(n - 1));
21    }
22
23event zeek_init()
24    {
25    Log::create_stream(LOG, [$columns=Info, $ev=log_factor, $path="factor"]);
26    }
27
28event zeek_done()
29    {
30    local numbers: vector of count = vector(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);    
31    for ( n in numbers )
32        Log::write( Factor::LOG, [$num=numbers[n],
33                                  $factorial_num=factorial(numbers[n])]);
34    }
35
36function mod5(id: Log::ID, path: string, rec: Factor::Info) : string    
37    {
38    if ( rec$factorial_num % 5 == 0 )
39        return "factor-mod5";
40    
41    else
42        return "factor-non5";
43    }
44
45event zeek_init()
46    {
47    local filter: Log::Filter = [$name="split-mod5s", $path_func=mod5];
48    Log::add_filter(Factor::LOG, filter);
49    Log::remove_filter(Factor::LOG, "default");
50    }

Raising Notices

While Zeek’s Logging Framework provides an easy and systematic way to generate logs, there still exists a need to indicate when a specific behavior has been detected and a method to allow that detection to come to someone’s attention. To that end, the Notice Framework is in place to allow script writers a codified means through which they can raise a notice, as well as a system through which an operator can opt-in to receive the notice. Zeek holds to the philosophy that it is up to the individual operator to indicate the behaviors in which they are interested and as such Zeek ships with a large number of policy scripts which detect behavior that may be of interest but it does not presume to guess as to which behaviors are “action-able”. In effect, Zeek works to separate the act of detection and the responsibility of reporting. With the Notice Framework it’s simple to raise a notice for any behavior that is detected.

To raise a notice in Zeek, you only need to indicate to Zeek that you are provide a specific Notice::Type by exporting it and then make a call to NOTICE supplying it with an appropriate Notice::Info record. Often times the call to NOTICE includes just the Notice::Type, and a concise message. There are however, significantly more options available when raising notices as seen in the definition of Notice::Info. The only field in Notice::Info whose attributes make it a required field is the note field. Still, good manners are always important and including a concise message in $msg and, where necessary, the contents of the connection record in $conn along with the Notice::Type tend to comprise the minimum of information required for an notice to be considered useful. If the $conn variable is supplied the Notice Framework will auto-populate the $id and $src fields as well. Other fields that are commonly included, $identifier and $suppress_for are built around the automated suppression feature of the Notice Framework which we will cover shortly.

One of the default policy scripts raises a notice when an SSH login has been heuristically detected and the originating hostname is one that would raise suspicion. Effectively, the script attempts to define a list of hosts from which you would never want to see SSH traffic originating, like DNS servers, mail servers, etc. To accomplish this, the script adheres to the separation of detection and reporting by detecting a behavior and raising a notice. Whether or not that notice is acted upon is decided by the local Notice Policy, but the script attempts to supply as much information as possible while staying concise.

scripts/policy/protocols/ssh/interesting-hostnames.zeek
##! This script will generate a notice if an apparent SSH login originates
##! or heads to a host with a reverse hostname that looks suspicious.  By
##! default, the regular expression to match "interesting" hostnames includes
##! names that are typically used for infrastructure hosts like nameservers,
##! mail servers, web servers and ftp servers.

@load base/frameworks/notice

module SSH;

export {
    redef enum Notice::Type += {
        ## Generated if a login originates or responds with a host where
        ## the reverse hostname lookup resolves to a name matched by the
        ## :zeek:id:`SSH::interesting_hostnames` regular expression.
        Interesting_Hostname_Login,
    };

    ## Strange/bad host names to see successful SSH logins from or to.
    option interesting_hostnames =
            /^d?ns[0-9]*\./ |
            /^smtp[0-9]*\./ |
            /^mail[0-9]*\./ |
            /^pop[0-9]*\./  |
            /^imap[0-9]*\./ |
            /^www[0-9]*\./  |
            /^ftp[0-9]*\./;
}

function check_ssh_hostname(id: conn_id, uid: string, host: addr)
    {
    when ( local hostname = lookup_addr(host) )
        {
        if ( interesting_hostnames in hostname )
            {
            NOTICE([$note=Interesting_Hostname_Login,
                    $msg=fmt("Possible SSH login involving a %s %s with an interesting hostname.",
                             Site::is_local_addr(host) ? "local" : "remote",
                             host == id$orig_h ? "client" : "server"),
                    $sub=hostname, $id=id, $uid=uid]);
            }
        }
    }

event ssh_auth_successful(c: connection, auth_method_none: bool)
    {
    for ( host in set(c$id$orig_h, c$id$resp_h) )
        {
        check_ssh_hostname(c$id, c$uid, host);
        }
    }

While much of the script relates to the actual detection, the parts specific to the Notice Framework are actually quite interesting in themselves. The script’s export block adds the value SSH::Interesting_Hostname_Login to the enumerable constant Notice::Type to indicate to the Zeek core that a new type of notice is being defined. The script then calls NOTICE and defines the $note, $msg, $sub, id, and $uid fields of the Notice::Info record. (More commonly, one would set $conn instead, however this script avoids using the connection record inside the when-statement for performance reasons.) There are two ternary if statements that modify the $msg text depending on whether the host is a local address and whether it is the client or the server. This use of fmt and ternary operators is a concise way to lend readability to the notices that are generated without the need for branching if statements that each raise a specific notice.

The opt-in system for notices is managed through writing Notice::policy hooks. A Notice::policy hook takes as its argument a Notice::Info record which will hold the same information your script provided in its call to NOTICE. With access to the Notice::Info record for a specific notice you can include logic such as in statements in the body of your hook to alter the policy for handling notices on your system. In Zeek, hooks are akin to a mix of functions and event handlers: like functions, calls to them are synchronous (i.e., run to completion and return); but like events, they can have multiple bodies which will all execute. For defining a notice policy, you define a hook and Zeek will take care of passing in the Notice::Info record. The simplest kind of Notice::policy hooks simply check the value of $note in the Notice::Info record being passed into the hook and performing an action based on the answer. The hook below adds the Notice::ACTION_EMAIL action for the SSH::Interesting_Hostname_Login notice raised in the policy/protocols/ssh/interesting-hostnames.zeek script.

framework_notice_hook_01.zeek
1@load policy/protocols/ssh/interesting-hostnames.zeek
2
3hook Notice::policy(n: Notice::Info)
4  {
5  if ( n$note == SSH::Interesting_Hostname_Login )
6      add n$actions[Notice::ACTION_EMAIL];
7  }

In the example above we’ve added Notice::ACTION_EMAIL to the n$actions set. This set, defined in the Notice Framework scripts, can only have entries from the Notice::Action type, which is itself an enumerable that defines the values shown in the table below along with their corresponding meanings. The Notice::ACTION_LOG action writes the notice to the Notice::LOG logging stream which, in the default configuration, will write each notice to the notice.log file and take no further action. The Notice::ACTION_EMAIL action will send an email to the address or addresses defined in the Notice::mail_dest variable with the particulars of the notice as the body of the email. The last action, Notice::ACTION_ALARM sends the notice to the Notice::ALARM_LOG logging stream which is then rotated hourly and its contents emailed in readable ASCII to the addresses in Notice::mail_dest.

Notice::ACTION_NONE

Take no action

Notice::ACTION_LOG

Send the notice to the Notice::LOG logging stream.

Notice::ACTION_EMAIL

Send an email with the notice in the body.

Notice::ACTION_ALARM

Send the notice to the Notice::Alarm_LOG stream.

While actions like the Notice::ACTION_EMAIL action have appeal for quick alerts and response, a caveat of its use is to make sure the notices configured with this action also have a suppression. A suppression is a means through which notices can be ignored after they are initially raised if the author of the script has set an identifier. An identifier is a unique string of information collected from the connection relative to the behavior that has been observed by Zeek.

scripts/policy/protocols/ssl/expiring-certs.zeek
NOTICE([$note=Certificate_Expires_Soon,
        $msg=fmt("Certificate %s is going to expire at %T", cert$subject, cert$not_valid_after),
        $conn=c, $suppress_for=1day,
        $identifier=cat(c$id$resp_h, c$id$resp_p, hash),
        $fuid=fuid]);

In the policy/protocols/ssl/expiring-certs.zeek script which identifies when SSL certificates are set to expire and raises notices when it crosses a predefined threshold, the call to NOTICE above also sets the $identifier entry by concatenating the responder IP, port, and the hash of the certificate. The selection of responder IP, port and certificate hash fits perfectly into an appropriate identifier as it creates a unique identifier with which the suppression can be matched. Were we to take out any of the entities used for the identifier, for example the certificate hash, we could be setting our suppression too broadly, causing an analyst to miss a notice that should have been raised. Depending on the available data for the identifier, it can be useful to set the $suppress_for variable as well. The expiring-certs.zeek script sets $suppress_for to 1day, telling the Notice Framework to suppress the notice for 24 hours after the first notice is raised. Once that time limit has passed, another notice can be raised which will again set the 1day suppression time. Suppressing for a specific amount of time has benefits beyond simply not filling up an analyst’s email inbox; keeping the notice alerts timely and succinct helps avoid a case where an analyst might see the notice and, due to over exposure, ignore it.

The $suppress_for variable can also be altered in a Notice::policy hook, allowing a deployment to better suit the environment in which it is be run. Using the example of expiring-certs.zeek, we can write a Notice::policy hook for SSL::Certificate_Expires_Soon to configure the $suppress_for variable to a shorter time.

framework_notice_hook_suppression_01.zeek
1@load policy/protocols/ssl/expiring-certs.zeek
2
3hook Notice::policy(n: Notice::Info) 
4   {
5   if ( n$note == SSL::Certificate_Expires_Soon )
6       n$suppress_for = 12hrs;
7   }

While Notice::policy hooks allow you to build custom predicate-based policies for a deployment, there are bound to be times where you don’t require the full expressiveness that a hook allows. In short, there will be notice policy considerations where a broad decision can be made based on the Notice::Type alone. To facilitate these types of decisions, the Notice Framework supports Notice Policy shortcuts. These shortcuts are implemented through the means of a group of data structures that map specific, predefined details and actions to the effective name of a notice. Primarily implemented as a set or table of enumerables of Notice::Type, Notice Policy shortcuts can be placed as a single directive in your local.zeek file as a concise readable configuration. As these variables are all constants, it bears mentioning that these variables are all set at parse-time before Zeek is fully up and running and not set dynamically.

Name

Description

Data Type

Notice::ignored_types

Ignore the Notice::Type entirely

set[Notice::Type]

Notice::emailed_types

Set Notice::ACTION_EMAIL to this Notice::Type

set[Notice::Type]

Notice::alarmed_types

Set Notice::ACTION_ALARM to this Notice::Type

set[Notice::Type]

Notice::not_suppressed_types

Remove suppression from this Notice::Type

set[Notice::Type]

Notice::type_suppression_intervals

Alter the $suppress_for value for this Notice::Type

table[Notice::Type] of interval

The table above details the five Notice Policy shortcuts, their meaning and the data type used to implement them. With the exception of Notice::type_suppression_intervals a set data type is employed to hold the Notice::Type of the notice upon which a shortcut should applied. The first three shortcuts are fairly self explanatory, applying an action to the Notice::Type elements in the set, while the latter two shortcuts alter details of the suppression being applied to the Notice. The shortcut Notice::not_suppressed_types can be used to remove the configured suppression from a notice while Notice::type_suppression_intervals can be used to alter the suppression interval defined by $suppress_for in the call to NOTICE.

framework_notice_shortcuts_01.zeek
1@load policy/protocols/ssh/interesting-hostnames.zeek
2@load base/protocols/ssh/
3
4redef Notice::emailed_types += {
5    SSH::Interesting_Hostname_Login
6};

The Notice Policy shortcut above adds the Notice::Type of SSH::Interesting_Hostname_Login to the Notice::emailed_types set while the shortcut below alters the length of time for which those notices will be suppressed.

framework_notice_shortcuts_02.zeek
1@load policy/protocols/ssh/interesting-hostnames.zeek
2@load base/protocols/ssh/
3
4redef Notice::type_suppression_intervals += {
5    [SSH::Interesting_Hostname_Login] = 1day,
6};