Logging Framework
Zeek comes with a flexible key-value based logging interface that allows fine-grained control of what gets logged and how it is logged. This document describes how logging can be customized and extended.
Terminology
Zeek’s logging interface is built around three main abstractions:
- Streams
A log stream corresponds to a single log. It defines the set of fields that a log consists of with their names and types. Examples are the conn stream for recording connection summaries, and the http stream for recording HTTP activity.
- Filters
Each stream has a set of filters attached to it that determine what information gets written out, and how. By default, each stream has one default filter that just logs everything directly to disk. However, additional filters can be added to record only a subset of the log records, write to different outputs, or set a custom rotation interval. If all filters are removed from a stream, then output is disabled for that stream.
- Writers
Each filter has a writer. A writer defines the actual output format for the information being logged. The default writer is the ASCII writer, which produces tab-separated ASCII files. Other writers are available, like for binary output or direct logging into a database.
There are several different ways to customize Zeek’s logging: you can create a new log stream, you can extend an existing log with new fields, you can apply filters to an existing log stream, or you can customize the output format by setting log writer options. All of these approaches are described in this document.
Streams
In order to log data to a new log stream, all of the following needs to be done:
A
record
type must be defined which consists of all the fields that will be logged (by convention, the name of this record type is usually “Info”).A log stream ID (an
enum
with type nameLog::ID
) must be defined that uniquely identifies the new log stream.A log stream must be created using the
Log::create_stream
function.When the data to be logged becomes available, the
Log::write
function must be called.
In the following example, we create a new module, Foo
, which creates a new
log stream.
module Foo;
export {
# Create an ID for our new stream. By convention, this is
# called "LOG".
redef enum Log::ID += { LOG };
# Define the record type that will contain the data to log.
type Info: record {
ts: time &log;
id: conn_id &log;
service: string &log &optional;
missed_bytes: count &log &default=0;
};
}
# Optionally, we can add a new field to the connection record so that
# the data we are logging (our "Info" record) will be easily
# accessible in a variety of event handlers.
redef record connection += {
# By convention, the name of this new field is the lowercase name
# of the module.
foo: Info &optional;
};
# This event is handled at a priority higher than zero so that if
# users modify this stream in another script, they can do so at the
# default priority of zero.
event zeek_init() &priority=5
{
# Create the stream. This adds a default filter automatically.
Log::create_stream(Foo::LOG, [$columns=Info, $path="foo"]);
}
In the definition of the Info
record above, notice that each field has the
&log
attribute. Without this attribute, a field will not appear in
the log output. Also notice one field has the &optional
attribute.
This indicates that the field might not be assigned any value before the log
record is written. Finally, a field with the &default
attribute
has a default value assigned to it automatically.
At this point, the only thing missing is a call to the Log::write
function to send data to the logging framework. The actual event handler where
this should take place will depend on where your data becomes available. In
this example, the connection_established
event provides our data,
and we also store a copy of the data being logged into the
connection
record:
event connection_established(c: connection)
{
local rec: Foo::Info = [$ts=network_time(), $id=c$id];
# Store a copy of the data in the connection record so other
# event handlers can access it.
c$foo = rec;
Log::write(Foo::LOG, rec);
}
If you run Zeek with this script, a new log file foo.log
will be
created. Although we only specified four fields in the Info
record above,
the log output will actually contain seven fields because one of the fields
(the one named id
) is itself a record type. Since a conn_id
record has four fields, then each of these fields is a separate column in the
log output. Note that the way that such fields are named in the log output
differs slightly from the way we would refer to the same field in a Zeek script
(each dollar sign is replaced with a period). For example, to access the first
field of a conn_id
in a Zeek script we would use the notation
id$orig_h
, but that field is named id.orig_h
in the log output.
When you are developing scripts that add data to the connection
record, care must be given to when and how long data is stored. Normally data
saved to the connection record will remain there for the duration of the
connection and from a practical perspective it’s not uncommon to need to delete
that data before the end of the connection.
Add Fields to a Log
You can add additional fields to a log by extending the record type that defines its content, and setting a value for the new fields before each log record is written.
Let’s say we want to add a boolean field is_private
to
Conn::Info
that indicates whether the originator IP address is part
of the RFC 1918 space:
# Add a field to the connection log record.
redef record Conn::Info += {
## Indicate if the originator of the connection is part of the
## "private" address space defined in RFC1918.
is_private: bool &default=F &log;
};
As this example shows, when extending a log stream’s Info
record, each new
field must always be declared either with a &default value or as
&optional
. Furthermore, you need to add the &log
attribute or otherwise the field won’t appear in the log file.
Now we need to set the field. Although the details vary depending on which log is being extended, in general it is important to choose a suitable event in which to set the additional fields because we need to make sure that the fields are set before the log record is written. Sometimes the right choice is the same event which writes the log record, but at a higher priority (in order to ensure that the event handler that sets the additional fields is executed before the event handler that writes the log record).
In this example, since a connection’s summary is generated at the time its state is removed from memory, we can add another handler at that time that sets our field correctly:
event connection_state_remove(c: connection)
{
if ( c$id$orig_h in Site::private_address_space )
c$conn$is_private = T;
}
Now conn.log
will show a new field is_private
of type
bool
. If you look at the Zeek script which defines the connection
log stream base/protocols/conn/main.zeek, you will see that
Log::write
gets called in an event handler for the same event as
used in this example to set the additional fields, but at a lower priority than
the one used in this example (i.e., the log record gets written after we assign
the is_private
field).
For extending logs this way, one needs a bit of knowledge about how the script
that creates the log stream is organizing its state keeping. Most of the
standard Zeek scripts attach their log state to the connection
record where it can then be accessed, just like c$conn
above. For example,
the HTTP analysis adds a field http of type HTTP::Info
to the
connection
record.
Define a Logging Event
Sometimes it is helpful to do additional analysis of the information being logged. For these cases, a stream can specify an event that will be generated every time a log record is written to it. To do this, we need to modify the example module shown above to look something like this:
module Foo;
export {
redef enum Log::ID += { LOG };
type Info: record {
ts: time &log;
id: conn_id &log;
service: string &log &optional;
missed_bytes: count &log &default=0;
};
# Define a logging event. By convention, this is called
# "log_<stream>".
global log_foo: event(rec: Info);
}
event zeek_init() &priority=5
{
# Specify the "log_foo" event here in order for Zeek to raise it.
Log::create_stream(Foo::LOG, [$columns=Info, $ev=log_foo,
$path="foo"]);
}
All of Zeek’s default log streams define such an event. For example, the
connection log stream raises the event Conn::log_conn
. You could
use that for example for flagging when a connection to a specific destination
exceeds a certain duration:
redef enum Notice::Type += {
## Indicates that a connection remained established longer
## than 5 minutes.
Long_Conn_Found
};
event Conn::log_conn(rec: Conn::Info)
{
if ( rec?$duration && rec$duration > 5mins )
NOTICE([$note=Long_Conn_Found,
$msg=fmt("unusually long conn to %s", rec$id$resp_h),
$id=rec$id]);
}
Often, these events can be an alternative to post-processing Zeek logs externally with Perl scripts. Much of what such an external script would do later offline, one may instead do directly inside of Zeek in real-time.
Disable a Stream
One way to “turn off” a log is to completely disable the stream. For example, the following example will prevent the conn.log from being written:
event zeek_init()
{
Log::disable_stream(Conn::LOG);
}
Note that this must run after the stream is created, so the priority of this event handler must be lower than the priority of the event handler where the stream was created.
Delaying Log Writes
New in version 6.2.
The logging framework allows delaying log writes using the
Log::delay
function.
This functionality enables querying or waiting for additional data to attach to
an in-flight log record for which a Log::write
has happened.
Common examples are the execution of DNS reverse lookups for the addresses
of a connection, or - more generally - asynchronous queries to external systems.
Similarly, waiting a small duration for more data from an external process
pertaining to specific connections or events is another. For example, endpoint
agents may provide detailed process information for specific connections
logged by Zeek.
Conceptually, the delay of a log record is placed after the execution of the
global Log::log_stream_policy
hook and before the execution of
policy hooks attached to filters.
At this point, calling Log::delay
is only valid for the currently
active write during the execution of the global Log::log_stream_policy
hook. Calling Log::delay
in any other context or with the wrong
arguments results in runtime errors.
Note
While this may appear very restrictive, it does make it explicit that it is
the action of a Log::write
for a given stream and log record
that is being delayed as well as providing a defined point where a delay starts.
Prior ideas entertained the idea of an implicit and very lax interface, but in the end was deemed too loose and provided too much flexibility that would be hard to later restrict again or keep stable. The current interface might be made more lax in the future if it turns out to be too rigid.
By default, log records are not delayed. That is, during the execution of
the Log::write
function, a serialized version of the given log
record is handed off to a remote logger or a local logging thread.
Modifications of the same log record after Log::write
has returned
have no effect.
In contrast, when a log write is delayed using the Log::delay
function, the record is enqueued into a per-stream record queue and the
Log::write
returns. Processing of the delayed write resumes once
it is released by using the Log::delay_finish
function or until
a maximum, per-stream configurable, delay duration expires.
When processing of a log write is resumed, first, all post delay callbacks
given to Log::delay
are executed. Thereafter, as for non-delayed
writes, filter policy hooks are executed and the log record is serialized.
Policy hooks attached to filters and the serialization step observe any mutations done during the delay. Filter policy hooks may even use these modifications for deciding on the verdict of the given log record.
Note
Policy hooks attached to filters are often used to skip logging of
uninteresting log records. When combined with log write delaying, users
should consider lifting such filter logic up into the
Log::log_stream_policy
hook to avoid unnecessarily delaying
records when it is known that these will be discarded later on.
The Log::delay
and Log::delay_finish
functions increment
and decrement an internal reference count for a given write. To continue a
delayed write, Log::delay_finish
must be called as often as
Log::delay
.
Zeek delays a log record by a configurable interval defined for each log stream.
It defaults to the global Log::default_max_delay_interval
, and can be
adapted by calling Log::set_max_delay_interval
on the stream.
It is possible to explicitly extend the delay duration by providing a post
delay callback to Log::delay
. Calling Log::delay
from
within such a post delay callback re-delays the record, essentially putting
it at the end of the per-stream queue again.
Note
While this puts additional burden on the script writer to realize per-record specific longer delay intervals, it allows for a simpler internal implementation. Additionally, the explicit re-delaying is also meant to make users aware of the consequences when using such long delays either on purpose or by accident.
For multiple second or even longer delays, it is suggested to consider resumable, robust and non-ephemeral external post processing steps based on Zeek logs instead. In the face of worker crashes or uncontrolled restarts of a Zeek cluster, all delayed log records are inevitably lost.
The following example shows how to use the when to asynchronously
lookup the DNS names of the originator and responder addresses to enrich an
in-flight Conn::Info
record. By default, a stream’s maximum delay
interval is 200 milliseconds - the timeout 150msec
part ensures a delayed
write resumes after 150 milliseconds already by explicitly calling
Log::delay_finish
.
1@load base/protocols/conn
2
3redef record Conn::Info += {
4 orig_name: string &log &optional;
5 resp_name: string &log &optional;
6};
7
8hook Log::log_stream_policy(rec: Conn::Info, id: Log::ID)
9 {
10 if ( id != Conn::LOG )
11 return;
12
13 local token1 = Log::delay(id, rec);
14 local token2 = Log::delay(id, rec);
15
16 when [id, rec, token1] ( local orig_name = lookup_addr(rec$id$orig_h) )
17 {
18 rec$orig_name = orig_name;
19 Log::delay_finish(id, rec, token1);
20 }
21 timeout 150msec
22 {
23 Reporter::warning(fmt("lookup_addr timeout for %s", rec$id$orig_h));
24 Log::delay_finish(id, rec, token1);
25 }
26
27 when [id, rec, token2] ( local resp_name = lookup_addr(rec$id$resp_h) )
28 {
29 rec$resp_name = resp_name;
30 Log::delay_finish(id, rec, token2);
31 }
32 timeout 150msec
33 {
34 Reporter::warning(fmt("lookup_addr timeout for %s", rec$id$resp_h));
35 Log::delay_finish(id, rec, token2);
36 }
37 }
Filters
A stream has one or more filters attached to it. A stream without any filters
will not produce any log output. Filters govern two aspects of log production:
they control which of the stream’s log entries get written out, and they define
how to actually implement the log writes. They do the latter by specifying a
log writer that implements the write operation, such as the ASCII writer (see
below) for text file output. When a stream is created, it automatically gets a
default filter attached to it. This default filter can be removed or replaced,
or other filters can be added to the stream. This is accomplished by using
either the Log::add_filter
or Log::remove_filter
function. This section shows how to use filters to do such tasks as rename a
log file, split the output into multiple files, control which records are
written, and set a custom rotation interval.
Each filter has a unique name, scoped to the stream it belongs to. That is, all
filters attached to a given stream have different names. Calling
Log::add_filter
to add a filter with a name that already exists for
the stream replaces the existing filter.
Rename a Log File
Normally, the log filename for a given log stream is determined when the stream is created, unless you explicitly specify a different one by adding a filter.
The easiest way to change a log filename is to simply replace the default log
filter with a new filter that specifies a value for the path
field. In this
example, conn.log
will be changed to myconn.log
:
event zeek_init()
{
# Replace default filter for the Conn::LOG stream in order to
# change the log filename.
local f = Log::get_filter(Conn::LOG, "default");
f$path = "myconn";
Log::add_filter(Conn::LOG, f);
}
Keep in mind that the path
field of a log filter never contains the
filename extension. The extension will be determined later by the log writer.
Change the Logging Directory
By default, Zeek log files are created in the current working directory.
To write logs into a different directory, set Log::default_logdir
:
redef Log::default_logdir = /path/to/output_log_directory
The Log::default_logdir
option is honored by all file based
writes included with Zeek (ASCII and SQLite).
Add an Additional Output File
Normally, a log stream writes to only one log file. However, you can add filters so that the stream writes to multiple files. This is useful if you want to restrict the set of fields being logged to the new file.
In this example, a new filter is added to the Conn::LOG
stream that
writes two fields to a new log file:
event zeek_init()
{
# Add a new filter to the Conn::LOG stream that logs only
# timestamp and originator address.
local filter: Log::Filter = [$name="orig-only", $path="origs",
$include=set("ts", "id.orig_h")];
Log::add_filter(Conn::LOG, filter);
}
Note
When multiple filters added to a stream use the same path value, Zeek will
disambiguate the output file names by adding numeric suffixes to the name. If
we say $path="conn"
in the above example, Zeek warns us about the fact that
it’ll write this filter’s log entries to a different file:
1071580905.346457 warning: Write using filter 'orig-only' on path 'conn' changed to use new path 'conn-2' to avoid conflict with filter 'default'
The same also happens when omitting a path value, in which case the filter inherits the value of the stream’s path member.
Notice how the include
filter attribute specifies a set that limits the
fields to the ones given. The names correspond to those in the
Conn::Info
record (however, because the id
field is itself a
record, we can specify an individual field of id
by the dot notation shown
in the example).
Using the code above, in addition to the regular conn.log
, you will now
also get a new log file origs.log
that looks like the regular
conn.log
, but will have only the fields specified in the include
filter attribute.
If you want to skip only some fields but keep the rest, there is a corresponding exclude filter attribute that you can use instead of include to list only the ones you are not interested in.
If you want to make this the only log file for the stream, you can remove the default filter:
event zeek_init()
{
# Remove the filter called "default".
Log::remove_filter(Conn::LOG, "default");
}
Determine Log Path Dynamically
Instead of using the path
filter attribute, a filter can determine output
paths dynamically based on the record being logged. That allows, e.g., to
record local and remote connections into separate files. To do this, you define
a function that returns the desired path, and use the path_func
filter
attribute:
function myfunc(id: Log::ID, path: string, rec: Conn::Info) : string
{
# Return "conn-local" if originator is a local IP, otherwise
# return "conn-remote".
local r = Site::is_local_addr(rec$id$orig_h) ? "local" : "remote";
return fmt("%s-%s", path, r);
}
event zeek_init()
{
local filter: Log::Filter = [$name="conn-split",
$path_func=myfunc, $include=set("ts", "id.orig_h")];
Log::add_filter(Conn::LOG, filter);
}
Running this will now produce two new files, conn-local.log
and
conn-remote.log
, with the corresponding entries. For this example
to work, Site::local_nets
must specify your local network.
It defaults to IANA’s standard private address space. One
could extend this further for example to log information by subnets or even by
IP address. Be careful, however, as it is easy to create many files very
quickly.
The myfunc
function has one drawback: it can be used only with the Conn::LOG
stream as the record type is hardcoded into its argument list. However, Zeek
allows to do a more generic variant:
function myfunc(id: Log::ID, path: string,
rec: record { id: conn_id; } ) : string
{
local r = Site::is_local_addr(rec$id$orig_h) ? "local" : "remote";
return fmt("%s-%s", path, r);
}
This function can be used with all log streams that have records containing an
id: conn_id
field.
Filtering Log Records
We just saw ways how to customize the logged columns. The logging framework also
lets you control which records Zeek writes out. It relies on Zeek’s hook
mechanism to do this, as follows. The framework provides two levels of “policy”
hooks, a global one and a set of filter-level ones. The hook handlers can
implement additional processing of a log record, including vetoing the writing
of the record. Any handler that uses a break
statement to leave the
hook declares that a record shall not be written out. Anyone can attach handlers
to these hooks, which look as follows:
type Log::StreamPolicyHook: hook(rec: any, id: ID);
type Log::PolicyHook: hook(rec: any, id: ID, filter: Filter);
For both hook types, the rec
argument contains the entry to be logged and is
an instance of the record type associated with the stream’s columns, and id
identifies the log stream.
The logging framework defines one global hook policy hook: Log::log_stream_policy
.
For every log write, this hook gets invoked first. Any of its handlers may
decide to veto the log entry. The framework then iterates over the log stream’s
filters. Each filter has a filter$policy
hook of type Log::PolicyHook
.
Its handlers receive the log record, the ID of the log stream, and the filter
record itself. Each handler can veto the write. After the filter’s hook has run,
any veto (by Log::log_stream_policy
or the filter’s hook) aborts the
write via that filter. If no veto has occurred, the filter now steers the log
record to its output.
You can pass arbitrary state through these hook handlers. For example, you can
extending streams or filters via a redef
, or pass key-value pairs
via the filter$config
table..
Since you’ll often want to use uniform handling for all writes on a given stream, log streams offer a default hook, provided when constructing the stream, that the stream’s filters will use if they don’t provide their own. To support hooks on your log streams, you should always define a default hook when creating new streams, as follows:
module Foo;
export {
## The logging stream identifier.
redef enum Log::ID += { LOG };
## A default logging policy hook for the stream.
global log_policy: Log::PolicyHook;
# Define the record type that will contain the data to log.
type Info: record {
ts: time &log;
id: conn_id &log;
service: string &log &optional;
missed_bytes: count &log &default=0;
};
}
event zeek_init() &priority=5
{
# Create the stream, adding the default policy hook:
Log::create_stream(Foo::LOG, [$columns=Info, $path="foo", $policy=log_policy]);
}
With this hook in place, it’s now easy to add a filtering predicate for the Foo
log from anywhere:
hook Foo::log_policy(rec: Foo::Info, id: Log::ID, filter: Log::Filter)
{
# Let's only log complete information:
if ( rec$missed_bytes > 0 )
break;
}
The Zeek distribution features default hooks for all of its streams. Here’s a more realistic example, using HTTP:
hook HTTP::log_policy(rec: HTTP::Info, id: Log::ID, filter: Log::Filter)
{
# Record only connections with successfully analyzed HTTP traffic
if ( ! rec?$service || rec$service != "http" )
break;
}
To override a hook selectively in a new filter, set the hook when adding the filter to a stream:
hook my_policy(rec: Foo::Info, id: Log::ID, filter: Log::Filter)
{
# Let's only log incomplete flows:
if ( rec$missed_bytes == 0 )
break;
}
event zeek_init()
{
local filter: Log::Filter = [$name="incomplete-only",
$path="foo-incomplete",
$policy=my_policy];
Log::add_filter(Foo::LOG, filter);
}
Note that this approach has subtle implications: the new filter does not use the
Foo::log_policy
hook, and that hook does not get invoked for writes to this
filter. Any vetoes or additional processing implemented in Foo::log_policy
handlers no longer happens for the new filter. Such hook replacement should
rarely be necessary; you may find it preferable to narrow the stream’s default
handler to the filter in question:
hook Foo::log_policy(rec: Foo::Info, id: Log::ID, filter: Log::Filter)
{
if ( filter$name != "incomplete-only" )
return;
# Let's only log incomplete flows:
if ( rec$missed_bytes == 0 )
break;
}
For tasks that need to run once per-write, not once per-write-and-filter,
use the Log::log_stream_policy
instead:
hook Log::log_stream_policy(rec: Foo::Info, id: Log::ID)
{
# Called once per write
}
hook Foo::log_policy(rec: Foo::Info, id: Log::ID, filter: Log::Filter)
{
# Called once for each of Foo's filters.
}
To change an existing filter first retrieve it, then update it, and re-establish it:
hook my_policy(rec: Foo::Info, id: Log::ID, filter: Log::Filter)
{
# Let's only log incomplete flows:
if ( rec$missed_bytes == 0 )
break;
}
event zeek_init()
{
local f = Log::get_filter(Foo::LOG, "default");
f$policy = my_policy;
Log::add_filter(Foo::LOG, f);
}
Note
Policy hooks can also modify the log records, but with subtle implications. The logging framework applies all of a stream’s log filters sequentially to the same log record, so modifications made in a hook handler will persist not only into subsequent handlers in the same hook, but also into any in filters processed subsequently. In contrast to hook priorities, filters provide no control over their processing order.
Log Rotation and Post-Processing
The logging framework provides fine-grained control over when and how to rotate
log files. Log rotation means that Zeek periodically renames an active log
file, such as conn.log
, in a manner configurable by the user (e.g.,
renaming to conn_21-01-03_14-05-00.log
to timestamp it), and starts
over on a fresh conn.log
file. Post-processing means that Zeek can also
apply optional additional processing to the rotated file, such as compression
or file transfers. These mechanisms apply naturally to file-based log writers,
but are available to other writers as well for more generalized forms of
periodic additional processing of their outputs.
Rotation Timing
The log rotation interval is globally controllable for all filters by
redefining the Log::default_rotation_interval
constant, or
specifically for certain Log::Filter
instances by setting their
interv
field. The default value, 0secs
, disables rotation.
Note
When using ZeekControl, this option is set automatically via the ZeekControl configuration.
Here’s an example of changing just the Conn::LOG
stream’s default
filter rotation:
event zeek_init()
{
local f = Log::get_filter(Conn::LOG, "default");
f$interv = 1 min;
Log::add_filter(Conn::LOG, f);
}
Controlling File Naming
The redef’able Log::rotation_format_func
determines the naming of
the rotated-to file. The logging framework invokes the function with sufficient
context (a Log::RotationFmtInfo
record), from which it determines
the output name in two parts: the output directory, and the output file’s base
name, meaning its name without a suffix. It returns these two components via a
Log::RotationPath
record. The output directory defaults to
Log::default_rotation_dir
(a config option) and incorporates a
timestamp in the base name, as specified by
Log::default_rotation_date_format
.
When Log::default_logdir
is in use and Log::rotation_format_func
does not set an output directory (e.g. when Log::default_rotation_dir
is not set),
Log::default_logdir
is used as the default output directory.
For examples of customized log rotation, take a look at the relevant test cases.
Post-Processing of Rotated Logs
Post-processing can proceed via defaults configured across all log filters, or with per-filter customizations. Zeek provides helpful default infrastructure to simplify running shell commands on rotated logs, but you’re free to define your own post-processing infrastructure from scratch.
By default, the Log::default_rotation_postprocessor_cmd
, if
defined, runs on every rotated log. The wrapper function making the actual
command invocation is Log::run_rotation_postprocessor_cmd
. It
passes six additional arguments to the configured shell command:
The rotated-to file name (e.g.
conn_21-01-03_14-05-00.log
)The original base name (e.g.
conn
)The timestamp at which the original log file got created (e.g.
21-01-03_14.04.00
)The timestamp at which the original log file got rotated (e.g.
21-01-03_15.05.00
)1
if Zeek is terminating,0
otherwiseThe name of the writer (e.g.
ascii
for the ASCII writer)
Warning
Zeek ignores failures (non-zero exit codes) of this shell command: the
default rotation postprocessor command returns T
regardless. Be careful
if you implement your own postprocessor function: returning F
from it
will cause the corresponding log writer instance to shut down, therefore do
so only when the writer really won’t be able to continue.
Zeek ships with ready-to-use postprocessors for file transfer via SCP and SFTP. The Zeek project also provides an external tool, zeek-archiver, that performs log compression outside of the Zeek process for robustness.
Other Features
Log Extension Fields
The logging framework provides rudimentary support for adding additional
columns to an already defined log format, globally for all logs or for
individual log filters only. Records returned by the
Log::default_ext_func
function get added to every log, and the
ext_func
member of Log::Filter
in filter records allows local
overrides.
You can configure a prefix string separately for either of these options — this
string ensures that the resulting fields don’t collide with already existing
log fields. The prefix defaults to an underscore, via
Log::default_ext_prefix
. The ext_prefix
field in filter
records overrides as needed.
The following example, taken straight from a Zeek testcase, adds three extra columns to all logs:
type Extension: record {
write_ts: time &log;
stream: string &log;
system_name: string &log;
};
function add_extension(path: string): Extension
{
return Extension($write_ts = network_time(),
$stream = path,
$system_name = peer_description);
}
redef Log::default_ext_func = add_extension;
A resulting conn.log
:
#fields _write_ts _stream _system_name ts uid …
#types time string string time string …
1071580905.346457 conn zeek 1071580904.891921 Cod6Wj3YeJFHgkaO8j …
Note
Extension fields remain separate from the original log record. They remain invisible to filters, policy hooks, and log events. After filter processing determines that an entry is to be logged, the framework simply tucks the extension’s members onto the list of fields to write out.
Field Name Mapping
On occasion it can be handy to rewrite column names as they appear in a Zeek
log. A typical use case for this would be to ensure that column naming complies
with the requirements of your log ingestion system. To achieve this, you can
provide name translation maps, and here too you can do this either globally or
per-filter. The maps are simple string tables with the keys being Zeek’s field
names and the values being the ones to actually write out. Field names not
present in the maps remain unchanged. The global variant is the (normally
empty) Log::default_field_name_map
, and the corresponding
filter-local equivalent is the filter’s field_name_map
member.
For example, the following name map gets rid of the dots in the usual naming of connection IDs:
redef Log::default_field_name_map = {
["id.orig_h"] = "id_orig_h",
["id.orig_p"] = "id_orig_p",
["id.resp_h"] = "id_resp_h",
["id.resp_p"] = "id_resp_p"
};
With it, all logs rendering a connection identifier tuple now use …
#fields ts uid id_orig_h id_orig_p id_resp_h id_resp_p ...
… instead of the default names:
#fields ts uid id.orig_h id.orig_p id.resp_h id.resp_p ...
If you’d prefer this change only for a given log filter, make the change to the
filter record directly. The following changes the naming only for
conn.log
:
event zeek_init()
{
local f = Log::get_filter(Conn::LOG, "default");
f$field_name_map = table(
["id.orig_h"] = "id_orig_h",
["id.orig_p"] = "id_orig_p",
["id.resp_h"] = "id_resp_h",
["id.resp_p"] = "id_resp_p");
Log::add_filter(Conn::LOG, f);
}
Printing to Log Messages
Zeek’s print
statement normally writes to stdout
or a specific
output file. By adjusting the Log::print_to_log
enum value you can
redirect such statements to instead go directly into a Zeek log. Possible
values include:
Log::REDIRECT_NONE
: the default, which doesn’t involve Zeek logsLog::REDIRECT_STDOUT
: prints that would normally go to stdout go to a logLog::REDIRECT_ALL
: any prints end up in a log instead of stdout or other files
The Log::print_log_path
defines the name of the log file,
Log::PrintLogInfo
its columns, and Log::log_print
events allow you to process logged messages via event handlers.
Local vs Remote Logging
In its log processing, Zeek considers whether log writes should happen locally
to a Zeek node or remotely on another node, after forwarding log entries to it.
Single-node Zeek setups default to local logging, whereas cluster setups enable
local logging only on logger nodes, and log remotely on all but the logger
nodes. You normally don’t need to go near these settings, but you can do so by
redef
’ing the Log::enable_local_logging
and
Log::enable_remote_logging
booleans, respectively.
Writers
Each filter has a writer. If you do not specify a writer when adding a filter to a stream, then the ASCII writer is the default.
There are two ways to specify a non-default writer. To change the default
writer for all log filters, just redefine the Log::default_writer
option. Alternatively, you can specify the writer to use on a per-filter basis
by setting a value for the filter’s writer
field. Consult the documentation
of the writer to use to see if there are other options that are needed.
ASCII Writer
By default, the ASCII writer outputs log files that begin with several lines of
metadata, followed by the actual log output. The metadata describes the format
of the log file, the path
of the log (i.e., the log filename without file
extension), and also specifies the time that the log was created and the time
when Zeek finished writing to it. The ASCII writer has a number of options for
customizing the format of its output, see
base/frameworks/logging/writers/ascii.zeek. If you change the
output format options, then be careful to check whether your post-processing
scripts can still recognize your log files.
Some writer options are global (i.e., they affect all log filters using that log writer). For example, to change the output format of all ASCII logs to JSON format:
redef LogAscii::use_json = T;
Some writer options are filter-specific (i.e., they affect only the filters
that explicitly specify the option). For example, to change the output format
of the conn.log
only:
event zeek_init()
{
local f = Log::get_filter(Conn::LOG, "default");
# Use tab-separated-value mode
f$config = table(["tsv"] = "T");
Log::add_filter(Conn::LOG, f);
}
SQLite Writer
SQLite is a simple, file-based, widely used SQL database system. Using SQLite allows Zeek to write and access data in a format that is easy to use in interchange with other applications. Due to the transactional nature of SQLite, databases can be used by several applications simultaneously. Zeek’s input framework supports a SQLite reader.
Logging support for SQLite is available in all Zeek installations. There is no need to load any additional scripts or for any compile-time configurations. Sending data from existing logging streams to SQLite is rather straightforward. Most likely you’ll want SQLite output only for select log filters, so you have to configure one to use the SQLite writer. The following example code adds SQLite as a filter for the connection log:
event zeek_init()
{
local filter: Log::Filter =
[
$name="sqlite",
$path="/var/db/conn",
$config=table(["tablename"] = "conn"),
$writer=Log::WRITER_SQLITE
];
Log::add_filter(Conn::LOG, filter);
}
Zeek will create the database file /var/db/conn.sqlite
if it does not
already exist. It will also create a table with the name conn
(if it does
not exist) and start appending connection information to the table.
Zeek does not currently support rotating SQLite databases as it does for ASCII logs. You have to take care to create them in adequate locations.
If you examine the resulting SQLite database, the schema will contain the same fields that are present in the ASCII log files:
sqlite3 /var/db/conn.sqlite
SQLite version 3.8.0.2 2013-09-03 17:11:13
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .schema
CREATE TABLE conn (
'ts' double precision,
'uid' text,
'id.orig_h' text,
'id.orig_p' integer,
...
Note that with the above code the ASCII conn.log
will still be created,
because it adds an additional log filter alongside the default, ASCII-logging
one. To prevent this you can remove the default filter:
Log::remove_filter(Conn::LOG, "default");
To create a custom SQLite log file, you have to create a new log stream that contains just the information you want to commit to the database. See the above documentation on how to create custom log streams.
None Writer
The None
writer, selected via Log::WRITER_NONE
, is largely a
troubleshooting and development aide. It discards all log entries it receives,
but behaves like a proper writer to the rest of the logging framework,
including, for example, pretended log rotation. If you enable its debugging
mode by setting LogNone::debug
to T
, Zeek reports operational
details about the writer’s activity to stdout
.