Popular Customizations
This page outlines customizations and additions that are popular among Zeek users.
Note
This page lists externally-maintained Zeek packages. The Zeek team does not provide support or maintenance for these packages. If you find bugs or have feature requests, please reach out to the respective package maintainers directly.
You may also post in the Zeek Slack #packages channel or forum to get help from the broader Zeek community.
Log Enrichment
Community ID
New in version 6.0.
Zeek includes native Community ID Flow Hashing support. This functionality has previously been provided through the zeek-community-id package.
Note
At this point, the external zeek-community-id package is still available to support Zeek deployments running older versions. However, the scripts provided by the package cause conflicts with those provided in Zeek 6.0 - do not load both.
Loading the
policy/protocols/conn/community-id-logging.zeek
and
policy/frameworks/notice/community-id.zeek
scripts adds an additional community_id
field to the
Conn::Info
and Notice::Info
record.
$ zeek -r ./traces/get.trace protocols/conn/community-id-logging LogAscii::use_json=T
$ jq < conn.log
{
"ts": 1362692526.869344,
"uid": "CoqLmg1Ds5TE61szq1",
"id.orig_h": "141.142.228.5",
"id.orig_p": 59856,
"id.resp_h": "192.150.187.43",
"id.resp_p": 80,
"proto": "tcp",
...
"community_id": "1:yvyB8h+3dnggTZW0UEITWCst97w="
}
The Community ID Flow Hash of a conn_id
instance can be computed
with the community_id_v1
builtin function directly on the command-line
or used in custom scripts.
$ zeek -e 'print community_id_v1([$orig_h=141.142.228.5, $orig_p=59856/tcp, $resp_h=192.150.187.43, $resp_p=80/tcp])'
1:yvyB8h+3dnggTZW0UEITWCst97w=
Address geolocation and AS lookups
Zeek supports IP address geolocation as well as AS (autonomous system) lookups. This requires two things:
Compilation of Zeek with the libmaxminddb library and development headers. If you’re using our Docker images or binary packages, there’s nothing to do: they ship with GeoIP support.
Installation of corresponding MaxMind database files on your system.
To check whether your Zeek supports geolocation, run zeek-config --have-geoip
(available since Zeek 6.2) or simply try an address lookup. The following
indicates that your Zeek lacks support:
$ zeek -e 'lookup_location(1.2.3.4)'
error in <command line>, line 1: Zeek was not configured for GeoIP support (lookup_location(1.2.3.4))
Read on for more details about building Zeek with GeoIP support, and how to configure access to the database files.
Building Zeek with libmaxminddb
If you build Zeek yourself, you need to install libmaxminddb prior to configuring your build.
RPM/RedHat-based Linux:
sudo yum install libmaxminddb-devel
DEB/Debian-based Linux:
sudo apt-get install libmaxminddb-dev
FreeBSD:
sudo pkg install libmaxminddb
Mac OS X:
You need to install from your preferred package management system (e.g. Homebrew, MacPorts, or Fink). For Homebrew, the name of the package that you need is libmaxminddb.
The configure
script’s output indicates whether it successfully located
libmaxminddb. If your system’s MaxMind library resides in a non-standard path,
you may need to specify it via ./configure --with-geoip=<path>
.
Installing and configuring GeoIP databases
MaxMind’s databases ship as individual files that you can download from their website after signing up for an account. Some Linux distributions also offer free databases in their package managers.
There are three types of databases: city-level geolocation, country-level
geolocation, and mapping of IP addresses to autonomous systems (AS number and
organization). Download these and decide on a place to put them on your
file system. If you use automated tooling or system packages for the
installation, that path may be chosen for you, such as /usr/share/GeoIP
.
Zeek provides three ways to configure access to the databases:
Specifying the path and filenames via script variables. Use the
mmdb_dir
variable, unset by default, to point to the directory containing the database(s). By default Zeek looks for databases calledGeoLite2-City.mmdb
,GeoLite2-Country.mmdb
, andGeoLite2-ASN.mmdb
. Starting with Zeek 6.2 you can adjust these names by redefining themmdb_city_db
,mmdb_country_db
, andmmdb_asn_db
variables.Relying on Zeek’s pre-configured search paths and filenames. The
mmdb_dir_fallbacks
variable contains default search paths that Zeek will try in turn whenmmdb_dir
is not set. Prior to Zeek 6.2 these paths were hardcoded; they’re now redefinable. For geolocation, Zeek first attempts the city-level databases due to their greater precision, and falls back to the city-level one. You can adjust the database filenames viammdb_city_db
and related variables, as covered above.Opening databases explicitly via scripting. The
mmdb_open_location_db
andmmdb_open_asn_db
functions take full paths to database files. Zeek only ever uses one geolocation and one ASN database, and these loads override any databases previously loaded. These loads can occur at any point.
Querying the databases
Two built-in functions provide GeoIP functionality:
function lookup_location(a:addr): geo_location
function lookup_autonomous_system(a:addr): geo_autonomous_system
lookup_location
returns a geo_location
record with
country/region/etc fields, while lookup_autonomous_system
returns a
geo_autonomous_system
record indicating the AS number and
organization. Depending on the queried IP address some fields may be
uninitialized, so you should guard access with an a?$b
existence test.
Zeek tests the database files for staleness. If it detects that a database has been updated, it will automatically reload it. Zeek does not automatically add GeoIP intelligence to its logs, but several add-on scripts and packages provide such functionality. These include:
The notice framework lets you configure notice types that you’d like to augment with location information. See
Notice::lookup_location_types
andNotice::ACTION_ADD_GEODATA
for details.The policy/protocols/smtp/detect-suspicious-orig.zeek and policy/protocols/ssh/geo-data.zeek policy scripts.
Several Zeek packages.
Testing
Before using the GeoIP functionality it is a good idea to verify that everything is setup correctly. You can quickly check if the GeoIP functionality works by running commands like these:
zeek -e "print lookup_location(8.8.8.8);"
If you see an error message similar to “Failed to open GeoIP location database”, then your database configuration is broken. You may need to rename or move your GeoIP database files.
Example
The following shows every FTP connection from hosts in Ohio, US:
event ftp_reply(c: connection, code: count, msg: string, cont_resp: bool)
{
local client = c$id$orig_h;
local loc = lookup_location(client);
if (loc?$region && loc$region == "OH" && loc?$country_code && loc$country_code == "US")
{
local city = loc?$city ? loc$city : "<unknown>";
print fmt("FTP Connection from:%s (%s,%s,%s)", client, city,
loc$region, loc$country_code);
}
}
Log Writers
Kafka
For exporting logs to Apache Kafka in a streaming fashion, the externally-maintained zeek-kafka package is a popular choice and easy to configure. It relies on librdkafka.
redef Log::default_writer = Log::WRITER_KAFKAWRITER;
redef Kafka::kafka_conf += {
["metadata.broker.list"] = "192.168.0.1:9092"
};
Logging
JSON Streaming Logs
The externally-maintained json-streaming-logs package tailors Zeek
for use with log shippers like Filebeat or fluentd. It configures
additional log files prefixed with json_streaming_
, adds _path
and _write_ts
fields to log records and configures log rotation
appropriately.
If you do not use a logging archive and want to stream all logs away from the system where Zeek is running without leveraging Kafka, this package helps you with that.
Long Connections
Zeek logs connection entries into the conn.log
only upon termination
or due to expiration of inactivity timeouts. Depending on the protocol and
chosen timeout values this can significantly delay the appearance of a log
entry for a given connection. The delay may be up to an hour for lingering
SSH connections or connections where the final FIN or RST packets were missed.
The zeek-long-connections package alleviates this by creating a conn_long.log
log with the same format as conn.log
, but containing entries for connections
that have been existing for configurable intervals.
By default, the first entry for a connection is logged after 10mins. Depending on
the environment, this can be lowered as even a 10 minute delay may be significant
for detection purposes in streaming setup.
Profiling and Debugging
jemalloc profiling
For investigation of memory leaks or state-growth issues within Zeek, jemalloc’s profiling is invaluable. A package providing a bit support for configuring jemalloc’s profiling facilities is zeek-jemalloc-profiling.
Some general information about memory profiling exists in the Troubleshooting section.