base/frameworks/files/main.zeek
- Files
An interface for driving the analysis of files, possibly independent of any network protocol over which they’re transported.
- Namespace
Files
- Imports
base/bif/file_analysis.bif.zeek, base/frameworks/analyzer, base/frameworks/logging, base/utils/site.zeek
Summary
Runtime Options
The default setting for file reassembly. |
Redefinable Options
Decide if you want to automatically attached analyzers to files based on the detected mime type of the file. |
|
A table that can be used to disable file analysis completely for any files transferred over given network protocol analyzers. |
|
The default per-file reassembly buffer size. |
Types
A structure which parameterizes a type of file analysis. |
|
Contains all metadata related to the analysis of a given file. |
|
Redefinitions
|
|
|
Events
Event that can be handled to access the Info record as it is sent on to the logging framework. |
Hooks
A default logging policy hook for the stream. |
Functions
Adds an analyzer to the analysis of a given file. |
|
Returns a table of all MIME-type-to-analyzer mappings currently registered. |
|
Checks whether a file analyzer is generally enabled. |
|
Translates a file analyzer enum value to a string with the analyzer’s name. |
|
Provides a text description regarding metadata of the file. |
|
Disables a file analyzer. |
|
Disables the file reassembler on this file. |
|
Enables a file analyzer. |
|
Allows the file reassembler to be used if it’s necessary because the file is transferred out of order. |
|
Lookup to see if a particular file id exists and is still valid. |
|
Lookup an |
|
Register a callback for file analyzers to use if they need to do some manipulation when they are being added to a file before the core code takes over. |
|
Registers a MIME type for an analyzer. |
|
Registers a set of MIME types for an analyzer. |
|
Register callbacks for protocols that work with the Files framework. |
|
Returns a set of all MIME types currently registered for a specific analyzer. |
|
Removes an analyzer from the analysis of a given file. |
|
Set the maximum size the reassembly buffer is allowed to grow for the given file. |
|
Sets the timeout_interval field of |
|
Stops/ignores any further analysis of a given file. |
Detailed Interface
Runtime Options
- Files::enable_reassembler
-
The default setting for file reassembly.
Redefinable Options
- Files::analyze_by_mime_type_automatically
-
Decide if you want to automatically attached analyzers to files based on the detected mime type of the file.
- Files::disable
- Type
table
[Files::Tag
] ofbool
- Attributes
- Default
{}
A table that can be used to disable file analysis completely for any files transferred over given network protocol analyzers.
- Files::reassembly_buffer_size
-
The default per-file reassembly buffer size.
Types
- Files::AnalyzerArgs
- Type
-
- chunk_event:
event
(f:fa_file
, data:string
, off:count
)&optional
An event which will be generated for all new file contents, chunk-wise. Used when tag (in the
Files::add_analyzer
function) isFiles::ANALYZER_DATA_EVENT
.- stream_event:
event
(f:fa_file
, data:string
)&optional
An event which will be generated for all new file contents, stream-wise. Used when tag is
Files::ANALYZER_DATA_EVENT
.- extract_filename:
string
&optional
(present if base/files/extract/main.zeek is loaded)
The local filename to which to write an extracted file. This field is used in the core by the extraction plugin to know where to write the file to. If not specified, then a filename in the format “extract-<source>-<id>” is automatically assigned (using the source and id fields of
fa_file
).- extract_limit:
count
&default
=FileExtract::default_limit
&optional
(present if base/files/extract/main.zeek is loaded)
The maximum allowed file size in bytes of extract_filename. Once reached, a
file_extraction_limit
event is raised and the analyzer will be removed unlessFileExtract::set_limit
is called to increase the limit. A value of zero means “no limit”.- extract_limit_includes_missing:
bool
&default
=FileExtract::default_limit_includes_missing
&optional
(present if base/files/extract/main.zeek is loaded)
By default, missing bytes in files count towards the extract file size. Missing bytes can, e.g., occur due to missed traffic, or offsets used when downloading files. Setting this option to false changes this behavior so that holes in files do no longer count towards these limits. Files with holes are created as sparse files on disk. Their apparent size can exceed this file size limit.
- chunk_event:
- Attributes
A structure which parameterizes a type of file analysis.
- Files::Info
- Type
-
- ts:
time
&log
The time when the file was first seen.
- fuid:
string
&log
An identifier associated with a single file.
- uid:
string
&log
&optional
If this file, or parts of it, were transferred over a network connection, this is the uid for the connection.
- id:
conn_id
&log
&optional
If this file, or parts of it, were transferred over a network connection, this shows the connection.
- source:
string
&log
&optional
An identification of the source of the file data. E.g. it may be a network protocol over which it was transferred, or a local file path which was read, or some other input source.
- depth:
count
&default
=0
&optional
&log
A value to represent the depth of this file in relation to its source. In SMTP, it is the depth of the MIME attachment on the message. In HTTP, it is the depth of the request within the TCP connection.
- analyzers:
set
[string
]&default
={ }
&optional
&log
A set of analysis types done during the file analysis.
- mime_type:
string
&log
&optional
A mime type provided by the strongest file magic signature match against the bof_buffer field of
fa_file
, or in the cases where no buffering of the beginning of file occurs, an initial guess of the mime type based on the first data seen.- filename:
string
&log
&optional
A filename for the file if one is available from the source for the file. These will frequently come from “Content-Disposition” headers in network protocols.
- duration:
interval
&log
&default
=0 secs
&optional
The duration the file was analyzed for.
- local_orig:
bool
&log
&optional
If the source of this file is a network connection, this field indicates if the data originated from the local network or not as determined by the configured
Site::local_nets
.- is_orig:
bool
&log
&optional
If the source of this file is a network connection, this field indicates if the file is being sent by the originator of the connection or the responder.
- seen_bytes:
count
&log
&default
=0
&optional
Number of bytes provided to the file analysis engine for the file. The value refers to the total number of bytes processed for this file across all connections seen by the current Zeek instance.
- total_bytes:
count
&log
&optional
Total number of bytes that are supposed to comprise the full file.
- missing_bytes:
count
&log
&default
=0
&optional
The number of bytes in the file stream that were completely missed during the process of analysis e.g. due to dropped packets. The value refers to number of bytes missed for this file across all connections seen by the current Zeek instance.
- overflow_bytes:
count
&log
&default
=0
&optional
The number of bytes in the file stream that were not delivered to stream file analyzers. This could be overlapping bytes or bytes that couldn’t be reassembled.
- timedout:
bool
&log
&default
=F
&optional
Whether the file analysis timed out at least once for the file.
- parent_fuid:
string
&log
&optional
Identifier associated with a container file from which this one was extracted as part of the file analysis.
- md5:
string
&log
&optional
(present if base/files/hash/main.zeek is loaded)
An MD5 digest of the file contents.
- sha1:
string
&log
&optional
(present if base/files/hash/main.zeek is loaded)
A SHA1 digest of the file contents.
- sha256:
string
&log
&optional
(present if base/files/hash/main.zeek is loaded)
A SHA256 digest of the file contents.
- x509:
X509::Info
&optional
(present if base/files/x509/main.zeek is loaded)
Information about X509 certificates. This is used to keep certificate information until all events have been received.
- extracted:
string
&optional
&log
(present if base/files/extract/main.zeek is loaded)
Local filename of extracted file.
- extracted_cutoff:
bool
&optional
&log
(present if base/files/extract/main.zeek is loaded)
Set to true if the file being extracted was cut off so the whole file was not logged.
- extracted_size:
count
&optional
&log
(present if base/files/extract/main.zeek is loaded)
The number of bytes extracted to disk.
- entropy:
double
&log
&optional
(present if policy/frameworks/files/entropy-test-all-files.zeek is loaded)
The information density of the contents of the file, expressed as a number of bits per character.
- ts:
- Attributes
Contains all metadata related to the analysis of a given file. For the most part, fields here are derived from ones of the same name in
fa_file
.
- Files::ProtoRegistration
- Type
-
- get_file_handle:
function
(c:connection
, is_orig:bool
)string
A callback to generate a file handle on demand when one is needed by the core.
- describe:
function
(f:fa_file
)string
&default
=function
&optional
A callback to “describe” a file. In the case of an HTTP transfer the most obvious description would be the URL. It’s like an extremely compressed version of the normal log.
- get_file_handle:
Events
- Files::log_files
- Type
event
(rec:Files::Info
)
Event that can be handled to access the Info record as it is sent on to the logging framework.
Hooks
- Files::log_policy
- Type
A default logging policy hook for the stream.
Functions
- Files::add_analyzer
- Type
function
(f:fa_file
, tag:Files::Tag
, args:Files::AnalyzerArgs
&default
= [chunk_event=<uninitialized>, stream_event=<uninitialized>, extract_filename=<uninitialized>, extract_limit=104857600, extract_limit_includes_missing=T]&optional
) :bool
Adds an analyzer to the analysis of a given file.
- Parameters
f – the file.
tag – the analyzer type.
args – any parameters the analyzer takes.
- Returns
true if the analyzer will be added, or false if analysis for the file isn’t currently active or the args were invalid for the analyzer type.
- Files::all_registered_mime_types
- Type
function
() :table
[Files::Tag
] ofset
[string
]
Returns a table of all MIME-type-to-analyzer mappings currently registered.
- Returns
A table mapping each analyzer to the set of MIME types registered for it.
- Files::analyzer_enabled
- Type
function
(tag:Files::Tag
) :bool
Checks whether a file analyzer is generally enabled.
- Parameters
tag – the analyzer type to check.
- Returns
true if the analyzer is generally enabled, else false.
- Files::analyzer_name
- Type
function
(tag:Files::Tag
) :string
Translates a file analyzer enum value to a string with the analyzer’s name.
- Parameters
tag – The analyzer tag.
- Returns
The analyzer name corresponding to the tag.
- Files::describe
-
Provides a text description regarding metadata of the file. For example, with HTTP it would return a URL.
- Parameters
f – The file to be described.
- Returns
a text description regarding metadata of the file.
- Files::disable_analyzer
- Type
function
(tag:Files::Tag
) :bool
Disables a file analyzer.
- Parameters
tag – the analyzer type to disable.
- Returns
false if the analyzer tag could not be found, else true.
- Files::disable_reassembly
-
Disables the file reassembler on this file. If the file is not transferred out of order this will have no effect.
- Parameters
f – the file.
- Files::enable_analyzer
- Type
function
(tag:Files::Tag
) :bool
Enables a file analyzer.
- Parameters
tag – the analyzer type to enable.
- Returns
false if the analyzer tag could not be found, else true.
- Files::enable_reassembly
-
Allows the file reassembler to be used if it’s necessary because the file is transferred out of order.
- Parameters
f – the file.
- Files::file_exists
-
Lookup to see if a particular file id exists and is still valid.
- Parameters
fuid – the file id.
- Returns
T if the file uid is known.
- Files::lookup_file
-
Lookup an
fa_file
record with the file id.- Parameters
fuid – the file id.
- Returns
the associated
fa_file
record.
- Files::register_analyzer_add_callback
- Type
function
(tag:Files::Tag
, callback:function
(f:fa_file
, args:Files::AnalyzerArgs
) :void
) :void
Register a callback for file analyzers to use if they need to do some manipulation when they are being added to a file before the core code takes over. This is unlikely to be interesting for users and should only be called by file analyzer authors but is not required.
- Parameters
tag – Tag for the file analyzer.
callback – Function to execute when the given file analyzer is being added.
- Files::register_for_mime_type
- Type
function
(tag:Files::Tag
, mt:string
) :bool
Registers a MIME type for an analyzer. If a future file with this type is seen, the analyzer will be automatically assigned to parsing it. The function adds to all MIME types already registered, it doesn’t replace them.
- Parameters
tag – The tag of the analyzer.
mt – The MIME type in the form “foo/bar” (case-insensitive).
- Returns
True if the MIME type was successfully registered.
- Files::register_for_mime_types
- Type
function
(tag:Files::Tag
, mime_types:set
[string
]) :bool
Registers a set of MIME types for an analyzer. If a future connection on one of these types is seen, the analyzer will be automatically assigned to parsing it. The function adds to all MIME types already registered, it doesn’t replace them.
- Parameters
tag – The tag of the analyzer.
mts – The set of MIME types, each in the form “foo/bar” (case-insensitive).
- Returns
True if the MIME types were successfully registered.
- Files::register_protocol
- Type
function
(tag:Analyzer::Tag
, reg:Files::ProtoRegistration
) :bool
Register callbacks for protocols that work with the Files framework. The callbacks must uniquely identify a file and each protocol can only have a single callback registered for it.
- Parameters
tag – Tag for the protocol analyzer having a callback being registered.
reg – A
Files::ProtoRegistration
record.
- Returns
true if the protocol being registered was not previously registered.
- Files::registered_mime_types
- Type
function
(tag:Files::Tag
) :set
[string
]
Returns a set of all MIME types currently registered for a specific analyzer.
- Parameters
tag – The tag of the analyzer.
- Returns
The set of MIME types.
- Files::remove_analyzer
- Type
function
(f:fa_file
, tag:Files::Tag
, args:Files::AnalyzerArgs
&default
= [chunk_event=<uninitialized>, stream_event=<uninitialized>, extract_filename=<uninitialized>, extract_limit=104857600, extract_limit_includes_missing=T]&optional
) :bool
Removes an analyzer from the analysis of a given file.
- Parameters
f – the file.
tag – the analyzer type.
args – the analyzer (type and args) to remove.
- Returns
true if the analyzer will be removed, or false if analysis for the file isn’t currently active.
- Files::set_reassembly_buffer_size
-
Set the maximum size the reassembly buffer is allowed to grow for the given file.
- Parameters
f – the file.
max – Maximum allowed size of the reassembly buffer.
- Files::set_timeout_interval
-
Sets the timeout_interval field of
fa_file
, which is used to determine the length of inactivity that is allowed for a file before internal state related to it is cleaned up. When used within afile_timeout
handler, the analysis will delay timing out again for the period specified by t.- Parameters
f – the file.
t – the amount of time the file can remain inactive before discarding.
- Returns
true if the timeout interval was set, or false if analysis for the file isn’t currently active.
- Files::stop
-
Stops/ignores any further analysis of a given file.
- Parameters
f – the file.
- Returns
true if analysis for the given file will be ignored for the rest of its contents, or false if analysis for the file isn’t currently active.