base/utils/urls.zeek¶
Functions for URL handling.
Summary¶
Redefinable Options¶
url_regex : pattern &redef |
A regular expression for matching and extracting URLs. |
Types¶
URI : record |
A URI, as parsed by decompose_uri . |
Functions¶
decompose_uri : function |
|
find_all_urls : function |
Extracts URLs discovered in arbitrary text. |
find_all_urls_without_scheme : function |
Extracts URLs discovered in arbitrary text without the URL scheme included. |
Detailed Interface¶
Redefinable Options¶
-
url_regex
¶ Type: Attributes: Default: /^?(^([a-zA-Z\-]{3,5}):\/\/(-\.)?([^[:blank:]\/?\.#-]+\.?)+(\/[^[:blank:]]*)?)$?/
A regular expression for matching and extracting URLs. This is the @imme_emosol regex from https://mathiasbynens.be/demo/url-regex, adapted for Zeek. It’s not perfect for all of their test cases, but it’s one of the shorter ones that covers most of the test cases.
Types¶
-
URI
¶ Type: - scheme:
string
&optional
The URL’s scheme..
- netlocation:
string
The location, which could be a domain name or an IP address. Left empty if not specified.
- portnum:
count
&optional
Port number, if included in URI.
- path:
string
Full including the file name. Will be ‘/’ if there’s not path given.
- file_name:
string
&optional
Full file name, including extension, if there is a file name.
- file_base:
string
&optional
The base filename, without extension, if there is a file name.
- file_ext:
string
&optional
The filename’s extension, if there is a file name.
- params:
table
[string
] ofstring
&optional
A table of all query parameters, mapping their keys to values, if there’s a query.
A URI, as parsed by
decompose_uri
.- scheme:
Functions¶
-
find_all_urls
¶ Type: function
(s:string
) :string_set
Extracts URLs discovered in arbitrary text.
-
find_all_urls_without_scheme
¶ Type: function
(s:string
) :string_set
Extracts URLs discovered in arbitrary text without the URL scheme included.