base/utils/urls.zeek
Functions for URL handling.
Summary
Redefinable Options
A regular expression for matching and extracting URLs. |
Types
A URI, as parsed by |
Functions
Extracts URLs discovered in arbitrary text. |
|
Extracts URLs discovered in arbitrary text without the URL scheme included. |
Detailed Interface
Redefinable Options
- url_regex
- Type:
- Attributes:
- Default:
/^?(^([a-zA-Z\-]{3,5}):\/\/(-\.)?([^[:blank:]\/?\.#-]+\.?)+(\/[^[:blank:]]*)?)$?/
A regular expression for matching and extracting URLs. This is the @imme_emosol regex from https://mathiasbynens.be/demo/url-regex, adapted for Zeek. It’s not perfect for all of their test cases, but it’s one of the shorter ones that covers most of the test cases.
Types
- URI
- Type:
- Fields:
A URI, as parsed by
decompose_uri.
Functions
- find_all_urls
- Type:
function(s:string) :string_set
Extracts URLs discovered in arbitrary text.
- find_all_urls_without_scheme
- Type:
function(s:string) :string_set
Extracts URLs discovered in arbitrary text without the URL scheme included.