5.2.5. Types

5.2.5.1. Address

The address type stores both IPv4 and IPv6 addresses.

Type

addr

Constants

IPv4: 1.2.3.4
IPv6: [2001:db8:85a3:8d3:1319:8a2e:370:7348], [::1.2.3.4]

This type supports the pack/unpack operators.

Methods

family() → spicy::AddressFamily: Returns the protocol family of the address, which can be IPv4 or IPv6.

Operators

addr == addr → bool: Compares two address values.

addr != addr → bool: Compares two address values.

5.2.5.2. Bitfield

Bitfields provide access to individual bitranges inside an unsigned integer. That can’t be instantiated directly, but must be defined and parsed inside a unit.

Type

bitfield(N) { RANGE_1; ...; RANGE_N }
Each RANGE has one of the forms LABEL: A or LABEL: A..B where A and B are bit numbers.

Constants

bitfield(N) { RANGE_1 [= VALUE_1]; ...; RANGE_N [= VALUE_N] }

A bitfield constant represents expected values for all or some of the individual bitranges. They can be used only for parsing inside a unit field, not as values to otherwise operate with. To define such a constant with expected values, add = VALUE to the bitranges inside the type definition as suitable (with VALUE representing the final value after applying any &bit-order attribute, if present). See Bitfield for more information.

Operators

<bitfield> ?. <name> → bool: Returns true if the bitfield’s element has a value.

<bitfield> . <name> → <field type>: Retrieves the value of a bitfield’s attribute. This is the value of the corresponding bits inside the underlying integer value, shifted to the very right.

5.2.5.3. Bool

Boolean values can be True or False.

Type

bool

Constants

True, False

Operators

bool & bool → bool: Computes the bit-wise ‘and’ of the two boolean values.

bool | bool → bool: Computes the bit-wise ‘or’ of the two boolean values.

bool ^ bool → bool: Computes the bit-wise ‘xor’ of the two boolean values.

bool == bool → bool: Compares two boolean values.

bool != bool → bool: Compares two boolean values.

5.2.5.4. Bytes

Bytes instances store raw, opaque data. They provide iterators to traverse their content.

Types

bytes
iterator<bytes>

Constants

b"Spicy", b""

Methods

at(i: uint<64>) → iterator<bytes>: Returns an iterator representing the offset i inside the bytes value.

decode([ charset: spicy::Charset = spicy::Charset::UTF8 ], [ errors: spicy::DecodeErrorStrategy = spicy::DecodeErrorStrategy::REPLACE ]) → string: Interprets the bytes as representing an binary string encoded with the given character set, and converts it into a UTF8 string. If data is encountered that charset or UTF* cannot represent, it’s handled according to the errors strategy.

ends_with(suffix: bytes) → bool: Returns true if the bytes value ends with suffix.

find(needle: bytes) → tuple<bool, iterator<bytes>>: Searches needle in the value’s content. Returns a tuple of a boolean and an iterator. If needle was found, the boolean will be true and the iterator will point to its first occurrence. If needle was not found, the boolean will be false and the iterator will point to the last position so that everything before it is guaranteed to not contain even a partial match of needle. Note that for a simple yes/no result, you should use the in operator instead of this method, as it’s more efficient.

join(parts: vector) → bytes: Returns the concatenation of all elements in the parts list rendered as printable strings. The portions will be separated by the bytes value to which this method is invoked as a member.

lower([ charset: spicy::Charset = spicy::Charset::UTF8 ], [ errors: spicy::DecodeErrorStrategy = spicy::DecodeErrorStrategy::REPLACE ]) → bytes: Returns a lower-case version of the bytes value, assuming it is encoded in character set charset. If data is encountered that charset cannot represent, it’s handled according to the errors strategy.

match(regex: regexp, [ group: uint<64> ]) → result<bytes>: Matches the bytes object against the regular expression regex. Returns the matching part or, if group is given, then the corresponding subgroup. The expression is considered anchored to the beginning of the data.

split([ sep: bytes ]) → vector<bytes>: Splits the bytes value at each occurrence of sep and returns a vector containing the individual pieces, with all separators removed. If the separator is not found, or if the separator is empty, the returned vector will have the whole bytes value as its single element. If the separator is not given, the split will occur at sequences of white spaces.

split1([ sep: bytes ]) → tuple<bytes, bytes>: Splits the bytes value at the first occurrence of sep and returns the two parts as a 2-tuple, with the separator removed. If the separator is not found, the returned tuple will have the whole bytes value as its first element and an empty value as its second element. If the separator is empty, the returned tuple will have an empty first element and the whole bytes value as its second element. If the separator is not provided, the split will occur at the first sequence of white spaces.

starts_with(prefix: bytes) → bool: Returns true if the bytes value starts with prefix.

strip([ side: spicy::Side ], [ set: bytes ]) → bytes: Removes leading and/or trailing sequences of all characters in set from the bytes value. If set is not given, removes all white spaces. If side is given, it indicates which side of the value should be stripped; Side::Both is the default if not given.

sub(begin: iterator<bytes>, end: iterator<bytes>) → bytes: Returns the subsequence from begin to (but not including) end.

sub(begin: uint<64>, end: uint<64>) → bytes: Returns the subsequence from offset begin to (but not including) offset end.

sub(end: iterator<bytes>) → bytes: Returns the subsequence from the value’s beginning to (but not including) end.

to_int([ base: uint<64> ]) → int<64>: Interprets the data as representing an ASCII-encoded number and converts that into a signed integer, using a base of base. base must be between 2 and 36. If base is not given, the default is 10. If the conversion fails, throws a RuntimeError exception, this includes calling to_int() on empty bytes.

to_int(byte_order: spicy::ByteOrder) → int<64>: Interprets the bytes as representing an binary number encoded with the given byte order, and converts it into signed integer. If the conversion fails, throws a RuntimeError exception, this can happen when bytes is empty or its size is larger than 8 bytes.

to_real() → real: Interprets the bytes as representing an ASCII-encoded floating point number and converts that into a real. The data can be in either decimal or hexadecimal format, and the conversion assumes a C/POSIX locale (i.e., using . as the decimal separator). If the conversion fails, throws an InvalidValue exception.

to_time([ base: uint<64> ]) → time: Interprets the bytes as representing a number of seconds since the epoch in the form of an ASCII-encoded number, and converts it into a time value using a base of base. If base is not given, the default is 10.

to_time(byte_order: spicy::ByteOrder) → time: Interprets the bytes as representing as number of seconds since the epoch in the form of an binary number encoded with the given byte order, and converts it into a time value.

to_uint([ base: uint<64> ]) → uint<64>: Interprets the data as representing an ASCII-encoded number and converts that into an unsigned integer, using a base of base. base must be between 2 and 36. If base is not given, the default is 10. If the conversion fails, throws a RuntimeError exception, this includes calling to_uint() on empty bytes.

to_uint(byte_order: spicy::ByteOrder) → uint<64>: Interprets the bytes as representing an binary number encoded with the given byte order, and converts it into an unsigned integer. If the conversion fails, throws a RuntimeError exception, this can happen when bytes is empty or its size is larger than 8 bytes.

upper([ charset: spicy::Charset = spicy::Charset::UTF8 ], [ errors: spicy::DecodeErrorStrategy = spicy::DecodeErrorStrategy::REPLACE ]) → bytes: Returns an upper-case version of the bytes value, assuming it is encoded in character set charset. If data is encountered that charset cannot represent, it’s handled according to the errors strategy.

Operators

begin(<container>) → <iterator>: Returns an iterator to the beginning of the container’s content.

end(<container>) → <iterator>: Returns an iterator to the end of the container’s content.

bytes == bytes → bool: Compares two bytes values lexicographically.

bytes > bytes → bool: Compares two bytes values lexicographically.

bytes >= bytes → bool: Compares two bytes values lexicographically.

bytes in bytes → bool: Returns true if the right-hand-side value contains the left-hand-side value as a subsequence.

bytes !in bytes → bool: Performs the inverse of the corresponding in operation.

bytes < bytes → bool: Compares two bytes values lexicographically.

bytes <= bytes → bool: Compares two bytes values lexicographically.

|bytes| → uint<64>: Returns the number of bytes the value contains.

bytes + bytes → bytes: Returns the concatenation of two bytes values.

bytes += bytes → bytes: Appends one bytes value to another.

bytes += uint<8> → bytes: Appends a single byte to the data.

bytes += view<stream> → bytes: Appends a view of stream data to a bytes instance.

bytes != bytes → bool: Compares two bytes values lexicographically.

Iterator Operators

*iterator<bytes> → uint<8>: Returns the character the iterator is pointing to.

iterator<bytes> - iterator<bytes> → int<64>: Returns the number of bytes between the two iterators. The result will be negative if the second iterator points to a location before the first. The result is undefined if the iterators do not refer to the same bytes instance.

iterator<bytes> == iterator<bytes> → bool: Compares the two positions. The result is undefined if they are not referring to the same bytes value.

iterator<bytes> > iterator<bytes> → bool: Compares the two positions. The result is undefined if they are not referring to the same bytes value.

iterator<bytes> >= iterator<bytes> → bool: Compares the two positions. The result is undefined if they are not referring to the same bytes value.

iterator<bytes>++ → iterator<bytes>: Advances the iterator by one byte, returning the previous position.

++iterator<bytes> → iterator<bytes>: Advances the iterator by one byte, returning the new position.

iterator<bytes> < iterator<bytes> → bool: Compares the two positions. The result is undefined if they are not referring to the same bytes value.

iterator<bytes> <= iterator<bytes> → bool: Compares the two positions. The result is undefined if they are not referring to the same bytes value.

iterator<bytes> + uint<64> → iterator<bytes>^{(commutative)}: Returns an iterator which is pointing the given number of bytes beyond the one passed in.

iterator<bytes> += uint<64> → iterator<bytes>: Advances the iterator by the given number of bytes.

iterator<bytes> != iterator<bytes> → bool: Compares the two positions. The result is undefined if they are not referring to the same bytes value.

5.2.5.5. Enum

Enum types associate labels with numerical values.

Type

enum { LABEL_1, ..., LABEL_N }
Each label has the form ID [= VALUE]. If VALUE is skipped, one will be assigned automatically.
Each enum type comes with an implicitly defined Undef label with a value distinct from all other ones. When coerced into a boolean, an enum will be true iff it’s not Undef.

Note

An instance of an enum can assume a numerical value that does not map to any of its defined labels. If printed, it will then render into <unknown-N> in that case, with N being the decimal expression of its numeric value.

Constants

The individual labels represent constants of the corresponding type (e.g., MyEnum::MyFirstLabel is a constant of type MyEnum).

Methods

has_label() → bool: Returns true if the value of op1 corresponds to a known enum label (other than Undef), as defined by its type.

Operators

enum(int) → enum value: Instantiates an enum instance initialized from a signed integer value. The value does not need to correspond to any of the type’s enumerator labels.

enum(uint) → enum value: Instantiates an enum instance initialized from an unsigned integer value. The value does not need to correspond to any of the type’s enumerator labels. It must not be larger than the maximum that a signed 64-bit integer value can represent.

cast<int>(enum) → int: Casts an enum value into a signed integer. If the enum value is Undef, this will return -1.

cast<uint>(enum) → uint: Casts an enum value into a unsigned integer. This will throw an exception if the enum value is Undef.

enum == enum → bool: Compares two enum values.

enum != enum → bool: Compares two enum values.

5.2.5.6. spicy::Error

spicy::Error captures an error message. It’s primarily meant for use with the result<T> type; see there for more.

Type

spicy::Error (note that you need to import spicy to use this)

Constants

error"MSG" creates a value of type spicy::Error capturing the error message MSG.

Methods

description() → string: Retrieves the textual description associated with the error.

Operators

error == error → bool: Compares two error descriptions lexicographically.

error != error → bool: Compares two error descriptions lexicographically.

5.2.5.7. Exception

Todo

This isn’t available in Spicy yet (#89).

5.2.5.8. Integer

Spicy distinguishes between signed and unsigned integers, and always requires specifying the bitwidth of a type.

Type

intN for signed integers, where N can be one of 8, 16, 32, 64.
uintN for unsigned integers, where N can be one of 8, 16, 32, 64.

Constants

Unsigned integer: 1234, +1234, uint8(42), uint16(42), uint32(42), uint64(42)
Signed integer: -1234, int8(42), int8(-42), int16(42), int32(42), int64(42)

This type supports the pack/unpack operators.

Operators

uint & uint → uint: Computes the bit-wise ‘and’ of the two integers.

uint | uint → uint: Computes the bit-wise ‘or’ of the two integers.

uint ^ uint → uint: Computes the bit-wise ‘xor’ of the two integers.

int16(int) → int<16>: Creates a 16-bit signed integer value.

int16(uint) → int<16>: Creates a 16-bit signed integer value.

int32(int) → int<32>: Creates a 32-bit signed integer value.

int32(uint) → int<32>: Creates a 32-bit signed integer value.

int64(int) → int<64>: Creates a 64-bit signed integer value.

int64(uint) → int<64>: Creates a 64-bit signed integer value.

int8(int) → int<8>: Creates a 8-bit signed integer value.

int8(uint) → int<8>: Creates a 8-bit signed integer value.

uint16(int) → uint<16>: Creates a 16-bit unsigned integer value.

uint16(uint) → uint<16>: Creates a 16-bit unsigned integer value.

uint32(int) → uint<32>: Creates a 32-bit unsigned integer value.

uint32(uint) → uint<32>: Creates a 32-bit unsigned integer value.

uint64(int) → uint<64>: Creates a 64-bit unsigned integer value.

uint64(uint) → uint<64>: Creates a 64-bit unsigned integer value.

uint8(int) → uint<8>: Creates a 8-bit unsigned integer value.

uint8(uint) → uint<8>: Creates a 8-bit unsigned integer value.

cast<bool>(int) → bool: Converts the value to a boolean by comparing against zero

cast<bool>(uint) → bool: Converts the value to a boolean by comparing against zero

cast<enum>(int) → enum: Converts the value into an enum instance. The value does not need to correspond to any of the target type’s enumerator labels.

cast<enum>(uint) → enum: Converts the value into an enum instance. The value does not need to correspond to any of the target type’s enumerator labels.

cast<int>(int) → int: Converts the value into a different signed integer type, accepting any loss of information.

cast<int>(uint) → int: Converts the value into a signed integer type, accepting any loss of information.

cast<interval>(int) → interval: Interprets the value as number of seconds.

cast<interval>(uint) → interval: Interprets the value as number of seconds.

cast<real>(int) → real: Converts the value into a real, accepting any loss of information.

cast<real>(uint) → real: Converts the value into a real, accepting any loss of information.

cast<time>(uint) → time: Interprets the value as number of seconds.

cast<uint>(int) → uint: Converts the value into an unsigned integer type, accepting any loss of information.

cast<uint>(uint) → uint: Converts the value into a different unsigned integer type, accepting any loss of information.

int-- → int: Decrements the value, returning the old value.

uint-- → uint: Decrements the value, returning the old value.

++int → int: Increments the value, returning the new value.

++uint → uint: Increments the value, returning the new value.

int - int → int: Computes the difference between the two integers.

uint - uint → uint: Computes the difference between the two integers.

int -= int → int: Decrements the first value by the second, assigning the new value.

uint -= uint → uint: Decrements the first value by the second, assigning the new value.

int / int → int: Divides the first integer by the second.

uint / uint → uint: Divides the first integer by the second.

int /= int → int: Divides the first value by the second, assigning the new value.

uint /= uint → uint: Divides the first value by the second, assigning the new value.

int == int → bool: Compares the two integers.

uint == uint → bool: Compares the two integers.

int > int → bool: Compares the two integers.

uint > uint → bool: Compares the two integers.

int >= int → bool: Compares the two integers.

uint >= uint → bool: Compares the two integers.

int++ → int: Increments the value, returning the old value.

uint++ → uint: Increments the value, returning the old value.

++int → int: Increments the value, returning the new value.

++uint → uint: Increments the value, returning the new value.

int < int → bool: Compares the two integers.

uint < uint → bool: Compares the two integers.

int <= int → bool: Compares the two integers.

uint <= uint → bool: Compares the two integers.

int % int → int: Computes the modulus of the first integer divided by the second.

uint % uint → uint: Computes the modulus of the first integer divided by the second.

int * int → int: Multiplies the first integer by the second.

uint * uint → uint: Multiplies the first integer by the second.

int *= int → int: Multiplies the first value by the second, assigning the new value.

uint *= uint → uint: Multiplies the first value by the second, assigning the new value.

~uint → uint: Computes the bit-wise negation of the integer.

int ** int → int: Computes the first integer raised to the power of the second.

uint ** uint → uint: Computes the first integer raised to the power of the second.

uint << uint → uint: Shifts the integer to the left by the given number of bits.

uint >> uint → uint: Shifts the integer to the right by the given number of bits.

-int → int: Inverts the sign of the integer.

-uint → uint: Inverts the sign of the integer.

int + int → int: Computes the sum of the integers.

uint + uint → uint: Computes the sum of the integers.

int += int → int: Increments the first integer by the second.

uint += uint → uint: Increments the first integer by the second.

int != int → bool: Compares the two integers.

uint != uint → bool: Compares the two integers.

5.2.5.9. Interval

Am interval value represents a period of time. Intervals are stored with nanosecond resolution, which is retained across all calculations.

Type

interval

Constants

interval(SECS) creates an interval from a signed integer or real value SECS specifying the period in seconds.
interval_ns(NSECS) creates an interval from a signed integer value NSECS specifying the period in nanoseconds.

Methods

nanoseconds() → uint<64>: Returns the time as an integer value representing nanoseconds since the UNIX epoch.

seconds() → real: Returns the time as a real value representing seconds since the UNIX epoch.

Operators

time(int) → time: Creates an time interpreting the argument as number of seconds.

time(real) → time: Creates an time interpreting the argument as number of seconds.

time(uint) → time: Creates an time interpreting the argument as number of seconds.

time_ns(int) → time: Creates an time interpreting the argument as number of nanoseconds.

time_ns(uint) → time: Creates an time interpreting the argument as number of nanoseconds.

time - time → interval: Returns the difference of the times.

time - interval → time: Subtracts the interval from the time.

time == time → bool: Compares two time values.

time > time → bool: Compares the times.

time >= time → bool: Compares the times.

time < time → bool: Compares the times.

time <= time → bool: Compares the times.

time + interval → time^{(commutative)}: Adds the interval to the time.

time != time → bool: Compares two time values.

5.2.5.10. List

Spicy uses lists only in a limited form as temporary values, usually for initializing other containers. That means you can only create list constants, but you cannot declare variables or unit fields to have a list type (use vector instead).

Constants

[E_1, E_2, ..., E_N] creates a list of N elements. The values E_I must all have the same type. [] creates an empty list of unknown element type.
[EXPR for ID in ITERABLE] creates a list by evaluating EXPR for all elements in ITERABLE, assembling the individual results into the final list value. The extended form [EXPR for ID in SEQUENCE if COND] includes only elements into the result for which COND evaluates to True. Both EXPR and COND can use ID to refer to the current element.
list(E_1, E_2, ..., E_N) is the same as [E_1, E_2, ..., E_N], and list() is the same as [].
list<T>(E_1, E_2, ..., E_N) creates a list of type T, initializing it with the N elements E_I. list<T>() creates an empty list.

Operators

begin(<container>) → <iterator>: Returns an iterator to the beginning of the container’s content.

end(<container>) → <iterator>: Returns an iterator to the end of the container’s content.

list == list → bool: Compares two lists element-wise.

|list| → uint<64>: Returns the number of elements a list contains.

list != list → bool: Compares two lists element-wise.

5.2.5.11. Map

Maps are containers holding key/value pairs of elements, with fast lookup for keys to retrieve the corresponding value.

Maps provide iterators to traverse their content, with no particular ordering. A map iterator yields the corresponding key/value pair as a 2-tuple. The following is an example iterating over a map’s elements using the ‘for’ statement:

global m: map("a": 1, "b": 2, "c": 3);

for ( i in m )
  print i[0], i[1]; # key, value

Types

map<K, V> specifies a map with key type K and value type V.
iterator<map<K, V>>

Constants

map(K_1: V_1, K_2: V_2, ..., K_N: V_N) creates a map of N elements, initializing it with the given key/value pairs. The keys K_I must all have the same type, and the values V_I must likewise all have the same type. map() creates an empty map of unknown key/value types; this cannot be used directly but must be coerced into a fully-defined map type first.
map<K, V>(K_1: V_1, K_2: V_2, ..., K_N: V_N) creates a map of type map<K, V>, initializing it with the given key/value pairs. map<K, V>() creates an empty map.

Methods

clear() → void: Removes all elements from the map.

get(key: <any>, [ default: <any> ]) → <type of element>: Returns the map’s element for the given key. If the key does not exist, returns the default value if provided; otherwise throws a runtime error.

get_optional(key: <any>) → optional<type of element>: Returns an optional either containing the map’s element for the given key if that entry exists, or an unset optional if it does not.

Operators

begin(<container>) → <iterator>: Returns an iterator to the beginning of the container’s content.

delete map[<any>] → void: Removes an element from the map.

end(<container>) → <iterator>: Returns an iterator to the end of the container’s content.

map == map → bool: Compares two maps element-wise.

<any> in map → bool: Returns true if an element is part of the map.

<any> !in map → bool: Performs the inverse of the corresponding in operation.

map[<any>] → <type of element>: Returns the map’s element for the given key. The key must exist, otherwise the operation will throw a runtime error.

map[<any>]=<any> → void: Updates the map value for a given key. If the key does not exist a new element is inserted.

|map| → uint<64>: Returns the number of elements a map contains.

map != map → bool: Compares two maps element-wise.

Iterator Operators

*iterator<map> → <dereferenced type>: Returns the map element that the iterator refers to.

iterator<map> == iterator<map> → bool: Returns true if two map iterators refer to the same location.

iterator<map>++ → iterator<map>: Advances the iterator by one map element, returning the previous position.

++iterator<map> → iterator<map>: Advances the iterator by one map element, returning the new position.

iterator<map> != iterator<map> → bool: Returns true if two map iterators refer to different locations.

5.2.5.12. Optional

An optional value may hold a value of another type, or can alternatively remain unset. A common use case for optional is the return value of a function that may fail.

optional<TYPE>

Constants

optional(EXPR) creates an optional<T>, where T is the type of the expression EXPR and initializes it with the value of EXPR.

More commonly, however, optional values are initialized through assignment:

Assigning an instance of TYPE to an optional<TYPE> sets it to the instance’s value.
Assigning Null to an optional<TYPE> unsets it.

To check whether an optional value is set, it can implicitly or explicitly be converted to a bool.

global x: optional<uint64>;  # Unset.
global b1: bool = x;         # False.
global b2 = cast<bool>(x);   # False.

if ( x )
    print "'x' was set";     # Never runs.
if ( ! x )
    print "'x' was unset";   # Always runs.

Operators

*optional → <dereferenced type>: Returns the element stored, or throws an exception if none.

5.2.5.13. Port

Ports represent the combination of a numerical port number and an associated transport-layer protocol.

Type

port

Constants

443/tcp, 53/udp
port(PORT, PROTOCOL) creates a port where PORT is a port number and PROTOCOL a spicy::Protocol.

Methods

protocol() → spicy::Protocol: Returns the protocol the port is using (such as UDP or TCP).

Operators

port(uint<16>,spicy::Protocol) → port: Creates a port instance.

port == port → bool: Compares two port values.

port != port → bool: Compares two port values.

5.2.5.14. Real

“Real” values store floating points with double precision.

Type

real

Constants

3.14, 10e9, 0x1.921fb78121fb8p+1

This type supports the pack/unpack operators.

Operators

cast<int>(real) → int: Converts the value to a signed integer type, accepting any loss of information.

cast<interval>(real) → interval: Interprets the value as number of seconds.

cast<time>(real) → time: Interprets the value as number of seconds since the UNIX epoch.

cast<uint>(real) → uint: Converts the value to an unsigned integer type, accepting any loss of information.

real - real → real: Returns the difference between the two values.

real -= real → real: Subtracts the second value from the first, assigning the new value.

real / real → real: Divides the first value by the second.

real /= real → real: Divides the first value by the second, assigning the new value.

real == real → bool: Compares the two reals.

real > real → bool: Compares the two reals.

real >= real → bool: Compares the two reals.

real < real → bool: Compares the two reals.

real <= real → bool: Compares the two reals.

real % real → real: Computes the modulus of the first real divided by the second.

real * real → real: Multiplies the first real by the second.

real *= real → real: Multiplies the first value by the second, assigning the new value.

real ** real → real: Computes the first real raised to the power of the second.

-real → real: Inverts the sign of the real.

real + real → real: Returns the sum of the reals.

real += real → real: Adds the first real to the second, assigning the new value.

real != real → bool: Compares the two reals.

5.2.5.15. Result

A result<T> is a type facilitating error handling by holding either a value of type T or an error message. It’s most useful when used as the return value of a function that would normally produce a computed value of some kind, but may fail doing so. Typical example:

function compute_value() : result<int64> {

    local value: int64;

    [... Try to compute value ...]

    if ( everything_went_ok )
        return value;
    else
        return error"Something went wrong.";
}

if ( local x = compute_value() )
    print "result: %d " % *x;
else
    print "error: %s " % x.error();

As you can see, the result<int64> return value of compute_value() can be set from either a corresponding integer or an appropriate error message. In the latter case, error"MSG" instantiates an error value of type spicy::Error. As the if statements shows, result<T> coerces to a boolean value depending on whether it holds a value or an error.

Type

result<TYPE>

Methods

error() → error: Retrieves the error stored inside the result instance. Will throw a NoError exception if the result is not in an error state.

Operators

*result → <type of stored value>: Retrieves the value stored inside the result instance. Will throw a NoResult exception if the result is in an error state.

5.2.5.16. Reference

A reference T& wraps a value of another type T, allowing to pass it around without creating a copy. Multiple references can wrap the same value, and the value will stay around for as long as there’s at least one reference to it.

You can create a reference through the Spicy’s new T operator, which instantiates a value of type T, initialized to the type’s default; and then returns a reference to it:

local x = new bytes; # x is now of type "bytes&"

To access or modify the value, one generally needs to dereference it first through the * operator:

assert *x == b"";   # x was initialized to empty
*x = b"Hello";      # x now holds "Hello"
*x += b" World";    # x now holds "Hello World"
assert *x == b"Hello World";

In some cases Spicy implicitly dereferences a reference, such as when printing it, or when passing it to a function that expects a value of the wrapped type:

print x;  # prints "Hello World"

function f(x: bytes) { assert x == b"Hello World"; }
f(x);     # passes the value of x to f

Automatic dereferencing even works for a value’s operators as long as that’s non-ambiguous. For example, above we could have just said x += b" World" instead of *x += b" World".

A reference’s wrapped value remains mutable, so you can leverage a reference for passing a value to a function for modification:

function m(x: bytes&) { *x += b" World"; }

local x = new b"Hello"; # creates pre-initialized bytes value
m(x);
print x; # prints "Hello World"

Note that this is subtly different from passing a value as an inout parameter. While both allow a function to modify the value, the reference continues to wrap a value that’s been created elsewhere, possibly with further references around that will see the same changes and keep the value alive—whereas with an inout parameter, life-time and visibility remain tied to normal scoping rules at the call site.

Note

Nothing prevents you from passing a reference as an inout parameter, like: function f(inout x: bytes&). However, doing so may not be that useful because the inout refers to the reference itself, meaning all you can do is rebind it to another reference. While you can modify the wrapped value through dereferentation, that’s independent of making the reference inout.

Type

T& represents a reference wrapping a value of type T.

Constants

Null is the only constant for references, representing an unset reference.

Operators

*(T&) → <dereferenced type>: Returns the referenced instance, or throws an exception if none or expired.

T& == T& → bool: Returns true if both operands reference the same instance.

T& != T& → bool: Returns true if the two operands reference different instances.

5.2.5.17. Regular Expression

Spicy provides POSIX-style regular expressions. Regular expression are typically of the form /PATTERN/[FLAGS], where PATTERN is a regular expression to match, and FLAGS contains optional flags modifying the matching behavior. In addition, regular expression constants can also consist of multiple patterns, separated by |. This creates a single regular expression constant that matches any of the patterns.

Type

regexp

Constants

/Foo*bar?/, /X(..)(..)(..)Y/, /foo/i
/Foo/$(1) | /Bar/$(2)

Regular expression patterns use the extended POSIX syntax, with a few smaller differences and extensions:

Supported character classes are: [:lower:], [:upper:], [:digit:], [:blank:]. Note that [:lower:] and [:upper:] do not take locales into account, they only match standard ASCII characters.
\b asserts a word-boundary, \B matches asserts no word boundary.
\xXX matches a byte with the binary hex value XX (e.g., \xff matches a byte of decimal value 255).

Patterns support the following optional flags:

i: Matches the pattern case-insensitively. Just like [:lower:] and [:upper:], this flag only affects standard ASCII characters; it does not consider any locale.
$(ID): Associates a numeric ID with the pattern. When a regular expression constant consists of multiple patterns, their IDs identify the one that matched.

Methods

find(data: bytes) → tuple<int<32>, bytes>: Searches the regular expression in data and returns the matching part. Different from match, this does not anchor the expression to the beginning of the data: it will find matches at arbitrary starting positions. Returns a 2-tuple with (1) an integer match indicator with the same semantics as that returned by find; and (2) if a match has been found, the data that matches the regular expression. (Note: Currently this function has a runtime that’s quadratic in the size of data; consider using match if performance is an issue.)

match(data: bytes) → int<32>: Matches the regular expression against data. If it matches, returns an integer that’s greater than zero. If multiple patterns have been compiled for parallel matching, that integer will be the ID of the matching pattern. Returns -1 if the regular expression does not match the data, but could still yield a match if more data were added. Returns 0 if the regular expression is not found and adding more data wouldn’t change anything. The expression is considered anchored, as though it starts with an implicit ^ regexp operator, to the beginning of the data.

match_groups(data: bytes) → vector<bytes>: Matches the regular expression against data. If it matches, returns a vector with one entry for each capture group defined by the regular expression; starting at index 1. Each of these entries is a view locating the matching bytes. In addition, index 0 always contains the data that matches the full regular expression. Returns an empty vector if the expression is not found. The expression is considered anchored, as though it starts with an implicit ^ regexp operator, to the beginning of the data. This method is not compatible with pattern sets and will throw a runtime exception if used with a regular expression compiled from a set.

token_matcher() → spicy::MatchState: Initializes state for matching regular expression incrementally against chunks of future input. The expression is considered anchored, as though it starts with an implicit ^ regexp operator, to the beginning of the data.

5.2.5.18. Set

Sets are containers for unique elements with fast lookup.

Sets provide iterators to traverse their content, with no particular ordering. The following is an example iterating over a set’s elements using the for statement:

global s = set("a", "b", "c");

for ( i in s )
    print i;

Types

set<T> specifies a set with unique elements of type T.
iterator<set<T>>

Constants

set(E_1, E_2, ..., E_N) creates a set of N elements. The values E_I must all have the same type. set() creates an empty set of unknown element type; this cannot be used directly but must be coerced into a fully-defined set type first.
set<T>(E_1, E_2, ..., E_N) creates a set of type T, initializing it with the elements E_I. set<T>() creates an empty set.

Methods

clear() → void: Removes all elements from the set.

Operators

add set[element] → void: Adds an element to the set.

begin(<container>) → <iterator>: Returns an iterator to the beginning of the container’s content.

delete set[element] → void: Removes an element from the set.

end(<container>) → <iterator>: Returns an iterator to the end of the container’s content.

set == set → bool: Compares two sets element-wise.

<any> in set → bool: Returns true if an element is part of the set.

<any> !in set → bool: Performs the inverse of the corresponding in operation.

|set| → uint<64>: Returns the number of elements a set contains.

set != set → bool: Compares two sets element-wise.

Iterator Operators

*iterator<set> → <dereferenced type>: Returns the set element that the iterator refers to.

iterator<set> == iterator<set> → bool: Returns true if two sets iterators refer to the same location.

iterator<set>++ → iterator<set>: Advances the iterator by one set element, returning the previous position.

++iterator<set> → iterator<set>: Advances the iterator by one set element, returning the new position.

iterator<set> != iterator<set> → bool: Returns true if two sets iterators refer to different locations.

5.2.5.19. Sink

Sinks act as a connector between two units, facilitating feeding the output of one as input into the other. See Sinks for a full description.

Sinks are special in that they don’t represent a type that’s generally available for instantiation. Instead they need to be declared as the member of unit using the special sink keyword. You can, however, maintain references to sinks by assigning the unit member to a variable of type Sink&.

Methods

close() → void: Closes a sink by disconnecting all parsing units. Afterwards the sink’s state is as if it had just been created (so new units can be connected). Note that a sink is automatically closed when the unit it is part of is done parsing. Also note that a previously connected parsing unit can not be reconnected; trying to do so will still throw a UnitAlreadyConnected exception.

connect(u: unit&) → void: Connects a parsing unit to a sink. All subsequent write operations to the sink will pass their data on to this parsing unit. Each unit can only be connected to a single sink. If the unit is already connected, a UnitAlreadyConnected exception is thrown. However, a sink can have more than one unit connected to it.

connect_filter(filter: unit&) → void

Connects a filter unit to the sink that will transform its input transparently before forwarding it for parsing to other connected units.

Multiple filters can be added to a sink, in which case they will be chained into a pipeline and the data will be passed through them in the order they have been added. The parsing will then be carried out on the output of the last filter in the chain.

Filters must be added before the first data chunk is written into the sink. If data has already been written when a filter is added, an error is triggered.

connect_mime_type(mt: bytes) → void: Connects parsing units to a sink for all parsers that support a given MIME type. All subsequent write operations to the sink will pass their data on to these parsing units. The MIME type may have wildcards for type or subtype, and the method will then connect units for all matching parsers.

connect_mime_type(mt: string) → void: Connects parsing units to a sink for all parsers that support a given MIME type. All subsequent write operations to the sink will pass their data on to these parsing units. The MIME type may have wildcards for type or subtype, and the method will then connect units for all matching parsers.

gap(seq: uint<64>, len: uint<64>) → void: Reports a gap in the input stream. seq is the sequence number of the first byte missing, len is the length of the gap.

sequence_number() → uint<64>: Returns the current sequence number of the sink’s input stream, which is one beyond the index of the last byte that has been put in order and delivered so far.

set_auto_trim(enable: bool) → void: Enables or disables auto-trimming. If enabled (which is the default) sink input data is trimmed automatically once in-order and processed. See trim() for more information about trimming.

set_initial_sequence_number(seq: uint<64>) → void: Sets the sink’s initial sequence number. All sequence numbers given to other methods are then assumed to be absolute numbers beyond that initial number. If the initial number is not set, the sink implicitly uses zero instead.

set_policy(policy: spicy::ReassemblerPolicy) → void: Sets a sink’s reassembly policy for ambiguous input. As long as data hasn’t been trimmed, a sink will detect overlapping chunks. This policy decides how to handle ambiguous overlaps. The default (and currently only) policy is ReassemblerPolicy::First, which resolves ambiguities by taking the data from the chunk that came first.

skip(seq: uint<64>) → void: Skips ahead in the input stream. seq is the sequence number where to continue parsing. If there’s still data buffered before that position it will be ignored; if auto-skip is also active, it will be immediately deleted as well. If new data is passed in later that comes before seq, that will likewise be ignored. If the input stream is currently stuck inside a gap, and seq lies beyond that gap, the stream will resume processing at seq.

trim(seq: uint<64>) → void

Deletes all data that’s still buffered internally up to seq. If processing the input stream hasn’t reached seq yet, parsing will also skip ahead to seq.

Trimming the input stream releases the memory, but that means that the sink won’t be able to detect any further data mismatches.

Note that by default, auto-trimming is enabled, which means all data is trimmed automatically once in-order and processed.

write(data: bytes, [ seq: uint<64> ], [ len: uint<64> ]) → void

Passes data on to all connected parsing units. Multiple write calls act like passing input in incrementally: The units will parse the pieces as if they were a single stream of data. If no sequence number seq is provided, the data is assumed to represent a chunk to be appended to the current end of the input stream. If a sequence number is provided, out-of-order data will be buffered and reassembled before being passed on. If len is provided, the data is assumed to represent that many bytes inside the sequence space; if not provided, len defaults to the length of data.

If no units are connected, the call does not have any effect. If multiple units are connected and one parsing unit throws an exception, parsing of subsequent units does not proceed. Note that the order in which the data is parsed to each unit is undefined.

Todo

The error semantics for multiple units aren’t great.

Operators

|sink| → uint<64>: Returns the number of bytes written into the sink so far. If the sink has filters attached, this returns the value after filtering.

Sinks provide a set of dedicated unit hooks as callbacks for the reassembly process. These must be implemented on the reader side, i.e., the unit that’s connected to a sink.

%on_gap(seq: uint64, len: uint64)

%on_overlap(seq: uint64, old: data, new: data)

Triggered when reassembly encounters a 2nd version of data for sequence space already covered earlier. seq is the start of the overlap, and old/new the previous and the new data, respectively. This hook is just for informational purposes, the policy set with set_policy() determines how the reassembler handles the overlap.

%on_skipped(seq: uint64)

Any time skip() moves ahead in the input stream, this hook reports the new sequence number seq.

%on_skipped(seq: uint64, data: bytes)

If data still buffered is skipped over through skip(), it will be passed to this hook, before adjusting the current position. seq is the starting sequence number of the data, data is the data itself.

5.2.5.20. Stream

A stream is data structure that efficiently represents a potentially large, incrementally provided input stream of raw data. You can think of it as a bytes type that’s optimized for (1) efficiently appending new chunks of data at the end, and (2) trimming data no longer needed at the beginning. Other than those two operation, stream data cannot be modified; there’s no way to change the actual content of a stream once it has been added to it. Streams provide iterators for traversal, and views for limiting visibility to smaller windows into the total stream.

Streams are key to Spicy’s parsing process, although most of that happens behind the scenes. You will most likely encounter them when using Random access. They may also be useful for buffering larger volumes of data during processing.

Types

stream
iterator<stream>
view<stream>

Methods

at(i: uint<64>) → iterator<stream>: Returns an iterator representing the offset i inside the stream value.

freeze() → void: Freezes the stream value. Once frozen, one cannot append any more data to a frozen stream value (unless it gets unfrozen first). If the value is already frozen, the operation does not change anything.

is_frozen() → bool: Returns true if the stream value has been frozen.

statistics() → spicy::StreamStatistics: Returns statistics about the stream input received so far. Note that during parsing, this reflects all input that has already been sent to the stream, which may include data that has not been processed yet.

trim(i: iterator<stream>) → void: Trims the stream value by removing all data from its beginning up to (but not including) the position i. The iterator i will remain valid afterwards and will still point to the same location, which will now be the beginning of the stream’s value. All existing iterators pointing to i or beyond will remain valid and keep their offsets as well. The effect of this operation is undefined if i does not actually refer to a location inside the stream value. Trimming is permitted even on frozen values.

unfreeze() → void: Unfreezes the stream value. A unfrozen stream value can be further modified. If the value is already unfrozen (which is the default), the operation does not change anything.

Operators

begin(<container>) → <iterator>: Returns an iterator to the beginning of the container’s content.

stream(bytes) → stream: Creates a stream instance pre-initialized with the given data.

end(<container>) → <iterator>: Returns an iterator to the end of the container’s content.

|stream| → uint<64>: Returns the number of stream the value contains.

stream += bytes → stream: Concatenates data to the stream.

stream += view<stream> → stream: Concatenates another stream’s view to the target stream.

stream != stream → bool: Compares two stream values lexicographically.

Iterator Methods

is_frozen() → bool: Returns whether the stream value that the iterator refers to has been frozen.

offset() → uint<64>: Returns the offset of the byte that the iterator refers to relative to the beginning of the underlying stream value.

Iterator Operators

*iterator<stream> → uint<64>: Returns the character the iterator is pointing to.

iterator<stream> - iterator<stream> → int<64>: Returns the number of stream between the two iterators. The result will be negative if the second iterator points to a location before the first. The result is undefined if the iterators do not refer to the same stream instance.

iterator<stream> == iterator<stream> → bool: Compares the two positions. The result is undefined if they are not referring to the same stream value.

iterator<stream> > iterator<stream> → bool: Compares the two positions. The result is undefined if they are not referring to the same stream value.

iterator<stream> >= iterator<stream> → bool: Compares the two positions. The result is undefined if they are not referring to the same stream value.

iterator<stream>++ → iterator<stream>: Advances the iterator by one byte, returning the previous position.

++iterator<stream> → iterator<stream>: Advances the iterator by one byte, returning the new position.

iterator<stream> < iterator<stream> → bool: Compares the two positions. The result is undefined if they are not referring to the same stream value.

iterator<stream> <= iterator<stream> → bool: Compares the two positions. The result is undefined if they are not referring to the same stream value.

iterator<stream> + uint<64> → iterator<stream>^{(commutative)}: Advances the iterator by the given number of stream.

iterator<stream> += uint<64> → iterator<stream>: Advances the iterator by the given number of stream.

iterator<stream> != iterator<stream> → bool: Compares the two positions. The result is undefined if they are not referring to the same stream value.

View Methods

advance(i: iterator<stream>) → view<stream>: Advances the view’s starting position to a given iterator i, returning the new view. The iterator must be referring to the same stream values as the view, and it must be equal or ahead of the view’s starting position.

advance(i: uint<64>) → view<stream>: Advances the view’s starting position by i stream, returning the new view.

advance_to_next_data() → view<stream>: Advances the view’s starting position to the next non-gap position. This always advances the input by at least one byte.

at(i: uint<64>) → iterator<stream>: Returns an iterator representing the offset i inside the view.

find(needle: bytes) → tuple<bool, iterator<stream>>: Searches needle inside the view’s content. Returns a tuple of a boolean and an iterator. If needle was found, the boolean will be true and the iterator will point to its first occurrence. If needle was not found, the boolean will be false and the iterator will point to the last position so that everything before that is guaranteed to not contain even a partial match of needle (in other words: one can trim until that position and then restart the search from there if more data gets appended to the underlying stream value). Note that for a simple yes/no result, you should use the in operator instead of this method, as it’s more efficient.

limit(i: uint<64>) → view<stream>: Returns a new view that keeps the current start but cuts off the end i characters from that beginning. The returned view will not be able to expand any further.

offset() → uint<64>: Returns the offset of the view’s starting position within the associated stream value.

starts_with(b: bytes) → bool: Returns true if the view starts with b.

sub(begin: iterator<stream>, end: iterator<stream>) → view<stream>: Returns a new view of the subsequence from begin up to (but not including) end.

sub(begin: uint<64>, end: uint<64>) → view<stream>: Returns a new view of the subsequence from offset begin to (but not including) offset end. The offsets are relative to the beginning of the view.

sub(end: iterator<stream>) → view<stream>: Returns a new view of the subsequence from the beginning of the stream up to (but not including) end.

View Operators

view<stream> == bytes → bool^{(commutative)}: Compares a stream view and a bytes instance lexicographically.

view<stream> == view<stream> → bool: Compares the views lexicographically.

bytes in view<stream> → bool: Returns true if the right-hand-side bytes contains the left-hand-side view as a subsequence.

view<stream> in bytes → bool: Returns true if the right-hand-side view contains the left-hand-side bytes as a subsequence.

bytes !in view<stream> → bool: Performs the inverse of the corresponding in operation.

view<stream> !in bytes → bool: Performs the inverse of the corresponding in operation.

|view<stream>| → uint<64>: Returns the number of stream the view contains.

view<stream> != bytes → bool^{(commutative)}: Compares a stream view and a bytes instance lexicographically.

view<stream> != view<stream> → bool: Compares two views lexicographically.

5.2.5.21. String

Strings store readable text that’s associated with a given character set. Internally, Spicy stores them as UTF-8.

Type

string

Constants

"Spicy", ""
When specifying string constants, Spicy assumes them to be in UTF-8.

Methods

encode([ charset: spicy::Charset = spicy::Charset::UTF8 ], [ errors: spicy::DecodeErrorStrategy = spicy::DecodeErrorStrategy::REPLACE ]) → bytes: Converts the string into a binary representation encoded with the given character set.

ends_with(suffix: string) → bool: Returns true if the string value ends with suffix.

lower() → string: Returns a lower-case version of the string value.

split([ sep: string ]) → vector<string>: Splits the string value at each occurrence of sep and returns a vector containing the individual pieces, with all separators removed. If the separator is not found, or if the separator is empty, the returned vector will have the whole string value as its single element. If the separator is not given, the split will occur at sequences of white spaces.

split1([ sep: string ]) → tuple<string, string>: Splits the string value at the first occurrence of sep and returns the two parts as a 2-tuple, with the separator removed. If the separator is not found, the returned tuple will have the whole string value as its first element and an empty value as its second element. If the separator is empty, the returned tuple will have an empty first element and the whole string value as its second element. If the separator is not provided, the split will occur at the first sequence of white spaces.

starts_with(prefix: string) → bool: Returns true if the string value starts with prefix.

upper() → string: Returns an upper-case version of the string value.

Operators

string == string → bool: Compares two strings lexicographically.

string % <any> → string: Renders a printf-style format string.

|string| → uint<64>: Returns the number of characters the string contains.

string + string → string: Returns the concatenation of two strings.

string += string → string: Appends the second string to the first.

string != string → bool: Compares two strings lexicographically.

5.2.5.22. Struct

A struct is a heterogeneous container of an ordered set of named values similar to a Tuple. In contrast to tuple elements, struct fields are mutable.

Type

struct { IDENTIFIER_1: TYPE_1; ...; IDENTIFIER_N: TYPE_N; }

Constants

Structs can be initialized with a struct initializer, local my_struct: MyStruct = [$FIELD_1 = X_1, ..., $FIELD_N = X_N] where FIELD_I is the label of the corresponding field in MyStruct’s type.

Operators

<struct> ?. <field> → bool: Returns true if the struct’s field has a value assigned (not counting any &default).

<struct> . <field> → <field type>: Retrieves the value of a struct’s field. If the field does not have a value assigned, it returns its &default expression if that has been defined; otherwise it triggers an exception.

<struct> .? <field> → <field type>: Retrieves the value of a struct’s field. If the field does not have a value assigned, it returns its &default expression if that has been defined; otherwise it signals a special non-error exception to the host application (which will normally still lead to aborting execution, similar to the standard dereference operator, unless the host application specifically handles this exception differently).

unset <struct>.<field> → void: Clears an optional field.

5.2.5.23. Time

A time value refers to a specific, absolute point of time, specified as the interval from January 1, 1970 UT ( i.e., the Unix epoch). Times are stored with nanosecond resolution, which is retained across all calculations.

Type

time

Constants

time(SECS) creates a time from an unsigned integer or real value SECS specifying seconds since the epoch.
time_ns(NSECS) creates a time from an unsigned integer value NSECS specifying nanoseconds since the epoch.

Methods

nanoseconds() → uint<64>: Returns the time as an integer value representing nanoseconds since the UNIX epoch.

seconds() → real: Returns the time as a real value representing seconds since the UNIX epoch.

Operators

time(int) → time: Creates an time interpreting the argument as number of seconds.

time(real) → time: Creates an time interpreting the argument as number of seconds.

time(uint) → time: Creates an time interpreting the argument as number of seconds.

time_ns(int) → time: Creates an time interpreting the argument as number of nanoseconds.

time_ns(uint) → time: Creates an time interpreting the argument as number of nanoseconds.

time - time → interval: Returns the difference of the times.

time - interval → time: Subtracts the interval from the time.

time == time → bool: Compares two time values.

time > time → bool: Compares the times.

time >= time → bool: Compares the times.

time < time → bool: Compares the times.

time <= time → bool: Compares the times.

time + interval → time^{(commutative)}: Adds the interval to the time.

time != time → bool: Compares two time values.

5.2.5.24. Tuple

Tuples are heterogeneous containers of a fixed, ordered set of types. Tuple elements may optionally be declared and addressed with custom identifier names. Tuple elements are immutable.

Type

tuple<[IDENTIFIER_1: ]TYPE_1, ...[IDENTIFIER_N: ]TYPE_N>

Constants

(1, "string", True), (1, ), ()
tuple(1, "string", True), tuple(1), tuple()

Operators

(x,...,y)=tuple → <tuple>: Assigns element-wise to the left-hand-side tuple.

tuple == tuple → bool: Compares two tuples element-wise.

tuple[uint<64>] → <type of element>: Extracts the tuple element at the given index. The index must be a constant unsigned integer.

tuple . <id> → <type of element>: Extracts the tuple element corresponding to the given ID.

tuple != tuple → bool: Compares two tuples element-wise.

5.2.5.25. Type

type stores a reference to a given Spicy type. To initialize a type instance, either use the typeinfo operator or, as a shortcut, directly assign any global type ID (note that you cannot directly assign a non-named type).

Inside Spicy code, there’s not much more to do with such type references than printing them out (which will output a readable representation of the type). However, host applications can leverage type to facilitate configuration of types to operate on.

Type

type

Usage

type S = struct {
    a: int32;
    b: int32;
};

global s1: type = typeinfo(bool);
global s2: type = typeinfo(S);
global s3: type = S;
global s4: type = typeinfo(0.5);

print s1, s2, s3, s4; # will output "bool, X::S, X::S, real"

5.2.5.26. Unit

Type

unit { FIELD_1; ...; FIELD_N }
See Parsing for a full discussion of unit types.

Constants

Spicy doesn’t support unit constants, but you can initialize unit instances through coercion from a struct initializer, see Struct.

Todo

This initialization isn’t actually available in Spicy yet (#1036).

Methods

backtrack() → void: Aborts parsing at the current position and returns back to the most recent &try attribute. Turns into a parse error if there’s no &try in scope.

connect_filter(filter: unit&) → void

Connects a separate filter unit to transform the unit’s input transparently before parsing. The filter unit will see the original input, and this unit will receive everything the filter passes on through forward().

Filters can be connected only before a unit’s parsing begins. The latest possible point is from inside the target unit’s %init hook.

context() → <context type>&: Returns a reference to the %context instance associated with the unit.

find(needle: bytes, [ dir: spicy::Direction ], [ start: iterator<stream> ]) → optional<iterator<stream>>: Searches a needle pattern inside the input region defined by where the unit began parsing and its current parsing position. If executed from inside a field hook, the current parasing position will represent the first byte that the field has been parsed from. By default, the search will start at the beginning of that region and scan forward. If the direction is spicy::Direcction::Backward, the search will start at the end of the region and scan backward. In either case, a starting position can also be explicitly given, but must lie inside the same region.

forward(data: bytes) → void: If the unit is connected as a filter to another one, this method forwards transformed input over to that other one to parse. If the unit is not connected, this method will silently discard the data.

forward_eod() → void: If the unit is connected as a filter to another one, this method signals that other one that end of its input has been reached. If the unit is not connected, this method will not do anything.

input() → iterator<stream>: Returns an iterator referring to the input location where the current unit has begun parsing. If this method is called before the units parsing has begun, it will throw a runtime exception. Once available, the input position will remain accessible for the unit’s entire life time.

offset() → uint<64>: Returns the offset of the current location in the input stream relative to the unit’s start. If executed from inside a field hook, the offset will represent the beginning of the field, not the end.

position() → iterator<stream>: Returns an iterator to the current position in the unit’s input stream. If executed from inside a field hook, the position will represent the beginning of the field, not the end.

set_input(i: iterator<stream>) → void: Moves the current parsing position to i. The iterator i must be into the input of the current unit, or the method will throw a runtime exception.

stream() → stream: Returns the current input stream. This will return a valid value only while parsing is in progress, otherwise it will throw an exception.

Operators

<unit> ?. <field> → <field type>: Returns true if the unit’s field has a value assigned (not counting any &default).

<unit> . <field> → <field type>: Retrieves the value of a unit’s field. If the field does not have a value assigned, it returns its &default expression if that has been defined; otherwise it triggers an exception.

<unit> .? <field> → <field type>: Retrieves the value of a unit’s field. If the field does not have a value assigned, it returns its &default expression if that has been defined; otherwise it signals a special non-error exception to the host application (which will normally still lead to aborting execution, similar to the standard dereference operator, unless the host application specifically handles this exception differently).

unset unit.<field> → void: Clears an optional field.

5.2.5.27. Vector

Vectors are homogeneous containers, holding a set of elements of a given element type. Indexes begin at 0.

Vectors provide iterators to traverse their content in order. The following is an example iterating over a vector’s elements using the for statement:

global v = vector("a", "b", "c");

for ( i in v )
    print i;

Types

vector<T> specifies a vector with elements of type T.
iterator<vector<T>>

Constants

vector(E_1, E_2, ..., E_N) creates a vector of N elements. The values E_I must all have the same type. vector() creates an empty vector of unknown element type; this cannot be used directly but must be coerced into a fully-defined vector type first.
vector<T>(E_1, E_2, ..., E_N) creates a vector of type T, initializing it with the N elements E_I. vector<T>() creates an empty vector.
Vectors can be initialized through coercion from a list value: vector<string> I = ["A", "B", "C"].

Methods

assign(i: uint<64>, x: <any>) → void: Assigns x to the i*th element of the vector. If the vector contains less than *i elements a sufficient number of default-initialized elements is added to carry out the assignment.

at(i: uint<64>) → <iterator>: Returns an iterator referring to the element at vector index i.

back() → <type of element>: Returns the last element of the vector. It throws an exception if the vector is empty.

front() → <type of element>: Returns the first element of the vector. It throws an exception if the vector is empty.

pop_back() → void: Removes the last element from the vector, which must be non-empty.

push_back(x: <any>) → void: Appends x to the end of the vector.

reserve(n: uint<64>) → void: Reserves space for at least n elements. This operation does not change the vector in any observable way but provides a hint about the size that will be needed.

resize(n: uint<64>) → void: Resizes the vector to hold exactly n elements. If n is larger than the current size, the new slots are filled with default values. If n is smaller than the current size, the excessive elements are removed.

sub(begin: uint<64>, end: uint<64>) → vector: Extracts a subsequence of vector elements spanning from index begin to (but not including) index end.

sub(end: uint<64>) → vector: Extracts a subsequence of vector elements spanning from index begin to (but not including) index end.

Operators

begin(<container>) → <iterator>: Returns an iterator to the beginning of the container’s content.

end(<container>) → <iterator>: Returns an iterator to the end of the container’s content.

vector == vector → bool: Compares two vectors element-wise.

vector[uint<64>] → <type of element>: Returns the vector element at the given index.

|vector| → uint<64>: Returns the number of elements a vector contains.

vector + vector → vector: Returns the concatenation of two vectors.

vector += vector → vector: Concatenates another vector to the vector.

vector != vector → bool: Compares two vectors element-wise.

Iterator Operators

*iterator<vector> → <dereferenced type>: Returns the vector element that the iterator refers to.

iterator<vector> == iterator<vector> → bool: Returns true if two vector iterators refer to the same location.

iterator<vector>++ → iterator<vector>: Advances the iterator by one vector element, returning the previous position.

++iterator<vector> → iterator<vector>: Advances the iterator by one vector element, returning the new position.

iterator<vector> != iterator<vector> → bool: Returns true if two vector iterators refer to different locations.

5.2.5.28. Void

The void type is place holder for specifying “no type”, such as when a function doesn’t return anything.

Type

void