4. Tutorial: A Real Analyzer

In this chapter we will develop a simple protocol analyzer from scratch. Our analyzer will parse the Trivial File Transfer Protocol (TFTP) in its original incarnation, as described in RFC 1350. TFTP provides a small protocol for copying files from a server to a client system. It is most commonly used these days for providing boot images to devices during initialization. The protocol is sufficiently simple that we can walk through it end to end. See its Wikipedia page for more background.

Contents

4.1. Creating a Spicy Grammar 

We start by developing Spicy grammar for TFTP. The protocol is packet-based, and our grammar will parse the content of one TFTP packet at a time. While TFTP is running on top of UDP, we will Spicy parse just the actual UDP application-layer payload, as described in Section 5 of the protocol standard.

4.1.1. Parsing One Packet Type 

TFTP is a binary protocol that uses a set of standardized, numerical opcodes to distinguish between different types of packets—a common idiom with such protocols. Each packet contains the opcode inside the first two bytes of the UDP payload, followed by further fields that then differ by type. For example, the following is the format of a TFTP “Read Request” (RRQ) that initiates a download from a server:

 2 bytes     string    1 byte     string   1 byte    (from RFC 1350)
 ------------------------------------------------
| Opcode |  Filename  |   0  |    Mode    |   0  |
 ------------------------------------------------

A Read Request uses an opcode of 1. The filename is a sequence of ASCII bytes terminated by a null byte. The mode is another null-terminated byte sequence that usually is either netascii, octet, or mail, describing the desired encoding for data that will be received.

Let’s stay with the Read Request for a little bit and write a Spicy parser just for this one packet type. The following is a minimal Spicy unit to parse the three fields:

module TFTP;                          # [1]

public type ReadRequest = unit {      # [2]
    opcode:   uint16;                 # [3]
    filename: bytes &until=b"\x00";   # [4]
    mode:     bytes &until=b"\x00";   # [5]

    on %done { print self; }          # [6]
};

Let’s walk through:

[1] All Spicy source files must start with a module line defining a namespace for their content. By convention, the namespace should match what is being parsed, so we call ours TFTP. Naming our module TFTP also implies saving it under the name tftp.spicy, so that other modules can find it through import TFTP;. See Modules for more on all of this.

[2] In Spicy, one will typically create a unit type for each of the main data units that a protocol defines. We want to parse a Read Request, so we call our type accordingly. We declare it as public because we want to use this unit as the starting point for parsing data. The following lines then lay out the elements of such a request in the same order as the protocol defines them.

[3] Per the TFTP specification, the first field contains the opcode as an integer value encoded over two bytes. For multi-byte integer values, it is important to consider the byte order for parsing. TFTP uses network byte order which matches Spicy’s default, so there is nothing else for us to do here. (If we had to specify the order, we would add the &byte-order attribute).

[4] The filename is a null-terminated byte sequence, which we can express directly as such in Spicy: the filename field will accumulate bytes until a null byte is encountered. Note that even though the specification of a Read Request shows the 0 as separate element inside the packet, we don’t create a field for it, but rather exploit it as a terminator for the file name (which will not be included into the filename stored).

[5] The mode operates just the same as the filename.

[6] Once we are done parsing a Read Request, we print out the result for debugging.

We should now be able to parse a Read Request. To try it, we need the actual payload of a corresponding packet. With TFTP, the format is simple enough that we can start by faking data with printf and pipe that into the Spicy tool spicy-driver:

# printf '\000\001rfc1350.txt\000octet\000' | spicy-driver -d tftp.spicy
[$opcode=1, $filename=b"rfc1350.txt", $mode=b"octet"]

Here, spicy-driver compiles our ReadRequest unit into an executable parser and then feeds it with the data it is receiving on standard input. The output of spicy-driver is the result of our print statement executing at the end.

What would we do with a more complex protocol where we cannot easily use printf to create some dummy payload? We would probably have access to some protocol traffic in pcap traces, however we can’t just feed those into spicy-driver directly as they will contain all the other network layers as well that our grammar does not handle (e.g., IP and UDP). One way to test with a trace would be proceeding with Zeek integration at this point, so that we could let Zeek strip off the lower layers and then feed our parser only the TFTP application payload. However, during development it is often easier to avoid Zeek’s additional complexity at first, and stay with spicy-driver until the protocol parsing is mostly in place.

To facilitate that, spicy-driver offers a batch mode, which allows feeding connection-based, bi-directional packet payloads into a parser, just as Zeek (or any other network application) would do after stripping off the lower layers. In this mode, spicy-driver reads input from a specially-crafted batch file that retains the packet structure of the underlying network communication as well as (just) the payload data that we want parse.

To create such a batch input file, we can leverage Zeek itself: it comes with a corresponding script that turns any PCAP trace into a spicy-driver batch file. Let’s use that script with a tiny TFTP trace, tftp_rrq.pcap, borrowed from Wireshark’s pcap archive. First, we confirm with tcpdump that the trace contains a single file download:

# tcpdump -ttnr tftp_rrq.pcap
1367411051.972852 IP 192.168.0.253.50618 > 192.168.0.10.69:  20 RRQ "rfc1350.txtoctet" [\|tftp]
1367411052.077243 IP 192.168.0.10.3445 > 192.168.0.253.50618: UDP, length 516
1367411052.081790 IP 192.168.0.253.50618 > 192.168.0.10.3445: UDP, length 4
[...]

We now run Zeek on that trace to perform the batch conversion:

# zeek -r tftp_rrq.pcap policy/frameworks/spicy/record-spicy-batch SpicyBatch::filename=tftp_rrq.dat
tracking [orig_h=192.168.0.253, orig_p=50618/udp, resp_h=192.168.0.10, resp_p=69/udp]
tracking [orig_h=192.168.0.10, orig_p=3445/udp, resp_h=192.168.0.253, resp_p=50618/udp]
recorded 2 sessions total
output in tftp_rrq.dat

This leaves a new spicy-driver batch file in tftp_rrq.dat (if we had left off the SpicyBatch::filename argument, the default output name is batch.dat).

Now we can pass that batch file into spicy-driver:

# spicy-driver -d -F tftp_rrq.dat -P 69/udp%orig=TFTP::ReadRequest tftp.spicy
[$opcode=1, $filename=b"rfc1350.txt", $mode=b"octet"]

The one additional piece here is that we need to tell spicy-driver on which packets inside the batch file to deploy our parser (because, in principle, the batch could contain many different protocols distributed over independent connections). We achieve that through -P 69/udp%orig=TFTP::ReadRequest, which specifies that we want to use the TFTP::ReadRequest on all originator-side UDP packets for any connections on port 69/udp. See spicy-driver documentation for more on that syntax.

Note

New in version 1.13: parser aliases

That option -P (aka --parser-alias) is a feature added to Spicy in version 1.13. An alternative to using that option—which works with older Spicy version as well—is providing a %port property inside the TFTP::ReadRequest unit; the two mechanisms have the same effect.

Altogether, this gives us an easy way to test our TFTP parser with actual packet data, without needing to switch to full Zeek integration yet.

The batch mode of spicy-driver is generally worth keeping in mind while developing a new analyzer: even if the eventual goal is to create a Zeek analyzer, it is usually easier to work with spicy-driver for as long as possible before transitioning to the Zeek-side glue layer later. The same observation applies to debugging: tracking down why a parser isn’t quite doing what you would expect is normally quicker with Zeek out of the picture. You can even craft input for spicy-driver manually if you need to test specific edge cases, for example by simply editing the payload data inside an existing batch file, tweaking it the way you need it.

4.1.2. Generalizing to More Packet Types 

So far we can parse a Read Request, but nothing else. In fact, we are not even examining the opcode yet at all to see if our input actually is a Read Request. To generalize our grammar to other TFTP packet types, we will need to parse the opcode on its own first, and then use the value to decide how to handle subsequent data. Let’s start over with a minimal version of our TFTP grammar that looks at just the opcode:

module TFTP;

public type Packet = unit {
    opcode: uint16;

    on %done { print self; }
};

# spicy-driver -d -F tftp_rrq.dat -P 69/udp=TFTP::Packet -P 50618/udp=TFTP::Packet tftp.spicy
[$opcode=1]
[$opcode=3]
[$opcode=4]
[$opcode=3]
[$opcode=4]
[⋯]

As you see, we now use -P 69/udp=TFTP::Packet because we no longer need to worry about the direction: from now on, the same Packet unit handles both originator and responder sides. However, because the way TFTP works, we need an additional parser mapping for the data connection that’s part of the PCAP as well, because that happens on a different port: -P 50618/udp=TFTP::Packet. The handling of such dynamic, non-standard ports is something that normally the host application (e.g., Zeek) would handle on its side. With spicy-driver, we need to do it manually ourselves.

With this in place we now, in fact, see output for all the packets that the original PCAP contains.

Next we create a separate type to parse the fields that are specific to a Read Request:

type ReadRequest = unit {
    filename: bytes &until=b"\x00";
    mode:     bytes &until=b"\x00";
};

We do not declare this type as public because we will use it only internally inside our grammar; it is not a top-level entry point for parsing (that’s Packet now).

Now we need to tie the two units together. We can do that by adding the ReadRequest as a field to the Packet, which will let Spicy parse it as a sub-unit:

module TFTP;

public type Packet = unit {
    opcode: uint16;
    rrq:    ReadRequest;

    on %done { print self; }
};

# spicy-driver -d -F tftp_rrq.dat -P 69/udp=TFTP::Packet -P 50618/udp=TFTP::Packet tftp.spicy
[$opcode=1, $rrq=[$filename=b"rfc1350.txt", $mode=b"octet"]]
error for ID 192.168.0.10-3445-192.168.0.253-50618-udp-orig: end-of-data reached before &until expression found (0 bytes available) (:14:28-14:34)
error for ID 192.168.0.10-3445-192.168.0.253-50618-udp-resp: end-of-data reached before &until expression found (0 bytes available) (:14:28-14:34)
error for ID 192.168.0.10-3445-192.168.0.253-50618-udp-orig: end-of-data reached before &until expression found (0 bytes available) (:14:28-14:34)
error for ID 192.168.0.10-3445-192.168.0.253-50618-udp-resp: end-of-data reached before &until expression found (0 bytes available) (:14:28-14:34)
[⋯]

However, this does not help us much yet: it still resembles our original version in that it continues to hardcode one specific packet type. Indeed, we are now getting error messages for packets of other opcodes because we told spicy-driver to use Packet for them as well, even though our current definition of Packet cannot actually parse them successfully.

But the direction of using sub-units remains promising, we only need to instruct the parser to leverage the opcode to decide what particular sub-unit to use. Spicy provides a switch construct for such dispatching:

module TFTP;

public type Packet = unit {
    opcode: uint16;

    switch ( self.opcode ) {
        1 -> rrq: ReadRequest;
    };

    on %done { print self; }
};

# spicy-driver -d -F tftp_rrq.dat -P 69/udp=TFTP::Packet -P 50618/udp=TFTP::Packet tftp.spicy
[$opcode=1, $rrq=[$filename=b"rfc1350.txt", $mode=b"octet"]]
error for ID 192.168.0.10-3445-192.168.0.253-50618-udp-orig: no matching case in switch statement for value '3' (:7:5-9:6)
error for ID 192.168.0.10-3445-192.168.0.253-50618-udp-resp: no matching case in switch statement for value '4' (:7:5-9:6)
error for ID 192.168.0.10-3445-192.168.0.253-50618-udp-orig: no matching case in switch statement for value '3' (:7:5-9:6)
error for ID 192.168.0.10-3445-192.168.0.253-50618-udp-resp: no matching case in switch statement for value '4' (:7:5-9:6)
[⋯]

The self keyword always refers to the unit instance currently being parsed, and we use that to get to the opcode for switching on. If it is 1, we descend down into a Read Request. We are still getting error messages for other opcodes, but now spicy-driver is no longer complaining that it can’t parse it them as a Read Request. Instead, we’re rightfully being told that our switch statement doesn’t provide the alternatives for other opcodes yet.

Of course, it is now easy to add more unit types for handling other opcodes. Let’s start with acknowledgments:

public type Packet = unit {
    opcode: uint16;

    switch ( self.opcode ) {
        1 -> rrq: ReadRequest;
        4 -> ack: Acknowledgement;
    };

    on %done { print self; }
};

type Acknowledgement = unit {
    num: uint16; # block number being acknowledged
};

# spicy-driver -d -F tftp_rrq.dat -P 69/udp=TFTP::Packet -P 50618/udp=TFTP::Packet tftp.spicy
[$opcode=1, $rrq=[$filename=b"rfc1350.txt", $mode=b"octet"], $ack=(not set)]
error for ID 192.168.0.10-3445-192.168.0.253-50618-udp-orig: no matching case in switch statement for value '3' (:9:5-12:6)
[$opcode=4, $rrq=(not set), $ack=[$num=1]]
error for ID 192.168.0.10-3445-192.168.0.253-50618-udp-orig: no matching case in switch statement for value '3' (:9:5-12:6)
[$opcode=4, $rrq=(not set), $ack=[$num=2]]
[⋯]

As expected, the output shows that for opcode 4, our TFTP parser now descends into the ack field while leaving rrq unset. Now opcode 3 is the only one remaining in our input that is not handled yet, hence the remaining error messages.

In total, TFTP defines three more opcodes for other packet types: 2 is a Write Request, 3 is file data being sent, and 5 is an error. Let’s add these to our grammar as well, so that we get the whole protocol covered (please refer to the RFC for specifics of each opcode type):

module TFTP;

public type Packet = unit {
    opcode: uint16;

    switch ( self.opcode ) {
        1 -> rrq:   ReadRequest;
        2 -> wrq:   WriteRequest;
        3 -> data:  Data;
        4 -> ack:   Acknowledgement;
        5 -> error: Error;
    };

    on %done { print self; }
};

type ReadRequest = unit {
    filename: bytes &until=b"\x00";
    mode:     bytes &until=b"\x00";
};

type WriteRequest = unit {
    filename: bytes &until=b"\x00";
    mode:     bytes &until=b"\x00";
};

type Data = unit {
    num:  uint16;
    data: bytes &eod; # parse until end of data (i.e., packet) is reached
};

type Acknowledgement = unit {
    num: uint16;
};

type Error = unit {
    code: uint16;
    msg:  bytes &until=b"\x00";
};

# spicy-driver -d -F tftp_rrq.dat -P 69/udp=TFTP::Packet -P 50618/udp=TFTP::Packet tftp.spicy
[$opcode=1, $rrq=[$filename=b"rfc1350.txt", $mode=b"octet"], $wrq=(not set), $data=(not set), $ack=(not set), $error=(not set)]
[$opcode=3, $rrq=(not set), $wrq=(not set), $data=[$num=1, $data=b"\x0a\x0a\x0a\x0a\x0a\x0aNetwork Working Group                                         K. Sollins\x0aRequest For Comments: 1350                                           MIT\x0aSTD: 33                                                        July 1992\x0aObsoletes: RFC 783\x0a\x0a\x0a                     THE TFTP PROTOCOL (REVISION 2)\x0a\x0aStatus of this Memo\x0a\x0a   This RFC specifies an IAB standards track protocol for the Internet\x0a   community, and requests discussion and suggestions for improvements.\x0a   Please refer to the current edition of the \"IA"], $ack=(not set), $error=(not set)]
[$opcode=4, $rrq=(not set), $wrq=(not set), $data=(not set), $ack=[$num=1], $error=(not set)]
[$opcode=3, $rrq=(not set), $wrq=(not set), $data=[$num=2, $data=b"B Official Protocol\x0a   Standards\" for the standardization state and status of this protocol.\x0a   Distribution of this memo is unlimited.\x0a\x0aSummary\x0a\x0a   TFTP is a very simple protocol used to transfer files.  It is from\x0a   this that its name comes, Trivial File Transfer Protocol or TFTP.\x0a   Each nonterminal packet is acknowledged separately.  This document\x0a   describes the protocol and its types of packets.  The document also\x0a   explains the reasons behind some of the design decisions.\x0a\x0aAcknowlegements\x0a\x0a   The "], $ack=(not set), $error=(not set)]
[$opcode=4, $rrq=(not set), $wrq=(not set), $data=(not set), $ack=[$num=2], $error=(not set)]
[⋯]

Now we are finally error-free.

This grammar works well already, but we can improve it a bit more.

4.1.3. Using Enums 

The use of integer values inside the switch construct is not exactly pretty: they are hard to read and maintain. We can improve our grammar by using an enumerator type with descriptive labels instead. We first declare an enum type that provides one label for each possible opcode:

type Opcode = enum { RRQ = 1, WRQ = 2, DATA = 3, ACK = 4, ERROR = 5 };

Now we can change the switch to look like this:

switch ( self.opcode ) {
        Opcode::RRQ   -> rrq:   ReadRequest;
        Opcode::WRQ   -> wrq:   WriteRequest;
        Opcode::DATA  -> data:  Data;
        Opcode::ACK   -> ack:   Acknowledgement;
        Opcode::ERROR -> error: Error;
        };

Much better, but there is a catch still: this will not compile because of a type mismatch. The switch cases’ expressions have type Opcode, but self.opcode remains of type uint16. That is because Spicy cannot know on its own that the integers we parse into opcode match the numerical values of the Opcode labels. But we can convert the former into the latter explicitly by adding a &convert attribute to the opcode field:

public type Packet = unit {
    opcode: uint16 &convert=Opcode($$);
    ...
};

This does two things:

Each time an uint16 gets parsed for this field, it is not directly stored in opcode, but instead first passed through the expression that &convert specifies. Spicy then stores the result of that expression, potentially adapting the field’s type accordingly. Inside the &convert expression, the parsed value is accessible through the special identifier $$.
Our &convert expression passes the parsed integer into the constructor for the Opcode enumerator type, which lets Spicy create an Opcode value with the label that corresponds to the integer value.

With this transformation, the opcode field now has type Opcode and hence can be used with our updated switch statement. You can see the new type for opcode in the output as well:

# spicy-driver -d -F tftp_rrq.dat -P 69/udp=TFTP::Packet -P 50618/udp=TFTP::Packet tftp.spicy
[$opcode=Opcode::RRQ, $rrq=[$filename=b"rfc1350.txt", $mode=b"octet"], $wrq=(not set), $data=(not set), $ack=(not set), $error=(not set)]
[$opcode=Opcode::DATA, $rrq=(not set), $wrq=(not set), $data=[$num=1, $data=b"\x0a\x0a\x0a\x0a\x0a\x0aNetwork Working Group                                         K. Sollins\x0aRequest For Comments: 1350                                           MIT\x0aSTD: 33                                                        July 1992\x0aObsoletes: RFC 783\x0a\x0a\x0a                     THE TFTP PROTOCOL (REVISION 2)\x0a\x0aStatus of this Memo\x0a\x0a   This RFC specifies an IAB standards track protocol for the Internet\x0a   community, and requests discussion and suggestions for improvements.\x0a   Please refer to the current edition of the \"IA"], $ack=(not set), $error=(not set)]
[$opcode=Opcode::ACK, $rrq=(not set), $wrq=(not set), $data=(not set), $ack=[$num=1], $error=(not set)]
[$opcode=Opcode::DATA, $rrq=(not set), $wrq=(not set), $data=[$num=2, $data=b"B Official Protocol\x0a   Standards\" for the standardization state and status of this protocol.\x0a   Distribution of this memo is unlimited.\x0a\x0aSummary\x0a\x0a   TFTP is a very simple protocol used to transfer files.  It is from\x0a   this that its name comes, Trivial File Transfer Protocol or TFTP.\x0a   Each nonterminal packet is acknowledged separately.  This document\x0a   describes the protocol and its types of packets.  The document also\x0a   explains the reasons behind some of the design decisions.\x0a\x0aAcknowlegements\x0a\x0a   The "], $ack=(not set), $error=(not set)]
[$opcode=Opcode::ACK, $rrq=(not set), $wrq=(not set), $data=(not set), $ack=[$num=2], $error=(not set)]
[⋯]

See On-the-fly Type Conversion with &convert for more on &convert, and Enum for more on the enum type.

Note

What happens when Opcode($$) receives an integer that does not correspond to any of the labels? Spicy permits that and will substitute an implicitly defined Opcode::Undef label. It will also retain the actual integer value, which can be recovered by converting the enum value back to an integer.

4.1.4. Using Unit Parameters 

Looking at the two types ReadRequest and WriteRequest, we see that both are using exactly the same fields. That means we do not really need two separate types here, and could instead define a single Request unit to cover both cases. Doing so is straight-forward, except for one issue: when parsing such a Request, we would now lose the information whether we are seeing read or a write operation. For a potential Zeek integration later it will be useful to retain that distinction, so let us leverage a Spicy capability that allows passing state into a sub-unit: unit parameters. Here’s the corresponding excerpt after that refactoring:

public type Packet = unit {
    opcode: uint16 &convert=Opcode($$);

    switch ( self.opcode ) {
        Opcode::RRQ   -> rrq:   Request(True);
        Opcode::WRQ   -> wrq:   Request(False);
        # ...
        };

    on %done { print self; }
};

type Request = unit(is_read: bool) {
    filename: bytes &until=b"\x00";
    mode:     bytes &until=b"\x00";

    on %done { print "We got a %s request." % (is_read ? "read" : "write"); }
};

We see that the switch now passes either True or False into the Request type, depending on whether it is a Read Request or Write Request. For demonstration, we added another print statement, so that we can see how that boolean becomes available through the is_read unit parameter:

# spicy-driver -d -F tftp_rrq.dat -P 69/udp=TFTP::Packet -P 50618/udp=TFTP::Packet tftp.spicy
We got a read request.
[$opcode=Opcode::RRQ, $rrq=[$filename=b"rfc1350.txt", $mode=b"octet"], $wrq=(not set), $data=(not set), $ack=(not set), $error=(not set)]
[$opcode=Opcode::DATA, $rrq=(not set), $wrq=(not set), $data=[$num=1, $data=b"\x0a\x0a\x0a\x0a\x0a\x0aNetwork Working Group                                         K. Sollins\x0aRequest For Comments: 1350                                           MIT\x0aSTD: 33                                                        July 1992\x0aObsoletes: RFC 783\x0a\x0a\x0a                     THE TFTP PROTOCOL (REVISION 2)\x0a\x0aStatus of this Memo\x0a\x0a   This RFC specifies an IAB standards track protocol for the Internet\x0a   community, and requests discussion and suggestions for improvements.\x0a   Please refer to the current edition of the \"IA"], $ack=(not set), $error=(not set)]
[$opcode=Opcode::ACK, $rrq=(not set), $wrq=(not set), $data=(not set), $ack=[$num=1], $error=(not set)]
[$opcode=Opcode::DATA, $rrq=(not set), $wrq=(not set), $data=[$num=2, $data=b"B Official Protocol\x0a   Standards\" for the standardization state and status of this protocol.\x0a   Distribution of this memo is unlimited.\x0a\x0aSummary\x0a\x0a   TFTP is a very simple protocol used to transfer files.  It is from\x0a   this that its name comes, Trivial File Transfer Protocol or TFTP.\x0a   Each nonterminal packet is acknowledged separately.  This document\x0a   describes the protocol and its types of packets.  The document also\x0a   explains the reasons behind some of the design decisions.\x0a\x0aAcknowlegements\x0a\x0a   The "], $ack=(not set), $error=(not set)]
[⋯]

Admittedly, the unit parameter is almost overkill in this example, but it proves very useful in more complex grammars where one needs access to state information, in particular also from higher-level units. For example, if the Packet type stored additional state that sub-units needed access to, they could receive the Packet itself as a parameter.

4.1.5. Complete Grammar 

Combining everything discussed so far, this leaves us with the following complete grammar for TFTP, including the packet formats in comments as well:

# Copyright (c) 2020-now by the Zeek Project. See LICENSE for details.
#
# Trivial File Transfer Protocol
#
# Specs from https://tools.ietf.org/html/rfc1350

module TFTP;

import spicy;

# Common header for all messages:
#
#      2 bytes
# ---------------
# |  TFTP Opcode  |
#  ---------------

public type Packet = unit {    # public top-level entry point for parsing
    op: uint16 &convert=Opcode($$);
    switch ( self.op ) {
        Opcode::RRQ   -> rrq:   Request(True);
        Opcode::WRQ   -> wrq:   Request(False);
        Opcode::DATA  -> data:  Data;
        Opcode::ACK   -> ack:   Acknowledgement;
        Opcode::ERROR -> error: Error;
        };
};

# TFTP supports five types of packets [...]:
#
# opcode  operation
#   1     Read request (RRQ)
#   2     Write request (WRQ)
#   3     Data (DATA)
#   4     Acknowledgment (ACK)
#   5     Error (ERROR)
type Opcode = enum {
    RRQ = 0x01,
    WRQ = 0x02,
    DATA = 0x03,
    ACK = 0x04,
    ERROR = 0x05
};

# Figure 5-1: RRQ/WRQ packet
#
#  2 bytes     string    1 byte     string   1 byte
#  ------------------------------------------------
# | Opcode |  Filename  |   0  |    Mode    |   0  |
#  ------------------------------------------------

type Request = unit(is_read: bool) {
    filename: bytes &until=b"\x00";
    mode:     bytes &until=b"\x00";

};

# Figure 5-2: DATA packet
#
#  2 bytes     2 bytes      n bytes
#   ----------------------------------
#  | Opcode |   Block #  |   Data     |
#   ----------------------------------

type Data = unit {
    num:  uint16;
    data: bytes &eod;
};

# Figure 5-3: ACK packet
#
#  2 bytes     2 bytes
#  ---------------------
# | Opcode |   Block #  |
#  ---------------------

type Acknowledgement = unit {
    num: uint16;
};

#  Figure 5-4: ERROR packet
#
#  2 bytes     2 bytes      string    1 byte
#  -----------------------------------------
# | Opcode |  ErrorCode |   ErrMsg   |   0  |
#  -----------------------------------------

type Error = unit {
    code: uint16;
    msg:  bytes &until=b"\x00";
};

4.2. Next Steps 

This tutorial provides an introduction to the Spicy language and toolchain. Spicy’s capabilities go much further than what we could show here. Some pointers for what to look at next:

Programming in Spicy provides an in-depth discussion of the Spicy language, including in particular all the constructs for parsing data and a reference of language elements. Note that most of Spicy’s types come with operators and methods for operating on values. The Debugging section helps understanding Spicy’s operation if results do not match what you would expect.
Examples summarizes grammars coming with the Spicy distribution.
Zeek’s Spicy tutorial continues the TFTP example by turning the Spicy code developed here into a full Zeek analyzer.
Zeek Integration discusses Spicy’s integration into Zeek.

4. Tutorial: A Real Analyzer

4.1. Creating a Spicy Grammar

4.1.1. Parsing One Packet Type

4.1.2. Generalizing to More Packet Types

4.1.3. Using Enums

4.1.4. Using Unit Parameters

4.1.5. Complete Grammar

4.2. Next Steps