5.6. Debugging

It can be challenging to track down the specifics of what a parser is doing (or not doing) because often there’s no directly observable effect. To make that easier, Spicy comes with debugging support that helps during parser development.

Generally, debugging support requires running spicyc or spicy-driver with option -d; that enables generating debug versions of the generated C++ code. In addition, the option -X <tag> may enable additional, more expensive debug instrumentation, as discussed below. Any use of -X implicitly turns on -d.

5.6.1. Debug Hooks

The simplest way to learn more about what’s going on is to add hooks with print statements to your grammar. That’s rather disruptive though, and hence there are also special %debug unit hooks which only get compiled into the resulting code if spicy-driver is run with debugging enabled (-d):

module Test;

public type test = unit {
    a: /1234/ %debug {
        print self.a;
    }

    b: /567890/;

    on b %debug { print self.b; }
};
# printf "1234567890" | spicy-driver -d debugging.spicy
1234
567890
# printf "1234567890" | spicy-driver debugging.spicy

5.6.2. Debug Streams

A second form of debugging support uses runtime debug streams that instrument the generated parsers to log activity as they are parsing their input. If you run spicy-driver with -d, you can set the environment variable HILTI_DEBUG to a set of debug stream names to select the desired information (see below for the list). Execution will then print debug information to standard error:

> echo "GET /index.html HTTP/1.0" | HILTI_DEBUG=spicy spicy-driver -d http-request.spicy
[spicy] Request::RequestLine
[spicy]   method = GET
[spicy]   anon_2 =
[spicy]   uri = /index.html
[spicy]   anon_3 =
[spicy]   Request::Version
[spicy]     anon = HTTP/
[spicy]     number = 1.0
[spicy]   version = [$anon=b"HTTP/", $number=b"1.0"]
[spicy]   anon_4 = \n
GET, /index.html, 1.0

The available debug streams include:

spicy

Logs unit fields and variables as they receive values. This is often the most helpful output as it shows rather concisely what the parser is doing, and in particular how far it gets in cases where it doesn’t parse something correctly.

spicy-verbose

Logs various internals about the parsing process, including the grammar rules currently being parsed, the current input, and lexer tokens.

This stream is primarily intended for debugging the Spicy compiler itself, but it can be helpful also in particular for understanding the data that remains to be parsed.

hilti-trace

This is a HILTI-level debug stream that records every HILTI instruction being executed. To use this, you need to run spicy-driver with -X trace.

This stream is primarily intended for debugging the Spicy compiler itself.

hilti-flow

This is a HILTI-level debug level recording flow information like function calls. To use this, you need to run spicy-driver with -X flow.

This stream is primarily intended for debugging the Spicy compiler itself, although it may also be helpful to understand the internal control flow when writing a grammar.

Multiple streams can be enabled by separating them with colons.

5.6.3. Exceptions

When encountering runtime errors, Spicy by default triggers C++ exceptions that bubble up back to the host application. If not handled there, execution will stop. For debugging, you can also let the Spicy runtime system abort() with a core dump, instead of throwing an exception, by running spicy-driver with --abort-on-exceptions. That especially helps inside a debugger.

If in addition you specify --show-backtraces as well, it will print a stack trace before aborting (assuming support for that is available on your platform).

5.6.4. Inspecting Generated Code

Using spicyc you can inspect the code that’s being generated for a given Spicy grammar:

  • spicyc -p outputs the intermediary HILTI code. The code tends to be pretty intuitively readable. Even if you don’t know all the specifics of HILTI, much of the code is rather close to Spicy itself. (Per above, you can trace the generated HILTI code as it executes by activating the hilti-trace debug stream).

  • spicyc -c outputs the final C++ code. If you add -L, the output will also include additional code generated by HILTI’s linker (which enables cross-module functionality).

  • When JITing a grammar with spicyc -j, running with -D dump-code will record all generated intermediary code (HILTI code, C++ code, object files) into files dbg.* inside the current directory.

5.6.5. Skipping Validation

When working on the Spicy code, it can be helpful to disable internal validation of generated HILTI code with -V. That way, one can often still see the HILTI code even if it’s malformed. Note, however, that Spicy may end up crashing if broken HILTI code gets passed into later stages.

5.6.6. Interpreting Stack Traces

When looking at stack traces taken during Spicy parsing (e.g., from --show-backtraces, or inside a debugger or profiler), it’s important to understand that the stack trace will show internal HILTI functions that are being executed, not the Spicy grammar rules. While HILTI functions are generated from the Spicy grammar, and bear some resemblance to its structure, the mapping is not always straightforward.

In addition, stack traces may be hard to interpret because the generated code runs inside fibers, which are a form of lightweight threads. Fibers are managed by HILTI’s runtime system and are used to implement the non-blocking parsing. As a result of the fibers, stack traces don’t show the full call chain up into the host application, but stop prematurely at the point where execution switched from the host application into a fiber.

Here’s an example stack trace taken during an execution of spicy-driver with a small Spicy grammar printing some parsed value, as shown by lldb:

frame #0:  void hilti::rt::print<hilti::rt::ValueReference<hlt_internal::Test::Foo>>
frame #1:  hlt_internal::Test::_hook_Foo___on_0x25_done_7fba
frame #2:  hlt_internal::Test::_hook_Foo___on_0x25_done
frame #3:  hlt_internal::Test::Foo::_on_0x25_done
frame #4:  hlt_internal::Test::Foo::_parse_Test__Foo_stage2
frame #5:  hlt_internal::Test::Foo::_parse_stage1
frame #6:  hlt_internal::Test::Foo::parse3
frame #7:  hlt::Test::Foo::parse3
frame #8:  hilti::rt::detail::Callback::Callback<hlt::Test::Foo::parse3>
frame #9:  hilti::rt::detail::Callback::Callback<hlt::Test::Foo::parse3>
frame #10: libhilti.dylib`hilti::rt::detail::Callback::operator
frame #11: libhilti.dylib`__fiber_run_trampoline
frame #12: libhilti.dylib`fiber_asm_invoke at fiber_asm_aarch64_apcs.S:83

__fiber_asm_invoke is the original entry point where control transferred into the fiber. Note that the stack trace doesn’t show anything further beyond that point: no function of spicy-driver shows up.

While this example is relatively clean, depending on the OS and toolchain, sometimes there’s additional noise in the stack trace just beyond that fiber_asm_invoke entry. In particular, profiling tools trying to aggregate multiple stack traces may get confused by the fibers, and can end up breaking up activity that belongs together. Sometimes this can be solved through some custom pre-processing of the stack traces, removing/ignoring anything beyond the fiber entry point (hilti::rt::detail::Callback::operator is a good cut-off point).