5.6. Debugging
It can be challenging to track down the specifics of what a parser is doing (or not doing) because often there’s no directly observable effect. To make that easier, Spicy comes with debugging support that helps during parser development.
Generally, debugging support requires running spicyc or
spicy-driver with option -d; that enables generating debug
versions of the generated C++ code. In addition, the option -X
<tag> may enable additional, more expensive debug instrumentation,
as discussed below. Any use of -X implicitly turns on -d.
5.6.1. Debug Hooks
The simplest way to learn more about what’s going on is to add hooks
with print statements to your grammar. That’s rather disruptive
though, and hence there are also special %debug unit hooks which
only get compiled into the resulting code if spicy-driver is run
with debugging enabled (-d):
module Test;
public type test = unit {
a: /1234/ %debug {
print self.a;
}
b: /567890/;
on b %debug { print self.b; }
};
# printf "1234567890" | spicy-driver -d debugging.spicy
1234
567890
# printf "1234567890" | spicy-driver debugging.spicy
5.6.2. Debug Streams
A second form of debugging support uses runtime debug streams that
instrument the generated parsers to log activity as they are parsing
their input. If you run spicy-driver with -d, you can set the
environment variable HILTI_DEBUG to a set of debug stream names to
select the desired information (see below for the list). Execution
will then print debug information to standard error:
> echo "GET /index.html HTTP/1.0" | HILTI_DEBUG=spicy spicy-driver -d http-request.spicy
[spicy] Request::RequestLine
[spicy] method = GET
[spicy] anon_2 =
[spicy] uri = /index.html
[spicy] anon_3 =
[spicy] Request::Version
[spicy] anon = HTTP/
[spicy] number = 1.0
[spicy] version = [$anon=b"HTTP/", $number=b"1.0"]
[spicy] anon_4 = \n
GET, /index.html, 1.0
The available debug streams include:
spicyLogs unit fields and variables as they receive values. This is often the most helpful output as it shows rather concisely what the parser is doing, and in particular how far it gets in cases where it doesn’t parse something correctly.
spicy-verboseLogs various internals about the parsing process, including the grammar rules currently being parsed, the current input, and lexer tokens.
This stream is primarily intended for debugging the Spicy compiler itself, but it can be helpful also in particular for understanding the data that remains to be parsed.
hilti-traceThis is a HILTI-level debug stream that records every HILTI instruction being executed. To use this, you need to run
spicy-driverwith-X trace.This stream is primarily intended for debugging the Spicy compiler itself.
hilti-flowThis is a HILTI-level debug level recording flow information like function calls. To use this, you need to run
spicy-driverwith-X flow.This stream is primarily intended for debugging the Spicy compiler itself, although it may also be helpful to understand the internal control flow when writing a grammar.
Multiple streams can be enabled by separating them with colons.
5.6.3. Exceptions
When encountering runtime errors, Spicy by default triggers C++
exceptions that bubble up back to the host application. If not handled
there, execution will stop. For debugging, you can also let the Spicy
runtime system abort() with a core dump, instead of throwing an
exception, by running spicy-driver with --abort-on-exceptions.
That especially helps inside a debugger.
If in addition you specify --show-backtraces as well, it will
print a stack trace before aborting (assuming support for that is
available on your platform).
5.6.4. Inspecting Generated Code
Using spicyc you can inspect the code that’s being generated for a
given Spicy grammar:
spicyc -poutputs the intermediary HILTI code. The code tends to be pretty intuitively readable. Even if you don’t know all the specifics of HILTI, much of the code is rather close to Spicy itself. (Per above, you can trace the generated HILTI code as it executes by activating thehilti-tracedebug stream).spicyc -coutputs the final C++ code. If you add-L, the output will also include additional code generated by HILTI’s linker (which enables cross-module functionality).When JITing a grammar with
spicyc -j, running with-D dump-codewill record all generated intermediary code (HILTI code, C++ code, object files) into filesdbg.*inside the current directory.
5.6.5. Skipping Validation
When working on the Spicy code, it can be helpful to disable internal
validation of generated HILTI code with -V. That way, one can
often still see the HILTI code even if it’s malformed. Note, however,
that Spicy may end up crashing if broken HILTI code gets passed into
later stages.
5.6.6. Interpreting Stack Traces
When looking at stack traces taken during Spicy parsing (e.g., from
--show-backtraces, or inside a debugger or profiler), it’s
important to understand that the stack trace will show internal HILTI
functions that are being executed, not the Spicy grammar rules. While
HILTI functions are generated from the Spicy grammar, and bear some
resemblance to its structure, the mapping is not always
straightforward.
In addition, stack traces may be hard to interpret because the generated code runs inside fibers, which are a form of lightweight threads. Fibers are managed by HILTI’s runtime system and are used to implement the non-blocking parsing. As a result of the fibers, stack traces don’t show the full call chain up into the host application, but stop prematurely at the point where execution switched from the host application into a fiber.
Here’s an example stack trace taken during an execution of
spicy-driver with a small Spicy grammar printing some parsed
value, as shown by lldb:
frame #0: void hilti::rt::print<hilti::rt::ValueReference<hlt_internal::Test::Foo>>
frame #1: hlt_internal::Test::_hook_Foo___on_0x25_done_7fba
frame #2: hlt_internal::Test::_hook_Foo___on_0x25_done
frame #3: hlt_internal::Test::Foo::_on_0x25_done
frame #4: hlt_internal::Test::Foo::_parse_Test__Foo_stage2
frame #5: hlt_internal::Test::Foo::_parse_stage1
frame #6: hlt_internal::Test::Foo::parse3
frame #7: hlt::Test::Foo::parse3
frame #8: hilti::rt::detail::Callback::Callback<hlt::Test::Foo::parse3>
frame #9: hilti::rt::detail::Callback::Callback<hlt::Test::Foo::parse3>
frame #10: libhilti.dylib`hilti::rt::detail::Callback::operator
frame #11: libhilti.dylib`__fiber_run_trampoline
frame #12: libhilti.dylib`fiber_asm_invoke at fiber_asm_aarch64_apcs.S:83
__fiber_asm_invoke is the original entry point where control
transferred into the fiber. Note that the stack trace doesn’t show
anything further beyond that point: no function of spicy-driver
shows up.
While this example is relatively clean, depending on the OS and
toolchain, sometimes there’s additional noise in the stack trace just
beyond that fiber_asm_invoke entry. In particular, profiling tools
trying to aggregate multiple stack traces may get confused by the
fibers, and can end up breaking up activity that belongs together.
Sometimes this can be solved through some custom pre-processing of the
stack traces, removing/ignoring anything beyond the fiber entry point
(hilti::rt::detail::Callback::operator is a good cut-off point).