Previously we tolerated EOF as an alias for newline, but a file without
a newline at the end is a edge case primarily limited to contrived
examples in unit tests, and requiring it simplifies tasks such as code
generation in zclwrite, since we can always assume that every block item
comes with a built-in line terminator.
When we're skipping comments but retaining newlines, we need to do some
slight-of-hand because single-line comment tokens contain the newline
that terminates them (for simpler handling of lead doc comments) but our
parsing can be newline-sensitive.
To allow for this, as a special case we transform single-line comment
tokens into newlines when in this situation, thus allowing parser code
to just worry about the newlines and skip over the comments.
The native parser's ranges don't include any surrounding comments, so we
need to do a little more work to pick them out of the surrounding token
sequences.
This just takes care of _lead_ comments, which are those that appear as
whole line comments above the item in question. Line comments, which
appear after the item on the same line, will follow in a later commit.
This can either be a traversal or a first-class node depending on whether
the given expression is a literal. This exception is made to allow
applications to conditionally populate only part of a potentially-large
collection if the config is only requesting one or two distinct indices.
In particular, it allows the following to be considered a single traversal
from the scope:
foo.bar[0].baz
stringer is more sensitive to certain errors than other generators, so
by running it last we give the other generators a chance to get things
straight before we ask stringer to run.
Traversal operators are the operators that can appear after a value
to traverse into the data structure that value represents. So far only
the attribute access operator is implemented.
This allows tooling to get a string describing the context of a particular
offset into the file. This is used, for example, to provide context
above the source code snippets in console-printed diagnostic messages.
There are certain tokens that are _never_ valid, so we might as well
catch them early in the Lex... functions rather than having to handle
them in many different contexts within the parser.
Unfortunately for now when such errors occur they tend to be echoed by
more confusing errors coming from the parser, but we'll accept that for
now.
This allows code that only deals with BodyContent (post-decoding) to
still be able to report on missing items within the associated body while
providing a suitable source location.
This applies both two the whole of bare expressions and to any nested
expressions within parentheses. The latter means that an attribute value
can span over multiple lines if it's wrapped in parens:
foo = (
1 + 2 + 3 + 4 + 5
)
The ~ character can be used at the start and end of interpolation
sequences to trim off whitespace in neighboring literals, with the goal
of allowing extra whitespace to be included for readability without
including it in the resulting string.
Previously we were failing to return back to template-scanning mode due
to decrementing "braces" too early, causing the remainder of the template
to be scanned as if it were an expression.
Conditional expression parsing is ostensibly implemented, but since it
depends on the rest of the expression parsers -- not yet implemented --
it cannot be tested in isolation.
Also includes an initial implementation of the conditional expression
node, but this is also not yet tested and so may need further revisions
once we're in a better position to test it.
This rewrite of decodeQuotedLit, now called decodeStringLit, is able to
handle both cases with a single function, and also now correctly handles
situations where double-$ and double-! are not followed immediately by
a { symbol, and must thus be treated literally.
The context where a string literal was found affects what sort of escaping
it can have, so we need to distinguish these cases so that we will only
look for and handle backslash escapes in quoted strings.
This recovery method attempts to place the peeker directly after the
newline indicating the end of the current body item. It does this by
counting open and close bracketing constructs and then returning when
a newline is encountered with no bracketing constructs open.
It's designed for use in the "header" part of a body item, with no
bracketing constructs open yet. It _might_ work in other situations, but
is likely to end up choosing the wrong end point if used in the middle
of a bracketed expression that itself contains newlines.
These are the top-level interface to parsing configuration files in the
native zcl syntax, similarly to the existing methods for parsing other
syntaxes.
The parser has some recovery heuristics but they will not always achieve
the best result. To prevent unsuccessful recovery from causing a cascade
of confusing follow-on errors, we'll track in the parser whether recovery
has been attempted and then the specific sub-parsers can use this to
skip certain "unexpected token type" errors in recovery situations,
assuming instead that an earlier error will cover the problem and that
we just want to bail out quickly.
The scanner produces complicated sequences for quoted strings due to the
template language, but sometimes we just want a simple string with no
interpolations.