This LexConfig, LexExpression and LexTemplate set of functions allow
outside callers to use the scanner in isolation, skipping the parser.
This may be useful for use-cases such as syntax highlighting, separate
parsers (such as the one in zclwrite), and so forth. Most callers should
use the parser (once implemented) though, to get a semantic AST.
This uses a separate, lower-level AST than the parser so that it can
retain the raw tokens and make surgical changes. As a consequence it
has much less semantic detail than the parser AST, and in particular
doesn't represent the structure of expressions except to retain variable
references to enable global rename operations.
This alternative scanning mode makes the scanner start in template
context rather than normal context. This will be later used by the parser
to allow parsing of standalone templates that aren't embedded inside a
zcl configuration file.
This is important because our syntax for objects uses newlines as the
separator between items, so this is the only signal we'll get that a
given item has ended and another is beginning.
A scanner "mode" decides which state it starts in, allowing us to start
in template mode for parsing top-level templates. However, currently the
only mode implemented is "normal" mode, which is the behavior we had
before.
This requires some extra state-keeping because we allow templates to be
nested inside templates. This takes us outside of the world of regular
languages, but we accept that here because it makes things easier to
deal with down the line in the parser.
The methodology is to keep track of how many braces are open at a given
time and then, when a nested template interpolation begins, record the
current brace level. Then, when a closing brace is encountered, if its
nesting level is at the top of the stack then we pop off the stack and
return to "main" parsing mode.
Ragel's existing idea of calling and returning from machines is important
here too. As this currently stands this is not actually needed, but once
heredocs are in play we will have two possible places to return to at
the end of an interpolation sequence, so the state return stack maintained
by Ragel will determine whether to return to string mode or heredoc mode.
Although this end symbol appears as just a close-brace in source, it's
worth differentiating it because the scanner must differentiate it anyway
(to recognize moving back into template-scanning mode) and it avoids the
parser from having to similarly re-recognize the difference.
On reflection, it seems easier to maintain the necessary state we need
by doing all of the scanning in a single pass, since we can then just
use local variables within the scanner function.
Using Ragel here because the scanner is going to be somewhat complex due
to the need to switch back and forth between normal and template states,
etc. This should be easier to maintain than a hand-written scanner, while
ragel gives us the extra features we need to implement things that would
normally be too complex for a "regular" scanner generator.
This means we can actually point at a column in the console without it
getting misaligned by multi-byte UTF-8 sequences and Unicode combining
characters.
This is the first non-trivial expression Value implementation. Lots of
code here, so hopefully while implementing other expressions some
opportunities emerge to factor out some of these details.
The implementation of Variables will be identical for every Expression
implementation since we just wrap our AST-walk-based "Variables" function
to do the work.
Rather than manually copy-pasting the declaration for each expression
type, instead we'll generate this programmatically using "go generate".
This will need to be re-run each time a new expression node type is
added, in order to make it actually implement the Expression interface.
This function is effectively the implementation of Variables for all
expressions, but unfortunately we still need to declare a wrapper around
it as a method on every single expression type.
This package will grow to contain all of the gory details of the native
zcl syntax, including it AST, parser, etc. Most callers should access
this via the simpler API in the top-level package, which then gives
automatic support for other syntaxes too.
Traversals are a generalized way to talk about paths taken from the scope
and from arbitrary values. These will be used for various analysis tasks,
such as determining what needs to be placed into a scope.
Eventually zcl will have its own native template format that we'll use
by default, but for applications migrating from HCL/HIL we can instead
parse strings as HIL templates, for compatibility with how JSON configs
would've been processed in a HCL/HIL app.
When this mode is not enabled, we still just treat strings as literals,
pending the implementation of the zcl template parser.
Currently only deals with the literal HCL structure. Later we will also
support HIL parsing and evaluation within strings, to achieve parity with
existing uses of HCL/HIL together. For now, this has parity with uses of
HCL alone, with the exception that float and int values are not
distinguished, because cty does not make this distinction.
This is intended as a backward-compatibility interface, allowing
applications that previously used HCL/HIL to adopt zcl while still being
able to parse their old HCL/HIL-based configuration file formats.
When producing diagnostics about missing attributes or bodies it's
necessary to have a range representing the place where the missing thing
might be inserted.
There's not always a single reasonable value for this, so some liberty
can be taken about what exactly is returned as long as it's somewhere
the user can relate back to the construct producing the error.