Commit Graph

8 Commits

Author SHA1 Message Date
Martin Atkins
598740b638 zclwrite: method for writing tokens to a writer
This now allows for round-tripping some input from bytes to tokens and
then back to bytes again. With no other changes, we expect this to produce
an identical result.
2017-06-07 07:06:23 -07:00
Martin Atkins
efbcfd19b2 zclwrite: TokenGen interface has EachToken, not AppendToTokens
Although building a flat array of tokens is _one_ use-case for TokenGen,
this new approach means we can also use the interface to write to an
io.Writer without needing to produce an intermediate buffer.
2017-06-07 06:45:01 -07:00
Martin Atkins
c233270a9b zclwrite: use a single, flat writer token buffer
Previously we were allocating a separate heap object for each token, which
creates a lot of small objects for the GC to manage. Since we know that
we're always converting from a flat array of native tokens, we can produce
a flat array of writer tokens first and _then_ take pointers into that
array to achieve our goal of making a slice of pointers.

For the use-case of formatting a sequence of tokens by tweaking the
"SpacesBefore" value, this means we can get all of our memory allocation
done in a single chunk and then just tweak the allocated, contiguous
tokens in-place, which should reduce memory pressure for a task which
will likely be done frequently by a text editor integration doing "format
on save".
2017-06-07 06:38:41 -07:00
Martin Atkins
3c0dde2ae5 zclwrite: foundations of the writer parser
The "writer parser" is a parser that produces a writer AST rather than
a zclsyntax AST. This can be used to produce a writer AST from existing
source in order to modify it before writing it out again.

It's implemented with the somewhat-unintuitive approach of running the
main zclsyntax parser and then mapping the source ranges it finds back
onto the token sequence to pull out the raw tokens for each object.
This allows us to avoid maintaining two parsers but also keeps all of
this raw-token-wrangling complexity out of the main parser.
2017-06-06 08:53:13 -07:00
Martin Atkins
e100bf4723 zclsyntax: generate lexer diagnostics
There are certain tokens that are _never_ valid, so we might as well
catch them early in the Lex... functions rather than having to handle
them in many different contexts within the parser.

Unfortunately for now when such errors occur they tend to be echoed by
more confusing errors coming from the parser, but we'll accept that for
now.
2017-06-04 07:34:26 -07:00
Martin Atkins
f94c89a6a5 zclsyntax: differentiate quoted and unquoted string literals
The context where a string literal was found affects what sort of escaping
it can have, so we need to distinguish these cases so that we will only
look for and handle backslash escapes in quoted strings.
2017-05-30 19:03:25 -07:00
Martin Atkins
476e2c127e zclwrite: convert zclsyntax tokens into zclwrite tokens
In zclwrite we throw away the absolute source position information and
instead just retain the number of spaces before each token. This different
model allows us to rewrite parts of the token sequence without needing
to re-adjust all of the positions, and it also allows us to do simple
indentation and spacing adjustments just by walking through the token
list and adjusting these numbers.
2017-05-29 16:59:20 -07:00
Martin Atkins
3a567abb51 zclwrite: stub of zcl code generation package
This uses a separate, lower-level AST than the parser so that it can
retain the raw tokens and make surgical changes. As a consequence it
has much less semantic detail than the parser AST, and in particular
doesn't represent the structure of expressions except to retain variable
references to enable global rename operations.
2017-05-29 16:05:34 -07:00