Commit Graph

12 Commits

Author SHA1 Message Date
Martin Atkins
cb768a591a zclwrite: parsing of blocks
There's something a little off here, as illustrated by the failing
round trip test. Will figure that out later.
2017-06-10 17:16:19 -07:00
Martin Atkins
948b2e0b7b zclwrite: populate EOLTokens when parsing attributes 2017-06-10 16:06:09 -07:00
Martin Atkins
1de72e146e zclwrite: consume trailing line comments and newlines in body items
We consume both of these things primarily so that when we manipulate
the AST they will all stay together. For example, removing an attribute
from a body should take its comments and newline too.
2017-06-09 08:00:38 -07:00
Martin Atkins
c88641b147 zclwrite: absorb lead comments into attributes
The native parser's ranges don't include any surrounding comments, so we
need to do a little more work to pick them out of the surrounding token
sequences.

This just takes care of _lead_ comments, which are those that appear as
whole line comments above the item in question. Line comments, which
appear after the item on the same line, will follow in a later commit.
2017-06-08 09:04:27 -07:00
Martin Atkins
13c93e974f zclwrite: initial attribute and basic expression parsing
This is not yet complete, since it fails to capture the newline, line
comments, and variable references in expressions. However, it does
capture the broad structure of an attribute, along with gathering up
all of its _interior_ tokens.
2017-06-07 08:24:33 -07:00
Martin Atkins
69a87c73b4 zclwrite: start to partition body items
So far they are still just all unstructured, but each item is a separate
node.
2017-06-07 07:38:44 -07:00
Martin Atkins
363d08ed0d zclwrite: File-level AllTokens
This captures any leftover tokens that aren't considered part of the
file's main body, such as the trailing EOF token.
2017-06-07 07:37:56 -07:00
Martin Atkins
fa8a707c7f zclwrite: begin to flesh out public interface 2017-06-07 07:24:10 -07:00
Martin Atkins
c233270a9b zclwrite: use a single, flat writer token buffer
Previously we were allocating a separate heap object for each token, which
creates a lot of small objects for the GC to manage. Since we know that
we're always converting from a flat array of native tokens, we can produce
a flat array of writer tokens first and _then_ take pointers into that
array to achieve our goal of making a slice of pointers.

For the use-case of formatting a sequence of tokens by tweaking the
"SpacesBefore" value, this means we can get all of our memory allocation
done in a single chunk and then just tweak the allocated, contiguous
tokens in-place, which should reduce memory pressure for a task which
will likely be done frequently by a text editor integration doing "format
on save".
2017-06-07 06:38:41 -07:00
Martin Atkins
3c0dde2ae5 zclwrite: foundations of the writer parser
The "writer parser" is a parser that produces a writer AST rather than
a zclsyntax AST. This can be used to produce a writer AST from existing
source in order to modify it before writing it out again.

It's implemented with the somewhat-unintuitive approach of running the
main zclsyntax parser and then mapping the source ranges it finds back
onto the token sequence to pull out the raw tokens for each object.
This allows us to avoid maintaining two parsers but also keeps all of
this raw-token-wrangling complexity out of the main parser.
2017-06-06 08:53:13 -07:00
Martin Atkins
e100bf4723 zclsyntax: generate lexer diagnostics
There are certain tokens that are _never_ valid, so we might as well
catch them early in the Lex... functions rather than having to handle
them in many different contexts within the parser.

Unfortunately for now when such errors occur they tend to be echoed by
more confusing errors coming from the parser, but we'll accept that for
now.
2017-06-04 07:34:26 -07:00
Martin Atkins
476e2c127e zclwrite: convert zclsyntax tokens into zclwrite tokens
In zclwrite we throw away the absolute source position information and
instead just retain the number of spaces before each token. This different
model allows us to rewrite parts of the token sequence without needing
to re-adjust all of the positions, and it also allows us to do simple
indentation and spacing adjustments just by walking through the token
list and adjusting these numbers.
2017-05-29 16:59:20 -07:00