The native parser's ranges don't include any surrounding comments, so we
need to do a little more work to pick them out of the surrounding token
sequences.
This just takes care of _lead_ comments, which are those that appear as
whole line comments above the item in question. Line comments, which
appear after the item on the same line, will follow in a later commit.
This is not yet complete, since it fails to capture the newline, line
comments, and variable references in expressions. However, it does
capture the broad structure of an attribute, along with gathering up
all of its _interior_ tokens.
We'll use TokenSeq for everything, including sequences we expect to
contain only one token, mainly for consistency but also so that our local
"parser" doesn't need to care about whether the main parser is returning
one token or many.
This now allows for round-tripping some input from bytes to tokens and
then back to bytes again. With no other changes, we expect this to produce
an identical result.
The "writer parser" is a parser that produces a writer AST rather than
a zclsyntax AST. This can be used to produce a writer AST from existing
source in order to modify it before writing it out again.
It's implemented with the somewhat-unintuitive approach of running the
main zclsyntax parser and then mapping the source ranges it finds back
onto the token sequence to pull out the raw tokens for each object.
This allows us to avoid maintaining two parsers but also keeps all of
this raw-token-wrangling complexity out of the main parser.
This uses a separate, lower-level AST than the parser so that it can
retain the raw tokens and make surgical changes. As a consequence it
has much less semantic detail than the parser AST, and in particular
doesn't represent the structure of expressions except to retain variable
references to enable global rename operations.