It seems to be somewhat common for someone to share HCL code via a forum
or a document and have the well-meaning word processor or CMS replace the
straight quotes with curly quotes, which then lead to confusing errors
when someone copies the result and tries to use it as valid HCL
configuration.
Here we add a special hint for that, giving a tailored error message
instead of the generic "This character is not used within the language"
error message.
HCL has always had some of these special hints implemented here, and they
were originally implemented with special token types to allow the parser
handle them. However, we later refactored to do the check all at once
inside the Lex* family of functions, prior to parsing, so it's now
relatively straightforward to handle it as a special case of TokenInvalid
rather than an entirely new token type. Perhaps later we'll rework the
existing ones to also just use TokenInvalid, but that's a decision for
another day.
All of the other subdivisions of a block were already nodes, but we'd
represented the labels as an undifferentiated set of nodes belonging
directly to the block's child node list.
Now that we support replacing the labels in the public API, that's a good
excuse to refactor this slightly to make the labels their own node. As
well as being consistent with everything else in Block, this also makes
it easier to implement the Block.SetLabels operation because we can
just change the children of the labels node, rather than having to
carefully identify and extract the individual child nodes of the block
that happen to represent labels.
Internally this models the labels in a similar sort of way as the content
of a body, although we've kept the public API directly on the Block type
here because that's a more straightforward model for the use-cases we
currently know and matches better with the API of hcl.Block. This is just
an internal change for consistency.
I also added a few tests for having comments interspersed with labels
while I was here, because that helped to better exercise the new
parseBlockLabels function.
While implementing Block.SetLabels(), I found a new hclwrite parser bug.
The NewBlock() method records positions of TokenOBrace / TokenCBrace.
Nevertheless when generating blocks via hclwrite.ParseConfig(),
they were not recorded.
The position of TokenOBrace is needed for Block.SetLabels(),
so I also fixed this existing bug.
Fixes#338
Add methods to update block type and labels to enable us to refactor HCL
configurations such as renaming Terraform resources.
- `*Block.SetType(typeName string)`
- `*Block.SetLabels(labels []string)`
Some additional notes about SetLabels:
Since we cannot assume that old and new labels are equal in length,
remove old labels and insert new ones before TokenOBrace.
To implement this, I also added the following methods.
- `*nodes.Insert(pos *node, c nodeContent) *node`
- `*nodes.InsertNode(pos *node, n *node) *node`
They are similar to the existing Append / AppendNode,
but insert a node before a given position.
This adds ValidateSpec, a new decoder Spec that allows one to add
custom validations to work with values at decode-time.
The validation is run on the value after the wrapped spec is applied to
the expression in question. Diagnostics are expected to be returned,
with the author having flexibility over whether or not they want to
specify a range; if one is not supplied, the range of the wrapped
expression is used.
Previously functions such as concat() would result in a panic if there
was a null element and a sequence, as in the included test. This PR adds
a check if the error index is outside of the range of arguments and
crafts an error that references the entire function instead of the null
argument.
The following expression caused a panic in hclwrite:
a = foo.*
This was due to the unusual dotted form of a full splat (where the splat
operator is at the end of the expression) being generated with an
invalid source range. In the full splat case, the end of the range was
uninitialized, which caused the token slice to be empty, and thus the
panic.
This commit fixes the bug, adds test coverage, and includes some bonus
tests for other splat expression cases.
The previous syntax for object and map values was a single line of
key-value pairs. For example:
object = { bar = 5, baz = true, foo = "foo" }
This is very compact, but in practice for many HCL values, less readable
and less common than a multi-line object format. This commit changes the
generated output from hclwrite to one line per attribute.
Examples of the new format:
// Empty object/map is a single line
a = {}
// Single-value object/map has the attribute on a separate line
b = {
bar = 5
}
// Multi-value object/map has one line per attribute
c = {
bar = 5
baz = true
}
When scanning JSON, upon encountering an invalid token, we immediately
return. Previously this return happened without inserting an EOF token.
Since other functions assume that a token sequence always ends in EOF,
this could cause a panic.
This commit adds a synthetic EOF token after the invalid token before
returning. While this does not match the real end-of-file of the source
JSON, it is marking the end of the scanned bytes, so it seems reasonable.
Fixes#339
HCL uses grapheme cluster segmentation to produce accurate "column"
indications in diagnostic messages and other human-oriented source
location information. Each new major version of Unicode introduces new
codepoints, some of which are defined to combine with other codepoints to
produce a single visible character (grapheme cluster).
We were previously using the rules from Unicode 9.0.0. This change
switches to using the segmentation rules from Unicode 12.0.0, which is
the latest version at the time of this commit and is also the version of
Unicode used for other purposes by the Go 1.14 runtime.
HCL does not use text segmentation results for any purpose that would
affect the meaning of decoded data extracted from HCL files, so this
change will only affect the human-oriented source positions generated for
files containing characters that were newly-introduced in Unicode 10, 11,
or 12. (Machine-oriented uses of source location information are based on
byte offsets and not affected by text segmentation.)