diff --git a/hcl/hclsyntax/spec.md b/hcl/hclsyntax/spec.md index 16d12c5..091c1c2 100644 --- a/hcl/hclsyntax/spec.md +++ b/hcl/hclsyntax/spec.md @@ -9,13 +9,13 @@ generation of configuration. The language consists of three integrated sub-languages: -* The _structural_ language defines the overall hierarchical configuration +- The _structural_ language defines the overall hierarchical configuration structure, and is a serialization of HCL bodies, blocks and attributes. -* The _expression_ language is used to express attribute values, either as +- The _expression_ language is used to express attribute values, either as literals or as derivations of other values. -* The _template_ language is used to compose values together into strings, +- The _template_ language is used to compose values together into strings, as one of several types of expression in the expression language. In normal use these three sub-languages are used together within configuration @@ -30,19 +30,19 @@ Within this specification a semi-formal notation is used to illustrate the details of syntax. This notation is intended for human consumption rather than machine consumption, with the following conventions: -* A naked name starting with an uppercase letter is a global production, +- A naked name starting with an uppercase letter is a global production, common to all of the syntax specifications in this document. -* A naked name starting with a lowercase letter is a local production, +- A naked name starting with a lowercase letter is a local production, meaningful only within the specification where it is defined. -* Double and single quotes (`"` and `'`) are used to mark literal character +- Double and single quotes (`"` and `'`) are used to mark literal character sequences, which may be either punctuation markers or keywords. -* The default operator for combining items, which has no punctuation, +- The default operator for combining items, which has no punctuation, is concatenation. -* The symbol `|` indicates that any one of its left and right operands may +- The symbol `|` indicates that any one of its left and right operands may be present. -* The `*` symbol indicates zero or more repetitions of the item to its left. -* The `?` symbol indicates zero or one of the item to its left. -* Parentheses (`(` and `)`) are used to group items together to apply +- The `*` symbol indicates zero or more repetitions of the item to its left. +- The `?` symbol indicates zero or one of the item to its left. +- Parentheses (`(` and `)`) are used to group items together to apply the `|`, `*` and `?` operators to them collectively. The grammar notation does not fully describe the language. The prose may @@ -77,11 +77,11 @@ are not valid within HCL native syntax. Comments serve as program documentation and come in two forms: -* _Line comments_ start with either the `//` or `#` sequences and end with +- _Line comments_ start with either the `//` or `#` sequences and end with the next newline sequence. A line comments is considered equivalent to a newline sequence. -* _Inline comments_ start with the `/*` sequence and end with the `*/` +- _Inline comments_ start with the `/*` sequence and end with the `*/` sequence, and may have any characters within except the ending sequence. An inline comments is considered equivalent to a whitespace sequence. @@ -91,7 +91,7 @@ template literals except inside an interpolation sequence or template directive. ### Identifiers Identifiers name entities such as blocks, attributes and expression variables. -Identifiers are interpreted as per [UAX #31][UAX31] Section 2. Specifically, +Identifiers are interpreted as per [UAX #31][uax31] Section 2. Specifically, their syntax is defined in terms of the `ID_Start` and `ID_Continue` character properties as follows: @@ -109,7 +109,7 @@ that is not part of the unicode `ID_Continue` definition. This is to allow attribute names and block type names to contain dashes, although underscores as word separators are considered the idiomatic usage. -[UAX31]: http://unicode.org/reports/tr31/ "Unicode Identifier and Pattern Syntax" +[uax31]: http://unicode.org/reports/tr31/ "Unicode Identifier and Pattern Syntax" ### Keywords @@ -150,9 +150,9 @@ expmark = ('e' | 'E') ("+" | "-")?; The structural language consists of syntax representing the following constructs: -* _Attributes_, which assign a value to a specified name. -* _Blocks_, which create a child body annotated by a type and optional labels. -* _Body Content_, which consists of a collection of attributes and blocks. +- _Attributes_, which assign a value to a specified name. +- _Blocks_, which create a child body annotated by a type and optional labels. +- _Body Content_, which consists of a collection of attributes and blocks. These constructs correspond to the similarly-named concepts in the language-agnostic HCL information model. @@ -253,9 +253,9 @@ LiteralValue = ( ); ``` -* Numeric literals represent values of type _number_. -* The `true` and `false` keywords represent values of type _bool_. -* The `null` keyword represents a null value of the dynamic pseudo-type. +- Numeric literals represent values of type _number_. +- The `true` and `false` keywords represent values of type _bool_. +- The `null` keyword represents a null value of the dynamic pseudo-type. String literals are not directly available in the expression sub-language, but are available via the template sub-language, which can in turn be incorporated @@ -286,8 +286,8 @@ When specifying an object element, an identifier is interpreted as a literal attribute name as opposed to a variable reference. To populate an item key from a variable, use parentheses to disambiguate: -* `{foo = "baz"}` is interpreted as an attribute literally named `foo`. -* `{(foo) = "baz"}` is interpreted as an attribute whose name is taken +- `{foo = "baz"}` is interpreted as an attribute literally named `foo`. +- `{(foo) = "baz"}` is interpreted as an attribute whose name is taken from the variable named `foo`. Between the open and closing delimiters of these sequences, newline sequences @@ -299,12 +299,12 @@ _for expression_ interpretation has priority, so to produce a tuple whose first element is the value of a variable named `for`, or an object with a key named `for`, use parentheses to disambiguate: -* `[for, foo, baz]` is a syntax error. -* `[(for), foo, baz]` is a tuple whose first element is the value of variable +- `[for, foo, baz]` is a syntax error. +- `[(for), foo, baz]` is a tuple whose first element is the value of variable `for`. -* `{for: 1, baz: 2}` is a syntax error. -* `{(for): 1, baz: 2}` is an object with an attribute literally named `for`. -* `{baz: 2, for: 1}` is equivalent to the previous example, and resolves the +- `{for: 1, baz: 2}` is a syntax error. +- `{(for): 1, baz: 2}` is an object with an attribute literally named `for`. +- `{baz: 2, for: 1}` is equivalent to the previous example, and resolves the ambiguity by reordering. ### Template Expressions @@ -312,9 +312,9 @@ key named `for`, use parentheses to disambiguate: A _template expression_ embeds a program written in the template sub-language as an expression. Template expressions come in two forms: -* A _quoted_ template expression is delimited by quote characters (`"`) and +- A _quoted_ template expression is delimited by quote characters (`"`) and defines a template as a single-line expression with escape characters. -* A _heredoc_ template expression is introduced by a `<<` sequence and +- A _heredoc_ template expression is introduced by a `<<` sequence and defines a template via a multi-line sequence terminated by a user-chosen delimiter. @@ -322,7 +322,7 @@ In both cases the template interpolation and directive syntax is available for use within the delimiters, and any text outside of these special sequences is interpreted as a literal string. -In _quoted_ template expressions any literal string sequences within the +In _quoted_ template expressions any literal string sequences within the template behave in a special way: literal newline sequences are not permitted and instead _escape sequences_ can be included, starting with the backslash `\`: @@ -458,14 +458,14 @@ are provided, the first is the key and the second is the value. Tuple, object, list, map, and set types are iterable. The type of collection used defines how the key and value variables are populated: -* For tuple and list types, the _key_ is the zero-based index into the +- For tuple and list types, the _key_ is the zero-based index into the sequence for each element, and the _value_ is the element value. The elements are visited in index order. -* For object and map types, the _key_ is the string attribute name or element +- For object and map types, the _key_ is the string attribute name or element key, and the _value_ is the attribute or element value. The elements are visited in the order defined by a lexicographic sort of the attribute names or keys. -* For set types, the _key_ and _value_ are both the element value. The elements +- For set types, the _key_ and _value_ are both the element value. The elements are visited in an undefined but consistent order. The expression after the colon and (in the case of object `for`) the expression @@ -487,12 +487,12 @@ immediately after the value expression, this activates the grouping mode in which each value in the resulting object is a _tuple_ of all of the values that were produced against each distinct key. -* `[for v in ["a", "b"]: v]` returns `["a", "b"]`. -* `[for i, v in ["a", "b"]: i]` returns `[0, 1]`. -* `{for i, v in ["a", "b"]: v => i}` returns `{a = 0, b = 1}`. -* `{for i, v in ["a", "a", "b"]: k => v}` produces an error, because attribute +- `[for v in ["a", "b"]: v]` returns `["a", "b"]`. +- `[for i, v in ["a", "b"]: i]` returns `[0, 1]`. +- `{for i, v in ["a", "b"]: v => i}` returns `{a = 0, b = 1}`. +- `{for i, v in ["a", "a", "b"]: k => v}` produces an error, because attribute `a` is defined twice. -* `{for i, v in ["a", "a", "b"]: v => i...}` returns `{a = [0, 1], b = [2]}`. +- `{for i, v in ["a", "a", "b"]: v => i...}` returns `{a = [0, 1], b = [2]}`. If the `if` keyword is used after the element expression(s), it applies an additional predicate that can be used to conditionally filter elements from @@ -502,7 +502,7 @@ element expression(s). It must evaluate to a boolean value; if `true`, the element will be evaluated as normal, while if `false` the element will be skipped. -* `[for i, v in ["a", "b", "c"]: v if i < 2]` returns `["a", "b"]`. +- `[for i, v in ["a", "b", "c"]: v if i < 2]` returns `["a", "b"]`. If the collection value, element expression(s) or condition expression return unknown values that are otherwise type-valid, the result is a value of the @@ -686,7 +686,7 @@ Arithmetic operations are considered to be performed in an arbitrary-precision number space. If either operand of an arithmetic operator is an unknown number or a value -of the dynamic pseudo-type, the result is an unknown number. +of the dynamic pseudo-type, the result is an unknown number. ### Logic Operators @@ -711,7 +711,7 @@ the outcome of a boolean expression. Conditional = Expression "?" Expression ":" Expression; ``` -The first expression is the _predicate_, which is evaluated and must produce +The first expression is the _predicate_, which is evaluated and must produce a boolean result. If the predicate value is `true`, the result of the second expression is the result of the conditional. If the predicate value is `false`, the result of the third expression is the result of the conditional. @@ -779,8 +779,8 @@ interpolations or directives that are adjacent to it. A strip marker is a tilde (`~`) placed immediately after the opening `{` or before the closing `}` of a template sequence: -* `hello ${~ "world" }` produces `"helloworld"`. -* `%{ if true ~} hello %{~ endif }` produces `"hello"`. +- `hello ${~ "world" }` produces `"helloworld"`. +- `%{ if true ~} hello %{~ endif }` produces `"hello"`. When a strip marker is present, any spaces adjacent to it in the corresponding string literal (if any) are removed before producing the final value. Space @@ -789,7 +789,7 @@ characters are interpreted as per Unicode's definition. Stripping is done at syntax level rather than value level. Values returned by interpolations or directives are not subject to stripping: -* `${"hello" ~}${" world"}` produces `"hello world"`, and not `"helloworld"`, +- `${"hello" ~}${" world"}` produces `"hello world"`, and not `"helloworld"`, because the space is not in a template literal directly adjacent to the strip marker. @@ -827,9 +827,9 @@ TemplateIf = ( The evaluation of the `if` directive is equivalent to the conditional expression, with the following exceptions: -* The two sub-templates always produce strings, and thus the result value is +- The two sub-templates always produce strings, and thus the result value is also always a string. -* The `else` clause may be omitted, in which case the conditional's third +- The `else` clause may be omitted, in which case the conditional's third expression result is implied to be the empty string. ### Template For Directive @@ -849,9 +849,9 @@ TemplateFor = ( The evaluation of the `for` directive is equivalent to the _for expression_ when producing a tuple, with the following exceptions: -* The sub-template always produces a string. -* There is no equivalent of the "if" clause on the for expression. -* The elements of the resulting tuple are all converted to strings and +- The sub-template always produces a string. +- There is no equivalent of the "if" clause on the for expression. +- The elements of the resulting tuple are all converted to strings and concatenated to produce a flat string result. ### Template Interpolation Unwrapping @@ -867,13 +867,13 @@ template or expression syntax. Unwrapping allows arbitrary expressions to be used to populate attributes when strings in such languages are interpreted as templates. -* `${true}` produces the boolean value `true` -* `${"${true}"}` produces the boolean value `true`, because both the inner +- `${true}` produces the boolean value `true` +- `${"${true}"}` produces the boolean value `true`, because both the inner and outer interpolations are subject to unwrapping. -* `hello ${true}` produces the string `"hello true"` -* `${""}${true}` produces the string `"true"` because there are two +- `hello ${true}` produces the string `"hello true"` +- `${""}${true}` produces the string `"true"` because there are two interpolation sequences, even though one produces an empty result. -* `%{ for v in [true] }${v}%{ endif }` produces the string `true` because +- `%{ for v in [true] }${v}%{ endif }` produces the string `true` because the presence of the `for` directive circumvents the unwrapping even though the final result is a single value. diff --git a/hcl/json/spec.md b/hcl/json/spec.md index da9ae53..dac5729 100644 --- a/hcl/json/spec.md +++ b/hcl/json/spec.md @@ -18,11 +18,11 @@ _Parsing_ such JSON has some additional constraints not beyond what is normally supported by JSON parsers, so a specialized parser may be required that is able to: -* Preserve the relative ordering of properties defined in an object. -* Preserve multiple definitions of the same property name. -* Preserve numeric values to the precision required by the number type +- Preserve the relative ordering of properties defined in an object. +- Preserve multiple definitions of the same property name. +- Preserve numeric values to the precision required by the number type in [the HCL syntax-agnostic information model](../spec.md). -* Retain source location information for parsed tokens/constructs in order +- Retain source location information for parsed tokens/constructs in order to produce good error messages. ## Structural Elements @@ -118,6 +118,7 @@ type: ] } ``` + ```json { "foo": [] @@ -147,7 +148,7 @@ the following examples: "boz": { "baz": { "child_attr": "baz" - }, + } } } } @@ -189,7 +190,7 @@ the following examples: "boz": { "child_attr": "baz" } - }, + } }, { "bar": { @@ -402,4 +403,3 @@ to that expression. If the original expression is not a string or its contents cannot be parsed as a native syntax expression then static call analysis is not supported. - diff --git a/hcl/spec.md b/hcl/spec.md index bab96c9..8bbaff8 100644 --- a/hcl/spec.md +++ b/hcl/spec.md @@ -57,10 +57,10 @@ access to the specific attributes and blocks requested. A _body schema_ consists of a list of _attribute schemata_ and _block header schemata_: -* An _attribute schema_ provides the name of an attribute and whether its +- An _attribute schema_ provides the name of an attribute and whether its presence is required. -* A _block header schema_ provides a block type name and the semantic names +- A _block header schema_ provides a block type name and the semantic names assigned to each of the labels of that block type, if any. Within a schema, it is an error to request the same attribute name twice or @@ -72,11 +72,11 @@ a block whose type name is identical to the attribute name. The result of applying a body schema to a body is _body content_, which consists of an _attribute map_ and a _block sequence_: -* The _attribute map_ is a map data structure whose keys are attribute names +- The _attribute map_ is a map data structure whose keys are attribute names and whose values are _expressions_ that represent the corresponding attribute values. -* The _block sequence_ is an ordered sequence of blocks, with each specifying +- The _block sequence_ is an ordered sequence of blocks, with each specifying a block _type name_, the sequence of _labels_ specified for the block, and the body object (not body _content_) representing the block's own body. @@ -132,13 +132,13 @@ the schema has been processed. Specifically: -* Any attribute whose name is specified in the schema is returned in body +- Any attribute whose name is specified in the schema is returned in body content and elided from the new body. -* Any block whose type is specified in the schema is returned in body content +- Any block whose type is specified in the schema is returned in body content and elided from the new body. -* Any attribute or block _not_ meeting the above conditions is placed into +- Any attribute or block _not_ meeting the above conditions is placed into the new body, unmodified. The new body can then be recursively processed using any of the body @@ -168,20 +168,20 @@ In order to obtain a concrete value, each expression must be _evaluated_. Evaluation is performed in terms of an evaluation context, which consists of the following: -* An _evaluation mode_, which is defined below. -* A _variable scope_, which provides a set of named variables for use in +- An _evaluation mode_, which is defined below. +- A _variable scope_, which provides a set of named variables for use in expressions. -* A _function table_, which provides a set of named functions for use in +- A _function table_, which provides a set of named functions for use in expressions. The _evaluation mode_ allows for two different interpretations of an expression: -* In _literal-only mode_, variables and functions are not available and it +- In _literal-only mode_, variables and functions are not available and it is assumed that the calling application's intent is to treat the attribute value as a literal. -* In _full expression mode_, variables and functions are defined and it is +- In _full expression mode_, variables and functions are defined and it is assumed that the calling application wishes to provide a full expression language for definition of the attribute value. @@ -235,15 +235,15 @@ for interpretation into any suitable number representation. An implementation may in practice implement numbers with limited precision so long as the following constraints are met: -* Integers are represented with at least 256 bits. -* Non-integer numbers are represented as floating point values with a +- Integers are represented with at least 256 bits. +- Non-integer numbers are represented as floating point values with a mantissa of at least 256 bits and a signed binary exponent of at least 16 bits. -* An error is produced if an integer value given in source cannot be +- An error is produced if an integer value given in source cannot be represented precisely. -* An error is produced if a non-integer value cannot be represented due to +- An error is produced if a non-integer value cannot be represented due to overflow. -* A non-integer number is rounded to the nearest possible value when a +- A non-integer number is rounded to the nearest possible value when a value is of too high a precision to be represented. The _number_ type also requires representation of both positive and negative @@ -265,11 +265,11 @@ _Structural types_ are types that are constructed by combining other types. Each distinct combination of other types is itself a distinct type. There are two structural type _kinds_: -* _Object types_ are constructed of a set of named attributes, each of which +- _Object types_ are constructed of a set of named attributes, each of which has a type. Attribute names are always strings. (_Object_ attributes are a distinct idea from _body_ attributes, though calling applications may choose to blur the distinction by use of common naming schemes.) -* _Tuple types_ are constructed of a sequence of elements, each of which +- _Tuple types_ are constructed of a sequence of elements, each of which has a type. Values of structural types are compared for equality in terms of their @@ -284,9 +284,9 @@ have attributes or elements with identical types. _Collection types_ are types that combine together an arbitrary number of values of some other single type. There are three collection type _kinds_: -* _List types_ represent ordered sequences of values of their element type. -* _Map types_ represent values of their element type accessed via string keys. -* _Set types_ represent unordered sets of distinct values of their element type. +- _List types_ represent ordered sequences of values of their element type. +- _Map types_ represent values of their element type accessed via string keys. +- _Set types_ represent unordered sets of distinct values of their element type. For each of these kinds and each distinct element type there is a distinct collection type. For example, "list of string" is a distinct type from @@ -376,9 +376,9 @@ a type has a non-commutative _matches_ relationship with a _type specification_. A type specification is, in practice, just a different interpretation of a type such that: -* Any type _matches_ any type that it is identical to. +- Any type _matches_ any type that it is identical to. -* Any type _matches_ the dynamic pseudo-type. +- Any type _matches_ the dynamic pseudo-type. For example, given a type specification "list of dynamic pseudo-type", the concrete types "list of string" and "list of map" match, but the @@ -397,51 +397,51 @@ applications to provide functions that are interoperable with all syntaxes. A _function_ is defined from the following elements: -* Zero or more _positional parameters_, each with a name used for documentation, +- Zero or more _positional parameters_, each with a name used for documentation, a type specification for expected argument values, and a flag for whether each of null values, unknown values, and values of the dynamic pseudo-type are accepted. -* Zero or one _variadic parameters_, with the same structure as the _positional_ +- Zero or one _variadic parameters_, with the same structure as the _positional_ parameters, which if present collects any additional arguments provided at the function call site. -* A _result type definition_, which specifies the value type returned for each +- A _result type definition_, which specifies the value type returned for each valid sequence of argument values. -* A _result value definition_, which specifies the value returned for each +- A _result value definition_, which specifies the value returned for each valid sequence of argument values. A _function call_, regardless of source syntax, consists of a sequence of argument values. The argument values are each mapped to a corresponding parameter as follows: -* For each of the function's positional parameters in sequence, take the next +- For each of the function's positional parameters in sequence, take the next argument. If there are no more arguments, the call is erroneous. -* If the function has a variadic parameter, take all remaining arguments that +- If the function has a variadic parameter, take all remaining arguments that where not yet assigned to a positional parameter and collect them into a sequence of variadic arguments that each correspond to the variadic parameter. -* If the function has _no_ variadic parameter, it is an error if any arguments +- If the function has _no_ variadic parameter, it is an error if any arguments remain after taking one argument for each positional parameter. After mapping each argument to a parameter, semantic checking proceeds for each argument: -* If the argument value corresponding to a parameter does not match the +- If the argument value corresponding to a parameter does not match the parameter's type specification, the call is erroneous. -* If the argument value corresponding to a parameter is null and the parameter +- If the argument value corresponding to a parameter is null and the parameter is not specified as accepting nulls, the call is erroneous. -* If the argument value corresponding to a parameter is the dynamic value +- If the argument value corresponding to a parameter is the dynamic value and the parameter is not specified as accepting values of the dynamic pseudo-type, the call is valid but its _result type_ is forced to be the dynamic pseudo type. -* If neither of the above conditions holds for any argument, the call is +- If neither of the above conditions holds for any argument, the call is valid and the function's value type definition is used to determine the call's _result type_. A function _may_ vary its result type depending on the argument _values_ as well as the argument _types_; for example, a @@ -450,11 +450,11 @@ for each argument: If semantic checking succeeds without error, the call is _executed_: -* For each argument, if its value is unknown and its corresponding parameter +- For each argument, if its value is unknown and its corresponding parameter is not specified as accepting unknowns, the _result value_ is forced to be an unknown value of the result type. -* If the previous condition does not apply, the function's result value +- If the previous condition does not apply, the function's result value definition is used to determine the call's _result value_. The result of a function call expression is either an error, if one of the @@ -631,20 +631,20 @@ diagnostics if they are applied to inappropriate expressions. The following are the required static analysis functions: -* **Static List**: Require list/tuple construction syntax to be used and +- **Static List**: Require list/tuple construction syntax to be used and return a list of expressions for each of the elements given. -* **Static Map**: Require map/object construction syntax to be used and +- **Static Map**: Require map/object construction syntax to be used and return a list of key/value pairs -- both expressions -- for each of the elements given. The usual constraint that a map key must be a string must not apply to this analysis, thus allowing applications to interpret arbitrary keys as they see fit. -* **Static Call**: Require function call syntax to be used and return an +- **Static Call**: Require function call syntax to be used and return an object describing the called function name and a list of expressions representing each of the call arguments. -* **Static Traversal**: Require a reference to a symbol in the variable +- **Static Traversal**: Require a reference to a symbol in the variable scope and return a description of the path from the root scope to the accessed attribute or index. @@ -670,18 +670,18 @@ with the goals of this specification. The language-agnosticism of this specification assumes that certain behaviors are implemented separately for each syntax: -* Matching of a body schema with the physical elements of a body in the +- Matching of a body schema with the physical elements of a body in the source language, to determine correspondence between physical constructs and schema elements. -* Implementing the _dynamic attributes_ body processing mode by either +- Implementing the _dynamic attributes_ body processing mode by either interpreting all physical constructs as attributes or producing an error if non-attribute constructs are present. -* Providing an evaluation function for all possible expressions that produces +- Providing an evaluation function for all possible expressions that produces a value given an evaluation context. -* Providing the static analysis functionality described above in a manner that +- Providing the static analysis functionality described above in a manner that makes sense within the convention of the syntax. The suggested implementation strategy is to use an implementation language's