guide: The "Configuration Language Design" section
This commit is contained in:
parent
280771fe8a
commit
57c9a676d7
@ -75,6 +75,8 @@ complex structures:
|
|||||||
|
|
||||||
source_file = "${path.module}/foo.txt"
|
source_file = "${path.module}/foo.txt"
|
||||||
|
|
||||||
|
.. _go-expression-funcs:
|
||||||
|
|
||||||
Defining Functions
|
Defining Functions
|
||||||
------------------
|
------------------
|
||||||
|
|
||||||
|
@ -8,6 +8,8 @@ some more complex situations that can benefit from some additional techniques.
|
|||||||
This section lists a few of these situations and ways to use the HCL API to
|
This section lists a few of these situations and ways to use the HCL API to
|
||||||
accommodate them.
|
accommodate them.
|
||||||
|
|
||||||
|
.. _go-interdep-blocks:
|
||||||
|
|
||||||
Interdependent Blocks
|
Interdependent Blocks
|
||||||
---------------------
|
---------------------
|
||||||
|
|
||||||
|
@ -1,3 +1,285 @@
|
|||||||
Configuration Language Design
|
Configuration Language Design
|
||||||
=============================
|
=============================
|
||||||
|
|
||||||
|
In this section we will cover some conventions for HCL-based configuration
|
||||||
|
languages that can help make them feel consistent with other HCL-based
|
||||||
|
languages, and make the best use of HCL's building blocks.
|
||||||
|
|
||||||
|
HCL's native and JSON syntaxes both define a mapping from input bytes to a
|
||||||
|
higher-level information model. In designing a configuration language based on
|
||||||
|
HCL, your building blocks are the components in that information model:
|
||||||
|
blocks, arguments, and expressions.
|
||||||
|
|
||||||
|
Each calling application of HCL, then, effectively defines its own language.
|
||||||
|
Just as Atom and RSS are higher-level languages built on XML, HashiCorp
|
||||||
|
Terraform has a higher-level language built on HCL, while HashiCorp Nomad has
|
||||||
|
its own distinct language that is *also* built on HCL.
|
||||||
|
|
||||||
|
From an end-user perspective, these are distinct languages but have a common
|
||||||
|
underlying texture. Users of both are therefore likely to bring some
|
||||||
|
expectations from one to the other, and so this section is an attempt to
|
||||||
|
codify some of these shared expectations to reduce user surprise.
|
||||||
|
|
||||||
|
These are subjective guidelines however, and so applications may choose to
|
||||||
|
ignore them entirely or ignore them in certain specialized cases. An
|
||||||
|
application providing a configuration language for a pre-existing system, for
|
||||||
|
example, may choose to eschew the identifier naming conventions in this section
|
||||||
|
in order to exactly match the existing names in that underlying system.
|
||||||
|
|
||||||
|
Language Keywords and Identifiers
|
||||||
|
---------------------------------
|
||||||
|
|
||||||
|
Much of the work in defining an HCL-based language is in selecting good names
|
||||||
|
for arguments, block types, variables, and functions.
|
||||||
|
|
||||||
|
The standard for naming in HCL is to use all-lowercase identifiers with
|
||||||
|
underscores separating words, like ``service`` or ``io_mode``. HCL identifiers
|
||||||
|
do allow uppercase letters and dashes, but this primarily for natural
|
||||||
|
interfacing with external systems that may have other identifier conventions,
|
||||||
|
and so these should generally be avoided for the identifiers native to your
|
||||||
|
own language.
|
||||||
|
|
||||||
|
The distinction between "keywords" and other identifiers is really just a
|
||||||
|
convention. In your own language documentation, you may use the word "keyword"
|
||||||
|
to refer to names that are presented as an intrinsic part of your language,
|
||||||
|
such as important top-level block type names.
|
||||||
|
|
||||||
|
Block type names are usually singular, since each block defines a single
|
||||||
|
object. Use a plural block name only if the block is serving only as a
|
||||||
|
namespacing container for a number of other objects. A block with a plural
|
||||||
|
type name will generally contain only nested blocks, and no arguments of its
|
||||||
|
own.
|
||||||
|
|
||||||
|
Argument names are also singular unless they expect a collection value, in
|
||||||
|
which case they should be plural. For example, ``name = "foo"`` but
|
||||||
|
``subnet_ids = ["abc", "123"]``.
|
||||||
|
|
||||||
|
Function names will generally *not* use underscores and will instead just run
|
||||||
|
words together, as is common in the C standard library. This is a result of
|
||||||
|
the fact that several of the standard library functions offered in ``cty``
|
||||||
|
(covered in a later section) have names that follow C library function names
|
||||||
|
like ``substr``. This is not a strong rule, and applications that use longer
|
||||||
|
names may choose to use underscores for them to improve readability.
|
||||||
|
|
||||||
|
Blocks vs. Object Values
|
||||||
|
------------------------
|
||||||
|
|
||||||
|
HCL blocks and argument values of object type have quite a similar appearance
|
||||||
|
in the native syntax, and are identical in JSON syntax:
|
||||||
|
|
||||||
|
.. code-block:: hcl
|
||||||
|
|
||||||
|
block {
|
||||||
|
foo = bar
|
||||||
|
}
|
||||||
|
|
||||||
|
# argument with object constructor expression
|
||||||
|
argument = {
|
||||||
|
foo = bar
|
||||||
|
}
|
||||||
|
|
||||||
|
In spite of this superficial similarity, there are some important differences
|
||||||
|
between these two forms.
|
||||||
|
|
||||||
|
The most significant difference is that a child block can contain nested blocks
|
||||||
|
of its own, while an object constructor expression can define only attributes
|
||||||
|
of the object it is creating.
|
||||||
|
|
||||||
|
The user-facing model for blocks is that they generally form the more "rigid"
|
||||||
|
structure of the language itself, while argument values can be more free-form.
|
||||||
|
An application will generally define in its schema and documentation all of
|
||||||
|
the arguments that are valid for a particular block type, while arguments
|
||||||
|
accepting object constructors are more appropriate for situations where the
|
||||||
|
arguments themselves are freely selected by the user, such as when the
|
||||||
|
expression will be converted by the application to a map type.
|
||||||
|
|
||||||
|
As a less contrived example, consider the ``resource`` block type in Terraform
|
||||||
|
and its use with a particular resource type ``aws_instance``:
|
||||||
|
|
||||||
|
.. code-block:: hcl
|
||||||
|
|
||||||
|
resource "aws_instance" "example" {
|
||||||
|
ami = "ami-abc123"
|
||||||
|
instance_type = "t2.micro"
|
||||||
|
|
||||||
|
tags = {
|
||||||
|
Name = "example instance"
|
||||||
|
}
|
||||||
|
|
||||||
|
ebs_block_device {
|
||||||
|
device_name = "hda1"
|
||||||
|
volume_size = 8
|
||||||
|
volume_type = "standard"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
The top-level block type ``resource`` is fundamental to Terraform itself and
|
||||||
|
so an obvious candidate for block syntax: it maps directly onto an object in
|
||||||
|
Terraform's own domain model.
|
||||||
|
|
||||||
|
Within this block we see a mixture of arguments and nested blocks, all defined
|
||||||
|
as part of the schema of the ``aws_instance`` resource type. The ``tags``
|
||||||
|
map here is specified as an argument because its keys are free-form, chosen
|
||||||
|
by the user and mapped directly onto a map in the underlying system.
|
||||||
|
``ebs_block_device`` is specified as a nested block, because it is a separate
|
||||||
|
domain object within the remote system and has a rigid schema of its own.
|
||||||
|
|
||||||
|
As a special case, block syntax may sometimes be used with free-form keys if
|
||||||
|
those keys each serve as a separate declaration of some first-class object
|
||||||
|
in the language. For example, Terraform has a top-level block type ``locals``
|
||||||
|
which behaves in this way:
|
||||||
|
|
||||||
|
.. code-block:: hcl
|
||||||
|
|
||||||
|
locals {
|
||||||
|
instance_type = "t2.micro"
|
||||||
|
instance_id = aws_instance.example.id
|
||||||
|
}
|
||||||
|
|
||||||
|
Although the argument names in this block are arbitrarily selected by the
|
||||||
|
user, each one defines a distinct top-level object. In other words, this
|
||||||
|
approach is used to create a more ergonomic syntax for defining these simple
|
||||||
|
single-expression objects, as a pragmatic alternative to more verbose and
|
||||||
|
redundant declarations using blocks:
|
||||||
|
|
||||||
|
.. code-block:: hcl
|
||||||
|
|
||||||
|
local "instance_type" {
|
||||||
|
value = "t2.micro"
|
||||||
|
}
|
||||||
|
local "instance_id" {
|
||||||
|
value = aws_instance.example.id
|
||||||
|
}
|
||||||
|
|
||||||
|
The distinction between domain objects, language constructs and user data will
|
||||||
|
always be subjective, so the final decision is up to you as the language
|
||||||
|
designer.
|
||||||
|
|
||||||
|
Standard Functions
|
||||||
|
------------------
|
||||||
|
|
||||||
|
HCL itself does not define a common set of functions available in all HCL-based
|
||||||
|
languages; the built-in language operators give a baseline of functionality
|
||||||
|
that is always available, but applications are free to define functions as they
|
||||||
|
see fit.
|
||||||
|
|
||||||
|
With that said, there's a number of generally-useful functions that don't
|
||||||
|
belong to the domain of any one application: string manipulation, sequence
|
||||||
|
manipulation, date formatting, JSON serialization and parsing, etc.
|
||||||
|
|
||||||
|
Given the general need such functions serve, it's helpful if a similar set of
|
||||||
|
functions is available with compatible behavior across multiple HCL-based
|
||||||
|
languages, assuming the language is for an application where function calls
|
||||||
|
make sense at all.
|
||||||
|
|
||||||
|
The Go implementation of HCL is built on an underlying type and function system
|
||||||
|
:go:pkg:`cty`, whose usage was introduced in :ref:`go-expression-funcs`. That
|
||||||
|
library also has a package of "standard library" functions which we encourage
|
||||||
|
applications to offer with consistent names and compatible behavior, either by
|
||||||
|
using the standard implementations directly or offering compatible
|
||||||
|
implementations under the same name.
|
||||||
|
|
||||||
|
The "standard" functions that new configuration formats should consider
|
||||||
|
offering are:
|
||||||
|
|
||||||
|
* ``abs(number)`` - returns the absolute (positive) value of the given number.
|
||||||
|
* ``coalesce(vals...)`` - returns the value of the first argument that isn't null. Useful only in formats where null values may appear.
|
||||||
|
* ``compact(vals...)`` - returns a new tuple with the non-null values given as arguments, preserving order.
|
||||||
|
* ``concat(seqs...)`` - builds a tuple value by concatenating together all of the given sequence (list or tuple) arguments.
|
||||||
|
* ``format(fmt, args...)`` - performs simple string formatting similar to the C library function ``printf``.
|
||||||
|
* ``hasindex(coll, idx)`` - returns true if the given collection has the given index. ``coll`` may be of list, tuple, map, or object type.
|
||||||
|
* ``int(number)`` - returns the integer component of the given number, rounding towards zero.
|
||||||
|
* ``jsondecode(str)`` - interprets the given string as JSON format and return the corresponding decoded value.
|
||||||
|
* ``jsonencode(val)`` - encodes the given value as a JSON string.
|
||||||
|
* ``length(coll)`` - returns the length of the given collection.
|
||||||
|
* ``lower(str)`` - converts the letters in the given string to lowercase, using Unicode case folding rules.
|
||||||
|
* ``max(numbers...)`` - returns the highest of the given number values.
|
||||||
|
* ``min(numbers...)`` - returns the lowest of the given number values.
|
||||||
|
* ``sethas(set, val)`` - returns true only if the given set has the given value as an element.
|
||||||
|
* ``setintersection(sets...)`` - returns the intersection of the given sets
|
||||||
|
* ``setsubtract(set1, set2)`` - returns a set with the elements from ``set1`` that are not also in ``set2``.
|
||||||
|
* ``setsymdiff(sets...)`` - returns the symmetric difference of the given sets.
|
||||||
|
* ``setunion(sets...)`` - returns the union of the given sets.
|
||||||
|
* ``strlen(str)`` - returns the length of the given string in Unicode grapheme clusters.
|
||||||
|
* ``substr(str, offset, length)`` - returns a substring from the given string by splitting it between Unicode grapheme clusters.
|
||||||
|
* ``timeadd(time, duration)`` - takes a timestamp in RFC3339 format and a possibly-negative duration given as a string like ``"1h"`` (for "one hour") and returns a new RFC3339 timestamp after adding the duration to the given timestamp.
|
||||||
|
* ``upper(str)`` - converts the letters in the given string to uppercase, using Unicode case folding rules.
|
||||||
|
|
||||||
|
Not all of these functions will make sense in all applications. For example, an
|
||||||
|
application that doesn't use set types at all would have no reason to provide
|
||||||
|
the set-manipulation functions here.
|
||||||
|
|
||||||
|
Some languages will not provide functions at all, since they are primarily for
|
||||||
|
assigning values to arguments and thus do not need nor want any custom
|
||||||
|
computations of those values.
|
||||||
|
|
||||||
|
Block Results as Expression Variables
|
||||||
|
-------------------------------------
|
||||||
|
|
||||||
|
In some applications, top-level blocks serve also as declarations of variables
|
||||||
|
(or of attributes of object variables) available during expression evaluation,
|
||||||
|
as discussed in :ref:`go-interdep-blocks`.
|
||||||
|
|
||||||
|
In this case, it's most intuitive for the variables map in the evaluation
|
||||||
|
context to contain an value named after each valid top-level block
|
||||||
|
type and for these values to be object-typed or map-typed and reflect the
|
||||||
|
structure implied by block type labels.
|
||||||
|
|
||||||
|
For example, an application may have a top-level ``service`` block type
|
||||||
|
used like this:
|
||||||
|
|
||||||
|
.. code-block:: hcl
|
||||||
|
|
||||||
|
service "http" "web_proxy" {
|
||||||
|
listen_addr = "127.0.0.1:8080"
|
||||||
|
|
||||||
|
process "main" {
|
||||||
|
command = ["/usr/local/bin/awesome-app", "server"]
|
||||||
|
}
|
||||||
|
|
||||||
|
process "mgmt" {
|
||||||
|
command = ["/usr/local/bin/awesome-app", "mgmt"]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
If the result of decoding this block were available for use in expressions
|
||||||
|
elsewhere in configuration, the above convention would call for it to be
|
||||||
|
available to expressions as an object at ``service.http.web_proxy``.
|
||||||
|
|
||||||
|
If it the contents of the block itself that are offered to evaluation -- or
|
||||||
|
a superset object *derived* from the block contents -- then the block arguments
|
||||||
|
can map directly to object attributes, but it is up to the application to
|
||||||
|
decide which value type is most appropriate for each block type, since this
|
||||||
|
depends on how multiple blocks of the same type relate to one another, or if
|
||||||
|
multiple blocks of that type are even allowed.
|
||||||
|
|
||||||
|
In the above example, an application would probably expose the ``listen_addr``
|
||||||
|
argument value as ``service.http.web_proxy.listen_addr``, and may choose to
|
||||||
|
expose the ``process`` blocks as a map of objects using the labels as keys,
|
||||||
|
which would allow an expression like
|
||||||
|
``service.http.web_proxy.service["main"].command``.
|
||||||
|
|
||||||
|
If multiple blocks of a given type do not have a significant order relative to
|
||||||
|
one another, as seems to be the case with these ``process`` blocks,
|
||||||
|
representation as a map is often the most intuitive. If the ordering of the
|
||||||
|
blocks *is* significant then a list may be more appropriate, allowing the use
|
||||||
|
of HCL's "splat operators" for convenient access to child arguments. However,
|
||||||
|
there is no one-size-fits-all solution here and language designers must
|
||||||
|
instead consider the likely usage patterns of each value and select the
|
||||||
|
value representation that best accommodates those patterns.
|
||||||
|
|
||||||
|
Some applications may choose to offer variables with slightly different names
|
||||||
|
than the top-level blocks in order to allow for more concise references, such
|
||||||
|
as abbreviating ``service`` to ``svc`` in the above examples. This should be
|
||||||
|
done with care since it may make the relationship between the two less obvious,
|
||||||
|
but this may be a good tradeoff for names that are accessed frequently that
|
||||||
|
might otherwise hurt the readability of expressions they are embedded in.
|
||||||
|
Familiarity permits brevity.
|
||||||
|
|
||||||
|
Many applications will not make blocks results available for use in other
|
||||||
|
expressions at all, in which case they are free to select whichever variable
|
||||||
|
names make sense for what is being exposed. For example, a format may make
|
||||||
|
environment variable values available for use in expressions, and may do so
|
||||||
|
either as top-level variables (if no other variables are needed) or as an
|
||||||
|
object named ``env``, which can be used as in ``env.HOME``.
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user