319 lines
15 KiB
ReStructuredText
319 lines
15 KiB
ReStructuredText
Configuration Language Design
|
|
=============================
|
|
|
|
In this section we will cover some conventions for HCL-based configuration
|
|
languages that can help make them feel consistent with other HCL-based
|
|
languages, and make the best use of HCL's building blocks.
|
|
|
|
HCL's native and JSON syntaxes both define a mapping from input bytes to a
|
|
higher-level information model. In designing a configuration language based on
|
|
HCL, your building blocks are the components in that information model:
|
|
blocks, arguments, and expressions.
|
|
|
|
Each calling application of HCL, then, effectively defines its own language.
|
|
Just as Atom and RSS are higher-level languages built on XML, HashiCorp
|
|
Terraform has a higher-level language built on HCL, while HashiCorp Nomad has
|
|
its own distinct language that is *also* built on HCL.
|
|
|
|
From an end-user perspective, these are distinct languages but have a common
|
|
underlying texture. Users of both are therefore likely to bring some
|
|
expectations from one to the other, and so this section is an attempt to
|
|
codify some of these shared expectations to reduce user surprise.
|
|
|
|
These are subjective guidelines however, and so applications may choose to
|
|
ignore them entirely or ignore them in certain specialized cases. An
|
|
application providing a configuration language for a pre-existing system, for
|
|
example, may choose to eschew the identifier naming conventions in this section
|
|
in order to exactly match the existing names in that underlying system.
|
|
|
|
Language Keywords and Identifiers
|
|
---------------------------------
|
|
|
|
Much of the work in defining an HCL-based language is in selecting good names
|
|
for arguments, block types, variables, and functions.
|
|
|
|
The standard for naming in HCL is to use all-lowercase identifiers with
|
|
underscores separating words, like ``service`` or ``io_mode``. HCL identifiers
|
|
do allow uppercase letters and dashes, but this primarily for natural
|
|
interfacing with external systems that may have other identifier conventions,
|
|
and so these should generally be avoided for the identifiers native to your
|
|
own language.
|
|
|
|
The distinction between "keywords" and other identifiers is really just a
|
|
convention. In your own language documentation, you may use the word "keyword"
|
|
to refer to names that are presented as an intrinsic part of your language,
|
|
such as important top-level block type names.
|
|
|
|
Block type names are usually singular, since each block defines a single
|
|
object. Use a plural block name only if the block is serving only as a
|
|
namespacing container for a number of other objects. A block with a plural
|
|
type name will generally contain only nested blocks, and no arguments of its
|
|
own.
|
|
|
|
Argument names are also singular unless they expect a collection value, in
|
|
which case they should be plural. For example, ``name = "foo"`` but
|
|
``subnet_ids = ["abc", "123"]``.
|
|
|
|
Function names will generally *not* use underscores and will instead just run
|
|
words together, as is common in the C standard library. This is a result of
|
|
the fact that several of the standard library functions offered in ``cty``
|
|
(covered in a later section) have names that follow C library function names
|
|
like ``substr``. This is not a strong rule, and applications that use longer
|
|
names may choose to use underscores for them to improve readability.
|
|
|
|
Blocks vs. Object Values
|
|
------------------------
|
|
|
|
HCL blocks and argument values of object type have quite a similar appearance
|
|
in the native syntax, and are identical in JSON syntax:
|
|
|
|
.. code-block:: hcl
|
|
|
|
block {
|
|
foo = bar
|
|
}
|
|
|
|
# argument with object constructor expression
|
|
argument = {
|
|
foo = bar
|
|
}
|
|
|
|
In spite of this superficial similarity, there are some important differences
|
|
between these two forms.
|
|
|
|
The most significant difference is that a child block can contain nested blocks
|
|
of its own, while an object constructor expression can define only attributes
|
|
of the object it is creating.
|
|
|
|
The user-facing model for blocks is that they generally form the more "rigid"
|
|
structure of the language itself, while argument values can be more free-form.
|
|
An application will generally define in its schema and documentation all of
|
|
the arguments that are valid for a particular block type, while arguments
|
|
accepting object constructors are more appropriate for situations where the
|
|
arguments themselves are freely selected by the user, such as when the
|
|
expression will be converted by the application to a map type.
|
|
|
|
As a less contrived example, consider the ``resource`` block type in Terraform
|
|
and its use with a particular resource type ``aws_instance``:
|
|
|
|
.. code-block:: hcl
|
|
|
|
resource "aws_instance" "example" {
|
|
ami = "ami-abc123"
|
|
instance_type = "t2.micro"
|
|
|
|
tags = {
|
|
Name = "example instance"
|
|
}
|
|
|
|
ebs_block_device {
|
|
device_name = "hda1"
|
|
volume_size = 8
|
|
volume_type = "standard"
|
|
}
|
|
}
|
|
|
|
The top-level block type ``resource`` is fundamental to Terraform itself and
|
|
so an obvious candidate for block syntax: it maps directly onto an object in
|
|
Terraform's own domain model.
|
|
|
|
Within this block we see a mixture of arguments and nested blocks, all defined
|
|
as part of the schema of the ``aws_instance`` resource type. The ``tags``
|
|
map here is specified as an argument because its keys are free-form, chosen
|
|
by the user and mapped directly onto a map in the underlying system.
|
|
``ebs_block_device`` is specified as a nested block, because it is a separate
|
|
domain object within the remote system and has a rigid schema of its own.
|
|
|
|
As a special case, block syntax may sometimes be used with free-form keys if
|
|
those keys each serve as a separate declaration of some first-class object
|
|
in the language. For example, Terraform has a top-level block type ``locals``
|
|
which behaves in this way:
|
|
|
|
.. code-block:: hcl
|
|
|
|
locals {
|
|
instance_type = "t2.micro"
|
|
instance_id = aws_instance.example.id
|
|
}
|
|
|
|
Although the argument names in this block are arbitrarily selected by the
|
|
user, each one defines a distinct top-level object. In other words, this
|
|
approach is used to create a more ergonomic syntax for defining these simple
|
|
single-expression objects, as a pragmatic alternative to more verbose and
|
|
redundant declarations using blocks:
|
|
|
|
.. code-block:: hcl
|
|
|
|
local "instance_type" {
|
|
value = "t2.micro"
|
|
}
|
|
local "instance_id" {
|
|
value = aws_instance.example.id
|
|
}
|
|
|
|
The distinction between domain objects, language constructs and user data will
|
|
always be subjective, so the final decision is up to you as the language
|
|
designer.
|
|
|
|
Standard Functions
|
|
------------------
|
|
|
|
HCL itself does not define a common set of functions available in all HCL-based
|
|
languages; the built-in language operators give a baseline of functionality
|
|
that is always available, but applications are free to define functions as they
|
|
see fit.
|
|
|
|
With that said, there's a number of generally-useful functions that don't
|
|
belong to the domain of any one application: string manipulation, sequence
|
|
manipulation, date formatting, JSON serialization and parsing, etc.
|
|
|
|
Given the general need such functions serve, it's helpful if a similar set of
|
|
functions is available with compatible behavior across multiple HCL-based
|
|
languages, assuming the language is for an application where function calls
|
|
make sense at all.
|
|
|
|
The Go implementation of HCL is built on an underlying type and function system
|
|
:go:pkg:`cty`, whose usage was introduced in :ref:`go-expression-funcs`. That
|
|
library also has a package of "standard library" functions which we encourage
|
|
applications to offer with consistent names and compatible behavior, either by
|
|
using the standard implementations directly or offering compatible
|
|
implementations under the same name.
|
|
|
|
The "standard" functions that new configuration formats should consider
|
|
offering are:
|
|
|
|
* ``abs(number)`` - returns the absolute (positive) value of the given number.
|
|
* ``coalesce(vals...)`` - returns the value of the first argument that isn't null. Useful only in formats where null values may appear.
|
|
* ``compact(vals...)`` - returns a new tuple with the non-null values given as arguments, preserving order.
|
|
* ``concat(seqs...)`` - builds a tuple value by concatenating together all of the given sequence (list or tuple) arguments.
|
|
* ``format(fmt, args...)`` - performs simple string formatting similar to the C library function ``printf``.
|
|
* ``hasindex(coll, idx)`` - returns true if the given collection has the given index. ``coll`` may be of list, tuple, map, or object type.
|
|
* ``int(number)`` - returns the integer component of the given number, rounding towards zero.
|
|
* ``jsondecode(str)`` - interprets the given string as JSON format and return the corresponding decoded value.
|
|
* ``jsonencode(val)`` - encodes the given value as a JSON string.
|
|
* ``length(coll)`` - returns the length of the given collection.
|
|
* ``lower(str)`` - converts the letters in the given string to lowercase, using Unicode case folding rules.
|
|
* ``max(numbers...)`` - returns the highest of the given number values.
|
|
* ``min(numbers...)`` - returns the lowest of the given number values.
|
|
* ``sethas(set, val)`` - returns true only if the given set has the given value as an element.
|
|
* ``setintersection(sets...)`` - returns the intersection of the given sets
|
|
* ``setsubtract(set1, set2)`` - returns a set with the elements from ``set1`` that are not also in ``set2``.
|
|
* ``setsymdiff(sets...)`` - returns the symmetric difference of the given sets.
|
|
* ``setunion(sets...)`` - returns the union of the given sets.
|
|
* ``strlen(str)`` - returns the length of the given string in Unicode grapheme clusters.
|
|
* ``substr(str, offset, length)`` - returns a substring from the given string by splitting it between Unicode grapheme clusters.
|
|
* ``timeadd(time, duration)`` - takes a timestamp in RFC3339 format and a possibly-negative duration given as a string like ``"1h"`` (for "one hour") and returns a new RFC3339 timestamp after adding the duration to the given timestamp.
|
|
* ``upper(str)`` - converts the letters in the given string to uppercase, using Unicode case folding rules.
|
|
|
|
Not all of these functions will make sense in all applications. For example, an
|
|
application that doesn't use set types at all would have no reason to provide
|
|
the set-manipulation functions here.
|
|
|
|
Some languages will not provide functions at all, since they are primarily for
|
|
assigning values to arguments and thus do not need nor want any custom
|
|
computations of those values.
|
|
|
|
Block Results as Expression Variables
|
|
-------------------------------------
|
|
|
|
In some applications, top-level blocks serve also as declarations of variables
|
|
(or of attributes of object variables) available during expression evaluation,
|
|
as discussed in :ref:`go-interdep-blocks`.
|
|
|
|
In this case, it's most intuitive for the variables map in the evaluation
|
|
context to contain an value named after each valid top-level block
|
|
type and for these values to be object-typed or map-typed and reflect the
|
|
structure implied by block type labels.
|
|
|
|
For example, an application may have a top-level ``service`` block type
|
|
used like this:
|
|
|
|
.. code-block:: hcl
|
|
|
|
service "http" "web_proxy" {
|
|
listen_addr = "127.0.0.1:8080"
|
|
|
|
process "main" {
|
|
command = ["/usr/local/bin/awesome-app", "server"]
|
|
}
|
|
|
|
process "mgmt" {
|
|
command = ["/usr/local/bin/awesome-app", "mgmt"]
|
|
}
|
|
}
|
|
|
|
If the result of decoding this block were available for use in expressions
|
|
elsewhere in configuration, the above convention would call for it to be
|
|
available to expressions as an object at ``service.http.web_proxy``.
|
|
|
|
If it the contents of the block itself that are offered to evaluation -- or
|
|
a superset object *derived* from the block contents -- then the block arguments
|
|
can map directly to object attributes, but it is up to the application to
|
|
decide which value type is most appropriate for each block type, since this
|
|
depends on how multiple blocks of the same type relate to one another, or if
|
|
multiple blocks of that type are even allowed.
|
|
|
|
In the above example, an application would probably expose the ``listen_addr``
|
|
argument value as ``service.http.web_proxy.listen_addr``, and may choose to
|
|
expose the ``process`` blocks as a map of objects using the labels as keys,
|
|
which would allow an expression like
|
|
``service.http.web_proxy.service["main"].command``.
|
|
|
|
If multiple blocks of a given type do not have a significant order relative to
|
|
one another, as seems to be the case with these ``process`` blocks,
|
|
representation as a map is often the most intuitive. If the ordering of the
|
|
blocks *is* significant then a list may be more appropriate, allowing the use
|
|
of HCL's "splat operators" for convenient access to child arguments. However,
|
|
there is no one-size-fits-all solution here and language designers must
|
|
instead consider the likely usage patterns of each value and select the
|
|
value representation that best accommodates those patterns.
|
|
|
|
Some applications may choose to offer variables with slightly different names
|
|
than the top-level blocks in order to allow for more concise references, such
|
|
as abbreviating ``service`` to ``svc`` in the above examples. This should be
|
|
done with care since it may make the relationship between the two less obvious,
|
|
but this may be a good tradeoff for names that are accessed frequently that
|
|
might otherwise hurt the readability of expressions they are embedded in.
|
|
Familiarity permits brevity.
|
|
|
|
Many applications will not make blocks results available for use in other
|
|
expressions at all, in which case they are free to select whichever variable
|
|
names make sense for what is being exposed. For example, a format may make
|
|
environment variable values available for use in expressions, and may do so
|
|
either as top-level variables (if no other variables are needed) or as an
|
|
object named ``env``, which can be used as in ``env.HOME``.
|
|
|
|
Text Editor and IDE Integrations
|
|
--------------------------------
|
|
|
|
Since HCL defines only low-level syntax, a text editor or IDE integration for
|
|
HCL itself can only really provide basic syntax highlighting.
|
|
|
|
For non-trivial HCL-based languages, a more specialized editor integration may
|
|
be warranted. For example, users writing configuration for HashiCorp Terraform
|
|
must recall the argument names for numerous different provider plugins, and so
|
|
auto-completion and documentation hovertips can be a great help, and
|
|
configurations are commonly spread over multiple files making "Go to Definition"
|
|
functionality useful. None of this functionality can be implemented generically
|
|
for all HCL-based languages since it relies on knowledge of the structure of
|
|
Terraform's own language.
|
|
|
|
Writing such text editor integrations is out of the scope of this guide. The
|
|
Go implementation of HCL does have some building blocks to help with this, but
|
|
it will always be an application-specific effort.
|
|
|
|
However, in order to *enable* such integrations, it is best to establish a
|
|
conventional file extension *other than* `.hcl` for each non-trivial HCL-based
|
|
language, thus allowing text editors to recognize it and enable the suitable
|
|
integration. For example, Terraform requires ``.tf`` and ``.tf.json`` filenames
|
|
for its main configuration, and the ``hcldec`` utility in the HCL repository
|
|
accepts spec files that should conventionally be named with an ``.hcldec``
|
|
extension.
|
|
|
|
For simple languages that are unlikely to benefit from specific editor
|
|
integrations, using the ``.hcl`` extension is fine and may cause an editor to
|
|
enable basic syntax highlighting, absent any other deeper features. An editor
|
|
extension for a specific HCL-based language should *not* match generically the
|
|
``.hcl`` extension, since this can cause confusing results for users
|
|
attempting to write configuration files targeting other applications.
|