guide: Design Patterns for Complex Systems section

2018-08-31 22:01:43 -07:00 · 2018-08-31 22:01:43 -07:00 · 280771fe8a
commit 280771fe8a
parent 495dfc9487
1 changed files with 311 additions and 0 deletions
--- a/guide/go_patterns.rst
+++ b/guide/go_patterns.rst
@ -1,2 +1,313 @@
 Design Patterns for Complex Systems
 ===================================
 In previous sections we've seen an overview of some different ways an
 application can decode a language its has defined in terms of the HCL grammar.
 For many applications, those mechanisms are sufficient. However, there are
 some more complex situations that can benefit from some additional techniques.
 This section lists a few of these situations and ways to use the HCL API to
 accommodate them.
 Interdependent Blocks
 ---------------------
 In some configuration languages, the variables available for use in one
 configuration block depend on values defined in other blocks.
 For example, in Terraform many of the top-level constructs are also implicitly
 definitions of values that are available for use in expressions elsewhere:
 .. code-block:: hcl
   variable "network_numbers" {
     type = list(number)
   }
   variable "base_network_addr" {
     type    = string
     default = "10.0.0.0/8"
   }
   locals {
     network_blocks = {
       for x in var.number:
       x => cidrsubnet(var.base_network_addr, 8, x)
     }
   }
   resource "cloud_subnet" "example" {
     for_each = local.network_blocks
     cidr_block = each.value
   }
   output "subnet_ids" {
     value = cloud_subnet.example[*].id
   }
 In this example, the `variable "network_numbers"` block makes
 ``var.base_network_addr`` available to expressions, the
 ``resource "cloud_subnet" "example"`` block makes ``cloud_subnet.example``
 available, etc.
 Terraform achieves this by decoding the top-level structure in isolation to
 start. You can do this either using the low-level API or using :go:pkg:`gohcl`
 with :go:type:`hcl.Body` fields tagged as "remain".
 Once you have a separate body for each top-level block, you can inspect each
 of the attribute expressions inside using the ``Variables`` method on
 :go:type:`hcl.Expression`, or the ``Variables`` function from package
 :go:pkg:`hcldec` if you will eventually use its higher-level API to decode as
 Terraform does.
 The detected variable references can then be used to construct a dependency
 graph between the blocks, and then perform a
 `topological sort <https://en.wikipedia.org/wiki/Topological_sorting>`_ to
 determine the correct order to evaluate each block's contents so that values
 will always be available before they are needed.
 Since :go:pkg:`cty` values are immutable, it is not convenient to directly
 change values in a :go:type:`hcl.EvalContext` during this gradual evaluation,
 so instead construct a specialized data structure that has a separate value
 per object and construct an evaluation context from that each time a new
 value becomes available.
 Using :go:pkg:`hcldec` to evaluate block bodies is particularly convenient in
 this scenario because it produces :go:type:`cty.Value` results which can then
 just be directly incorporated into the evaluation context.
 Distributed Systems
 -------------------
 Distributed systems cause a number of extra challenges, and configuration
 management is rarely the worst of these. However, there are some specific
 considerations for using HCL-based configuration in distributed systems.
 For the sake of this section, we are concerned with distributed systems where
 at least two separate components both depend on the content of HCL-based
 configuration files. Real-world examples include the following:
 * **HashiCorp Nomad** loads configuration (job specifications) in its servers
  but also needs these results in its clients and in its various driver plugins.
 * **HashiCorp Terraform** parses configuration in Terraform Core but can write
  a partially-evaluated execution plan to disk and continue evaluation in a
  separate process later. It must also pass configuration values into provider
  plugins.
 Broadly speaking, there are two approaches to allowing configuration to be
 accessed in multiple subsystems, which the following subsections will discuss
 separately.
 Ahead-of-time Evaluation
 ^^^^^^^^^^^^^^^^^^^^^^^^
 Ahead-of-time evaluation is the simplest path, with the configuration files
 being entirely evaluated on entry to the system, and then only the resulting
 *constant values* being passed between subsystems.
 This approach is relatively straightforward because the resulting
 :go:type:`cty.Value` results can be losslessly serialized as either JSON or
 msgpack as long as all system components agree on the expected value types.
 Aside from passing these values around "on the wire", parsing and decoding of
 configuration proceeds as normal.
 Both Nomad and Terraform use this approach for interacting with *plugins*,
 because the plugins themselves are written by various different teams that do
 not coordinate closely, and so doing all expression evaluation in the core
 subsystems ensures consistency between plugins and simplifies plugin development.
 In both applications, the plugin is expected to describe (using an
 application-specific protocol) the schema it expects for each element of
 configuration it is responsible for, allowing the core subsystems to perform
 decoding on the plugin's behalf and pass a value that is guaranteed to conform
 to the schema.
 Gradual Evaluation
 ^^^^^^^^^^^^^^^^^^
 Although ahead-of-time evaluation is relatively straightforward, it has the
 significant disadvantage that all data available for access via variables or
 functions must be known by whichever subsystem performs that initial
 evaluation.
 For example, in Terraform, the "plan" subcommand is responsible for evaluating
 the configuration and presenting to the user an execution plan for approval, but
 certain values in that plan cannot be determined until the plan is already
 being applied, since the specific values used depend on remote API decisions
 such as the allocation of opaque id strings for objects.
 In Terraform's case, both the creation of the plan and the eventual apply
 of that plan *both* entail evaluating configuration, with the apply step
 having a more complete set of input values and thus producing a more complete
 result. However, this means that Terraform must somehow make the expressions
 from the original input configuration available to the separate process that
 applies the generated plan.
 Good usability requires error and warning messages that are able to refer back
 to specific sections of the input configuration as context for the reported
 problem, and the best way to achieve this in a distributed system doing
 gradual evaluation is to send the configuration *source code* between
 subsystems. This is generally the most compact representation that retains
 source location information, and will avoid any inconsistency caused by
 introducing another intermediate serialization.
 In Terraform's, for example, the serialized plan incorporates both the data
 structure describing the partial evaluation results from the plan phase and
 the original configuration files that produced those results, which can then
 be re-evalauated during the apply step.
 In a gradual evaluation scenario, the application should verify correctness of
 the input configuration as completely as possible at each state. To help with
 this, :go:pkg:`cty` has the concept of
 `unknown values <https://github.com/zclconf/go-cty/blob/master/docs/concepts.md#unknown-values-and-the-dynamic-pseudo-type>`_,
 which can stand in for values the application does not yet know while still
 retaining correct type information. HCL expression evaluation reacts to unknown
 values by performing type checking but then returning another unknown value,
 causing the unknowns to propagate through expressions automatically.
 .. code-block:: go
   ctx := &hcl.EvalContext{
        Variables: map[string]cty.Value{
            "name": cty.UnknownVal(cty.String),
            "age":  cty.UnknownVal(cty.Number),
        },
   }
   val, moreDiags := expr.Value(ctx)
   diags = append(diags, moreDiags...)
 Each time an expression is re-evaluated with additional information, fewer of
 the input values will be unknown and thus more of the result will be known.
 Eventually the application should evaluate the expressions with no unknown
 values at all, which then guarantees that the result will also be wholly-known.
 Static References, Calls, Lists, and Maps
 -----------------------------------------
 In most cases, we care more about the final result value of an expression than
 how that value was obtained. A particular list argument, for example, might
 be defined by the user via a tuple constructor, by a `for` expression, or by
 assigning the value of a variable that has a suitable list type.
 In some special cases, the structure of the expression is more important than
 the result value, or an expression may not *have* a reasonable result value.
 For example, in Terraform there are a few arguments that call for the user
 to name another object by reference, rather than provide an object value:
 .. code-block:: hcl
   resource "cloud_network" "example" {
     # ...
   }
   resource "cloud_subnet" "example" {
     cidr_block = "10.1.2.0/24"
     depends_on = [
       cloud_network.example,
     ]
   }
 The ``depends_on`` argument in the second ``resource`` block *appears* as an
 expression that would construct a single-element tuple containing an object
 representation of the first resource block. However, Terraform uses this
 expression to construct its dependency graph, and so it needs to see
 specifically that this expression refers to ``cloud_network.example``, rather
 than determine a result value for it.
 HCL offers a number of "static analysis" functions to help with this sort of
 situation. These all live in the :go:pkg:`hcl` package, and each one imposes
 a particular requirement on the syntax tree of the expression it is given,
 and returns a result derived from that if the expression conforms to that
 requirement.
 .. go:currentpackage:: hcl
 .. go:function:: func ExprAsKeyword(expr Expression) string
   This function attempts to interpret the given expression as a single keyword,
   returning that keyword as a string if possible.
   A "keyword" for the purposes of this function is an expression that can be
   understood as a valid single identifier. For example, the simple variable
   reference ``foo`` can be interpreted as a keyword, while ``foo.bar``
   cannot.
   As a special case, the language-level keywords ``true``, ``false``, and
   ``null`` are also considered to be valid keywords, allowing the calling
   application to disregard their usual meaning.
   If the given expression cannot be reduced to a single keyword, the result
   is an empty string. Since an empty string is never a valid keyword, this
   result unambiguously signals failure.
 .. go:function:: func AbsTraversalForExpr(expr Expression) (Traversal, Diagnostics)
   This is a generalization of ``ExprAsKeyword`` that will accept anything that
   can be interpreted as a *traversal*, which is a variable name followed by
   zero or more attribute access or index operators with constant operands.
   For example, all of ``foo``, ``foo.bar`` and ``foo[0]`` are valid
   traversals, but ``foo[bar]`` is not, because the ``bar`` index is not
   constant.
   This is the function that Terraform uses to interpret the items within the
   ``depends_on`` sequence in our example above.
   As with ``ExprAsKeyword``, this function has a special case that the
   keywords ``true``, ``false``, and ``null`` will be accepted as if they were
   variable names by this function, allowing ``null.foo`` to be interpreted
   as a traversal even though it would be invalid if evaluated.
   If error diagnostics are returned, the traversal result is invalid and
   should not be used.
 .. go:function:: func RelTraversalForExpr(expr Expression) (Traversal, Diagnostics)
   This is very similar to ``AbsTraversalForExpr``, but the result is a
   *relative* traversal, which is one whose first name is considered to be
   an attribute of some other (implied) object.
   The processing rules are identical to ``AbsTraversalForExpr``, with the
   only exception being that the first element of the returned traversal is
   marked as being an attribute, rather than as a root variable.
 .. go:function:: func ExprList(expr Expression) ([]Expression, Diagnostics)
   This function requires that the given expression be a tuple constructor,
   and if so returns a slice of the element expressions in that constructor.
   Applications can then perform further static analysis on these, or evaluate
   them as normal.
   If error diagnostics are returned, the result is invalid and should not be
   used.
   This is the fucntion that Terraform uses to interpret the expression
   assigned to ``depends_on`` in our example above, then in turn using
   ``AbsTraversalForExpr`` on each enclosed expression.
 .. go:function:: func ExprMap(expr Expression) ([]KeyValuePair, Diagnostics)
   This function requires that the given expression be an object constructor,
   and if so returns a slice of the element key/value pairs in that constructor.
   Applications can then perform further static analysis on these, or evaluate
   them as normal.
   If error diagnostics are returned, the result is invalid and should not be
   used.
 .. go:function:: func ExprCall(expr Expression) (*StaticCall, Diagnostics)
   This function requires that the given expression be a function call, and
   if so returns an object describing the name of the called function and
   expression objects representing the call arguments.
   If error diagnostics are returned, the result is invalid and should not be
   used.
 The ``Variables`` method on :go:type:`hcl.Expression` is also considered to be
 a "static analysis" helper, but is built in as a fundamental feature because
 analysis of referenced variables is often important for static validation and
 for implementing interdependent blocks as we saw in the section above.