Use Unicode 12.0.0 grapheme cluster segmentation rules

HCL uses grapheme cluster segmentation to produce accurate "column"
indications in diagnostic messages and other human-oriented source
location information. Each new major version of Unicode introduces new
codepoints, some of which are defined to combine with other codepoints to
produce a single visible character (grapheme cluster).

We were previously using the rules from Unicode 9.0.0. This change
switches to using the segmentation rules from Unicode 12.0.0, which is
the latest version at the time of this commit and is also the version of
Unicode used for other purposes by the Go 1.14 runtime.

HCL does not use text segmentation results for any purpose that would
affect the meaning of decoded data extracted from HCL files, so this
change will only affect the human-oriented source positions generated for
files containing characters that were newly-introduced in Unicode 10, 11,
or 12. (Machine-oriented uses of source location information are based on
byte offsets and not affected by text segmentation.)
This commit is contained in:
Martin Atkins 2020-03-06 17:17:12 -08:00
parent 2cf95f2b60
commit fee90926da
8 changed files with 11 additions and 7 deletions

4
go.mod
View File

@ -1,9 +1,11 @@
module github.com/hashicorp/hcl/v2
go 1.12
require (
github.com/agext/levenshtein v1.2.1
github.com/apparentlymart/go-dump v0.0.0-20180507223929-23540a00eaa3
github.com/apparentlymart/go-textseg v1.0.0
github.com/apparentlymart/go-textseg/v12 v12.0.0
github.com/davecgh/go-spew v1.1.1
github.com/go-test/deep v1.0.3
github.com/google/go-cmp v0.3.1

2
go.sum
View File

@ -4,6 +4,8 @@ github.com/apparentlymart/go-dump v0.0.0-20180507223929-23540a00eaa3 h1:ZSTrOEhi
github.com/apparentlymart/go-dump v0.0.0-20180507223929-23540a00eaa3/go.mod h1:oL81AME2rN47vu18xqj1S1jPIPuN7afo62yKTNn3XMM=
github.com/apparentlymart/go-textseg v1.0.0 h1:rRmlIsPEEhUTIKQb7T++Nz/A5Q6C9IuX2wFoYVvnCs0=
github.com/apparentlymart/go-textseg v1.0.0/go.mod h1:z96Txxhf3xSFMPmb5X/1W05FF/Nj9VFpLOpjS5yuumk=
github.com/apparentlymart/go-textseg/v12 v12.0.0 h1:bNEQyAGak9tojivJNkoqWErVCQbjdL7GzRt3F8NvfJ0=
github.com/apparentlymart/go-textseg/v12 v12.0.0/go.mod h1:S/4uRK2UtaQttw1GenVJEynmyUenKwP++x/+DdGV/Ec=
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/go-test/deep v1.0.3 h1:ZrJSEWsXzPOxaZnFteGEfooLba+ju3FYIbOrS+rQd68=

View File

@ -6,7 +6,7 @@ import (
"strconv"
"unicode/utf8"
"github.com/apparentlymart/go-textseg/textseg"
"github.com/apparentlymart/go-textseg/v12/textseg"
"github.com/hashicorp/hcl/v2"
"github.com/zclconf/go-cty/cty"
)

View File

@ -5,7 +5,7 @@ import (
"strings"
"unicode"
"github.com/apparentlymart/go-textseg/textseg"
"github.com/apparentlymart/go-textseg/v12/textseg"
"github.com/hashicorp/hcl/v2"
"github.com/zclconf/go-cty/cty"
)

View File

@ -4,7 +4,7 @@ import (
"bytes"
"fmt"
"github.com/apparentlymart/go-textseg/textseg"
"github.com/apparentlymart/go-textseg/v12/textseg"
"github.com/hashicorp/hcl/v2"
)

View File

@ -4,7 +4,7 @@ import (
"bytes"
"io"
"github.com/apparentlymart/go-textseg/textseg"
"github.com/apparentlymart/go-textseg/v12/textseg"
"github.com/hashicorp/hcl/v2"
"github.com/hashicorp/hcl/v2/hclsyntax"
)

View File

@ -3,7 +3,7 @@ package json
import (
"fmt"
"github.com/apparentlymart/go-textseg/textseg"
"github.com/apparentlymart/go-textseg/v12/textseg"
"github.com/hashicorp/hcl/v2"
)

View File

@ -4,7 +4,7 @@ import (
"bufio"
"bytes"
"github.com/apparentlymart/go-textseg/textseg"
"github.com/apparentlymart/go-textseg/v12/textseg"
)
// RangeScanner is a helper that will scan over a buffer using a bufio.SplitFunc