These allow the inclusion of arbitrary unicode codepoints (always encoded
as UTF-8) using a hex representation.
\u expects four digits and can thus represent only characters in the basic
multilingual plane.
\U expects eight digits and can thus represent all unicode characters,
at the cost of being extra-verbose.
Since our parser properly accounts for unicode characters (including
combining sequences) it's recommended to include them literally (UTF-8
encoded) in source code, but these sequences are useful for explicitly
representing non-printable characters that could otherwise appear
invisible in source code, such as zero-width modifier characters.
This fixes#6.