

If the value represented by a single hexadecimal escape sequence does not fit the range of values represented by the character type used in this string literal ( char, char8_t, (since C++20) char16_t, char32_t, (since C++11)or wchar_t), the result is unspecified.Ī universal character name in a narrow string literal or a 16-bit string literal may map to more than one code unit, e.g. Hexadecimal escape sequences have no length limit and terminate at the first character that is not a valid hexadecimal digit. Octal escape sequences have a limit of three octal digits, but terminate at the first character that is not a valid octal digit if encountered sooner. Some systems mark their lines with length fields instead. The new-line character \n has special meaning when used in text mode I/O: it is converted to the OS-specific newline representation, usually a byte or byte sequence. \ 0 is the most commonly used octal escape sequence, because it represents the terminating null character in null-terminated strings. None of these names or aliases have leading or trailing spaces.

These aliases are listed in the Unicode Character Database’s NameAliases.txt.

It designates the corresponding character in the Unicode Standard ( chapter 4.8 Name) if the n-char-sequence is equal to its character name or to one of its character name aliases of type “control”, “correction”, or “alternate” otherwise, the program is ill-formed. If a universal character name does not correspond to a scalar value of a character in the translation character set, the program is ill-formed.Ī character from the translation character set, except the right curly bracket } or new-line characterĪ universal character name of the syntax above is a named universal character. If a universal character name corresponding to a scalar value of a character in the basic character set or a control character appear outside a character or string literal, the program is ill-formed. If a universal character name does not correspond to a code point in ISO/IEC 10646 (the range 0x0-0x10FFFF, inclusive) or corresponds to a surrogate code point (the range 0xD800-0xDFFF, inclusive), the program is ill-formed. If a universal character name used in a UTF-16/32 string literal does not correspond to a code point in ISO/IEC 10646 (the range 0x0-0x10FFFF, inclusive), the program is ill-formed. If a universal character name corresponds surrogate code point (the range 0xD800-0xDFFF, inclusive), the program is ill-formed. If a universal character name corresponding to a code point of a member of basic source character set or control characters appear outside a character or string literal, the program is ill-formed.

In other words, members of basic source character set and control characters (in ranges 0x0-0x1F and 0x7F-0x9F) cannot be expressed in universal character names. If a universal character name corresponds to a code point that is not 0x24 ( $), 0x40 ( nor 0圆0 ( `) and less than 0xA0, the program is ill-formed. The character c in each conditional escape sequence is a member of basic source character set (until C++23) basic character set (since C++23) that is not the character following the \ in any other escape sequence. ↑ Conditional escape sequences are conditionally-supported.(arbitrary number of hexadecimal digits)Ĭode point U+ nnnnnnnn (8 hexadecimal digits) (arbitrary number of hexadecimal digits)Ĭode point U+ nnnn (4 hexadecimal digits)Ĭode point U+ n. (arbitrary number of octal digits)īyte n. The following escape sequences are available:īyte n. Escape sequences are used to represent certain special characters within string literals and character literals.
