Expand description

Operations related to UTF-8 validation.

Constants

CONT_MASK 🔒
Mask of the value bits of a continuation byte.

Functions

Returns true if any byte in the word x is nonascii (>= 128).
next_code_pointExperimental
Reads the next code point out of a byte iterator (assuming a UTF-8-like encoding).
Reads the last code point out of a byte iterator (assuming a UTF-8-like encoding).
Walks through v checking that it’s a valid UTF-8 sequence, returning Ok(()) in that case, or, if it is invalid, Err(err).
Returns the value of ch updated with continuation byte byte.
utf8_char_widthExperimental
Given a first byte, determines how many bytes are in this UTF-8 character.
Returns the initial codepoint accumulator for the first byte. The first byte is special, only want bottom 5 bits for width 2, 4 bits for width 3, and 3 bits for width 4.
Checks whether the byte is a UTF-8 continuation byte (i.e., starts with the bits 10).