pub struct Utf8Chunk<'a> { /* private fields */ }
Expand description
A chunk of valid UTF-8, possibly followed by invalid UTF-8 bytes.
This is yielded by the
Utf8Chunks
iterator, which can be created via the
ByteSlice::utf8_chunks
method.
The 'a
lifetime parameter corresponds to the lifetime of the bytes that
are being iterated over.
Implementations§
Source§impl<'a> Utf8Chunk<'a>
impl<'a> Utf8Chunk<'a>
Sourcepub fn valid(&self) -> &'a str
pub fn valid(&self) -> &'a str
Returns the (possibly empty) valid UTF-8 bytes in this chunk.
This may be empty if there are consecutive sequences of invalid UTF-8 bytes.
Sourcepub fn invalid(&self) -> &'a [u8] ⓘ
pub fn invalid(&self) -> &'a [u8] ⓘ
Returns the (possibly empty) invalid UTF-8 bytes in this chunk that immediately follow the valid UTF-8 bytes in this chunk.
This is only empty when this chunk corresponds to the last chunk in the original bytes.
The maximum length of this slice is 3. That is, invalid UTF-8 byte
sequences greater than 1 always correspond to a valid prefix of
a valid UTF-8 encoded codepoint. This corresponds to the “substitution
of maximal subparts” strategy that is described in more detail in the
docs for the
ByteSlice::to_str_lossy
method.
Sourcepub fn incomplete(&self) -> bool
pub fn incomplete(&self) -> bool
Returns whether the invalid sequence might still become valid if more bytes are added.
Returns true if the end of the input was reached unexpectedly, without encountering an unexpected byte.
This can only be the case for the last chunk.