unicode_normalization

Trait UnicodeNormalization

Source
pub trait UnicodeNormalization<I: Iterator<Item = char>> {
    // Required methods
    fn nfd(self) -> Decompositions<I> ;
    fn nfkd(self) -> Decompositions<I> ;
    fn nfc(self) -> Recompositions<I> ;
    fn nfkc(self) -> Recompositions<I> ;
    fn cjk_compat_variants(self) -> Replacements<I> ;
    fn stream_safe(self) -> StreamSafe<I> ;
}
Expand description

Methods for iterating over strings while applying Unicode normalizations as described in Unicode Standard Annex #15.

Required Methods§

Source

fn nfd(self) -> Decompositions<I>

Returns an iterator over the string in Unicode Normalization Form D (canonical decomposition).

Source

fn nfkd(self) -> Decompositions<I>

Returns an iterator over the string in Unicode Normalization Form KD (compatibility decomposition).

Source

fn nfc(self) -> Recompositions<I>

An Iterator over the string in Unicode Normalization Form C (canonical decomposition followed by canonical composition).

Source

fn nfkc(self) -> Recompositions<I>

An Iterator over the string in Unicode Normalization Form KC (compatibility decomposition followed by canonical composition).

Source

fn cjk_compat_variants(self) -> Replacements<I>

A transformation which replaces CJK Compatibility Ideograph codepoints with normal forms using Standardized Variation Sequences. This is not part of the canonical or compatibility decomposition algorithms, but performing it before those algorithms produces normalized output which better preserves the intent of the original text.

Note that many systems today ignore variation selectors, so these may not immediately help text display as intended, but they at least preserve the information in a standardized form, giving implementations the option to recognize them.

Source

fn stream_safe(self) -> StreamSafe<I>

An Iterator over the string with Conjoining Grapheme Joiner characters inserted according to the Stream-Safe Text Process (UAX15-D4)

Implementations on Foreign Types§

Source§

impl<'a> UnicodeNormalization<Chars<'a>> for &'a str

Implementors§

Source§

impl<I: Iterator<Item = char>> UnicodeNormalization<I> for I