class UnicodeString

Defined at line 295 of file ../../third_party/icu/default/source/common/unicode/unistr.h

C++ API: Replaceable String

Public Methods

template <typename S, typename = std::enable_if_t<ConvertibleToU16StringView<S>>>
bool operator== (const S & text)

Equality operator. Performs only bitwise comparison with `text`

which is, or which is implicitly convertible to,

a std::u16string_view or (if U_SIZEOF_WCHAR_T==2) std::wstring_view.

For performance, you can use UTF-16 string literals with compile-time

length determination:

Parameters

text The string view to compare to this string.

Returns

true if `text` contains the same characters as this one, false otherwise.

ICU 76

Code

                                        
                                             UnicodeString str = ...;
                                             if (str == u"literal") { ... }
                                        
                                    

Defined at line 347 of file ../../third_party/icu/default/source/common/unicode/unistr.h

template <typename S, typename = std::enable_if_t<ConvertibleToU16StringView<S>>>
bool operator!= (const S & text)

Inequality operator. Performs only bitwise comparison with `text`

which is, or which is implicitly convertible to,

a std::u16string_view or (if U_SIZEOF_WCHAR_T==2) std::wstring_view.

For performance, you can use std::u16string_view literals with compile-time

length determination:

Parameters

text The string view to compare to this string.

Returns

false if `text` contains the same characters as this one, true otherwise.

ICU 76

Code

                                        
                                             #include &lt;string_view&gt;
                                             using namespace std::string_view_literals;
                                             UnicodeString str = ...;
                                             if (str != u"literal"sv) { ... }
                                        
                                    

Defined at line 382 of file ../../third_party/icu/default/source/common/unicode/unistr.h

template <typename StringClass>
StringClass & toUTF8String (StringClass & result)

Convert the UnicodeString to UTF-8 and append the result

to a standard string.

Unpaired surrogates are replaced with U+FFFD.

Calls toUTF8().

Parameters

result A standard string (or a compatible object) to which the UTF-8 version of the string is appended.

Returns

The string object.

ICU 4.2

Defined at line 1777 of file ../../third_party/icu/default/source/common/unicode/unistr.h

template <typename S, typename = std::enable_if_t<ConvertibleToU16StringView<S>>>
UnicodeString & operator= (const S & src)

Assignment operator. Replaces the characters in this UnicodeString

with a copy of the characters from the `src`

which is, or which is implicitly convertible to,

a std::u16string_view or (if U_SIZEOF_WCHAR_T==2) std::wstring_view.

Parameters

src The string view containing the characters to copy.

Returns

a reference to this

ICU 76

Defined at line 1960 of file ../../third_party/icu/default/source/common/unicode/unistr.h

template <typename S, typename = std::enable_if_t<ConvertibleToU16StringView<S>>>
UnicodeString & operator+= (const S & src)

Append operator. Appends the characters in `src`

which is, or which is implicitly convertible to,

a std::u16string_view or (if U_SIZEOF_WCHAR_T==2) std::wstring_view,

to the UnicodeString object.

Parameters

src the source for the new characters

Returns

a reference to this

ICU 76

Defined at line 2227 of file ../../third_party/icu/default/source/common/unicode/unistr.h

template <typename S, typename = std::enable_if_t<ConvertibleToU16StringView<S>>>
UnicodeString & append (const S & src)

Appends the characters in `src`

which is, or which is implicitly convertible to,

a std::u16string_view or (if U_SIZEOF_WCHAR_T==2) std::wstring_view,

to the UnicodeString object.

Parameters

src the source for the new characters

Returns

a reference to this

ICU 76

Defined at line 2300 of file ../../third_party/icu/default/source/common/unicode/unistr.h

std::u16string_view operator basic_string_view ()

Converts to a std::u16string_view.

Returns

a string view of the contents of this string

ICU 76

Defined at line 3035 of file ../../third_party/icu/default/source/common/unicode/unistr.h

void UnicodeString (const uint16_t * text)

uint16_t * constructor.

Delegates to UnicodeString(const char16_t *).

It is recommended to mark this constructor "explicit" by

`-DUNISTR_FROM_STRING_EXPLICIT=explicit`

on the compiler command line or similar.

Note, for string literals:

Since C++17 and ICU 76, you can use UTF-16 string literals with compile-time

length determination:

Parameters

text NUL-terminated UTF-16 string ICU 59

Code

                                        
                                             UnicodeString str(u"literal");
                                             if (str == u"other literal") { ... }
                                        
                                    

Defined at line 3148 of file ../../third_party/icu/default/source/common/unicode/unistr.h

void UnicodeString (const uint16_t * text, int32_t textLength)

uint16_t * constructor.

Delegates to UnicodeString(const char16_t *, int32_t).

Note, for string literals:

Since C++17 and ICU 76, you can use UTF-16 string literals with compile-time

length determination:

Parameters

text UTF-16 string
textLength string length ICU 59

Code

                                        
                                             UnicodeString str(u"literal");
                                             if (str == u"other literal") { ... }
                                        
                                    

Defined at line 3225 of file ../../third_party/icu/default/source/common/unicode/unistr.h

template <typename S, typename = std::enable_if_t<ConvertibleToU16StringView<S>>>
void UnicodeString (const S & text)

Constructor from `text`

which is, or which is implicitly convertible to,

a std::u16string_view or (if U_SIZEOF_WCHAR_T==2) std::wstring_view.

The string is bogus if the string view is too long.

If you need a UnicodeString but need not copy the string view contents,

then you can call the UnicodeString::readOnlyAlias() function instead of this constructor.

Parameters

text UTF-16 string ICU 76

Defined at line 3274 of file ../../third_party/icu/default/source/common/unicode/unistr.h

void UnicodeString (uint16_t * buffer, int32_t buffLength, int32_t buffCapacity)

Writable-aliasing uint16_t * constructor.

Delegates to UnicodeString(const char16_t *, int32_t, int32_t).

Parameters

buffer writable buffer of/for UTF-16 text
buffLength length of the current buffer contents
buffCapacity buffer capacity ICU 59

Defined at line 3343 of file ../../third_party/icu/default/source/common/unicode/unistr.h

template <typename S, typename = std::enable_if_t<ConvertibleToU16StringView<S>>>
UnicodeString readOnlyAlias (const S & text)

Readonly-aliasing factory method.

Aliases the same buffer as the input `text`

which is, or which is implicitly convertible to,

a std::u16string_view or (if U_SIZEOF_WCHAR_T==2) std::wstring_view.

The string is bogus if the string view is too long.

The text will be used for the UnicodeString object, but

it will not be released when the UnicodeString is destroyed.

This has copy-on-write semantics:

When the string is modified, then the buffer is first copied into

newly allocated memory.

The aliased buffer is never modified.

In an assignment to another UnicodeString, when using the copy constructor

or the assignment operator, the text will be copied.

When using fastCopyFrom(), the text will be aliased again,

so that both strings then alias the same readonly-text.

Parameters

text The string view to alias for the UnicodeString. ICU 76

Defined at line 3600 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString readOnlyAlias (const UnicodeString & text)

Readonly-aliasing factory method.

Aliases the same buffer as the input `text`.

The text will be used for the UnicodeString object, but

it will not be released when the UnicodeString is destroyed.

This has copy-on-write semantics:

When the string is modified, then the buffer is first copied into

newly allocated memory.

The aliased buffer is never modified.

In an assignment to another UnicodeString, when using the copy constructor

or the assignment operator, the text will be copied.

When using fastCopyFrom(), the text will be aliased again,

so that both strings then alias the same readonly-text.

Parameters

text The UnicodeString to alias. ICU 76

Defined at line 3623 of file ../../third_party/icu/default/source/common/unicode/unistr.h

bool operator== (const UnicodeString & text)

Equality operator. Performs only bitwise comparison.

Parameters

text The UnicodeString to compare to this one.

Returns

true if `text` contains the same characters as this one,

false otherwise.

ICU 2.0

Defined at line 4288 of file ../../third_party/icu/default/source/common/unicode/unistr.h

bool operator!= (const UnicodeString & text)

Inequality operator. Performs only bitwise comparison.

Parameters

text The UnicodeString to compare to this one.

Returns

false if `text` contains the same characters as this one,

true otherwise.

ICU 2.0

Defined at line 4299 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UBool operator> (const UnicodeString & text)

Greater than operator. Performs only bitwise comparison.

Parameters

text The UnicodeString to compare to this one.

Returns

true if the characters in this are bitwise

greater than the characters in `text`, false otherwise

ICU 2.0

Defined at line 4303 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UBool operator< (const UnicodeString & text)

Less than operator. Performs only bitwise comparison.

Parameters

text The UnicodeString to compare to this one.

Returns

true if the characters in this are bitwise

less than the characters in `text`, false otherwise

ICU 2.0

Defined at line 4307 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UBool operator>= (const UnicodeString & text)

Greater than or equal operator. Performs only bitwise comparison.

Parameters

text The UnicodeString to compare to this one.

Returns

true if the characters in this are bitwise

greater than or equal to the characters in `text`, false otherwise

ICU 2.0

Defined at line 4311 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UBool operator<= (const UnicodeString & text)

Less than or equal operator. Performs only bitwise comparison.

Parameters

text The UnicodeString to compare to this one.

Returns

true if the characters in this are bitwise

less than or equal to the characters in `text`, false otherwise

ICU 2.0

Defined at line 4315 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int8_t compare (const UnicodeString & text)

Compare the characters bitwise in this UnicodeString to

the characters in `text`.

Parameters

text The UnicodeString to compare to this one.

Returns

The result of bitwise character comparison: 0 if this

contains the same characters as `text`, -1 if the characters in

this are bitwise less than the characters in `text`, +1 if the

characters in this are bitwise greater than the characters

in `text`.

ICU 2.0

Defined at line 4319 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int8_t compare (int32_t start, int32_t length, const UnicodeString & text)

Compare the characters bitwise in the range

[`start`, `start + length`) with the characters

in the **entire string** `text`.

(The parameters "start" and "length" are not applied to the other text "text".)

Parameters

start the offset at which the compare operation begins
length the number of characters of text to compare.
text the other text to be compared against this string.

Returns

The result of bitwise character comparison: 0 if this

contains the same characters as `text`, -1 if the characters in

this are bitwise less than the characters in `text`, +1 if the

characters in this are bitwise greater than the characters

in `text`.

ICU 2.0

Defined at line 4323 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int8_t compare (int32_t start, int32_t length, const UnicodeString & srcText, int32_t srcStart, int32_t srcLength)

Compare the characters bitwise in the range

[`start`, `start + length`) with the characters

in `srcText` in the range

[`srcStart`, `srcStart + srcLength`).

Parameters

start the offset at which the compare operation begins
length the number of characters in this to compare.
srcText the text to be compared
srcStart the offset into `srcText` to start comparison
srcLength the number of characters in `src` to compare

Returns

The result of bitwise character comparison: 0 if this

contains the same characters as `srcText`, -1 if the characters in

this are bitwise less than the characters in `srcText`, +1 if the

characters in this are bitwise greater than the characters

in `srcText`.

ICU 2.0

Defined at line 4334 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int8_t compare (ConstChar16Ptr srcChars, int32_t srcLength)

Compare the characters bitwise in this UnicodeString with the first

`srcLength` characters in `srcChars`.

Parameters

srcChars The characters to compare to this UnicodeString.
srcLength the number of characters in `srcChars` to compare

Returns

The result of bitwise character comparison: 0 if this

contains the same characters as `srcChars`, -1 if the characters in

this are bitwise less than the characters in `srcChars`, +1 if the

characters in this are bitwise greater than the characters

in `srcChars`.

ICU 2.0

Defined at line 4329 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int8_t compare (int32_t start, int32_t length, const char16_t * srcChars)

Compare the characters bitwise in the range

[`start`, `start + length`) with the first

`length` characters in `srcChars`

Parameters

start the offset at which the compare operation begins
length the number of characters to compare.
srcChars the characters to be compared

Returns

The result of bitwise character comparison: 0 if this

contains the same characters as `srcChars`, -1 if the characters in

this are bitwise less than the characters in `srcChars`, +1 if the

characters in this are bitwise greater than the characters

in `srcChars`.

ICU 2.0

Defined at line 4342 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int8_t compare (int32_t start, int32_t length, const char16_t * srcChars, int32_t srcStart, int32_t srcLength)

Compare the characters bitwise in the range

[`start`, `start + length`) with the characters

in `srcChars` in the range

[`srcStart`, `srcStart + srcLength`).

Parameters

start the offset at which the compare operation begins
length the number of characters in this to compare
srcChars the characters to be compared
srcStart the offset into `srcChars` to start comparison
srcLength the number of characters in `srcChars` to compare

Returns

The result of bitwise character comparison: 0 if this

contains the same characters as `srcChars`, -1 if the characters in

this are bitwise less than the characters in `srcChars`, +1 if the

characters in this are bitwise greater than the characters

in `srcChars`.

ICU 2.0

Defined at line 4348 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int8_t compareBetween (int32_t start, int32_t limit, const UnicodeString & srcText, int32_t srcStart, int32_t srcLimit)

Compare the characters bitwise in the range

[`start`, `limit`) with the characters

in `srcText` in the range

[`srcStart`, `srcLimit`).

Parameters

start the offset at which the compare operation begins
limit the offset immediately following the compare operation
srcText the text to be compared
srcStart the offset into `srcText` to start comparison
srcLimit the offset into `srcText` to limit comparison

Returns

The result of bitwise character comparison: 0 if this

contains the same characters as `srcText`, -1 if the characters in

this are bitwise less than the characters in `srcText`, +1 if the

characters in this are bitwise greater than the characters

in `srcText`.

ICU 2.0

Defined at line 4356 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int8_t compareCodePointOrder (const UnicodeString & text)

Compare two Unicode strings in code point order.

The result may be different from the results of compare(), operator

<

, etc.

if supplementary characters are present:

In UTF-16, supplementary characters (with code points U+10000 and above) are

stored with pairs of surrogate code units. These have values from 0xd800 to 0xdfff,

which means that they compare as less than some other BMP characters like U+feff.

This function compares Unicode strings in code point order.

If either of the UTF-16 strings is malformed (i.e., it contains unpaired surrogates), then the result is not defined.

Parameters

text Another string to compare this one to.

Returns

a negative/zero/positive integer corresponding to whether

this string is less than/equal to/greater than the second one

in code point order

ICU 2.0

Defined at line 4380 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int8_t compareCodePointOrder (int32_t start, int32_t length, const UnicodeString & srcText)

Compare two Unicode strings in code point order.

The result may be different from the results of compare(), operator

<

, etc.

if supplementary characters are present:

In UTF-16, supplementary characters (with code points U+10000 and above) are

stored with pairs of surrogate code units. These have values from 0xd800 to 0xdfff,

which means that they compare as less than some other BMP characters like U+feff.

This function compares Unicode strings in code point order.

If either of the UTF-16 strings is malformed (i.e., it contains unpaired surrogates), then the result is not defined.

Parameters

start The start offset in this string at which the compare operation begins.
length The number of code units from this string to compare.
srcText Another string to compare this one to.

Returns

a negative/zero/positive integer corresponding to whether

this string is less than/equal to/greater than the second one

in code point order

ICU 2.0

Defined at line 4384 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int8_t compareCodePointOrder (int32_t start, int32_t length, const UnicodeString & srcText, int32_t srcStart, int32_t srcLength)

Compare two Unicode strings in code point order.

The result may be different from the results of compare(), operator

<

, etc.

if supplementary characters are present:

In UTF-16, supplementary characters (with code points U+10000 and above) are

stored with pairs of surrogate code units. These have values from 0xd800 to 0xdfff,

which means that they compare as less than some other BMP characters like U+feff.

This function compares Unicode strings in code point order.

If either of the UTF-16 strings is malformed (i.e., it contains unpaired surrogates), then the result is not defined.

Parameters

start The start offset in this string at which the compare operation begins.
length The number of code units from this string to compare.
srcText Another string to compare this one to.
srcStart The start offset in that string at which the compare operation begins.
srcLength The number of code units from that string to compare.

Returns

a negative/zero/positive integer corresponding to whether

this string is less than/equal to/greater than the second one

in code point order

ICU 2.0

Defined at line 4395 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int8_t compareCodePointOrder (ConstChar16Ptr srcChars, int32_t srcLength)

Compare two Unicode strings in code point order.

The result may be different from the results of compare(), operator

<

, etc.

if supplementary characters are present:

In UTF-16, supplementary characters (with code points U+10000 and above) are

stored with pairs of surrogate code units. These have values from 0xd800 to 0xdfff,

which means that they compare as less than some other BMP characters like U+feff.

This function compares Unicode strings in code point order.

If either of the UTF-16 strings is malformed (i.e., it contains unpaired surrogates), then the result is not defined.

Parameters

srcChars A pointer to another string to compare this one to.
srcLength The number of code units from that string to compare.

Returns

a negative/zero/positive integer corresponding to whether

this string is less than/equal to/greater than the second one

in code point order

ICU 2.0

Defined at line 4390 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int8_t compareCodePointOrder (int32_t start, int32_t length, const char16_t * srcChars)

Compare two Unicode strings in code point order.

The result may be different from the results of compare(), operator

<

, etc.

if supplementary characters are present:

In UTF-16, supplementary characters (with code points U+10000 and above) are

stored with pairs of surrogate code units. These have values from 0xd800 to 0xdfff,

which means that they compare as less than some other BMP characters like U+feff.

This function compares Unicode strings in code point order.

If either of the UTF-16 strings is malformed (i.e., it contains unpaired surrogates), then the result is not defined.

Parameters

start The start offset in this string at which the compare operation begins.
length The number of code units from this string to compare.
srcChars A pointer to another string to compare this one to.

Returns

a negative/zero/positive integer corresponding to whether

this string is less than/equal to/greater than the second one

in code point order

ICU 2.0

Defined at line 4403 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int8_t compareCodePointOrder (int32_t start, int32_t length, const char16_t * srcChars, int32_t srcStart, int32_t srcLength)

Compare two Unicode strings in code point order.

The result may be different from the results of compare(), operator

<

, etc.

if supplementary characters are present:

In UTF-16, supplementary characters (with code points U+10000 and above) are

stored with pairs of surrogate code units. These have values from 0xd800 to 0xdfff,

which means that they compare as less than some other BMP characters like U+feff.

This function compares Unicode strings in code point order.

If either of the UTF-16 strings is malformed (i.e., it contains unpaired surrogates), then the result is not defined.

Parameters

start The start offset in this string at which the compare operation begins.
length The number of code units from this string to compare.
srcChars A pointer to another string to compare this one to.
srcStart The start offset in that string at which the compare operation begins.
srcLength The number of code units from that string to compare.

Returns

a negative/zero/positive integer corresponding to whether

this string is less than/equal to/greater than the second one

in code point order

ICU 2.0

Defined at line 4409 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int8_t compareCodePointOrderBetween (int32_t start, int32_t limit, const UnicodeString & srcText, int32_t srcStart, int32_t srcLimit)

Compare two Unicode strings in code point order.

The result may be different from the results of compare(), operator

<

, etc.

if supplementary characters are present:

In UTF-16, supplementary characters (with code points U+10000 and above) are

stored with pairs of surrogate code units. These have values from 0xd800 to 0xdfff,

which means that they compare as less than some other BMP characters like U+feff.

This function compares Unicode strings in code point order.

If either of the UTF-16 strings is malformed (i.e., it contains unpaired surrogates), then the result is not defined.

Parameters

start The start offset in this string at which the compare operation begins.
limit The offset after the last code unit from this string to compare.
srcText Another string to compare this one to.
srcStart The start offset in that string at which the compare operation begins.
srcLimit The offset after the last code unit from that string to compare.

Returns

a negative/zero/positive integer corresponding to whether

this string is less than/equal to/greater than the second one

in code point order

ICU 2.0

Defined at line 4417 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int8_t caseCompare (const UnicodeString & text, uint32_t options)

Compare two strings case-insensitively using full case folding.

This is equivalent to this->foldCase(options).compare(text.foldCase(options)).

- U_COMPARE_CODE_POINT_ORDER

Set to choose code point order instead of code unit order

(see u_strCompare for details).

- U_FOLD_CASE_EXCLUDE_SPECIAL_I

Parameters

text Another string to compare this one to.
options A bit set of options: - U_FOLD_CASE_DEFAULT or 0 is used for default options: Comparison in code unit order with default case folding.

Returns

A negative, zero, or positive integer indicating the comparison result.

ICU 2.0

Defined at line 4442 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int8_t caseCompare (int32_t start, int32_t length, const UnicodeString & srcText, uint32_t options)

Compare two strings case-insensitively using full case folding.

This is equivalent to this->foldCase(options).compare(srcText.foldCase(options)).

- U_COMPARE_CODE_POINT_ORDER

Set to choose code point order instead of code unit order

(see u_strCompare for details).

- U_FOLD_CASE_EXCLUDE_SPECIAL_I

Parameters

start The start offset in this string at which the compare operation begins.
length The number of code units from this string to compare.
srcText Another string to compare this one to.
options A bit set of options: - U_FOLD_CASE_DEFAULT or 0 is used for default options: Comparison in code unit order with default case folding.

Returns

A negative, zero, or positive integer indicating the comparison result.

ICU 2.0

Defined at line 4447 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int8_t caseCompare (int32_t start, int32_t length, const UnicodeString & srcText, int32_t srcStart, int32_t srcLength, uint32_t options)

Compare two strings case-insensitively using full case folding.

This is equivalent to this->foldCase(options).compare(srcText.foldCase(options)).

- U_COMPARE_CODE_POINT_ORDER

Set to choose code point order instead of code unit order

(see u_strCompare for details).

- U_FOLD_CASE_EXCLUDE_SPECIAL_I

Parameters

start The start offset in this string at which the compare operation begins.
length The number of code units from this string to compare.
srcText Another string to compare this one to.
srcStart The start offset in that string at which the compare operation begins.
srcLength The number of code units from that string to compare.
options A bit set of options: - U_FOLD_CASE_DEFAULT or 0 is used for default options: Comparison in code unit order with default case folding.

Returns

A negative, zero, or positive integer indicating the comparison result.

ICU 2.0

Defined at line 4462 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int8_t caseCompare (ConstChar16Ptr srcChars, int32_t srcLength, uint32_t options)

Compare two strings case-insensitively using full case folding.

This is equivalent to this->foldCase(options).compare(srcChars.foldCase(options)).

- U_COMPARE_CODE_POINT_ORDER

Set to choose code point order instead of code unit order

(see u_strCompare for details).

- U_FOLD_CASE_EXCLUDE_SPECIAL_I

Parameters

srcChars A pointer to another string to compare this one to.
srcLength The number of code units from that string to compare.
options A bit set of options: - U_FOLD_CASE_DEFAULT or 0 is used for default options: Comparison in code unit order with default case folding.

Returns

A negative, zero, or positive integer indicating the comparison result.

ICU 2.0

Defined at line 4455 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int8_t caseCompare (int32_t start, int32_t length, const char16_t * srcChars, uint32_t options)

Compare two strings case-insensitively using full case folding.

This is equivalent to this->foldCase(options).compare(srcChars.foldCase(options)).

- U_COMPARE_CODE_POINT_ORDER

Set to choose code point order instead of code unit order

(see u_strCompare for details).

- U_FOLD_CASE_EXCLUDE_SPECIAL_I

Parameters

start The start offset in this string at which the compare operation begins.
length The number of code units from this string to compare.
srcChars A pointer to another string to compare this one to.
options A bit set of options: - U_FOLD_CASE_DEFAULT or 0 is used for default options: Comparison in code unit order with default case folding.

Returns

A negative, zero, or positive integer indicating the comparison result.

ICU 2.0

Defined at line 4472 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int8_t caseCompare (int32_t start, int32_t length, const char16_t * srcChars, int32_t srcStart, int32_t srcLength, uint32_t options)

Compare two strings case-insensitively using full case folding.

This is equivalent to this->foldCase(options).compare(srcChars.foldCase(options)).

- U_COMPARE_CODE_POINT_ORDER

Set to choose code point order instead of code unit order

(see u_strCompare for details).

- U_FOLD_CASE_EXCLUDE_SPECIAL_I

Parameters

start The start offset in this string at which the compare operation begins.
length The number of code units from this string to compare.
srcChars A pointer to another string to compare this one to.
srcStart The start offset in that string at which the compare operation begins.
srcLength The number of code units from that string to compare.
options A bit set of options: - U_FOLD_CASE_DEFAULT or 0 is used for default options: Comparison in code unit order with default case folding.

Returns

A negative, zero, or positive integer indicating the comparison result.

ICU 2.0

Defined at line 4480 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int8_t caseCompareBetween (int32_t start, int32_t limit, const UnicodeString & srcText, int32_t srcStart, int32_t srcLimit, uint32_t options)

Compare two strings case-insensitively using full case folding.

This is equivalent to this->foldCase(options).compareBetween(text.foldCase(options)).

- U_COMPARE_CODE_POINT_ORDER

Set to choose code point order instead of code unit order

(see u_strCompare for details).

- U_FOLD_CASE_EXCLUDE_SPECIAL_I

Parameters

start The start offset in this string at which the compare operation begins.
limit The offset after the last code unit from this string to compare.
srcText Another string to compare this one to.
srcStart The start offset in that string at which the compare operation begins.
srcLimit The offset after the last code unit from that string to compare.
options A bit set of options: - U_FOLD_CASE_DEFAULT or 0 is used for default options: Comparison in code unit order with default case folding.

Returns

A negative, zero, or positive integer indicating the comparison result.

ICU 2.0

Defined at line 4490 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UBool startsWith (const UnicodeString & text)

Determine if this starts with the characters in `text`

Parameters

text The text to match.

Returns

true if this starts with the characters in `text`,

false otherwise

ICU 2.0

Defined at line 4666 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UBool startsWith (const UnicodeString & srcText, int32_t srcStart, int32_t srcLength)

Determine if this starts with the characters in `srcText`

in the range [`srcStart`, `srcStart + srcLength`).

Parameters

srcText The text to match.
srcStart the offset into `srcText` to start matching
srcLength the number of characters in `srcText` to match

Returns

true if this starts with the characters in `text`,

false otherwise

ICU 2.0

Defined at line 4670 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UBool startsWith (ConstChar16Ptr srcChars, int32_t srcLength)

Determine if this starts with the characters in `srcChars`

Parameters

srcChars The characters to match.
srcLength the number of characters in `srcChars`

Returns

true if this starts with the characters in `srcChars`,

false otherwise

ICU 2.0

Defined at line 4676 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UBool startsWith (const char16_t * srcChars, int32_t srcStart, int32_t srcLength)

Determine if this ends with the characters in `srcChars`

in the range [`srcStart`, `srcStart + srcLength`).

Parameters

srcChars The characters to match.
srcStart the offset into `srcText` to start matching
srcLength the number of characters in `srcChars` to match

Returns

true if this ends with the characters in `srcChars`, false otherwise

ICU 2.0

Defined at line 4684 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UBool endsWith (const UnicodeString & text)

Determine if this ends with the characters in `text`

Parameters

text The text to match.

Returns

true if this ends with the characters in `text`,

false otherwise

ICU 2.0

Defined at line 4692 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UBool endsWith (const UnicodeString & srcText, int32_t srcStart, int32_t srcLength)

Determine if this ends with the characters in `srcText`

in the range [`srcStart`, `srcStart + srcLength`).

Parameters

srcText The text to match.
srcStart the offset into `srcText` to start matching
srcLength the number of characters in `srcText` to match

Returns

true if this ends with the characters in `text`,

false otherwise

ICU 2.0

Defined at line 4697 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UBool endsWith (ConstChar16Ptr srcChars, int32_t srcLength)

Determine if this ends with the characters in `srcChars`

Parameters

srcChars The characters to match.
srcLength the number of characters in `srcChars`

Returns

true if this ends with the characters in `srcChars`,

false otherwise

ICU 2.0

Defined at line 4706 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UBool endsWith (const char16_t * srcChars, int32_t srcStart, int32_t srcLength)

Determine if this ends with the characters in `srcChars`

in the range [`srcStart`, `srcStart + srcLength`).

Parameters

srcChars The characters to match.
srcStart the offset into `srcText` to start matching
srcLength the number of characters in `srcChars` to match

Returns

true if this ends with the characters in `srcChars`,

false otherwise

ICU 2.0

Defined at line 4715 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int32_t indexOf (const UnicodeString & text)

Locate in this the first occurrence of the characters in `text`,

using bitwise comparison.

Parameters

text The text to search for.

Returns

The offset into this of the start of `text`,

or -1 if not found.

ICU 2.0

Defined at line 4516 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int32_t indexOf (const UnicodeString & text, int32_t start)

Locate in this the first occurrence of the characters in `text`

starting at offset `start`, using bitwise comparison.

Parameters

text The text to search for.
start The offset at which searching will start.

Returns

The offset into this of the start of `text`,

or -1 if not found.

ICU 2.0

Defined at line 4520 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int32_t indexOf (const UnicodeString & text, int32_t start, int32_t length)

Locate in this the first occurrence in the range

[`start`, `start + length`) of the characters

in `text`, using bitwise comparison.

Parameters

text The text to search for.
start The offset at which searching will start.
length The number of characters to search

Returns

The offset into this of the start of `text`,

or -1 if not found.

ICU 2.0

Defined at line 4527 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int32_t indexOf (const UnicodeString & srcText, int32_t srcStart, int32_t srcLength, int32_t start, int32_t length)

Locate in this the first occurrence in the range

[`start`, `start + length`) of the characters

in `srcText` in the range

[`srcStart`, `srcStart + srcLength`),

using bitwise comparison.

Parameters

srcText The text to search for.
srcStart the offset into `srcText` at which to start matching
srcLength the number of characters in `srcText` to match
start the offset into this at which to start matching
length the number of characters in this to search

Returns

The offset into this of the start of `text`,

or -1 if not found.

ICU 2.0

Defined at line 4500 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int32_t indexOf (const char16_t * srcChars, int32_t srcLength, int32_t start)

Locate in this the first occurrence of the characters in

`srcChars`

starting at offset `start`, using bitwise comparison.

Parameters

srcChars The text to search for.
srcLength the number of characters in `srcChars` to match
start the offset into this at which to start matching

Returns

The offset into this of the start of `text`,

or -1 if not found.

ICU 2.0

Defined at line 4533 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int32_t indexOf (ConstChar16Ptr srcChars, int32_t srcLength, int32_t start, int32_t length)

Locate in this the first occurrence in the range

[`start`, `start + length`) of the characters

in `srcChars`, using bitwise comparison.

Parameters

srcChars The text to search for.
srcLength the number of characters in `srcChars`
start The offset at which searching will start.
length The number of characters to search

Returns

The offset into this of the start of `srcChars`,

or -1 if not found.

ICU 2.0

Defined at line 4541 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int32_t indexOf (const char16_t * srcChars, int32_t srcStart, int32_t srcLength, int32_t start, int32_t length)

Locate in this the first occurrence in the range

[`start`, `start + length`) of the characters

in `srcChars` in the range

[`srcStart`, `srcStart + srcLength`),

using bitwise comparison.

Parameters

srcChars The text to search for.
srcStart the offset into `srcChars` at which to start matching
srcLength the number of characters in `srcChars` to match
start the offset into this at which to start matching
length the number of characters in this to search

Returns

The offset into this of the start of `text`,

or -1 if not found.

ICU 2.0

int32_t indexOf (char16_t c)

Locate in this the first occurrence of the BMP code point `c`,

using bitwise comparison.

Parameters

c The code unit to search for.

Returns

The offset into this of `c`, or -1 if not found.

ICU 2.0

Defined at line 4560 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int32_t indexOf (UChar32 c)

Locate in this the first occurrence of the code point `c`,

using bitwise comparison.

Parameters

c The code point to search for.

Returns

The offset into this of `c`, or -1 if not found.

ICU 2.0

Defined at line 4564 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int32_t indexOf (char16_t c, int32_t start)

Locate in this the first occurrence of the BMP code point `c`,

starting at offset `start`, using bitwise comparison.

Parameters

c The code unit to search for.
start The offset at which searching will start.

Returns

The offset into this of `c`, or -1 if not found.

ICU 2.0

Defined at line 4568 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int32_t indexOf (UChar32 c, int32_t start)

Locate in this the first occurrence of the code point `c`

starting at offset `start`, using bitwise comparison.

Parameters

c The code point to search for.
start The offset at which searching will start.

Returns

The offset into this of `c`, or -1 if not found.

ICU 2.0

Defined at line 4575 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int32_t indexOf (char16_t c, int32_t start, int32_t length)

Locate in this the first occurrence of the BMP code point `c`

in the range [`start`, `start + length`),

using bitwise comparison.

Parameters

c The code unit to search for.
start the offset into this at which to start matching
length the number of characters in this to search

Returns

The offset into this of `c`, or -1 if not found.

ICU 2.0

Defined at line 4548 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int32_t indexOf (UChar32 c, int32_t start, int32_t length)

Locate in this the first occurrence of the code point `c`

in the range [`start`, `start + length`),

using bitwise comparison.

Parameters

c The code point to search for.
start the offset into this at which to start matching
length the number of characters in this to search

Returns

The offset into this of `c`, or -1 if not found.

ICU 2.0

Defined at line 4554 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int32_t lastIndexOf (const UnicodeString & text)

Locate in this the last occurrence of the characters in `text`,

using bitwise comparison.

Parameters

text The text to search for.

Returns

The offset into this of the start of `text`,

or -1 if not found.

ICU 2.0

Defined at line 4626 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int32_t lastIndexOf (const UnicodeString & text, int32_t start)

Locate in this the last occurrence of the characters in `text`

starting at offset `start`, using bitwise comparison.

Parameters

text The text to search for.
start The offset at which searching will start.

Returns

The offset into this of the start of `text`,

or -1 if not found.

ICU 2.0

Defined at line 4619 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int32_t lastIndexOf (const UnicodeString & text, int32_t start, int32_t length)

Locate in this the last occurrence in the range

[`start`, `start + length`) of the characters

in `text`, using bitwise comparison.

Parameters

text The text to search for.
start The offset at which searching will start.
length The number of characters to search

Returns

The offset into this of the start of `text`,

or -1 if not found.

ICU 2.0

Defined at line 4613 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int32_t lastIndexOf (const UnicodeString & srcText, int32_t srcStart, int32_t srcLength, int32_t start, int32_t length)

Locate in this the last occurrence in the range

[`start`, `start + length`) of the characters

in `srcText` in the range

[`srcStart`, `srcStart + srcLength`),

using bitwise comparison.

Parameters

srcText The text to search for.
srcStart the offset into `srcText` at which to start matching
srcLength the number of characters in `srcText` to match
start the offset into this at which to start matching
length the number of characters in this to search

Returns

The offset into this of the start of `text`,

or -1 if not found.

ICU 2.0

Defined at line 4597 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int32_t lastIndexOf (const char16_t * srcChars, int32_t srcLength, int32_t start)

Locate in this the last occurrence of the characters in `srcChars`

starting at offset `start`, using bitwise comparison.

Parameters

srcChars The text to search for.
srcLength the number of characters in `srcChars` to match
start the offset into this at which to start matching

Returns

The offset into this of the start of `text`,

or -1 if not found.

ICU 2.0

Defined at line 4589 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int32_t lastIndexOf (ConstChar16Ptr srcChars, int32_t srcLength, int32_t start, int32_t length)

Locate in this the last occurrence in the range

[`start`, `start + length`) of the characters

in `srcChars`, using bitwise comparison.

Parameters

srcChars The text to search for.
srcLength the number of characters in `srcChars`
start The offset at which searching will start.
length The number of characters to search

Returns

The offset into this of the start of `srcChars`,

or -1 if not found.

ICU 2.0

Defined at line 4582 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int32_t lastIndexOf (const char16_t * srcChars, int32_t srcStart, int32_t srcLength, int32_t start, int32_t length)

Locate in this the last occurrence in the range

[`start`, `start + length`) of the characters

in `srcChars` in the range

[`srcStart`, `srcStart + srcLength`),

using bitwise comparison.

Parameters

srcChars The text to search for.
srcStart the offset into `srcChars` at which to start matching
srcLength the number of characters in `srcChars` to match
start the offset into this at which to start matching
length the number of characters in this to search

Returns

The offset into this of the start of `text`,

or -1 if not found.

ICU 2.0

int32_t lastIndexOf (char16_t c)

Locate in this the last occurrence of the BMP code point `c`,

using bitwise comparison.

Parameters

c The code unit to search for.

Returns

The offset into this of `c`, or -1 if not found.

ICU 2.0

Defined at line 4643 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int32_t lastIndexOf (UChar32 c)

Locate in this the last occurrence of the code point `c`,

using bitwise comparison.

Parameters

c The code point to search for.

Returns

The offset into this of `c`, or -1 if not found.

ICU 2.0

Defined at line 4647 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int32_t lastIndexOf (char16_t c, int32_t start)

Locate in this the last occurrence of the BMP code point `c`

starting at offset `start`, using bitwise comparison.

Parameters

c The code unit to search for.
start The offset at which searching will start.

Returns

The offset into this of `c`, or -1 if not found.

ICU 2.0

Defined at line 4652 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int32_t lastIndexOf (UChar32 c, int32_t start)

Locate in this the last occurrence of the code point `c`

starting at offset `start`, using bitwise comparison.

Parameters

c The code point to search for.
start The offset at which searching will start.

Returns

The offset into this of `c`, or -1 if not found.

ICU 2.0

Defined at line 4659 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int32_t lastIndexOf (char16_t c, int32_t start, int32_t length)

Locate in this the last occurrence of the BMP code point `c`

in the range [`start`, `start + length`),

using bitwise comparison.

Parameters

c The code unit to search for.
start the offset into this at which to start matching
length the number of characters in this to search

Returns

The offset into this of `c`, or -1 if not found.

ICU 2.0

Defined at line 4630 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int32_t lastIndexOf (UChar32 c, int32_t start, int32_t length)

Locate in this the last occurrence of the code point `c`

in the range [`start`, `start + length`),

using bitwise comparison.

Parameters

c The code point to search for.
start the offset into this at which to start matching
length the number of characters in this to search

Returns

The offset into this of `c`, or -1 if not found.

ICU 2.0

Defined at line 4636 of file ../../third_party/icu/default/source/common/unicode/unistr.h

char16_t charAt (int32_t offset)

Return the code unit at offset `offset`.

If the offset is not valid (0..length()-1) then U+ffff is returned.

Parameters

offset a valid offset into the text

Returns

the code unit at offset `offset`

or 0xffff if the offset is not valid for this string

ICU 2.0

Defined at line 4854 of file ../../third_party/icu/default/source/common/unicode/unistr.h

char16_t operator[] (int32_t offset)

Return the code unit at offset `offset`.

If the offset is not valid (0..length()-1) then U+ffff is returned.

Parameters

offset a valid offset into the text

Returns

the code unit at offset `offset`

ICU 2.0

Defined at line 4858 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UChar32 char32At (int32_t offset)

Return the code point that contains the code unit

at offset `offset`.

If the offset is not valid (0..length()-1) then U+ffff is returned.

Parameters

offset a valid offset into the text that indicates the text offset of any of the code units that will be assembled into a code point (21-bit value) and returned

Returns

the code point of text at `offset`

or 0xffff if the offset is not valid for this string

ICU 2.0

int32_t getChar32Start (int32_t offset)

Adjust a random-access offset so that

it points to the beginning of a Unicode character.

The offset that is passed in points to

any code unit of a code point,

while the returned offset will point to the first code unit

of the same code point.

In UTF-16, if the input offset points to a second surrogate

of a surrogate pair, then the returned offset will point

to the first surrogate.

Parameters

offset a valid offset into one code point of the text

Returns

offset of the first code unit of the same code point

int32_t getChar32Limit (int32_t offset)

Adjust a random-access offset so that

it points behind a Unicode character.

The offset that is passed in points behind

any code unit of a code point,

while the returned offset will point behind the last code unit

of the same code point.

In UTF-16, if the input offset points behind the first surrogate

(i.e., to the second surrogate)

of a surrogate pair, then the returned offset will point

behind the second surrogate (i.e., to the first surrogate).

Parameters

offset a valid offset after any code unit of a code point of the text

Returns

offset of the first code unit after the same code point

int32_t moveIndex32 (int32_t index, int32_t delta)

Move the code unit index along the string by delta code points.

Interpret the input index as a code unit-based offset into the string,

move the index forward or backward by delta code points, and

return the resulting index.

The input index should point to the first code unit of a code point,

if there is more than one.

Both input and output indexes are code unit-based as for all

string indexes/offsets in ICU (and other libraries, like MBCS char*).

If delta

<

0 then the index is moved backward (toward the start of the string).

If delta>0 then the index is moved forward (toward the end of the string).

This behaves like CharacterIterator::move32(delta, kCurrent).

Behavior for out-of-bounds indexes:

`moveIndex32` pins the input index to 0..length(), i.e.,

if the input index

<

0 then it is pinned to 0;

if it is index>length() then it is pinned to length().

Afterwards, the index is moved by `delta` code points

forward or backward,

but no further backward than to 0 and no further forward than to length().

The resulting index return value will be in between 0 and length(), inclusively.

Examples:

Parameters

index input code unit index
delta (signed) code point count to move the index forward or backward in the string

Returns

the resulting code unit index

ICU 2.0

Code

                                        
                                                 // s has code points 'a' U+10000 'b' U+10ffff U+2029
                                                 UnicodeString s(u"a\U00010000b\U0010ffff\u2029");
                                            
                                                 // initial index: position of U+10000
                                                 int32_t index=1;
                                            
                                                 // the following examples will all result in index==4, position of U+10ffff
                                            
                                                 // skip 2 code points from some position in the string
                                                 index=s.moveIndex32(index, 2); // skips U+10000 and 'b'
                                            
                                                 // go to the 3rd code point from the start of s (0-based)
                                                 index=s.moveIndex32(0, 3); // skips 'a', U+10000, and 'b'
                                            
                                                 // go to the next-to-last code point of s
                                                 index=s.moveIndex32(s.length(), -2); // backward-skips U+2029 and U+10ffff
                                        
                                    
void extract (int32_t start, int32_t length, Char16Ptr dst, int32_t dstStart)

Copy the characters in the range

[`start`, `start + length`) into the array `dst`,

beginning at `dstStart`.

If the string aliases to `dst` itself as an external buffer,

then extract() will not copy the contents.

Parameters

start offset of first character which will be copied into the array
length the number of characters to extract
dst array in which to copy characters. The length of `dst` must be at least (`dstStart + length`).
dstStart the offset in `dst` where the first character will be extracted ICU 2.0

Defined at line 4801 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int32_t extract (Char16Ptr dest, int32_t destCapacity, UErrorCode & errorCode)

Copy the contents of the string into dest.

This is a convenience function that

checks if there is enough space in dest,

extracts the entire string if possible,

and NUL-terminates dest if possible.

If the string fits into dest but cannot be NUL-terminated

(length()==destCapacity) then the error code is set to U_STRING_NOT_TERMINATED_WARNING.

If the string itself does not fit into dest

(length()>destCapacity) then the error code is set to U_BUFFER_OVERFLOW_ERROR.

If the string aliases to `dest` itself as an external buffer,

then extract() will not copy the contents.

Parameters

dest Destination string buffer.
destCapacity Number of char16_ts available at dest.
errorCode ICU error code.

Returns

length()

ICU 2.0

void extract (int32_t start, int32_t length, UnicodeString & target)

Copy the characters in the range

[`start`, `start + length`) into the UnicodeString

`target`.

Parameters

start offset of first character which will be copied
length the number of characters to extract
target UnicodeString into which to copy characters. ICU 2.0

Defined at line 4808 of file ../../third_party/icu/default/source/common/unicode/unistr.h

void extractBetween (int32_t start, int32_t limit, char16_t * dst, int32_t dstStart)

Copy the characters in the range [`start`, `limit`)

into the array `dst`, beginning at `dstStart`.

Parameters

start offset of first character which will be copied into the array
limit offset immediately following the last character to be copied
dst array in which to copy characters. The length of `dst` must be at least (`dstStart + (limit - start)`).
dstStart the offset in `dst` where the first character will be extracted ICU 2.0

Defined at line 4829 of file ../../third_party/icu/default/source/common/unicode/unistr.h

void extractBetween (int32_t start, int32_t limit, UnicodeString & target)

Copy the characters in the range [`start`, `limit`)

into the UnicodeString `target`. Replaceable API.

Parameters

start offset of first character which will be copied
limit offset immediately following the last character to be copied
target UnicodeString into which to copy characters. ICU 2.0
int32_t extract (int32_t start, int32_t startLength, char * target, int32_t targetCapacity, enum EInvariant inv)

Copy the characters in the range

[`start`, `start + startLength`) into an array of characters.

All characters must be invariant (see utypes.h).

Use US_INV as the last, signature-distinguishing parameter.

This function does not write any more than `targetCapacity`

characters but returns the length of the entire output string

so that one can allocate a larger buffer and call the function again

if necessary.

The output string is NUL-terminated if possible.

Parameters

start offset of first character which will be copied
startLength the number of characters to extract
target the target buffer for extraction, can be nullptr if targetLength is 0
targetCapacity the length of the target buffer
inv Signature-distinguishing parameter, use US_INV.

Returns

the output string length, not including the terminating NUL

ICU 3.2

int32_t extract (int32_t start, int32_t startLength, char * target, uint32_t targetLength)

Copy the characters in the range

[`start`, `start + length`) into an array of characters

in the platform's default codepage.

This function does not write any more than `targetLength`

characters but returns the length of the entire output string

so that one can allocate a larger buffer and call the function again

if necessary.

The output string is NUL-terminated if possible.

Parameters

start offset of first character which will be copied
startLength the number of characters to extract
target the target buffer for extraction
targetLength the length of the target buffer If `target` is nullptr, then the number of bytes required for `target` is returned.

Returns

the output string length, not including the terminating NUL

ICU 2.0

int32_t extract (int32_t start, int32_t startLength, char * target, const char * codepage)

Copy the characters in the range

[`start`, `start + length`) into an array of characters

in a specified codepage.

The output string is NUL-terminated.

Recommendation: For invariant-character strings use

extract(int32_t start, int32_t length, char *target, int32_t targetCapacity, enum EInvariant inv) const

because it avoids object code dependencies of UnicodeString on

the conversion code.

Parameters

start offset of first character which will be copied
startLength the number of characters to extract
target the target buffer for extraction
codepage the desired codepage for the characters. 0 has the special meaning of the default codepage If `codepage` is an empty string (`""`), then a simple conversion is performed on the codepage-invariant subset ("invariant characters") of the platform encoding. See utypes.h. If `target` is nullptr, then the number of bytes required for `target` is returned. It is assumed that the target is big enough to fit all of the characters.

Returns

the output string length, not including the terminating NUL

ICU 2.0

Defined at line 4816 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int32_t extract (int32_t start, int32_t startLength, char * target, uint32_t targetLength, const char * codepage)

Copy the characters in the range

[`start`, `start + length`) into an array of characters

in a specified codepage.

This function does not write any more than `targetLength`

characters but returns the length of the entire output string

so that one can allocate a larger buffer and call the function again

if necessary.

The output string is NUL-terminated if possible.

Recommendation: For invariant-character strings use

extract(int32_t start, int32_t length, char *target, int32_t targetCapacity, enum EInvariant inv) const

because it avoids object code dependencies of UnicodeString on

the conversion code.

Parameters

start offset of first character which will be copied
startLength the number of characters to extract
target the target buffer for extraction
targetLength the length of the target buffer
codepage the desired codepage for the characters. 0 has the special meaning of the default codepage If `codepage` is an empty string (`""`), then a simple conversion is performed on the codepage-invariant subset ("invariant characters") of the platform encoding. See utypes.h. If `target` is nullptr, then the number of bytes required for `target` is returned.

Returns

the output string length, not including the terminating NUL

ICU 2.0

int32_t extract (char * dest, int32_t destCapacity, UConverter * cnv, UErrorCode & errorCode)

Convert the UnicodeString into a codepage string using an existing UConverter.

The output string is NUL-terminated if possible.

This function avoids the overhead of opening and closing a converter if

multiple strings are extracted.

Parameters

dest destination string buffer, can be nullptr if destCapacity==0
destCapacity the number of chars available at dest
cnv the converter object to be used (ucnv_resetFromUnicode() will be called), or nullptr for the default converter
errorCode normal ICU error code

Returns

the length of the output string, not counting the terminating NUL;

if the length is greater than destCapacity, then the string will not fit

and a buffer of the indicated length would need to be passed in

ICU 2.0

UnicodeString tempSubString (int32_t start, int32_t length)

Create a temporary substring for the specified range.

Unlike the substring constructor and setTo() functions,

the object returned here will be a read-only alias (using getBuffer())

rather than copying the text.

As a result, this substring operation is much faster but requires

that the original string not be modified or deleted during the lifetime

of the returned substring object.

Parameters

start offset of the first character visible in the substring
length length of the substring

Returns

a read-only alias UnicodeString object for the substring

ICU 4.4

UnicodeString tempSubStringBetween (int32_t start, int32_t limit)

Create a temporary substring for the specified range.

Same as tempSubString(start, length) except that the substring range

is specified as a (start, limit) pair (with an exclusive limit index)

rather than a (start, length) pair.

Parameters

start offset of the first character visible in the substring
limit offset immediately following the last character visible in the substring

Returns

a read-only alias UnicodeString object for the substring

ICU 4.4

Defined at line 4839 of file ../../third_party/icu/default/source/common/unicode/unistr.h

void toUTF8 (ByteSink & sink)

Convert the UnicodeString to UTF-8 and write the result

to a ByteSink. This is called by toUTF8String().

Unpaired surrogates are replaced with U+FFFD.

Calls u_strToUTF8WithSub().

Parameters

sink A ByteSink to which the UTF-8 version of the string is written. sink.Flush() is called at the end. ICU 4.2
int32_t toUTF32 (UChar32 * utf32, int32_t capacity, UErrorCode & errorCode)

Convert the UnicodeString to UTF-32.

Unpaired surrogates are replaced with U+FFFD.

Calls u_strToUTF32WithSub().

Parameters

utf32 destination string buffer, can be nullptr if capacity==0
capacity the number of UChar32s available at utf32
errorCode Standard ICU error code. Its input value must pass the U_SUCCESS() test, or else the function returns immediately. Check for U_FAILURE() on output or use with function chaining. (See User Guide for details.)

Returns

The length of the UTF-32 string.

int32_t length ()

Return the length of the UnicodeString object.

The length is the number of char16_t code units are in the UnicodeString.

If you want the number of code points, please use countChar32().

Returns

the length of the UnicodeString object

Defined at line 4213 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int32_t countChar32 (int32_t start, int32_t length)

Count Unicode code points in the length char16_t code units of the string.

A code point may occupy either one or two char16_t code units.

Counting code points involves reading all code units.

This functions is basically the inverse of moveIndex32().

Parameters

start the index of the first code unit to check
length the number of char16_t code units to check

Returns

the number of code points in the specified code units

UBool hasMoreChar32Than (int32_t start, int32_t length, int32_t number)

Check if the length char16_t code units of the string

contain more Unicode code points than a certain number.

This is more efficient than counting all code points in this part of the string

and comparing that number with a threshold.

This function may not need to scan the string at all if the length

falls within a certain range, and

never needs to count more than 'number+1' code points.

Logically equivalent to (countChar32(start, length)>number).

A Unicode code point may occupy either one or two char16_t code units.

Parameters

start the index of the first code unit to check (0 for the entire string)
length the number of char16_t code units to check (use INT32_MAX for the entire string; remember that start/length values are pinned)
number The number of code points in the (sub)string is compared against the 'number' parameter.

Returns

Boolean value for whether the string contains more Unicode code points

than 'number'. Same as (u_countChar32(s, length)>number).

UBool isEmpty ()

Determine if this string is empty.

Returns

true if this string contains 0 characters, false otherwise.

ICU 2.0

Defined at line 4862 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int32_t getCapacity ()

Return the capacity of the internal buffer of the UnicodeString object.

This is useful together with the getBuffer functions.

See there for details.

Returns

the number of char16_ts available in the internal buffer

Defined at line 4218 of file ../../third_party/icu/default/source/common/unicode/unistr.h

int32_t hashCode ()

Generate a hash code for this object.

Returns

The hash code of this UnicodeString.

ICU 2.0

Defined at line 4224 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UBool isBogus ()

Determine if this object contains a valid string.

A bogus string has no value. It is different from an empty string,

although in both cases isEmpty() returns true and length() returns 0.

setToBogus() and isBogus() can be used to indicate that no string value is available.

For a bogus string, getBuffer() and getTerminatedBuffer() return nullptr, and

length() returns 0.

Returns

true if the string is bogus/invalid, false otherwise

Defined at line 4228 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & operator= (const UnicodeString & srcText)

Assignment operator. Replace the characters in this UnicodeString

with the characters from `srcText`.

Starting with ICU 2.4, the assignment operator and the copy constructor

allocate a new buffer and copy the buffer contents even for readonly aliases.

By contrast, the fastCopyFrom() function implements the old,

more efficient but less safe behavior

of making this string also a readonly alias to the same buffer.

If the source object has an "open" buffer from getBuffer(minCapacity),

then the copy is an empty string.

Parameters

srcText The text containing the characters to replace

Returns

a reference to this

ICU 2.0

UnicodeString & fastCopyFrom (const UnicodeString & src)

Almost the same as the assignment operator.

Replace the characters in this UnicodeString

with the characters from `srcText`.

This function works the same as the assignment operator

for all strings except for ones that are readonly aliases.

Starting with ICU 2.4, the assignment operator and the copy constructor

allocate a new buffer and copy the buffer contents even for readonly aliases.

This function implements the old, more efficient but less safe behavior

of making this string also a readonly alias to the same buffer.

The fastCopyFrom function must be used only if it is known that the lifetime of

this UnicodeString does not exceed the lifetime of the aliased buffer

including its contents, for example for strings from resource bundles

or aliases to string constants.

If the source object has an "open" buffer from getBuffer(minCapacity),

then the copy is an empty string.

Parameters

src The text containing the characters to replace.

Returns

a reference to this

ICU 2.4

UnicodeString & operator= (UnicodeString && src)

Move assignment operator; might leave src in bogus state.

This string will have the same contents and state that the source string had.

The behavior is undefined if *this and src are the same object.

Parameters

src source string

Returns

*this

ICU 56

void swap (UnicodeString & other)

Swap strings.

Parameters

other other string ICU 56
UnicodeString & operator= (char16_t ch)

Assignment operator. Replace the characters in this UnicodeString

with the code unit `ch`.

Parameters

ch the code unit to replace

Returns

a reference to this

ICU 2.0

Defined at line 4905 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & operator= (UChar32 ch)

Assignment operator. Replace the characters in this UnicodeString

with the code point `ch`.

Parameters

ch the code point to replace

Returns

a reference to this

ICU 2.0

Defined at line 4909 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & setTo (const UnicodeString & srcText, int32_t srcStart)

Set the text in the UnicodeString object to the characters

in `srcText` in the range

[`srcStart`, `srcText.length()`).

`srcText` is not modified.

Parameters

srcText the source for the new characters
srcStart the offset into `srcText` where new characters will be obtained

Returns

a reference to this

ICU 2.2

Defined at line 4922 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & setTo (const UnicodeString & srcText, int32_t srcStart, int32_t srcLength)

Set the text in the UnicodeString object to the characters

in `srcText` in the range

[`srcStart`, `srcStart + srcLength`).

`srcText` is not modified.

Parameters

srcText the source for the new characters
srcStart the offset into `srcText` where new characters will be obtained
srcLength the number of characters in `srcText` in the replace string.

Returns

a reference to this

ICU 2.0

Defined at line 4913 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & setTo (const UnicodeString & srcText)

Set the text in the UnicodeString object to the characters in

`srcText`.

`srcText` is not modified.

Parameters

srcText the source for the new characters

Returns

a reference to this

ICU 2.0

Defined at line 4931 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & setTo (const char16_t * srcChars, int32_t srcLength)

Set the characters in the UnicodeString object to the characters

in `srcChars`. `srcChars` is not modified.

Parameters

srcChars the source for the new characters
srcLength the number of Unicode characters in srcChars.

Returns

a reference to this

ICU 2.0

Defined at line 4937 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & setTo (char16_t srcChar)

Set the characters in the UnicodeString object to the code unit

`srcChar`.

Parameters

srcChar the code unit which becomes the UnicodeString's character content

Returns

a reference to this

ICU 2.0

Defined at line 4945 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & setTo (UChar32 srcChar)

Set the characters in the UnicodeString object to the code point

`srcChar`.

Parameters

srcChar the code point which becomes the UnicodeString's character content

Returns

a reference to this

ICU 2.0

Defined at line 4952 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & setTo (UBool isTerminated, ConstChar16Ptr text, int32_t textLength)

Aliasing setTo() function, analogous to the readonly-aliasing char16_t* constructor.

The text will be used for the UnicodeString object, but

it will not be released when the UnicodeString is destroyed.

This has copy-on-write semantics:

When the string is modified, then the buffer is first copied into

newly allocated memory.

The aliased buffer is never modified.

In an assignment to another UnicodeString, when using the copy constructor

or the assignment operator, the text will be copied.

When using fastCopyFrom(), the text will be aliased again,

so that both strings then alias the same readonly-text.

Parameters

isTerminated specifies if `text` is `NUL`-terminated. This must be true if `textLength==-1`.
text The characters to alias for the UnicodeString.
textLength The number of Unicode characters in `text` to alias. If -1, then this constructor will determine the length by calling `u_strlen()`.

Returns

a reference to this

ICU 2.0

UnicodeString & setTo (char16_t * buffer, int32_t buffLength, int32_t buffCapacity)

Aliasing setTo() function, analogous to the writable-aliasing char16_t* constructor.

The text will be used for the UnicodeString object, but

it will not be released when the UnicodeString is destroyed.

This has write-through semantics:

For as long as the capacity of the buffer is sufficient, write operations

will directly affect the buffer. When more capacity is necessary, then

a new buffer will be allocated and the contents copied as with regularly

constructed strings.

In an assignment to another UnicodeString, the buffer will be copied.

The extract(Char16Ptr dst) function detects whether the dst pointer is the same

as the string buffer itself and will in this case not copy the contents.

Parameters

buffer The characters to alias for the UnicodeString.
buffLength The number of Unicode characters in `buffer` to alias.
buffCapacity The size of `buffer` in char16_ts.

Returns

a reference to this

ICU 2.0

void setToBogus ()

Make this UnicodeString object invalid.

The string will test true with isBogus().

A bogus string has no value. It is different from an empty string.

It can be used to indicate that no string value is available.

getBuffer() and getTerminatedBuffer() return nullptr, and

length() returns 0.

This utility function is used throughout the UnicodeString

implementation to indicate that a UnicodeString operation failed,

and may be used in other functions,

especially but not exclusively when such functions do not

take a UErrorCode for simplicity.

The following methods, and no others, will clear a string object's bogus flag:

- remove()

- remove(0, INT32_MAX)

- truncate(0)

- operator=() (assignment operator)

- setTo(...)

The simplest ways to turn a bogus string into an empty one

is to use the remove() function.

Examples for other functions that are equivalent to "set to empty string":

Code

                                        
                                             if(s.isBogus()) {
                                               s.remove();           // set to an empty string (remove all), or
                                               s.remove(0, INT32_MAX); // set to an empty string (remove all), or
                                               s.truncate(0);        // set to an empty string (complete truncation), or
                                               s=UnicodeString();    // assign an empty string, or
                                               s.setTo((UChar32)-1); // set to a pseudo code point that is out of range, or
                                               s.setTo(u"", 0);      // set to an empty C Unicode string
                                             }
                                        
                                    
UnicodeString & setCharAt (int32_t offset, char16_t ch)

Set the character at the specified offset to the specified character.

Parameters

offset A valid offset into the text of the character to set
ch The new character

Returns

A reference to this

ICU 2.0

UnicodeString & operator+= (char16_t ch)

Append operator. Append the code unit `ch` to the UnicodeString

object.

Parameters

ch the code unit to be appended

Returns

a reference to this

ICU 2.0

Defined at line 4984 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & operator+= (UChar32 ch)

Append operator. Append the code point `ch` to the UnicodeString

object.

Parameters

ch the code point to be appended

Returns

a reference to this

ICU 2.0

Defined at line 4988 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & operator+= (const UnicodeString & srcText)

Append operator. Append the characters in `srcText` to the

UnicodeString object. `srcText` is not modified.

Parameters

srcText the source for the new characters

Returns

a reference to this

ICU 2.0

Defined at line 4993 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & append (const UnicodeString & srcText, int32_t srcStart, int32_t srcLength)

Append the characters

in `srcText` in the range

[`srcStart`, `srcStart + srcLength`) to the

UnicodeString object at offset `start`. `srcText`

is not modified.

Parameters

srcText the source for the new characters
srcStart the offset into `srcText` where new characters will be obtained
srcLength the number of characters in `srcText` in the append string

Returns

a reference to this

ICU 2.0

Defined at line 4959 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & append (const UnicodeString & srcText)

Append the characters in `srcText` to the UnicodeString object.

`srcText` is not modified.

Parameters

srcText the source for the new characters

Returns

a reference to this

ICU 2.0

Defined at line 4965 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & append (const char16_t * srcChars, int32_t srcStart, int32_t srcLength)

Append the characters in `srcChars` in the range

[`srcStart`, `srcStart + srcLength`) to the UnicodeString

object at offset

`start`. `srcChars` is not modified.

Parameters

srcChars the source for the new characters
srcStart the offset into `srcChars` where new characters will be obtained
srcLength the number of characters in `srcChars` in the append string; can be -1 if `srcChars` is NUL-terminated

Returns

a reference to this

ICU 2.0

Defined at line 4969 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & append (ConstChar16Ptr srcChars, int32_t srcLength)

Append the characters in `srcChars` to the UnicodeString object.

`srcChars` is not modified.

Parameters

srcChars the source for the new characters
srcLength the number of Unicode characters in `srcChars`; can be -1 if `srcChars` is NUL-terminated

Returns

a reference to this

ICU 2.0

Defined at line 4975 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & append (char16_t srcChar)

Append the code unit `srcChar` to the UnicodeString object.

Parameters

srcChar the code unit to append

Returns

a reference to this

ICU 2.0

Defined at line 4980 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & append (UChar32 srcChar)

Append the code point `srcChar` to the UnicodeString object.

Parameters

srcChar the code point to append

Returns

a reference to this

ICU 2.0

UnicodeString & insert (int32_t start, const UnicodeString & srcText, int32_t srcStart, int32_t srcLength)

Insert the characters in `srcText` in the range

[`srcStart`, `srcStart + srcLength`) into the UnicodeString

object at offset `start`. `srcText` is not modified.

Parameters

start the offset where the insertion begins
srcText the source for the new characters
srcStart the offset into `srcText` where new characters will be obtained
srcLength the number of characters in `srcText` in the insert string

Returns

a reference to this

ICU 2.0

Defined at line 4997 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & insert (int32_t start, const UnicodeString & srcText)

Insert the characters in `srcText` into the UnicodeString object

at offset `start`. `srcText` is not modified.

Parameters

start the offset where the insertion begins
srcText the source for the new characters

Returns

a reference to this

ICU 2.0

Defined at line 5004 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & insert (int32_t start, const char16_t * srcChars, int32_t srcStart, int32_t srcLength)

Insert the characters in `srcChars` in the range

[`srcStart`, `srcStart + srcLength`) into the UnicodeString

object at offset `start`. `srcChars` is not modified.

Parameters

start the offset at which the insertion begins
srcChars the source for the new characters
srcStart the offset into `srcChars` where new characters will be obtained
srcLength the number of characters in `srcChars` in the insert string

Returns

a reference to this

ICU 2.0

Defined at line 5009 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & insert (int32_t start, ConstChar16Ptr srcChars, int32_t srcLength)

Insert the characters in `srcChars` into the UnicodeString object

at offset `start`. `srcChars` is not modified.

Parameters

start the offset where the insertion begins
srcChars the source for the new characters
srcLength the number of Unicode characters in srcChars.

Returns

a reference to this

ICU 2.0

Defined at line 5016 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & insert (int32_t start, char16_t srcChar)

Insert the code unit `srcChar` into the UnicodeString object at

offset `start`.

Parameters

start the offset at which the insertion occurs
srcChar the code unit to insert

Returns

a reference to this

ICU 2.0

Defined at line 5022 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & insert (int32_t start, UChar32 srcChar)

Insert the code point `srcChar` into the UnicodeString object at

offset `start`.

Parameters

start the offset at which the insertion occurs
srcChar the code point to insert

Returns

a reference to this

ICU 2.0

Defined at line 5027 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & replace (int32_t start, int32_t length, const UnicodeString & srcText, int32_t srcStart, int32_t srcLength)

Replace the characters in the range

[`start`, `start + length`) with the characters in

`srcText` in the range

[`srcStart`, `srcStart + srcLength`).

`srcText` is not modified.

Parameters

start the offset at which the replace operation begins
length the number of characters to replace. The character at `start + length` is not modified.
srcText the source for the new characters
srcStart the offset into `srcText` where new characters will be obtained
srcLength the number of characters in `srcText` in the replace string

Returns

a reference to this

ICU 2.0

Defined at line 4735 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & replace (int32_t start, int32_t length, const UnicodeString & srcText)

Replace the characters in the range

[`start`, `start + length`)

with the characters in `srcText`. `srcText` is

not modified.

Parameters

start the offset at which the replace operation begins
length the number of characters to replace. The character at `start + length` is not modified.
srcText the source for the new characters

Returns

a reference to this

ICU 2.0

Defined at line 4729 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & replace (int32_t start, int32_t length, const char16_t * srcChars, int32_t srcStart, int32_t srcLength)

Replace the characters in the range

[`start`, `start + length`) with the characters in

`srcChars` in the range

[`srcStart`, `srcStart + srcLength`). `srcChars`

is not modified.

Parameters

start the offset at which the replace operation begins
length the number of characters to replace. The character at `start + length` is not modified.
srcChars the source for the new characters
srcStart the offset into `srcChars` where new characters will be obtained
srcLength the number of characters in `srcChars` in the replace string

Returns

a reference to this

ICU 2.0

Defined at line 4750 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & replace (int32_t start, int32_t length, ConstChar16Ptr srcChars, int32_t srcLength)

Replace the characters in the range

[`start`, `start + length`) with the characters in

`srcChars`. `srcChars` is not modified.

Parameters

start the offset at which the replace operation begins
length number of characters to replace. The character at `start + length` is not modified.
srcChars the source for the new characters
srcLength the number of Unicode characters in srcChars

Returns

a reference to this

ICU 2.0

Defined at line 4743 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & replace (int32_t start, int32_t length, char16_t srcChar)

Replace the characters in the range

[`start`, `start + length`) with the code unit

`srcChar`.

Parameters

start the offset at which the replace operation begins
length the number of characters to replace. The character at `start + length` is not modified.
srcChar the new code unit

Returns

a reference to this

ICU 2.0

Defined at line 4758 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & replace (int32_t start, int32_t length, UChar32 srcChar)

Replace the characters in the range

[`start`, `start + length`) with the code point

`srcChar`.

Parameters

start the offset at which the replace operation begins
length the number of characters to replace. The character at `start + length` is not modified.
srcChar the new code point

Returns

a reference to this

ICU 2.0

UnicodeString & replaceBetween (int32_t start, int32_t limit, const UnicodeString & srcText)

Replace the characters in the range [`start`, `limit`)

with the characters in `srcText`. `srcText` is not modified.

Parameters

start the offset at which the replace operation begins
limit the offset immediately following the replace range
srcText the source for the new characters

Returns

a reference to this

ICU 2.0

Defined at line 4764 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & replaceBetween (int32_t start, int32_t limit, const UnicodeString & srcText, int32_t srcStart, int32_t srcLimit)

Replace the characters in the range [`start`, `limit`)

with the characters in `srcText` in the range

[`srcStart`, `srcLimit`). `srcText` is not modified.

Parameters

start the offset at which the replace operation begins
limit the offset immediately following the replace range
srcText the source for the new characters
srcStart the offset into `srcChars` where new characters will be obtained
srcLimit the offset immediately following the range to copy in `srcText`

Returns

a reference to this

ICU 2.0

Defined at line 4770 of file ../../third_party/icu/default/source/common/unicode/unistr.h

void handleReplaceBetween (int32_t start, int32_t limit, const UnicodeString & text)

Replace a substring of this object with the given text.

Parameters

start the beginning index, inclusive; `0 <= start <= limit`.
limit the ending index, exclusive; `start <= limit <= length()`.
text the text to replace characters `start` to `limit - 1` ICU 2.0
UBool hasMetaData ()

Replaceable API

Returns

true if it has MetaData

ICU 2.4

void copy (int32_t start, int32_t limit, int32_t dest)

Copy a substring of this object, retaining attribute (out-of-band)

information. This method is used to duplicate or reorder substrings.

The destination index must not overlap the source range.

Parameters

start the beginning index, inclusive; `0 <= start <= limit`.
limit the ending index, exclusive; `start <= limit <= length()`.
dest the destination index. The characters from `start..limit-1` will be copied to `dest`. Implementations of this method may assume that `dest <= start || dest >= limit`. ICU 2.0
UnicodeString & findAndReplace (const UnicodeString & oldText, const UnicodeString & newText)

Replace all occurrences of characters in oldText with the characters

in newText

Parameters

oldText the text containing the search text
newText the text containing the replacement text

Returns

a reference to this

ICU 2.0

Defined at line 4778 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & findAndReplace (int32_t start, int32_t length, const UnicodeString & oldText, const UnicodeString & newText)

Replace all occurrences of characters in oldText with characters

in newText

in the range [`start`, `start + length`).

Parameters

start the start of the range in which replace will performed
length the length of the range in which replace will be performed
oldText the text containing the search text
newText the text containing the replacement text

Returns

a reference to this

ICU 2.0

Defined at line 4784 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & findAndReplace (int32_t start, int32_t length, const UnicodeString & oldText, int32_t oldStart, int32_t oldLength, const UnicodeString & newText, int32_t newStart, int32_t newLength)

Replace all occurrences of characters in oldText in the range

[`oldStart`, `oldStart + oldLength`) with the characters

in newText in the range

[`newStart`, `newStart + newLength`)

in the range [`start`, `start + length`).

Parameters

start the start of the range in which replace will performed
length the length of the range in which replace will be performed
oldText the text containing the search text
oldStart the start of the search range in `oldText`
oldLength the length of the search range in `oldText`
newText the text containing the replacement text
newStart the start of the replacement range in `newText`
newLength the length of the replacement range in `newText`

Returns

a reference to this

ICU 2.0

UnicodeString & remove ()

Removes all characters from the UnicodeString object and clears the bogus flag.

This is the UnicodeString equivalent of std::string’s clear().

Returns

a reference to this

Defined at line 5033 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & remove (int32_t start, int32_t length)

Remove the characters in the range

[`start`, `start + length`) from the UnicodeString object.

Parameters

start the offset of the first character to remove
length the number of characters to remove

Returns

a reference to this

ICU 2.0

Defined at line 5045 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & removeBetween (int32_t start, int32_t limit)

Remove the characters in the range

[`start`, `limit`) from the UnicodeString object.

Parameters

start the offset of the first character to remove
limit the offset immediately following the range to remove

Returns

a reference to this

ICU 2.0

Defined at line 5056 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & retainBetween (int32_t start, int32_t limit)

Retain only the characters in the range

[`start`, `limit`) from the UnicodeString object.

Removes characters before `start` and at and after `limit`.

Parameters

start the offset of the first character to retain
limit the offset immediately following the range to retain

Returns

a reference to this

ICU 4.4

Defined at line 5061 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UBool padLeading (int32_t targetLength, char16_t padChar)

Pad the start of this UnicodeString with the character `padChar`.

If the length of this UnicodeString is less than targetLength,

length() - targetLength copies of padChar will be added to the

beginning of this UnicodeString.

Parameters

targetLength the desired length of the string
padChar the character to use for padding. Defaults to space (U+0020)

Returns

true if the text was padded, false otherwise.

ICU 2.0

UBool padTrailing (int32_t targetLength, char16_t padChar)

Pad the end of this UnicodeString with the character `padChar`.

If the length of this UnicodeString is less than targetLength,

length() - targetLength copies of padChar will be added to the

end of this UnicodeString.

Parameters

targetLength the desired length of the string
padChar the character to use for padding. Defaults to space (U+0020)

Returns

true if the text was padded, false otherwise.

ICU 2.0

UBool truncate (int32_t targetLength)

Truncate this UnicodeString to the `targetLength`.

Parameters

targetLength the desired length of this UnicodeString.

Returns

true if the text was truncated, false otherwise

ICU 2.0

Defined at line 5067 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & trim ()

Trims leading and trailing whitespace from this UnicodeString.

Returns

a reference to this

ICU 2.0

UnicodeString & reverse ()

Reverse this UnicodeString in place.

Returns

a reference to this

ICU 2.0

Defined at line 5082 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & reverse (int32_t start, int32_t length)

Reverse the range [`start`, `start + length`) in

this UnicodeString.

Parameters

start the start of the range to reverse
length the number of characters to to reverse

Returns

a reference to this

ICU 2.0

Defined at line 5086 of file ../../third_party/icu/default/source/common/unicode/unistr.h

UnicodeString & toUpper ()

Convert the characters in this to UPPER CASE following the conventions of

the default locale.

Returns

A reference to this.

ICU 2.0

UnicodeString & toUpper (const Locale & locale)

Convert the characters in this to UPPER CASE following the conventions of

a specific locale.

Parameters

locale The locale containing the conventions to use.

Returns

A reference to this.

ICU 2.0

UnicodeString & toLower ()

Convert the characters in this to lower case following the conventions of

the default locale.

Returns

A reference to this.

ICU 2.0

UnicodeString & toLower (const Locale & locale)

Convert the characters in this to lower case following the conventions of

a specific locale.

Parameters

locale The locale containing the conventions to use.

Returns

A reference to this.

ICU 2.0

UnicodeString & toTitle (BreakIterator * titleIter)

Titlecase this string, convenience function using the default locale.

Casing is locale-dependent and context-sensitive.

Titlecasing uses a break iterator to find the first characters of words

that are to be titlecased. It titlecases those characters and lowercases

all others.

The titlecase break iterator can be provided to customize for arbitrary

styles, using rules and dictionaries beyond the standard iterators.

It may be more efficient to always provide an iterator to avoid

opening and closing one for each string.

If the break iterator passed in is null, the default Unicode algorithm

will be used to determine the titlecase positions.

This function uses only the setText(), first() and next() methods of the

provided break iterator.

Parameters

titleIter A break iterator to find the first characters of words that are to be titlecased. If none is provided (0), then a standard titlecase break iterator is opened. Otherwise the provided iterator is set to the string's text.

Returns

A reference to this.

ICU 2.1

UnicodeString & toTitle (BreakIterator * titleIter, const Locale & locale)

Titlecase this string.

Casing is locale-dependent and context-sensitive.

Titlecasing uses a break iterator to find the first characters of words

that are to be titlecased. It titlecases those characters and lowercases

all others.

The titlecase break iterator can be provided to customize for arbitrary

styles, using rules and dictionaries beyond the standard iterators.

It may be more efficient to always provide an iterator to avoid

opening and closing one for each string.

If the break iterator passed in is null, the default Unicode algorithm

will be used to determine the titlecase positions.

This function uses only the setText(), first() and next() methods of the

provided break iterator.

Parameters

titleIter A break iterator to find the first characters of words that are to be titlecased. If none is provided (0), then a standard titlecase break iterator is opened. Otherwise the provided iterator is set to the string's text.
locale The locale to consider.

Returns

A reference to this.

ICU 2.1

UnicodeString & toTitle (BreakIterator * titleIter, const Locale & locale, uint32_t options)

Titlecase this string, with options.

Casing is locale-dependent and context-sensitive.

Titlecasing uses a break iterator to find the first characters of words

that are to be titlecased. It titlecases those characters and lowercases

all others. (This can be modified with options.)

The titlecase break iterator can be provided to customize for arbitrary

styles, using rules and dictionaries beyond the standard iterators.

It may be more efficient to always provide an iterator to avoid

opening and closing one for each string.

If the break iterator passed in is null, the default Unicode algorithm

will be used to determine the titlecase positions.

This function uses only the setText(), first() and next() methods of the

provided break iterator.

Parameters

titleIter A break iterator to find the first characters of words that are to be titlecased. If none is provided (0), then a standard titlecase break iterator is opened. Otherwise the provided iterator is set to the string's text.
locale The locale to consider.
options Options bit set, usually 0. See U_TITLECASE_NO_LOWERCASE, U_TITLECASE_NO_BREAK_ADJUSTMENT, U_TITLECASE_ADJUST_TO_CASED, U_TITLECASE_WHOLE_STRING, U_TITLECASE_SENTENCES.

Returns

A reference to this.

ICU 3.8

UnicodeString & foldCase (uint32_t options)

Case-folds the characters in this string.

Case-folding is locale-independent and not context-sensitive,

but there is an option for whether to include or exclude mappings for dotted I

and dotless i that are marked with 'T' in CaseFolding.txt.

The result may be longer or shorter than the original.

Parameters

options Either U_FOLD_CASE_DEFAULT or U_FOLD_CASE_EXCLUDE_SPECIAL_I

Returns

A reference to this.

ICU 2.0

char16_t * getBuffer (int32_t minCapacity)

Get a read/write pointer to the internal buffer.

The buffer is guaranteed to be large enough for at least minCapacity char16_ts,

writable, and is still owned by the UnicodeString object.

Calls to getBuffer(minCapacity) must not be nested, and

must be matched with calls to releaseBuffer(newLength).

If the string buffer was read-only or shared,

then it will be reallocated and copied.

An attempted nested call will return 0, and will not further modify the

state of the UnicodeString object.

It also returns 0 if the string is bogus.

The actual capacity of the string buffer may be larger than minCapacity.

getCapacity() returns the actual capacity.

For many operations, the full capacity should be used to avoid reallocations.

While the buffer is "open" between getBuffer(minCapacity)

and releaseBuffer(newLength), the following applies:

- The string length is set to 0.

- Any read API call on the UnicodeString object will behave like on a 0-length string.

- Any write API call on the UnicodeString object is disallowed and will have no effect.

- You can read from and write to the returned buffer.

- The previous string contents will still be in the buffer;

if you want to use it, then you need to call length() before getBuffer(minCapacity).

If the length() was greater than minCapacity, then any contents after minCapacity

may be lost.

The buffer contents is not NUL-terminated by getBuffer().

If length()

<

getCapacity() then you can terminate it by writing a NUL

at index length().

- You must call releaseBuffer(newLength) before and in order to

return to normal UnicodeString operation.

Parameters

minCapacity the minimum number of char16_ts that are to be available in the buffer, starting at the returned pointer; default to the current string capacity if minCapacity==-1

Returns

a writable pointer to the internal string buffer,

or nullptr if an error occurs (nested calls, out of memory)

void releaseBuffer (int32_t newLength)

Release a read/write buffer on a UnicodeString object with an

"open" getBuffer(minCapacity).

This function must be called in a matched pair with getBuffer(minCapacity).

releaseBuffer(newLength) must be called if and only if a getBuffer(minCapacity) is "open".

It will set the string length to newLength, at most to the current capacity.

If newLength==-1 then it will set the length according to the

first NUL in the buffer, or to the capacity if there is no NUL.

After calling releaseBuffer(newLength) the UnicodeString is back to normal operation.

Parameters

newLength the new length of the UnicodeString object; defaults to the current capacity if newLength is greater than that; if newLength==-1, it defaults to u_strlen(buffer) but not more than the current capacity of the string
const char16_t * getBuffer ()

Get a read-only pointer to the internal buffer.

This can be called at any time on a valid UnicodeString.

It returns 0 if the string is bogus, or

during an "open" getBuffer(minCapacity).

It can be called as many times as desired.

The pointer that it returns will remain valid until the UnicodeString object is modified,

at which time the pointer is semantically invalidated and must not be used any more.

The capacity of the buffer can be determined with getCapacity().

The part after length() may or may not be initialized and valid,

depending on the history of the UnicodeString object.

The buffer contents is (probably) not NUL-terminated.

You can check if it is with

`(s.length()

<

s.getCapacity()

&

&

buffer[s.length()]==0)`.

(See getTerminatedBuffer().)

The buffer may reside in read-only memory. Its contents must not

be modified.

Returns

a read-only pointer to the internal string buffer,

or nullptr if the string is empty or bogus

Defined at line 4244 of file ../../third_party/icu/default/source/common/unicode/unistr.h

const char16_t * getTerminatedBuffer ()

Get a read-only pointer to the internal buffer,

making sure that it is NUL-terminated.

This can be called at any time on a valid UnicodeString.

It returns 0 if the string is bogus, or

during an "open" getBuffer(minCapacity), or if the buffer cannot

be NUL-terminated (because memory allocation failed).

It can be called as many times as desired.

The pointer that it returns will remain valid until the UnicodeString object is modified,

at which time the pointer is semantically invalidated and must not be used any more.

The capacity of the buffer can be determined with getCapacity().

The part after length()+1 may or may not be initialized and valid,

depending on the history of the UnicodeString object.

The buffer contents is guaranteed to be NUL-terminated.

getTerminatedBuffer() may reallocate the buffer if a terminating NUL

is written.

For this reason, this function is not const, unlike getBuffer().

Note that a UnicodeString may also contain NUL characters as part of its contents.

The buffer may reside in read-only memory. Its contents must not

be modified.

Returns

a read-only pointer to the internal string buffer,

or 0 if the string is empty or bogus

void UnicodeString ()

Construct an empty UnicodeString.

ICU 2.0

Defined at line 4181 of file ../../third_party/icu/default/source/common/unicode/unistr.h

void UnicodeString (int32_t capacity, UChar32 c, int32_t count)

Construct a UnicodeString with capacity to hold `capacity` char16_ts

Parameters

capacity the number of char16_ts this UnicodeString should hold before a resize is necessary; if count is greater than 0 and count code points c take up more space than capacity, then capacity is adjusted accordingly.
c is used to initially fill the string
count specifies how many code points c are to be written in the string ICU 2.0
void UnicodeString (char16_t ch)

Single char16_t (code unit) constructor.

It is recommended to mark this constructor "explicit" by

`-DUNISTR_FROM_CHAR_EXPLICIT=explicit`

on the compiler command line or similar.

Parameters

ch the character to place in the UnicodeString ICU 2.0
void UnicodeString (UChar32 ch)

Single UChar32 (code point) constructor.

It is recommended to mark this constructor "explicit" by

`-DUNISTR_FROM_CHAR_EXPLICIT=explicit`

on the compiler command line or similar.

Parameters

ch the character to place in the UnicodeString ICU 2.0
void UnicodeString (const std::nullptr_t text)

nullptr_t constructor.

Effectively the same as the default constructor, makes an empty string object.

It is recommended to mark this constructor "explicit" by

`-DUNISTR_FROM_STRING_EXPLICIT=explicit`

on the compiler command line or similar.

Parameters

text nullptr ICU 59

Defined at line 4186 of file ../../third_party/icu/default/source/common/unicode/unistr.h

void UnicodeString (const char16_t * text, int32_t textLength)

char16_t* constructor.

Note, for string literals:

Since C++17 and ICU 76, you can use UTF-16 string literals with compile-time

length determination:

Parameters

text The characters to place in the UnicodeString.
textLength The number of Unicode characters in `text` to copy. ICU 2.0

Code

                                        
                                             UnicodeString str(u"literal");
                                             if (str == u"other literal") { ... }
                                        
                                    
void UnicodeString (const std::nullptr_t text, int32_t textLength)

nullptr_t constructor.

Effectively the same as the default constructor, makes an empty string object.

Parameters

text nullptr
textLength ignored ICU 59

Defined at line 4190 of file ../../third_party/icu/default/source/common/unicode/unistr.h

void UnicodeString (UBool isTerminated, ConstChar16Ptr text, int32_t textLength)

Readonly-aliasing char16_t* constructor.

The text will be used for the UnicodeString object, but

it will not be released when the UnicodeString is destroyed.

This has copy-on-write semantics:

When the string is modified, then the buffer is first copied into

newly allocated memory.

The aliased buffer is never modified.

In an assignment to another UnicodeString, when using the copy constructor

or the assignment operator, the text will be copied.

When using fastCopyFrom(), the text will be aliased again,

so that both strings then alias the same readonly-text.

Note, for string literals:

Since C++17 and ICU 76, you can use UTF-16 string literals with compile-time

length determination:

Parameters

isTerminated specifies if `text` is `NUL`-terminated. This must be true if `textLength==-1`.
text The characters to alias for the UnicodeString.
textLength The number of Unicode characters in `text` to alias. If -1, then this constructor will determine the length by calling `u_strlen()`. ICU 2.0

Code

                                        
                                             UnicodeString alias = UnicodeString::readOnlyAlias(u"literal");
                                             if (str == u"other literal") { ... }
                                        
                                    
void UnicodeString (char16_t * buffer, int32_t buffLength, int32_t buffCapacity)

Writable-aliasing char16_t* constructor.

The text will be used for the UnicodeString object, but

it will not be released when the UnicodeString is destroyed.

This has write-through semantics:

For as long as the capacity of the buffer is sufficient, write operations

will directly affect the buffer. When more capacity is necessary, then

a new buffer will be allocated and the contents copied as with regularly

constructed strings.

In an assignment to another UnicodeString, the buffer will be copied.

The extract(Char16Ptr dst) function detects whether the dst pointer is the same

as the string buffer itself and will in this case not copy the contents.

Parameters

buffer The characters to alias for the UnicodeString.
buffLength The number of Unicode characters in `buffer` to alias.
buffCapacity The size of `buffer` in char16_ts. ICU 2.0
void UnicodeString (std::nullptr_t buffer, int32_t buffLength, int32_t buffCapacity)

Writable-aliasing nullptr_t constructor.

Effectively the same as the default constructor, makes an empty string object.

Parameters

buffer nullptr
buffLength ignored
buffCapacity ignored ICU 59

Defined at line 4194 of file ../../third_party/icu/default/source/common/unicode/unistr.h

void UnicodeString (const char * codepageData)

char* constructor.

Uses the default converter (and thus depends on the ICU conversion code)

unless U_CHARSET_IS_UTF8 is set to 1.

For ASCII (really "invariant character") strings it is more efficient to use

the constructor that takes a US_INV (for its enum EInvariant).

Note, for string literals:

Since C++17 and ICU 76, you can use UTF-16 string literals with compile-time

length determination:

It is recommended to mark this constructor "explicit" by

`-DUNISTR_FROM_STRING_EXPLICIT=explicit`

on the compiler command line or similar.

Parameters

codepageData an array of bytes, null-terminated, in the platform's default codepage. ICU 2.0

Code

                                        
                                             UnicodeString str(u"literal");
                                             if (str == u"other literal") { ... }
                                        
                                    
void UnicodeString (const char * codepageData, int32_t dataLength)

char* constructor.

Uses the default converter (and thus depends on the ICU conversion code)

unless U_CHARSET_IS_UTF8 is set to 1.

Parameters

codepageData an array of bytes in the platform's default codepage.
dataLength The number of bytes in `codepageData`. ICU 2.0
void UnicodeString (const char * codepageData, const char * codepage)

char* constructor.

If `codepage` is an empty string (`""`),

then a simple conversion is performed on the codepage-invariant

subset ("invariant characters") of the platform encoding. See utypes.h.

Recommendation: For invariant-character strings use the constructor

UnicodeString(const char *src, int32_t length, enum EInvariant inv)

because it avoids object code dependencies of UnicodeString on

the conversion code.

ICU 2.0

Parameters

codepageData an array of bytes, null-terminated
codepage the encoding of `codepageData`. The special value 0 for `codepage` indicates that the text is in the platform's default codepage.
void UnicodeString (const char * codepageData, int32_t dataLength, const char * codepage)

char* constructor.

ICU 2.0

Parameters

codepageData an array of bytes.
dataLength The number of bytes in `codepageData`.
codepage the encoding of `codepageData`. The special value 0 for `codepage` indicates that the text is in the platform's default codepage. If `codepage` is an empty string (`""`), then a simple conversion is performed on the codepage-invariant subset ("invariant characters") of the platform encoding. See utypes.h. Recommendation: For invariant-character strings use the constructor UnicodeString(const char *src, int32_t length, enum EInvariant inv) because it avoids object code dependencies of UnicodeString on the conversion code.
void UnicodeString (const char * src, int32_t srcLength, UConverter * cnv, UErrorCode & errorCode)

char * / UConverter constructor.

This constructor uses an existing UConverter object to

convert the codepage string to Unicode and construct a UnicodeString

from that.

The converter is reset at first.

If the error code indicates a failure before this constructor is called,

or if an error occurs during conversion or construction,

then the string will be bogus.

This function avoids the overhead of opening and closing a converter if

multiple strings are constructed.

Parameters

src input codepage string
srcLength length of the input string, can be -1 for NUL-terminated strings
cnv converter object (ucnv_resetToUnicode() will be called), can be nullptr for the default converter
errorCode normal ICU error code ICU 2.0
void UnicodeString (const char * src, int32_t textLength, enum EInvariant inv)

Constructs a Unicode string from an invariant-character char * string.

About invariant characters see utypes.h.

This constructor has no runtime dependency on conversion code and is

therefore recommended over ones taking a charset name string

(where the empty string "" indicates invariant-character conversion).

Use the macro US_INV as the third, signature-distinguishing parameter.

For example:

Note, for string literals:

Since C++17 and ICU 76, you can use UTF-16 string literals with compile-time

length determination:

Parameters

src String using only invariant characters.
textLength Length of src, or -1 if NUL-terminated.
inv Signature-distinguishing parameter, use US_INV.

Code

                                        
                                                 void fn(const char *s) {
                                                   UnicodeString ustr(s, -1, US_INV);
                                                   // use ustr ...
                                                 }
                                        
                                    
                                        
                                             UnicodeString str(u"literal");
                                             if (str == u"other literal") { ... }
                                        
                                    
void UnicodeString (const UnicodeString & that)

Copy constructor.

Starting with ICU 2.4, the assignment operator and the copy constructor

allocate a new buffer and copy the buffer contents even for readonly aliases.

By contrast, the fastCopyFrom() function implements the old,

more efficient but less safe behavior

of making this string also a readonly alias to the same buffer.

If the source object has an "open" buffer from getBuffer(minCapacity),

then the copy is an empty string.

Parameters

that The UnicodeString object to copy. ICU 2.0
void UnicodeString (UnicodeString && src)

Move constructor; might leave src in bogus state.

This string will have the same contents and state that the source string had.

Parameters

src source string ICU 56
void UnicodeString (const UnicodeString & src, int32_t srcStart)

'Substring' constructor from tail of source string.

Parameters

src The UnicodeString object to copy.
srcStart The offset into `src` at which to start copying. ICU 2.2
void UnicodeString (const UnicodeString & src, int32_t srcStart, int32_t srcLength)

'Substring' constructor from subrange of source string.

Parameters

src The UnicodeString object to copy.
srcStart The offset into `src` at which to start copying.
srcLength The number of characters from `src` to copy. ICU 2.2
UnicodeString * clone ()

Clone this object, an instance of a subclass of Replaceable.

Clones can be used concurrently in multiple threads.

If a subclass does not implement clone(), or if an error occurs,

then nullptr is returned.

The caller must delete the clone.

Returns

a clone of this object

void ~UnicodeString ()

Destructor.

ICU 2.0

UnicodeString fromUTF8 (StringPiece utf8)

Create a UnicodeString from a UTF-8 string.

Illegal input is replaced with U+FFFD. Otherwise, errors result in a bogus string.

Calls u_strFromUTF8WithSub().

Parameters

utf8 UTF-8 input string. Note that a StringPiece can be implicitly constructed from a std::string or a NUL-terminated const char * string.

Returns

A UnicodeString with equivalent UTF-16 contents.

UnicodeString fromUTF32 (const UChar32 * utf32, int32_t length)

Create a UnicodeString from a UTF-32 string.

Illegal input is replaced with U+FFFD. Otherwise, errors result in a bogus string.

Calls u_strFromUTF32WithSub().

Parameters

utf32 UTF-32 input string. Must not be nullptr.
length Length of the input string, or -1 if NUL-terminated.

Returns

A UnicodeString with equivalent UTF-16 contents.

UnicodeString unescape ()

Unescape a string of characters and return a string containing

the result. The following escape sequences are recognized:

\

uhhhh 4 hex digits; h in [0-9A-Fa-f]

\

Uhhhhhhhh 8 hex digits

\

xhh 1-2 hex digits

\

ooo 1-3 octal digits; o in [0-7]

\

cX control-X; X is masked with 0x1F

as well as the standard ANSI C escapes:

\

a => U+0007,

\

b => U+0008,

\

t => U+0009,

\

n => U+000A,

\

v => U+000B,

\

f => U+000C,

\

r => U+000D,

\

e => U+001B,

\

" => U+0022,

\

' => U+0027,

\

? => U+003F,

\

\

=> U+005C

Anything else following a backslash is generically escaped. For

example, "[a\\-z]" returns "[a-z]".

If an escape sequence is ill-formed, this method returns an empty

string. An example of an ill-formed sequence is "\\u" followed by

fewer than 4 hex digits.

This function is similar to u_unescape() but not identical to it.

The latter takes a source char*, so it does escape recognition

and also invariant conversion.

Returns

a string with backslash escapes interpreted, or an

empty string on error.

UChar32 unescapeAt (int32_t & offset)

Unescape a single escape sequence and return the represented

character. See unescape() for a listing of the recognized escape

sequences. The character at offset-1 is assumed (without

checking) to be a backslash. If the escape sequence is

ill-formed, or the offset is out of range, U_SENTINEL=-1 is

returned.

Parameters

offset an input output parameter. On input, it is the offset into this string where the escape sequence is located, after the initial backslash. On output, it is advanced after the last character parsed. On error, it is not advanced at all.

Returns

the character represented by the escape sequence at

offset, or U_SENTINEL=-1 on error.

UClassID getStaticClassID ()

ICU "poor man's RTTI", returns a UClassID for this class.

ICU 2.2

UClassID getDynamicClassID ()

ICU "poor man's RTTI", returns a UClassID for the actual class.

ICU 2.2

Protected Methods

int32_t getLength ()

Implement Replaceable::getLength() (see jitterbug 1027).

ICU 2.4

char16_t getCharAt (int32_t offset)

The change in Replaceable to use virtual getCharAt() allows

UnicodeString::charAt() to be inline again (see jitterbug 709).

ICU 2.4

UChar32 getChar32At (int32_t offset)

The change in Replaceable to use virtual getChar32At() allows

UnicodeString::char32At() to be inline again (see jitterbug 709).

ICU 2.4

Enumerations

enum EInvariant
Name Value
kInvariant 0

Constant to be used in the UnicodeString(char *, int32_t, EInvariant) constructor

which constructs a Unicode string from an invariant-character char * string.

Use the macro US_INV instead of the full qualification for this value.

Defined at line 307 of file ../../third_party/icu/default/source/common/unicode/unistr.h

Friends

class StackBufferOrFields
class UnicodeStringAppendable
void UnicodeString (UnicodeString & s1UnicodeString & s2)