bstr

Trait ByteVec

Source
pub trait ByteVec: Sealed {
Show 23 methods // Provided methods fn from_slice<B: AsRef<[u8]>>(bytes: B) -> Vec<u8> { ... } fn from_os_string(os_str: OsString) -> Result<Vec<u8>, OsString> { ... } fn from_os_str_lossy<'a>(os_str: &'a OsStr) -> Cow<'a, [u8]> { ... } fn from_path_buf(path: PathBuf) -> Result<Vec<u8>, PathBuf> { ... } fn from_path_lossy<'a>(path: &'a Path) -> Cow<'a, [u8]> { ... } fn unescape_bytes<S: AsRef<str>>(escaped: S) -> Vec<u8> { ... } fn push_byte(&mut self, byte: u8) { ... } fn push_char(&mut self, ch: char) { ... } fn push_str<B: AsRef<[u8]>>(&mut self, bytes: B) { ... } fn into_string(self) -> Result<String, FromUtf8Error> where Self: Sized { ... } fn into_string_lossy(self) -> String where Self: Sized { ... } unsafe fn into_string_unchecked(self) -> String where Self: Sized { ... } fn into_os_string(self) -> Result<OsString, FromUtf8Error> where Self: Sized { ... } fn into_os_string_lossy(self) -> OsString where Self: Sized { ... } fn into_path_buf(self) -> Result<PathBuf, FromUtf8Error> where Self: Sized { ... } fn into_path_buf_lossy(self) -> PathBuf where Self: Sized { ... } fn pop_byte(&mut self) -> Option<u8> { ... } fn pop_char(&mut self) -> Option<char> { ... } fn remove_char(&mut self, at: usize) -> char { ... } fn insert_char(&mut self, at: usize, ch: char) { ... } fn insert_str<B: AsRef<[u8]>>(&mut self, at: usize, bytes: B) { ... } fn replace_range<R, B>(&mut self, range: R, replace_with: B) where R: RangeBounds<usize>, B: AsRef<[u8]> { ... } fn drain_bytes<R>(&mut self, range: R) -> DrainBytes<'_> where R: RangeBounds<usize> { ... }
}
Expand description

A trait that extends Vec<u8> with string oriented methods.

Note that when using the constructor methods, such as ByteVec::from_slice, one should actually call them using the concrete type. For example:

use bstr::{B, ByteVec};

let s = Vec::from_slice(b"abc"); // NOT ByteVec::from_slice("...")
assert_eq!(s, B("abc"));

This trait is sealed and cannot be implemented outside of bstr.

Provided Methods§

Source

fn from_slice<B: AsRef<[u8]>>(bytes: B) -> Vec<u8>

Create a new owned byte string from the given byte slice.

§Examples

Basic usage:

use bstr::{B, ByteVec};

let s = Vec::from_slice(b"abc");
assert_eq!(s, B("abc"));
Source

fn from_os_string(os_str: OsString) -> Result<Vec<u8>, OsString>

Create a new byte string from an owned OS string.

When the underlying bytes of OS strings are accessible, then this always succeeds and is zero cost. Otherwise, this returns the given OsString if it is not valid UTF-8.

§Examples

Basic usage:

use std::ffi::OsString;

use bstr::{B, ByteVec};

let os_str = OsString::from("foo");
let bs = Vec::from_os_string(os_str).expect("valid UTF-8");
assert_eq!(bs, B("foo"));
Source

fn from_os_str_lossy<'a>(os_str: &'a OsStr) -> Cow<'a, [u8]>

Lossily create a new byte string from an OS string slice.

When the underlying bytes of OS strings are accessible, then this is zero cost and always returns a slice. Otherwise, a UTF-8 check is performed and if the given OS string is not valid UTF-8, then it is lossily decoded into valid UTF-8 (with invalid bytes replaced by the Unicode replacement codepoint).

§Examples

Basic usage:

use std::ffi::OsStr;

use bstr::{B, ByteVec};

let os_str = OsStr::new("foo");
let bs = Vec::from_os_str_lossy(os_str);
assert_eq!(bs, B("foo"));
Source

fn from_path_buf(path: PathBuf) -> Result<Vec<u8>, PathBuf>

Create a new byte string from an owned file path.

When the underlying bytes of paths are accessible, then this always succeeds and is zero cost. Otherwise, this returns the given PathBuf if it is not valid UTF-8.

§Examples

Basic usage:

use std::path::PathBuf;

use bstr::{B, ByteVec};

let path = PathBuf::from("foo");
let bs = Vec::from_path_buf(path).expect("must be valid UTF-8");
assert_eq!(bs, B("foo"));
Source

fn from_path_lossy<'a>(path: &'a Path) -> Cow<'a, [u8]>

Lossily create a new byte string from a file path.

When the underlying bytes of paths are accessible, then this is zero cost and always returns a slice. Otherwise, a UTF-8 check is performed and if the given path is not valid UTF-8, then it is lossily decoded into valid UTF-8 (with invalid bytes replaced by the Unicode replacement codepoint).

§Examples

Basic usage:

use std::path::Path;

use bstr::{B, ByteVec};

let path = Path::new("foo");
let bs = Vec::from_path_lossy(path);
assert_eq!(bs, B("foo"));
Source

fn unescape_bytes<S: AsRef<str>>(escaped: S) -> Vec<u8>

Unescapes the given string into its raw bytes.

This looks for the escape sequences \xNN, \0, \r, \n, \t and \ and translates them into their corresponding unescaped form.

Incomplete escape sequences or things that look like escape sequences but are not (for example, \i or \xYZ) are passed through literally.

This is the dual of ByteSlice::escape_bytes.

Note that the zero or NUL byte may be represented as either \0 or \x00. Both will be unescaped into the zero byte.

§Examples

This shows basic usage:

use bstr::{B, BString, ByteVec};

assert_eq!(
    BString::from(b"foo\xFFbar"),
    Vec::unescape_bytes(r"foo\xFFbar"),
);
assert_eq!(
    BString::from(b"foo\nbar"),
    Vec::unescape_bytes(r"foo\nbar"),
);
assert_eq!(
    BString::from(b"foo\tbar"),
    Vec::unescape_bytes(r"foo\tbar"),
);
assert_eq!(
    BString::from(b"foo\\bar"),
    Vec::unescape_bytes(r"foo\\bar"),
);
assert_eq!(
    BString::from("foo☃bar"),
    Vec::unescape_bytes(r"foo☃bar"),
);

This shows some examples of how incomplete or “incorrect” escape sequences get passed through literally.

use bstr::{B, BString, ByteVec};

// Show some incomplete escape sequences.
assert_eq!(
    BString::from(br"\"),
    Vec::unescape_bytes(r"\"),
);
assert_eq!(
    BString::from(br"\"),
    Vec::unescape_bytes(r"\\"),
);
assert_eq!(
    BString::from(br"\x"),
    Vec::unescape_bytes(r"\x"),
);
assert_eq!(
    BString::from(br"\xA"),
    Vec::unescape_bytes(r"\xA"),
);
// And now some that kind of look like escape
// sequences, but aren't.
assert_eq!(
    BString::from(br"\xZ"),
    Vec::unescape_bytes(r"\xZ"),
);
assert_eq!(
    BString::from(br"\xZZ"),
    Vec::unescape_bytes(r"\xZZ"),
);
assert_eq!(
    BString::from(br"\i"),
    Vec::unescape_bytes(r"\i"),
);
assert_eq!(
    BString::from(br"\u"),
    Vec::unescape_bytes(r"\u"),
);
assert_eq!(
    BString::from(br"\u{2603}"),
    Vec::unescape_bytes(r"\u{2603}"),
);
Source

fn push_byte(&mut self, byte: u8)

Appends the given byte to the end of this byte string.

Note that this is equivalent to the generic Vec::push method. This method is provided to permit callers to explicitly differentiate between pushing bytes, codepoints and strings.

§Examples

Basic usage:

use bstr::ByteVec;

let mut s = <Vec<u8>>::from("abc");
s.push_byte(b'\xE2');
s.push_byte(b'\x98');
s.push_byte(b'\x83');
assert_eq!(s, "abc☃".as_bytes());
Source

fn push_char(&mut self, ch: char)

Appends the given char to the end of this byte string.

§Examples

Basic usage:

use bstr::ByteVec;

let mut s = <Vec<u8>>::from("abc");
s.push_char('1');
s.push_char('2');
s.push_char('3');
assert_eq!(s, "abc123".as_bytes());
Source

fn push_str<B: AsRef<[u8]>>(&mut self, bytes: B)

Appends the given slice to the end of this byte string. This accepts any type that be converted to a &[u8]. This includes, but is not limited to, &str, &BStr, and of course, &[u8] itself.

§Examples

Basic usage:

use bstr::ByteVec;

let mut s = <Vec<u8>>::from("abc");
s.push_str(b"123");
assert_eq!(s, "abc123".as_bytes());
Source

fn into_string(self) -> Result<String, FromUtf8Error>
where Self: Sized,

Converts a Vec<u8> into a String if and only if this byte string is valid UTF-8.

If it is not valid UTF-8, then a FromUtf8Error is returned. (This error can be used to examine why UTF-8 validation failed, or to regain the original byte string.)

§Examples

Basic usage:

use bstr::ByteVec;

let bytes = Vec::from("hello");
let string = bytes.into_string().unwrap();

assert_eq!("hello", string);

If this byte string is not valid UTF-8, then an error will be returned. That error can then be used to inspect the location at which invalid UTF-8 was found, or to regain the original byte string:

use bstr::{B, ByteVec};

let bytes = Vec::from_slice(b"foo\xFFbar");
let err = bytes.into_string().unwrap_err();

assert_eq!(err.utf8_error().valid_up_to(), 3);
assert_eq!(err.utf8_error().error_len(), Some(1));

// At no point in this example is an allocation performed.
let bytes = Vec::from(err.into_vec());
assert_eq!(bytes, B(b"foo\xFFbar"));
Source

fn into_string_lossy(self) -> String
where Self: Sized,

Lossily converts a Vec<u8> into a String. If this byte string contains invalid UTF-8, then the invalid bytes are replaced with the Unicode replacement codepoint.

§Examples

Basic usage:

use bstr::ByteVec;

let bytes = Vec::from_slice(b"foo\xFFbar");
let string = bytes.into_string_lossy();
assert_eq!(string, "foo\u{FFFD}bar");
Source

unsafe fn into_string_unchecked(self) -> String
where Self: Sized,

Unsafely convert this byte string into a String, without checking for valid UTF-8.

§Safety

Callers must ensure that this byte string is valid UTF-8 before calling this method. Converting a byte string into a String that is not valid UTF-8 is considered undefined behavior.

This routine is useful in performance sensitive contexts where the UTF-8 validity of the byte string is already known and it is undesirable to pay the cost of an additional UTF-8 validation check that into_string performs.

§Examples

Basic usage:

use bstr::ByteVec;

// SAFETY: This is safe because string literals are guaranteed to be
// valid UTF-8 by the Rust compiler.
let s = unsafe { Vec::from("☃βツ").into_string_unchecked() };
assert_eq!("☃βツ", s);
Source

fn into_os_string(self) -> Result<OsString, FromUtf8Error>
where Self: Sized,

Converts this byte string into an OS string, in place.

When OS strings can be constructed from arbitrary byte sequences, this always succeeds and is zero cost. Otherwise, if this byte string is not valid UTF-8, then an error (with the original byte string) is returned.

§Examples

Basic usage:

use std::ffi::OsStr;

use bstr::ByteVec;

let bs = Vec::from("foo");
let os_str = bs.into_os_string().expect("should be valid UTF-8");
assert_eq!(os_str, OsStr::new("foo"));
Source

fn into_os_string_lossy(self) -> OsString
where Self: Sized,

Lossily converts this byte string into an OS string, in place.

When OS strings can be constructed from arbitrary byte sequences, this is zero cost and always returns a slice. Otherwise, this will perform a UTF-8 check and lossily convert this byte string into valid UTF-8 using the Unicode replacement codepoint.

Note that this can prevent the correct roundtripping of file paths when the representation of OsString is opaque.

§Examples

Basic usage:

use bstr::ByteVec;

let bs = Vec::from_slice(b"foo\xFFbar");
let os_str = bs.into_os_string_lossy();
assert_eq!(os_str.to_string_lossy(), "foo\u{FFFD}bar");
Source

fn into_path_buf(self) -> Result<PathBuf, FromUtf8Error>
where Self: Sized,

Converts this byte string into an owned file path, in place.

When paths can be constructed from arbitrary byte sequences, this always succeeds and is zero cost. Otherwise, if this byte string is not valid UTF-8, then an error (with the original byte string) is returned.

§Examples

Basic usage:

use bstr::ByteVec;

let bs = Vec::from("foo");
let path = bs.into_path_buf().expect("should be valid UTF-8");
assert_eq!(path.as_os_str(), "foo");
Source

fn into_path_buf_lossy(self) -> PathBuf
where Self: Sized,

Lossily converts this byte string into an owned file path, in place.

When paths can be constructed from arbitrary byte sequences, this is zero cost and always returns a slice. Otherwise, this will perform a UTF-8 check and lossily convert this byte string into valid UTF-8 using the Unicode replacement codepoint.

Note that this can prevent the correct roundtripping of file paths when the representation of PathBuf is opaque.

§Examples

Basic usage:

use bstr::ByteVec;

let bs = Vec::from_slice(b"foo\xFFbar");
let path = bs.into_path_buf_lossy();
assert_eq!(path.to_string_lossy(), "foo\u{FFFD}bar");
Source

fn pop_byte(&mut self) -> Option<u8>

Removes the last byte from this Vec<u8> and returns it.

If this byte string is empty, then None is returned.

If the last codepoint in this byte string is not ASCII, then removing the last byte could make this byte string contain invalid UTF-8.

Note that this is equivalent to the generic Vec::pop method. This method is provided to permit callers to explicitly differentiate between popping bytes and codepoints.

§Examples

Basic usage:

use bstr::ByteVec;

let mut s = Vec::from("foo");
assert_eq!(s.pop_byte(), Some(b'o'));
assert_eq!(s.pop_byte(), Some(b'o'));
assert_eq!(s.pop_byte(), Some(b'f'));
assert_eq!(s.pop_byte(), None);
Source

fn pop_char(&mut self) -> Option<char>

Removes the last codepoint from this Vec<u8> and returns it.

If this byte string is empty, then None is returned. If the last bytes of this byte string do not correspond to a valid UTF-8 code unit sequence, then the Unicode replacement codepoint is yielded instead in accordance with the replacement codepoint substitution policy.

§Examples

Basic usage:

use bstr::ByteVec;

let mut s = Vec::from("foo");
assert_eq!(s.pop_char(), Some('o'));
assert_eq!(s.pop_char(), Some('o'));
assert_eq!(s.pop_char(), Some('f'));
assert_eq!(s.pop_char(), None);

This shows the replacement codepoint substitution policy. Note that the first pop yields a replacement codepoint but actually removes two bytes. This is in contrast with subsequent pops when encountering \xFF since \xFF is never a valid prefix for any valid UTF-8 code unit sequence.

use bstr::ByteVec;

let mut s = Vec::from_slice(b"f\xFF\xFF\xFFoo\xE2\x98");
assert_eq!(s.pop_char(), Some('\u{FFFD}'));
assert_eq!(s.pop_char(), Some('o'));
assert_eq!(s.pop_char(), Some('o'));
assert_eq!(s.pop_char(), Some('\u{FFFD}'));
assert_eq!(s.pop_char(), Some('\u{FFFD}'));
assert_eq!(s.pop_char(), Some('\u{FFFD}'));
assert_eq!(s.pop_char(), Some('f'));
assert_eq!(s.pop_char(), None);
Source

fn remove_char(&mut self, at: usize) -> char

Removes a char from this Vec<u8> at the given byte position and returns it.

If the bytes at the given position do not lead to a valid UTF-8 code unit sequence, then a replacement codepoint is returned instead.

§Panics

Panics if at is larger than or equal to this byte string’s length.

§Examples

Basic usage:

use bstr::ByteVec;

let mut s = Vec::from("foo☃bar");
assert_eq!(s.remove_char(3), '☃');
assert_eq!(s, b"foobar");

This example shows how the Unicode replacement codepoint policy is used:

use bstr::ByteVec;

let mut s = Vec::from_slice(b"foo\xFFbar");
assert_eq!(s.remove_char(3), '\u{FFFD}');
assert_eq!(s, b"foobar");
Source

fn insert_char(&mut self, at: usize, ch: char)

Inserts the given codepoint into this Vec<u8> at a particular byte position.

This is an O(n) operation as it may copy a number of elements in this byte string proportional to its length.

§Panics

Panics if at is larger than the byte string’s length.

§Examples

Basic usage:

use bstr::ByteVec;

let mut s = Vec::from("foobar");
s.insert_char(3, '☃');
assert_eq!(s, "foo☃bar".as_bytes());
Source

fn insert_str<B: AsRef<[u8]>>(&mut self, at: usize, bytes: B)

Inserts the given byte string into this byte string at a particular byte position.

This is an O(n) operation as it may copy a number of elements in this byte string proportional to its length.

The given byte string may be any type that can be cheaply converted into a &[u8]. This includes, but is not limited to, &str and &[u8].

§Panics

Panics if at is larger than the byte string’s length.

§Examples

Basic usage:

use bstr::ByteVec;

let mut s = Vec::from("foobar");
s.insert_str(3, "☃☃☃");
assert_eq!(s, "foo☃☃☃bar".as_bytes());
Source

fn replace_range<R, B>(&mut self, range: R, replace_with: B)
where R: RangeBounds<usize>, B: AsRef<[u8]>,

Removes the specified range in this byte string and replaces it with the given bytes. The given bytes do not need to have the same length as the range provided.

§Panics

Panics if the given range is invalid.

§Examples

Basic usage:

use bstr::ByteVec;

let mut s = Vec::from("foobar");
s.replace_range(2..4, "xxxxx");
assert_eq!(s, "foxxxxxar".as_bytes());
Source

fn drain_bytes<R>(&mut self, range: R) -> DrainBytes<'_>
where R: RangeBounds<usize>,

Creates a draining iterator that removes the specified range in this Vec<u8> and yields each of the removed bytes.

Note that the elements specified by the given range are removed regardless of whether the returned iterator is fully exhausted.

Also note that is is unspecified how many bytes are removed from the Vec<u8> if the DrainBytes iterator is leaked.

§Panics

Panics if the given range is not valid.

§Examples

Basic usage:

use bstr::ByteVec;

let mut s = Vec::from("foobar");
{
    let mut drainer = s.drain_bytes(2..4);
    assert_eq!(drainer.next(), Some(b'o'));
    assert_eq!(drainer.next(), Some(b'b'));
    assert_eq!(drainer.next(), None);
}
assert_eq!(s, "foar".as_bytes());

Dyn Compatibility§

This trait is not dyn compatible.

In older versions of Rust, dyn compatibility was called "object safety", so this trait is not object safe.

Implementations on Foreign Types§

Source§

impl ByteVec for Vec<u8>

Implementors§