pub struct BStr { /* private fields */ }
Expand description
A byte string slice that is conventionally UTF-8.
A byte string slice is the core string type in this library, and is usually
seen in its borrowed form, &BStr
. The principle difference between a
&BStr
and a &str
(Rust’s standard Unicode string slice) is that a
&BStr
is only conventionally UTF-8, where as a &str
is guaranteed to
always be valid UTF-8.
If you need ownership or a growable byte string buffer, then use
BString
.
§Literals
A byte string literal has type &'static BStr
. The most convenient way to
write a byte string literal is by using the short-hand B
constructor function:
use bstr::{B, BStr};
// A byte string literal can be constructed from a normal Unicode string.
let s = B("a byte string literal");
// A byte string literal can also be constructed from a Rust byte string.
let s = B(b"another byte string literal");
// BStr::new can also be used:
let s = BStr::new("a byte string literal");
let s = BStr::new(b"another byte string literal");
§Representation
A &BStr
has the same representation as a &str
. That is, a &BStr
is
a fat pointer which consists of a pointer to some bytes and a length.
§Trait implementations
The BStr
type has a number of trait implementations, and in particular,
defines equality and ordinal comparisons between &BStr
, &str
and
&[u8]
for convenience.
The Debug
implementation for BStr
shows its bytes as a normal string.
For invalid UTF-8, hex escape sequences are used.
The Display
implementation behaves as if BStr
were first lossily
converted to a str
. Invalid UTF-8 bytes are substituted with the Unicode
replacement codepoint, which looks like this: �.
§Indexing and slicing
A BStr
implements indexing and slicing using [..]
notation. Unlike
the standard str
type, the BStr
type permits callers to index
individual bytes. For example:
use bstr::B;
let s = B("foo☃bar");
assert_eq!(&s[0..3], "foo");
assert_eq!(s[2], b'o');
assert_eq!(&s[3..6], "☃");
// Nothing stops you from indexing or slicing invalid UTF-8.
assert_eq!(s[3], b'\xE2');
assert_eq!(&s[3..5], B(b"\xE2\x98"));
Implementations§
Source§impl BStr
impl BStr
Sourcepub fn new<B: ?Sized + AsRef<[u8]>>(bytes: &B) -> &BStr
pub fn new<B: ?Sized + AsRef<[u8]>>(bytes: &B) -> &BStr
Create a byte string slice from anything that can be borrowed as a
sequence of bytes. This includes, but is not limited to, &Vec<u8>
,
&[u8]
, &String
and &str
.
§Examples
Basic usage:
use bstr::BStr;
assert_eq!("abc", BStr::new("abc"));
assert_eq!("abc", BStr::new(b"abc"));
Sourcepub fn new_mut<B: ?Sized + AsMut<[u8]>>(bytes: &mut B) -> &mut BStr
pub fn new_mut<B: ?Sized + AsMut<[u8]>>(bytes: &mut B) -> &mut BStr
Create a mutable byte string slice from anything that can be borrowed
as a sequence of bytes. This includes, but is not limited to, &mut Vec<u8>
and &mut [u8]
.
§Examples
Basic usage:
use bstr::BStr;
assert_eq!("abc", BStr::new("abc"));
assert_eq!("abc", BStr::new(b"abc"));
Sourcepub fn from_bytes(slice: &[u8]) -> &BStr
pub fn from_bytes(slice: &[u8]) -> &BStr
Create an immutable byte string slice from an immutable byte slice.
§Examples
Basic usage:
use bstr::BStr;
let bytes = &[b'a'];
let bs = BStr::from_bytes(bytes);
assert_eq!("a", bs);
Sourcepub fn from_bytes_mut(slice: &mut [u8]) -> &mut BStr
pub fn from_bytes_mut(slice: &mut [u8]) -> &mut BStr
Create a mutable byte string slice from a mutable byte slice.
§Examples
Basic usage:
use bstr::BStr;
let bytes = &mut [b'a'];
{
let bs = BStr::from_bytes_mut(bytes);
bs[0] = b'b';
}
assert_eq!(b"b", bytes);
Sourcepub unsafe fn from_raw_parts<'a>(data: *const u8, len: usize) -> &'a BStr
pub unsafe fn from_raw_parts<'a>(data: *const u8, len: usize) -> &'a BStr
Create a byte string from its constituent pointer and length, where the length is the number of bytes in the byte string.
§Safety
This function is unsafe as there is no guarantee that the given pointer
is valid for len
elements, nor whether the lifetime inferred is a
suitable lifetime for the returned slice.
data
must be a non-null pointer, even for a zero length slice. A
pointer that is usable for zero-length slices can be obtaining from
the standard library’s NonNull::dangling()
constructor.
The total size of the given slice must be no larger than isize::MAX
bytes in memory.
§Caveat
The lifetime for the returned slice is inferred from its usage. To prevent accidental misuse, it’s suggested to tie the lifetime to whichever source lifetime is safe in the context, such as by providing a helper function taking the lifetime of a host value for the slice, or by explicit annotation.
§Examples
Basic usage:
use bstr::BStr;
// manifest a byte string from a single byte
let x = b'Z';
let ptr = &x as *const u8;
let s = unsafe { BStr::from_raw_parts(ptr, 1) };
assert_eq!(s, "Z");
Sourcepub unsafe fn from_raw_parts_mut<'a>(data: *mut u8, len: usize) -> &'a mut BStr
pub unsafe fn from_raw_parts_mut<'a>(data: *mut u8, len: usize) -> &'a mut BStr
Create a mutable byte string from its constituent pointer and length, where the length is the number of bytes in the byte string.
§Safety
This function is unsafe as there is no guarantee that the given pointer
is valid for len
elements, nor whether the lifetime inferred is a
suitable lifetime for the returned slice.
data
must be a non-null pointer, even for a zero length slice. A
pointer that is usable for zero-length slices can be obtaining from
the standard library’s NonNull::dangling()
constructor.
The total size of the given slice must be no larger than isize::MAX
bytes in memory.
The above reasons are the same as for
from_raw_parts
. In addition, for this
constructor, callers must guarantee that the mutable slice returned
is not aliased with any other reference.
§Caveat
The lifetime for the returned slice is inferred from its usage. To prevent accidental misuse, it’s suggested to tie the lifetime to whichever source lifetime is safe in the context, such as by providing a helper function taking the lifetime of a host value for the slice, or by explicit annotation.
§Examples
Basic usage:
use std::mem;
use bstr::{BStr, BString};
// For demonstration purposes, get a mutable pointer to a byte string.
let mut buf = BString::from("bar");
let ptr = buf.as_mut_ptr();
// Drop buf without deallocating, to avoid &mut aliasing.
mem::forget(buf);
// Now convert it to a mutable byte string from the raw pointer.
let mut s = unsafe { BStr::from_raw_parts_mut(ptr, 3) };
s.make_ascii_uppercase();
assert_eq!(s, "BAR");
Sourcepub fn from_os_str(os_str: &OsStr) -> Option<&BStr>
pub fn from_os_str(os_str: &OsStr) -> Option<&BStr>
Create an immutable byte string from an OS string slice.
On Unix, this always succeeds and is zero cost. On non-Unix systems,
this returns None
if the given OS string is not valid UTF-8. (For
example, on Windows, file paths are allowed to be a sequence of
arbitrary 16-bit integers. Not all such sequences can be transcoded to
valid UTF-8.)
§Examples
Basic usage:
use std::ffi::OsStr;
use bstr::BStr;
let os_str = OsStr::new("foo");
let bs = BStr::from_os_str(os_str).expect("should be valid UTF-8");
assert_eq!(bs, "foo");
Sourcepub fn from_path(path: &Path) -> Option<&BStr>
pub fn from_path(path: &Path) -> Option<&BStr>
Create an immutable byte string from a file path.
On Unix, this always succeeds and is zero cost. On non-Unix systems,
this returns None
if the given path is not valid UTF-8. (For example,
on Windows, file paths are allowed to be a sequence of arbitrary 16-bit
integers. Not all such sequences can be transcoded to valid UTF-8.)
§Examples
Basic usage:
use std::path::Path;
use bstr::BStr;
let path = Path::new("foo");
let bs = BStr::from_path(path).expect("should be valid UTF-8");
assert_eq!(bs, "foo");
Sourcepub fn len(&self) -> usize
pub fn len(&self) -> usize
Returns the length, in bytes, of this byte string.
§Examples
Basic usage:
use bstr::BStr;
assert_eq!(0, BStr::new("").len());
assert_eq!(3, BStr::new("abc").len());
assert_eq!(8, BStr::new("☃βツ").len());
Sourcepub fn is_empty(&self) -> bool
pub fn is_empty(&self) -> bool
Returns true if and only if the length of this byte string is zero.
§Examples
Basic usage:
use bstr::BStr;
assert!(BStr::new("").is_empty());
assert!(!BStr::new("abc").is_empty());
Sourcepub fn as_bytes(&self) -> &[u8] ⓘ
pub fn as_bytes(&self) -> &[u8] ⓘ
Returns an immutable byte slice of this BStr
’s contents.
§Examples
Basic usage:
use bstr::B;
let s = B("hello");
assert_eq!(&[104, 101, 108, 108, 111], s.as_bytes());
Sourcepub fn as_bytes_mut(&mut self) -> &mut [u8] ⓘ
pub fn as_bytes_mut(&mut self) -> &mut [u8] ⓘ
Returns a mutable byte slice of this BStr
’s contents.
§Examples
Basic usage:
use bstr::BString;
let mut s = BString::from("hello");
s.as_bytes_mut()[1] = b'a';
assert_eq!(&[104, 97, 108, 108, 111], s.as_bytes());
Sourcepub fn to_bstring(&self) -> BString
pub fn to_bstring(&self) -> BString
Create a new owned byte string from this byte string slice.
§Examples
Basic usage:
use bstr::BStr;
let s = BStr::new("abc");
let mut owned = s.to_bstring();
owned.push_char('d');
assert_eq!("abcd", owned);
Sourcepub fn to_str(&self) -> Result<&str, Utf8Error>
pub fn to_str(&self) -> Result<&str, Utf8Error>
Safely convert this byte string into a &str
if it’s valid UTF-8.
If this byte string is not valid UTF-8, then an error is returned. The error returned indicates the first invalid byte found and the length of the error.
In cases where a lossy conversion to &str
is acceptable, then use one
of the to_str_lossy
or to_str_lossy_into
methods.
§Examples
Basic usage:
use bstr::{B, BString};
let s = B("☃βツ").to_str()?;
assert_eq!("☃βツ", s);
let mut bstring = BString::from("☃βツ");
bstring.push_byte(b'\xFF');
let err = bstring.to_str().unwrap_err();
assert_eq!(8, err.valid_up_to());
Sourcepub unsafe fn to_str_unchecked(&self) -> &str
pub unsafe fn to_str_unchecked(&self) -> &str
Unsafely convert this byte string into a &str
, without checking for
valid UTF-8.
§Safety
Callers must ensure that this byte string is valid UTF-8 before
calling this method. Converting a byte string into a &str
that is
not valid UTF-8 is considered undefined behavior.
This routine is useful in performance sensitive contexts where the
UTF-8 validity of the byte string is already known and it is
undesirable to pay the cost of an additional UTF-8 validation check
that to_str
performs.
§Examples
Basic usage:
use bstr::{B, BString};
// SAFETY: This is safe because string literals are guaranteed to be
// valid UTF-8 by the Rust compiler.
let s = unsafe { B("☃βツ").to_str_unchecked() };
assert_eq!("☃βツ", s);
Sourcepub fn to_str_lossy(&self) -> Cow<'_, str>
pub fn to_str_lossy(&self) -> Cow<'_, str>
Convert this byte string to a valid UTF-8 string by replacing invalid
UTF-8 bytes with the Unicode replacement codepoint (U+FFFD
).
If the byte string is already valid UTF-8, then no copying or allocation is performed and a borrrowed string slice is returned. If the byte string is not valid UTF-8, then an owned string buffer is returned with invalid bytes replaced by the replacement codepoint.
This method uses the “substitution of maximal subparts” (Unicode Standard, Chapter 3, Section 9) strategy for inserting the replacement codepoint. Specifically, a replacement codepoint is inserted whenever a byte is found that cannot possibly lead to a valid code unit sequence. If there were previous bytes that represented a prefix of a well-formed code unit sequence, then all of those bytes are substituted with a single replacement codepoint. The “substitution of maximal subparts” strategy is the same strategy used by W3C’s Encoding standard. For a more precise description of the maximal subpart strategy, see the Unicode Standard, Chapter 3, Section 9. See also Public Review Issue #121.
N.B. Rust’s standard library also appears to use the same strategy, but it does not appear to be an API guarantee.
§Examples
Basic usage:
use std::borrow::Cow;
use bstr::BString;
let mut bstring = BString::from("☃βツ");
assert_eq!(Cow::Borrowed("☃βツ"), bstring.to_str_lossy());
// Add a byte that makes the sequence invalid.
bstring.push_byte(b'\xFF');
assert_eq!(Cow::Borrowed("☃βツ\u{FFFD}"), bstring.to_str_lossy());
This demonstrates the “maximal subpart” substitution logic.
use bstr::B;
// \x61 is the ASCII codepoint for 'a'.
// \xF1\x80\x80 is a valid 3-byte code unit prefix.
// \xE1\x80 is a valid 2-byte code unit prefix.
// \xC2 is a valid 1-byte code unit prefix.
// \x62 is the ASCII codepoint for 'b'.
//
// In sum, each of the prefixes is replaced by a single replacement
// codepoint since none of the prefixes are properly completed. This
// is in contrast to other strategies that might insert a replacement
// codepoint for every single byte.
let bs = B(b"\x61\xF1\x80\x80\xE1\x80\xC2\x62");
assert_eq!("a\u{FFFD}\u{FFFD}\u{FFFD}b", bs.to_str_lossy());
Sourcepub fn to_str_lossy_into(&self, dest: &mut String)
pub fn to_str_lossy_into(&self, dest: &mut String)
Copy the contents of this byte string into the given owned string
buffer, while replacing invalid UTF-8 code unit sequences with the
Unicode replacement codepoint (U+FFFD
).
This method uses the same “substitution of maximal subparts” strategy
for inserting the replacement codepoint as the
to_str_lossy
method.
This routine is useful for amortizing allocation. However, unlike
to_str_lossy
, this routine will always copy the contents of this
byte string into the destination buffer, even if this byte string is
valid UTF-8.
§Examples
Basic usage:
use std::borrow::Cow;
use bstr::BString;
let mut bstring = BString::from("☃βツ");
// Add a byte that makes the sequence invalid.
bstring.push_byte(b'\xFF');
let mut dest = String::new();
bstring.to_str_lossy_into(&mut dest);
assert_eq!("☃βツ\u{FFFD}", dest);
Sourcepub fn to_os_str(&self) -> Result<&OsStr, Utf8Error>
pub fn to_os_str(&self) -> Result<&OsStr, Utf8Error>
Create an OS string slice from this byte string.
On Unix, this always succeeds and is zero cost. On non-Unix systems, this returns a UTF-8 decoding error if this byte string is not valid UTF-8. (For example, on Windows, file paths are allowed to be a sequence of arbitrary 16-bit integers. There is no obvious mapping from an arbitrary sequence of 8-bit integers to an arbitrary sequence of 16-bit integers.)
§Examples
Basic usage:
use bstr::B;
let bs = B("foo");
let os_str = bs.to_os_str().expect("should be valid UTF-8");
assert_eq!(os_str, "foo");
Sourcepub fn to_os_str_lossy(&self) -> Cow<'_, OsStr>
pub fn to_os_str_lossy(&self) -> Cow<'_, OsStr>
Lossily create an OS string slice from this byte string.
On Unix, this always succeeds and is zero cost. On non-Unix systems, this will perform a UTF-8 check and lossily convert this byte string into valid UTF-8 using the Unicode replacement codepoint.
Note that this can prevent the correct roundtripping of file paths on non-Unix systems such as Windows, where file paths are an arbitrary sequence of 16-bit integers.
§Examples
Basic usage:
use bstr::B;
let bs = B(b"foo\xFFbar");
let os_str = bs.to_os_str_lossy();
assert_eq!(os_str.to_string_lossy(), "foo\u{FFFD}bar");
Sourcepub fn to_path(&self) -> Result<&Path, Utf8Error>
pub fn to_path(&self) -> Result<&Path, Utf8Error>
Create a path slice from this byte string.
On Unix, this always succeeds and is zero cost. On non-Unix systems, this returns a UTF-8 decoding error if this byte string is not valid UTF-8. (For example, on Windows, file paths are allowed to be a sequence of arbitrary 16-bit integers. There is no obvious mapping from an arbitrary sequence of 8-bit integers to an arbitrary sequence of 16-bit integers.)
§Examples
Basic usage:
use bstr::B;
let bs = B("foo");
let path = bs.to_path().expect("should be valid UTF-8");
assert_eq!(path.as_os_str(), "foo");
Sourcepub fn to_path_lossy(&self) -> Cow<'_, Path>
pub fn to_path_lossy(&self) -> Cow<'_, Path>
Lossily create a path slice from this byte string.
On Unix, this always succeeds and is zero cost. On non-Unix systems, this will perform a UTF-8 check and lossily convert this byte string into valid UTF-8 using the Unicode replacement codepoint.
Note that this can prevent the correct roundtripping of file paths on non-Unix systems such as Windows, where file paths are an arbitrary sequence of 16-bit integers.
§Examples
Basic usage:
use bstr::B;
let bs = B(b"foo\xFFbar");
let path = bs.to_path_lossy();
assert_eq!(path.to_string_lossy(), "foo\u{FFFD}bar");
Sourcepub fn contains<B: AsRef<[u8]>>(&self, needle: B) -> bool
pub fn contains<B: AsRef<[u8]>>(&self, needle: B) -> bool
Returns true if and only if this byte string contains the given needle.
§Examples
Basic usage:
use bstr::B;
assert!(B("foo bar").contains("foo"));
assert!(B("foo bar").contains("bar"));
assert!(!B("foo").contains("foobar"));
Sourcepub fn starts_with<B: AsRef<[u8]>>(&self, prefix: B) -> bool
pub fn starts_with<B: AsRef<[u8]>>(&self, prefix: B) -> bool
Returns true if and only if this byte string has the given prefix.
§Examples
Basic usage:
use bstr::B;
assert!(B("foo bar").starts_with("foo"));
assert!(!B("foo bar").starts_with("bar"));
assert!(!B("foo").starts_with("foobar"));
Sourcepub fn ends_with<B: AsRef<[u8]>>(&self, suffix: B) -> bool
pub fn ends_with<B: AsRef<[u8]>>(&self, suffix: B) -> bool
Returns true if and only if this byte string has the given suffix.
§Examples
Basic usage:
use bstr::B;
assert!(B("foo bar").ends_with("bar"));
assert!(!B("foo bar").ends_with("foo"));
assert!(!B("bar").ends_with("foobar"));
Sourcepub fn find<B: AsRef<[u8]>>(&self, needle: B) -> Option<usize>
pub fn find<B: AsRef<[u8]>>(&self, needle: B) -> Option<usize>
Returns the index of the first occurrence of the given needle.
The needle may be any type that can be cheaply converted into a
&[u8]
. This includes, but is not limited to, &str
, &BStr
, and of
course, &[u8]
itself.
Note that if you’re are searching for the same needle in many
different small haystacks, it may be faster to initialize a
Finder
once, and reuse it for each search.
§Complexity
This routine is guaranteed to have worst case linear time complexity
with respect to both the needle and the haystack. That is, this runs
in O(needle.len() + haystack.len())
time.
This routine is also guaranteed to have worst case constant space complexity.
§Examples
Basic usage:
use bstr::B;
let s = B("foo bar baz");
assert_eq!(Some(0), s.find("foo"));
assert_eq!(Some(4), s.find("bar"));
assert_eq!(None, s.find("quux"));
Sourcepub fn rfind<B: AsRef<[u8]>>(&self, needle: B) -> Option<usize>
pub fn rfind<B: AsRef<[u8]>>(&self, needle: B) -> Option<usize>
Returns the index of the last occurrence of the given needle.
The needle may be any type that can be cheaply converted into a
&[u8]
. This includes, but is not limited to, &str
, &BStr
, and of
course, &[u8]
itself.
Note that if you’re are searching for the same needle in many
different small haystacks, it may be faster to initialize a
FinderReverse
once, and reuse it for
each search.
§Complexity
This routine is guaranteed to have worst case linear time complexity
with respect to both the needle and the haystack. That is, this runs
in O(needle.len() + haystack.len())
time.
This routine is also guaranteed to have worst case constant space complexity.
§Examples
Basic usage:
use bstr::B;
let s = B("foo bar baz");
assert_eq!(Some(0), s.rfind("foo"));
assert_eq!(Some(4), s.rfind("bar"));
assert_eq!(Some(8), s.rfind("ba"));
assert_eq!(None, s.rfind("quux"));
Sourcepub fn find_iter<'a, B: ?Sized + AsRef<[u8]>>(
&'a self,
needle: &'a B,
) -> Find<'a> ⓘ
pub fn find_iter<'a, B: ?Sized + AsRef<[u8]>>( &'a self, needle: &'a B, ) -> Find<'a> ⓘ
Returns an iterator of the non-overlapping occurrences of the given needle. The iterator yields byte offset positions indicating the start of each match.
§Complexity
This routine is guaranteed to have worst case linear time complexity
with respect to both the needle and the haystack. That is, this runs
in O(needle.len() + haystack.len())
time.
This routine is also guaranteed to have worst case constant space complexity.
§Examples
Basic usage:
use bstr::B;
let s = B("foo bar foo foo quux foo");
let matches: Vec<usize> = s.find_iter("foo").collect();
assert_eq!(matches, vec![0, 8, 12, 21]);
An empty string matches at every position, including the position immediately following the last byte:
use bstr::B;
let matches: Vec<usize> = B("foo").find_iter("").collect();
assert_eq!(matches, vec![0, 1, 2, 3]);
let matches: Vec<usize> = B("").find_iter("").collect();
assert_eq!(matches, vec![0]);
Sourcepub fn rfind_iter<'a, B: ?Sized + AsRef<[u8]>>(
&'a self,
needle: &'a B,
) -> FindReverse<'a> ⓘ
pub fn rfind_iter<'a, B: ?Sized + AsRef<[u8]>>( &'a self, needle: &'a B, ) -> FindReverse<'a> ⓘ
Returns an iterator of the non-overlapping occurrences of the given needle in reverse. The iterator yields byte offset positions indicating the start of each match.
§Complexity
This routine is guaranteed to have worst case linear time complexity
with respect to both the needle and the haystack. That is, this runs
in O(needle.len() + haystack.len())
time.
This routine is also guaranteed to have worst case constant space complexity.
§Examples
Basic usage:
use bstr::B;
let s = B("foo bar foo foo quux foo");
let matches: Vec<usize> = s.rfind_iter("foo").collect();
assert_eq!(matches, vec![21, 12, 8, 0]);
An empty string matches at every position, including the position immediately following the last byte:
use bstr::B;
let matches: Vec<usize> = B("foo").rfind_iter("").collect();
assert_eq!(matches, vec![3, 2, 1, 0]);
let matches: Vec<usize> = B("").rfind_iter("").collect();
assert_eq!(matches, vec![0]);
Sourcepub fn find_byte(&self, byte: u8) -> Option<usize>
pub fn find_byte(&self, byte: u8) -> Option<usize>
Returns the index of the first occurrence of the given byte. If the
byte does not occur in this byte string, then None
is returned.
§Examples
Basic usage:
use bstr::B;
assert_eq!(Some(10), B("foo bar baz").find_byte(b'z'));
assert_eq!(None, B("foo bar baz").find_byte(b'y'));
Sourcepub fn rfind_byte(&self, byte: u8) -> Option<usize>
pub fn rfind_byte(&self, byte: u8) -> Option<usize>
Returns the index of the last occurrence of the given byte. If the
byte does not occur in this byte string, then None
is returned.
§Examples
Basic usage:
use bstr::B;
assert_eq!(Some(10), B("foo bar baz").rfind_byte(b'z'));
assert_eq!(None, B("foo bar baz").rfind_byte(b'y'));
Sourcepub fn find_char(&self, ch: char) -> Option<usize>
pub fn find_char(&self, ch: char) -> Option<usize>
Returns the index of the first occurrence of the given codepoint.
If the codepoint does not occur in this byte string, then None
is
returned.
Note that if one searches for the replacement codepoint, \u{FFFD}
,
then only explicit occurrences of that encoding will be found. Invalid
UTF-8 sequences will not be matched.
§Examples
Basic usage:
use bstr::B;
assert_eq!(Some(10), B("foo bar baz").find_char('z'));
assert_eq!(Some(4), B("αβγγδ").find_char('γ'));
assert_eq!(None, B("foo bar baz").find_char('y'));
Sourcepub fn rfind_char(&self, ch: char) -> Option<usize>
pub fn rfind_char(&self, ch: char) -> Option<usize>
Returns the index of the last occurrence of the given codepoint.
If the codepoint does not occur in this byte string, then None
is
returned.
Note that if one searches for the replacement codepoint, \u{FFFD}
,
then only explicit occurrences of that encoding will be found. Invalid
UTF-8 sequences will not be matched.
§Examples
Basic usage:
use bstr::B;
assert_eq!(Some(10), B("foo bar baz").rfind_char('z'));
assert_eq!(Some(6), B("αβγγδ").rfind_char('γ'));
assert_eq!(None, B("foo bar baz").rfind_char('y'));
Sourcepub fn fields(&self) -> Fields<'_> ⓘ
pub fn fields(&self) -> Fields<'_> ⓘ
Returns an iterator over the fields in a byte string, separated by contiguous whitespace.
§Example
Basic usage:
use bstr::{B, BStr};
let s = B(" foo\tbar\t\u{2003}\nquux \n");
let fields: Vec<&BStr> = s.fields().collect();
assert_eq!(fields, vec!["foo", "bar", "quux"]);
A byte string consisting of just whitespace yields no elements:
use bstr::B;
assert_eq!(0, B(" \n\t\u{2003}\n \t").fields().count());
Sourcepub fn fields_with<F: FnMut(char) -> bool>(&self, f: F) -> FieldsWith<'_, F> ⓘ
pub fn fields_with<F: FnMut(char) -> bool>(&self, f: F) -> FieldsWith<'_, F> ⓘ
Returns an iterator over the fields in a byte string, separated by contiguous codepoints satisfying the given predicate.
If this byte
§Example
Basic usage:
use bstr::{B, BStr};
let s = B("123foo999999bar1quux123456");
let fields: Vec<&BStr> = s.fields_with(|c| c.is_numeric()).collect();
assert_eq!(fields, vec!["foo", "bar", "quux"]);
A byte string consisting of all codepoints satisfying the predicate yields no elements:
use bstr::B;
assert_eq!(0, B("1911354563").fields_with(|c| c.is_numeric()).count());
Sourcepub fn split<'a, B: ?Sized + AsRef<[u8]>>(
&'a self,
splitter: &'a B,
) -> Split<'a> ⓘ
pub fn split<'a, B: ?Sized + AsRef<[u8]>>( &'a self, splitter: &'a B, ) -> Split<'a> ⓘ
Returns an iterator over substrings of this byte string, separated by the given byte string. Each element yielded is guaranteed not to include the splitter substring.
The splitter may be any type that can be cheaply converted into a
&[u8]
. This includes, but is not limited to, &str
, &BStr
, and of
course, &[u8]
itself.
§Examples
Basic usage:
use bstr::{B, BStr};
let x: Vec<&BStr> = B("Mary had a little lamb").split(" ").collect();
assert_eq!(x, vec!["Mary", "had", "a", "little", "lamb"]);
let x: Vec<&BStr> = B("").split("X").collect();
assert_eq!(x, vec![""]);
let x: Vec<&BStr> = B("lionXXtigerXleopard").split("X").collect();
assert_eq!(x, vec!["lion", "", "tiger", "leopard"]);
let x: Vec<&BStr> = B("lion::tiger::leopard").split("::").collect();
assert_eq!(x, vec!["lion", "tiger", "leopard"]);
If a string contains multiple contiguous separators, you will end up with empty strings yielded by the iterator:
use bstr::{B, BStr};
let x: Vec<&BStr> = B("||||a||b|c").split("|").collect();
assert_eq!(x, vec!["", "", "", "", "a", "", "b", "c"]);
let x: Vec<&BStr> = B("(///)").split("/").collect();
assert_eq!(x, vec!["(", "", "", ")"]);
Separators at the start or end of a string are neighbored by empty strings.
use bstr::{B, BStr};
let x: Vec<&BStr> = B("010").split("0").collect();
assert_eq!(x, vec!["", "1", ""]);
When the empty string is used as a separator, it splits every byte in the byte string, along with the beginning and end of the byte string.
use bstr::{B, BStr};
let x: Vec<&BStr> = B("rust").split("").collect();
assert_eq!(x, vec!["", "r", "u", "s", "t", ""]);
// Splitting by an empty string is not UTF-8 aware. Elements yielded
// may not be valid UTF-8!
let x: Vec<&BStr> = B("☃").split("").collect();
assert_eq!(x, vec![B(""), B(b"\xE2"), B(b"\x98"), B(b"\x83"), B("")]);
Contiguous separators, especially whitespace, can lead to possibly surprising behavior. For example, this code is correct:
use bstr::{B, BStr};
let x: Vec<&BStr> = B(" a b c").split(" ").collect();
assert_eq!(x, vec!["", "", "", "", "a", "", "b", "c"]);
It does not give you ["a", "b", "c"]
. For that behavior, use
fields
instead.
Sourcepub fn rsplit<'a, B: ?Sized + AsRef<[u8]>>(
&'a self,
splitter: &'a B,
) -> SplitReverse<'a> ⓘ
pub fn rsplit<'a, B: ?Sized + AsRef<[u8]>>( &'a self, splitter: &'a B, ) -> SplitReverse<'a> ⓘ
Returns an iterator over substrings of this byte string, separated by the given byte string, in reverse. Each element yielded is guaranteed not to include the splitter substring.
The splitter may be any type that can be cheaply converted into a
&[u8]
. This includes, but is not limited to, &str
, &BStr
, and of
course, &[u8]
itself.
§Examples
Basic usage:
use bstr::{B, BStr};
let x: Vec<&BStr> = B("Mary had a little lamb").rsplit(" ").collect();
assert_eq!(x, vec!["lamb", "little", "a", "had", "Mary"]);
let x: Vec<&BStr> = B("").rsplit("X").collect();
assert_eq!(x, vec![""]);
let x: Vec<&BStr> = B("lionXXtigerXleopard").rsplit("X").collect();
assert_eq!(x, vec!["leopard", "tiger", "", "lion"]);
let x: Vec<&BStr> = B("lion::tiger::leopard").rsplit("::").collect();
assert_eq!(x, vec!["leopard", "tiger", "lion"]);
If a string contains multiple contiguous separators, you will end up with empty strings yielded by the iterator:
use bstr::{B, BStr};
let x: Vec<&BStr> = B("||||a||b|c").rsplit("|").collect();
assert_eq!(x, vec!["c", "b", "", "a", "", "", "", ""]);
let x: Vec<&BStr> = B("(///)").rsplit("/").collect();
assert_eq!(x, vec![")", "", "", "("]);
Separators at the start or end of a string are neighbored by empty strings.
use bstr::{B, BStr};
let x: Vec<&BStr> = B("010").rsplit("0").collect();
assert_eq!(x, vec!["", "1", ""]);
When the empty string is used as a separator, it splits every byte in the byte string, along with the beginning and end of the byte string.
use bstr::{B, BStr};
let x: Vec<&BStr> = B("rust").rsplit("").collect();
assert_eq!(x, vec!["", "t", "s", "u", "r", ""]);
// Splitting by an empty string is not UTF-8 aware. Elements yielded
// may not be valid UTF-8!
let x: Vec<&BStr> = B("☃").rsplit("").collect();
assert_eq!(x, vec![B(""), B(b"\x83"), B(b"\x98"), B(b"\xE2"), B("")]);
Contiguous separators, especially whitespace, can lead to possibly surprising behavior. For example, this code is correct:
use bstr::{B, BStr};
let x: Vec<&BStr> = B(" a b c").rsplit(" ").collect();
assert_eq!(x, vec!["c", "b", "", "a", "", "", "", ""]);
It does not give you ["a", "b", "c"]
.
Sourcepub fn splitn<'a, B: ?Sized + AsRef<[u8]>>(
&'a self,
limit: usize,
splitter: &'a B,
) -> SplitN<'a> ⓘ
pub fn splitn<'a, B: ?Sized + AsRef<[u8]>>( &'a self, limit: usize, splitter: &'a B, ) -> SplitN<'a> ⓘ
Returns an iterator of at most limit
substrings of this byte string,
separated by the given byte string. If limit
substrings are yielded,
then the last substring will contain the remainder of this byte string.
The splitter may be any type that can be cheaply converted into a
&[u8]
. This includes, but is not limited to, &str
, &BStr
, and of
course, &[u8]
itself.
§Examples
Basic usage:
use bstr::{B, BStr};
let x: Vec<_> = B("Mary had a little lamb").splitn(3, " ").collect();
assert_eq!(x, vec!["Mary", "had", "a little lamb"]);
let x: Vec<_> = B("").splitn(3, "X").collect();
assert_eq!(x, vec![""]);
let x: Vec<_> = B("lionXXtigerXleopard").splitn(3, "X").collect();
assert_eq!(x, vec!["lion", "", "tigerXleopard"]);
let x: Vec<_> = B("lion::tiger::leopard").splitn(2, "::").collect();
assert_eq!(x, vec!["lion", "tiger::leopard"]);
let x: Vec<_> = B("abcXdef").splitn(1, "X").collect();
assert_eq!(x, vec!["abcXdef"]);
let x: Vec<_> = B("abcXdef").splitn(0, "X").collect();
assert!(x.is_empty());
Sourcepub fn rsplitn<'a, B: ?Sized + AsRef<[u8]>>(
&'a self,
limit: usize,
splitter: &'a B,
) -> SplitNReverse<'a> ⓘ
pub fn rsplitn<'a, B: ?Sized + AsRef<[u8]>>( &'a self, limit: usize, splitter: &'a B, ) -> SplitNReverse<'a> ⓘ
Returns an iterator of at most limit
substrings of this byte string,
separated by the given byte string, in reverse. If limit
substrings
are yielded, then the last substring will contain the remainder of this
byte string.
The splitter may be any type that can be cheaply converted into a
&[u8]
. This includes, but is not limited to, &str
, &BStr
, and of
course, &[u8]
itself.
§Examples
Basic usage:
use bstr::{B, BStr};
let x: Vec<_> = B("Mary had a little lamb").rsplitn(3, " ").collect();
assert_eq!(x, vec!["lamb", "little", "Mary had a"]);
let x: Vec<_> = B("").rsplitn(3, "X").collect();
assert_eq!(x, vec![""]);
let x: Vec<_> = B("lionXXtigerXleopard").rsplitn(3, "X").collect();
assert_eq!(x, vec!["leopard", "tiger", "lionX"]);
let x: Vec<_> = B("lion::tiger::leopard").rsplitn(2, "::").collect();
assert_eq!(x, vec!["leopard", "lion::tiger"]);
let x: Vec<_> = B("abcXdef").rsplitn(1, "X").collect();
assert_eq!(x, vec!["abcXdef"]);
let x: Vec<_> = B("abcXdef").rsplitn(0, "X").collect();
assert!(x.is_empty());
Sourcepub fn replace<N: AsRef<[u8]>, R: AsRef<[u8]>>(
&self,
needle: N,
replacement: R,
) -> BString
pub fn replace<N: AsRef<[u8]>, R: AsRef<[u8]>>( &self, needle: N, replacement: R, ) -> BString
Replace all matches of the given needle with the given replacement, and
the result as a new BString
.
This routine is useful as a convenience. If you need to reuse an
allocation, use replace_into
instead.
§Examples
Basic usage:
use bstr::B;
let s = B("this is old").replace("old", "new");
assert_eq!(s, "this is new");
When the pattern doesn’t match:
use bstr::B;
let s = B("this is old").replace("nada nada", "limonada");
assert_eq!(s, "this is old");
When the needle is an empty string:
use bstr::B;
let s = B("foo").replace("", "Z");
assert_eq!(s, "ZfZoZoZ");
Sourcepub fn replacen<N: AsRef<[u8]>, R: AsRef<[u8]>>(
&self,
needle: N,
replacement: R,
limit: usize,
) -> BString
pub fn replacen<N: AsRef<[u8]>, R: AsRef<[u8]>>( &self, needle: N, replacement: R, limit: usize, ) -> BString
Replace up to limit
matches of the given needle with the given
replacement, and the result as a new BString
.
This routine is useful as a convenience. If you need to reuse an
allocation, use replacen_into
instead.
§Examples
Basic usage:
use bstr::B;
let s = B("foofoo").replacen("o", "z", 2);
assert_eq!(s, "fzzfoo");
When the pattern doesn’t match:
use bstr::B;
let s = B("foofoo").replacen("a", "z", 2);
assert_eq!(s, "foofoo");
When the needle is an empty string:
use bstr::B;
let s = B("foo").replacen("", "Z", 2);
assert_eq!(s, "ZfZoo");
Sourcepub fn replace_into<N: AsRef<[u8]>, R: AsRef<[u8]>>(
&self,
needle: N,
replacement: R,
dest: &mut BString,
)
pub fn replace_into<N: AsRef<[u8]>, R: AsRef<[u8]>>( &self, needle: N, replacement: R, dest: &mut BString, )
Replace all matches of the given needle with the given replacement,
and write the result into the provided BString
.
This does not clear dest
before writing to it.
This routine is useful for reusing allocation. For a more convenient
API, use replace
instead.
§Examples
Basic usage:
use bstr::{B, BString};
let s = B("this is old");
let mut dest = BString::new();
s.replace_into("old", "new", &mut dest);
assert_eq!(dest, "this is new");
When the pattern doesn’t match:
use bstr::{B, BString};
let s = B("this is old");
let mut dest = BString::new();
s.replace_into("nada nada", "limonada", &mut dest);
assert_eq!(dest, "this is old");
When the needle is an empty string:
use bstr::{B, BString};
let s = B("foo");
let mut dest = BString::new();
s.replace_into("", "Z", &mut dest);
assert_eq!(dest, "ZfZoZoZ");
Sourcepub fn replacen_into<N: AsRef<[u8]>, R: AsRef<[u8]>>(
&self,
needle: N,
replacement: R,
limit: usize,
dest: &mut BString,
)
pub fn replacen_into<N: AsRef<[u8]>, R: AsRef<[u8]>>( &self, needle: N, replacement: R, limit: usize, dest: &mut BString, )
Replace up to limit
matches of the given needle with the given
replacement, and write the result into the provided BString
.
This does not clear dest
before writing to it.
This routine is useful for reusing allocation. For a more convenient
API, use replace
instead.
§Examples
Basic usage:
use bstr::{B, BString};
let s = B("foofoo");
let mut dest = BString::new();
s.replacen_into("o", "z", 2, &mut dest);
assert_eq!(dest, "fzzfoo");
When the pattern doesn’t match:
use bstr::{B, BString};
let s = B("foofoo");
let mut dest = BString::new();
s.replacen_into("a", "z", 2, &mut dest);
assert_eq!(dest, "foofoo");
When the needle is an empty string:
use bstr::{B, BString};
let s = B("foo");
let mut dest = BString::new();
s.replacen_into("", "Z", 2, &mut dest);
assert_eq!(dest, "ZfZoo");
Sourcepub fn bytes(&self) -> Bytes<'_> ⓘ
pub fn bytes(&self) -> Bytes<'_> ⓘ
Returns an iterator over the bytes in this byte string.
§Examples
Basic usage:
use bstr::B;
let bs = B("foobar");
let bytes: Vec<u8> = bs.bytes().collect();
assert_eq!(bytes, bs);
Sourcepub fn chars(&self) -> Chars<'_> ⓘ
pub fn chars(&self) -> Chars<'_> ⓘ
Returns an iterator over the Unicode scalar values in this byte string. If invalid UTF-8 is encountered, then the Unicode replacement codepoint is yielded instead.
§Examples
Basic usage:
use bstr::B;
let bs = B(b"\xE2\x98\x83\xFF\xF0\x9D\x9E\x83\xE2\x98\x61");
let chars: Vec<char> = bs.chars().collect();
assert_eq!(vec!['☃', '\u{FFFD}', '𝞃', '\u{FFFD}', 'a'], chars);
Codepoints can also be iterated over in reverse:
use bstr::B;
let bs = B(b"\xE2\x98\x83\xFF\xF0\x9D\x9E\x83\xE2\x98\x61");
let chars: Vec<char> = bs.chars().rev().collect();
assert_eq!(vec!['a', '\u{FFFD}', '𝞃', '\u{FFFD}', '☃'], chars);
Sourcepub fn char_indices(&self) -> CharIndices<'_> ⓘ
pub fn char_indices(&self) -> CharIndices<'_> ⓘ
Returns an iterator over the Unicode scalar values in this byte string along with their starting and ending byte index positions. If invalid UTF-8 is encountered, then the Unicode replacement codepoint is yielded instead.
Note that this is slightly different from the CharIndices
iterator
provided by the standard library. Aside from working on possibly
invalid UTF-8, this iterator provides both the corresponding starting
and ending byte indices of each codepoint yielded. The ending position
is necessary to slice the original byte string when invalid UTF-8 bytes
are converted into a Unicode replacement codepoint, since a single
replacement codepoint can substitute anywhere from 1 to 3 invalid bytes
(inclusive).
§Examples
Basic usage:
use bstr::B;
let bs = B(b"\xE2\x98\x83\xFF\xF0\x9D\x9E\x83\xE2\x98\x61");
let chars: Vec<(usize, usize, char)> = bs.char_indices().collect();
assert_eq!(chars, vec![
(0, 3, '☃'),
(3, 4, '\u{FFFD}'),
(4, 8, '𝞃'),
(8, 10, '\u{FFFD}'),
(10, 11, 'a'),
]);
Codepoints can also be iterated over in reverse:
use bstr::B;
let bs = B(b"\xE2\x98\x83\xFF\xF0\x9D\x9E\x83\xE2\x98\x61");
let chars: Vec<(usize, usize, char)> = bs
.char_indices()
.rev()
.collect();
assert_eq!(chars, vec![
(10, 11, 'a'),
(8, 10, '\u{FFFD}'),
(4, 8, '𝞃'),
(3, 4, '\u{FFFD}'),
(0, 3, '☃'),
]);
Sourcepub fn lines(&self) -> Lines<'_> ⓘ
pub fn lines(&self) -> Lines<'_> ⓘ
An iterator over all lines in a byte string, without their terminators.
For this iterator, the only line terminators recognized are \r\n
and
\n
.
§Examples
Basic usage:
use bstr::{B, BStr};
let s = B("\
foo
bar\r
baz
quux");
let lines: Vec<&BStr> = s.lines().collect();
assert_eq!(lines, vec![
"foo", "", "bar", "baz", "", "", "quux",
]);
Sourcepub fn lines_with_terminator(&self) -> LinesWithTerminator<'_> ⓘ
pub fn lines_with_terminator(&self) -> LinesWithTerminator<'_> ⓘ
An iterator over all lines in a byte string, including their terminators.
For this iterator, the only line terminator recognized is \n
. (Since
line terminators are included, this also handles \r\n
line endings.)
Line terminators are only included if they are present in the original byte string. For example, the last line in a byte string may not end with a line terminator.
Concatenating all elements yielded by this iterator is guaranteed to yield the original byte string.
§Examples
Basic usage:
use bstr::{B, BStr};
let s = B("\
foo
bar\r
baz
quux");
let lines: Vec<&BStr> = s.lines_with_terminator().collect();
assert_eq!(lines, vec![
"foo\n", "\n", "bar\r\n", "baz\n", "\n", "\n", "quux",
]);
Sourcepub fn trim_start(&self) -> &BStr
pub fn trim_start(&self) -> &BStr
Return a byte string slice with leading whitespace removed.
Whitespace is defined according to the terms of the White_Space
Unicode property.
§Examples
Basic usage:
use bstr::B;
let s = B(" foo\tbar\t\u{2003}\n");
assert_eq!(s.trim_start(), "foo\tbar\t\u{2003}\n");
Sourcepub fn trim_end(&self) -> &BStr
pub fn trim_end(&self) -> &BStr
Return a byte string slice with trailing whitespace removed.
Whitespace is defined according to the terms of the White_Space
Unicode property.
§Examples
Basic usage:
use bstr::B;
let s = B(" foo\tbar\t\u{2003}\n");
assert_eq!(s.trim_end(), " foo\tbar");
Sourcepub fn trim_with<F: FnMut(char) -> bool>(&self, trim: F) -> &BStr
pub fn trim_with<F: FnMut(char) -> bool>(&self, trim: F) -> &BStr
Return a byte string slice with leading and trailing characters satisfying the given predicate removed.
§Examples
Basic usage:
use bstr::B;
let s = B("123foo5bar789");
assert_eq!(s.trim_with(|c| c.is_numeric()), "foo5bar");
Sourcepub fn trim_start_with<F: FnMut(char) -> bool>(&self, trim: F) -> &BStr
pub fn trim_start_with<F: FnMut(char) -> bool>(&self, trim: F) -> &BStr
Return a byte string slice with leading characters satisfying the given predicate removed.
§Examples
Basic usage:
use bstr::B;
let s = B("123foo5bar789");
assert_eq!(s.trim_start_with(|c| c.is_numeric()), "foo5bar789");
Sourcepub fn trim_end_with<F: FnMut(char) -> bool>(&self, trim: F) -> &BStr
pub fn trim_end_with<F: FnMut(char) -> bool>(&self, trim: F) -> &BStr
Return a byte string slice with trailing characters satisfying the given predicate removed.
§Examples
Basic usage:
use bstr::B;
let s = B("123foo5bar");
assert_eq!(s.trim_end_with(|c| c.is_numeric()), "123foo5bar");
Sourcepub fn to_ascii_lowercase(&self) -> BString
pub fn to_ascii_lowercase(&self) -> BString
Returns a new BString
containing the ASCII lowercase equivalent of
this byte string.
In this case, lowercase is only defined in ASCII letters. Namely, the
letters A-Z
are converted to a-z
. All other bytes remain unchanged.
In particular, the length of the byte string returned is always
equivalent to the length of this byte string.
If you’d like to reuse an allocation for performance reasons, then use
make_ascii_lowercase
to perform
the conversion in place.
§Examples
Basic usage:
use bstr::{B, BString};
let s = B("HELLO Β");
assert_eq!("hello Β", s.to_ascii_lowercase());
Invalid UTF-8 remains as is:
use bstr::{B, BString};
let s = B(b"FOO\xFFBAR\xE2\x98BAZ");
assert_eq!(B(b"foo\xFFbar\xE2\x98baz"), s.to_ascii_lowercase());
Sourcepub fn make_ascii_lowercase(&mut self)
pub fn make_ascii_lowercase(&mut self)
Convert this byte string to its lowercase ASCII equivalent in place.
In this case, lowercase is only defined in ASCII letters. Namely, the
letters A-Z
are converted to a-z
. All other bytes remain unchanged.
If you don’t need to do the conversion in
place and instead prefer convenience, then use
to_ascii_lowercase
instead.
§Examples
Basic usage:
use bstr::BString;
let mut s = BString::from("HELLO Β");
s.make_ascii_lowercase();
assert_eq!("hello Β", s);
Invalid UTF-8 remains as is:
use bstr::{B, BString};
let mut s = BString::from_slice(b"FOO\xFFBAR\xE2\x98BAZ");
s.make_ascii_lowercase();
assert_eq!(B(b"foo\xFFbar\xE2\x98baz"), s);
Sourcepub fn to_ascii_uppercase(&self) -> BString
pub fn to_ascii_uppercase(&self) -> BString
Returns a new BString
containing the ASCII uppercase equivalent of
this byte string.
In this case, uppercase is only defined in ASCII letters. Namely, the
letters a-z
are converted to A-Z
. All other bytes remain unchanged.
In particular, the length of the byte string returned is always
equivalent to the length of this byte string.
If you’d like to reuse an allocation for performance reasons, then use
make_ascii_uppercase
to perform
the conversion in place.
§Examples
Basic usage:
use bstr::{B, BString};
let s = B("hello β");
assert_eq!("HELLO β", s.to_ascii_uppercase());
Invalid UTF-8 remains as is:
use bstr::{B, BString};
let s = B(b"foo\xFFbar\xE2\x98baz");
assert_eq!(B(b"FOO\xFFBAR\xE2\x98BAZ"), s.to_ascii_uppercase());
Sourcepub fn make_ascii_uppercase(&mut self)
pub fn make_ascii_uppercase(&mut self)
Convert this byte string to its uppercase ASCII equivalent in place.
In this case, uppercase is only defined in ASCII letters. Namely, the
letters a-z
are converted to A-Z
. All other bytes remain unchanged.
If you don’t need to do the conversion in
place and instead prefer convenience, then use
to_ascii_uppercase
instead.
§Examples
Basic usage:
use bstr::BString;
let mut s = BString::from("hello β");
s.make_ascii_uppercase();
assert_eq!("HELLO β", s);
Invalid UTF-8 remains as is:
use bstr::{B, BString};
let mut s = BString::from_slice(b"foo\xFFbar\xE2\x98baz");
s.make_ascii_uppercase();
assert_eq!(B(b"FOO\xFFBAR\xE2\x98BAZ"), s);
Sourcepub fn reverse_bytes(&mut self)
pub fn reverse_bytes(&mut self)
Reverse the bytes in this string, in place.
Note that this is not necessarily a well formed operation. For example, if this byte string contains valid UTF-8 that isn’t ASCII, then reversing the string will likely result in invalid UTF-8 and otherwise non-sensical content.
§Examples
Basic usage:
use bstr::BString;
let mut s = BString::from("hello");
s.reverse_bytes();
assert_eq!(s, "olleh");
Sourcepub fn reverse_chars(&mut self)
pub fn reverse_chars(&mut self)
Reverse the codepoints in this string, in place.
If this byte string is valid UTF-8, then its reversal by codepoint is also guaranteed to be valid UTF-8.
This operation is equivalent to the following, but without allocating:
use bstr::BString;
let mut s = BString::from("foo☃bar");
let mut chars: Vec<char> = s.chars().collect();
chars.reverse();
let reversed: String = chars.into_iter().collect();
assert_eq!(reversed, "rab☃oof");
Note that this is not necessarily a well formed operation. For example,
if this byte string contains grapheme clusters with more than one
codepoint, then those grapheme clusters will not necessarily be
preserved. If you’d like to preserve grapheme clusters, then use
reverse_graphemes
instead.
§Examples
Basic usage:
use bstr::BString;
let mut s = BString::from("foo☃bar");
s.reverse_chars();
assert_eq!(s, "rab☃oof");
This example shows that not all reversals lead to a well formed string. For example, in this case, combining marks are used to put accents over some letters, and those accent marks must appear after the codepoints they modify.
use bstr::{B, BString};
let mut s = BString::from("résumé");
s.reverse_chars();
assert_eq!(s, B(b"\xCC\x81emus\xCC\x81er"));
A word of warning: the above example relies on the fact that
résumé
is in decomposed normal form, which means there are separate
codepoints for the accents above e
. If it is instead in composed
normal form, then the example works:
use bstr::{B, BString};
let mut s = BString::from("résumé");
s.reverse_chars();
assert_eq!(s, "émusér");
The point here is to be cautious and not assume that just because
reverse_chars
works in one case, that it therefore works in all
cases.
Sourcepub fn is_ascii(&self) -> bool
pub fn is_ascii(&self) -> bool
Returns true if and only if every byte in this byte string is ASCII.
ASCII is an encoding that defines 128 codepoints. A byte corresponds to
an ASCII codepoint if and only if it is in the inclusive range
[0, 127]
.
§Examples
Basic usage:
use bstr::B;
assert!(B("abc").is_ascii());
assert!(!B("☃βツ").is_ascii());
assert!(!B(b"\xFF").is_ascii());
Sourcepub fn is_utf8(&self) -> bool
pub fn is_utf8(&self) -> bool
Returns true if and only if the entire byte string is valid UTF-8.
If you need location information about where a byte string’s first
invalid UTF-8 byte is, then use the to_str
method.
§Examples
Basic usage:
use bstr::B;
assert!(B("abc").is_utf8());
assert!(B("☃βツ").is_utf8());
// invalid bytes
assert!(!B(b"abc\xFF").is_utf8());
// surrogate encoding
assert!(!B(b"\xED\xA0\x80").is_utf8());
// incomplete sequence
assert!(!B(b"\xF0\x9D\x9Ca").is_utf8());
// overlong sequence
assert!(!B(b"\xF0\x82\x82\xAC").is_utf8());
Sourcepub fn split_at(&self, at: usize) -> (&BStr, &BStr)
pub fn split_at(&self, at: usize) -> (&BStr, &BStr)
Divides this byte string into two at an index.
The first byte string will contain all bytes at indices [0, at)
, and
the second byte string will contain all bytes at indices [at, len)
.
§Panics
Panics if at > len
.
§Examples
Basic usage:
use bstr::B;
assert_eq!(B("foobar").split_at(3), (B("foo"), B("bar")));
assert_eq!(B("foobar").split_at(0), (B(""), B("foobar")));
assert_eq!(B("foobar").split_at(6), (B("foobar"), B("")));
Sourcepub fn split_at_mut(&mut self, at: usize) -> (&mut BStr, &mut BStr)
pub fn split_at_mut(&mut self, at: usize) -> (&mut BStr, &mut BStr)
Divides this mutable byte string into two at an index.
The first byte string will contain all bytes at indices [0, at)
, and
the second byte string will contain all bytes at indices [at, len)
.
§Panics
Panics if at > len
.
§Examples
Basic usage:
use bstr::{B, BString};
let mut b = BString::from("foobar");
{
let (left, right) = b.split_at_mut(3);
left[2] = b'z';
right[2] = b'z';
}
assert_eq!(b, B("fozbaz"));
Sourcepub fn get<I: SliceIndex>(&self, at: I) -> Option<&I::Output>
pub fn get<I: SliceIndex>(&self, at: I) -> Option<&I::Output>
Retrieve a reference to a byte or a subslice, depending on the type of the index given.
If given a position, this returns a reference to the byte at that position, if it exists.
If given a range, this returns the slice of bytes corresponding to that range in this byte string.
In the case of invalid indices, this returns None
.
§Examples
Basic usage:
use bstr::B;
let s = B("baz");
assert_eq!(s.get(1), Some(&b'a'));
assert_eq!(s.get(0..2), Some(B("ba")));
assert_eq!(s.get(2..), Some(B("z")));
assert_eq!(s.get(1..=2), Some(B("az")));
Sourcepub fn get_mut<I: SliceIndex>(&mut self, at: I) -> Option<&mut I::Output>
pub fn get_mut<I: SliceIndex>(&mut self, at: I) -> Option<&mut I::Output>
Retrieve a mutable reference to a byte or a subslice, depending on the type of the index given.
If given a position, this returns a reference to the byte at that position, if it exists.
If given a range, this returns the slice of bytes corresponding to that range in this byte string.
In the case of invalid indices, this returns None
.
§Examples
Basic usage:
use bstr::BString;
let mut s = BString::from("baz");
if let Some(mut slice) = s.get_mut(1..) {
slice[0] = b'o';
slice[1] = b'p';
}
assert_eq!(s, "bop");
Sourcepub unsafe fn get_unchecked<I: SliceIndex>(&self, at: I) -> &I::Output
pub unsafe fn get_unchecked<I: SliceIndex>(&self, at: I) -> &I::Output
Retrieve a reference to a byte or a subslice, depending on the type of the index given, while explicitly eliding bounds checks.
If given a position, this returns a reference to the byte at that position, if it exists.
If given a range, this returns the slice of bytes corresponding to that range in this byte string.
In the case of invalid indices, this returns None
.
§Safety
Callers must ensure that the supplied bounds are correct. If they
are out of bounds, then this results in undefined behavior. For a
safe alternative, use get
.
§Examples
Basic usage:
use bstr::B;
let s = B("baz");
unsafe {
assert_eq!(s.get_unchecked(1), &b'a');
assert_eq!(s.get_unchecked(0..2), "ba");
assert_eq!(s.get_unchecked(2..), "z");
assert_eq!(s.get_unchecked(1..=2), "az");
}
Sourcepub unsafe fn get_unchecked_mut<I: SliceIndex>(
&mut self,
at: I,
) -> &mut I::Output
pub unsafe fn get_unchecked_mut<I: SliceIndex>( &mut self, at: I, ) -> &mut I::Output
Retrieve a mutable reference to a byte or a subslice, depending on the type of the index given, while explicitly eliding bounds checks.
If given a position, this returns a reference to the byte at that position, if it exists.
If given a range, this returns the slice of bytes corresponding to that range in this byte string.
In the case of invalid indices, this returns None
.
§Safety
Callers must ensure that the supplied bounds are correct. If they
are out of bounds, then this results in undefined behavior. For a
safe alternative, use get_mut
.
§Examples
Basic usage:
use bstr::BString;
let mut s = BString::from("baz");
{
let mut slice = unsafe { s.get_unchecked_mut(1..) };
slice[0] = b'o';
slice[1] = b'p';
}
assert_eq!(s, "bop");
Sourcepub fn last(&self) -> Option<u8>
pub fn last(&self) -> Option<u8>
Returns the last byte in this byte string, if it’s non-empty. If this
byte string is empty, this returns None
.
§Examples
Basic usage:
use bstr::B;
assert_eq!(Some(b'z'), B("baz").last());
assert_eq!(None, B("").last());
Sourcepub fn copy_within<R>(&mut self, src: R, dest: usize)where
R: RangeBounds<usize>,
pub fn copy_within<R>(&mut self, src: R, dest: usize)where
R: RangeBounds<usize>,
Copies elements from one part of the slice to another part of itself, where the parts may be overlapping.
src
is the range within this byte string to copy from, while dest
is the starting index of the range within this byte string to copy to.
The length indicated by src
must be less than or equal to the number
of bytes from dest
to the end of the byte string.
§Panics
Panics if either range is out of bounds, or if src
is too big to fit
into dest
, or if the end of src
is before the start.
§Examples
Copying four bytes within a byte string:
use bstr::BStr;
let mut buf = *b"Hello, World!";
let s = BStr::new_mut(&mut buf);
s.copy_within(1..5, 8);
assert_eq!(s, "Hello, Wello!");
Sourcepub fn as_ptr(&self) -> *const u8
pub fn as_ptr(&self) -> *const u8
Returns a raw pointer to this byte string’s underlying bytes.
§Safety
The caller must ensure that the byte string outlives the pointer this function returns, or else it will end up pointing to garbage.
Modifying the container (like a BString
) referenced by this byte
string may cause its buffer to be reallocated, which would also make
any pointers to it invalid.
§Examples
Basic usage:
use bstr::B;
let s = B("hello");
let p = s.as_ptr();
unsafe {
assert_eq!(*p.add(2), b'l');
}
Sourcepub fn as_mut_ptr(&mut self) -> *mut u8
pub fn as_mut_ptr(&mut self) -> *mut u8
Returns a raw mutable pointer to this byte string’s underlying bytes.
§Safety
The caller must ensure that the byte string outlives the pointer this function returns, or else it will end up pointing to garbage.
Modifying the container (like a BString
) referenced by this byte
string may cause its buffer to be reallocated, which would also make
any pointers to it invalid.
§Examples
Basic usage:
use bstr::BStr;
let mut buf = &mut [b'h', b'e', b'l', b'l', b'o'];
let mut s = BStr::new_mut(buf);
let p = s.as_mut_ptr();
unsafe {
*p.add(2) = b'Z';
}
assert_eq!("heZlo", s);