Trait ByteSlice
trait ByteSlice: private::Sealed
A trait that extends &[u8] with string oriented methods.
This trait is sealed and cannot be implemented outside of bstr.
Provided Methods
fn as_bstr(self: &Self) -> &BStrReturn this byte slice as a
&BStr.Use
&BStris useful because of itsfmt::Debugrepresentation and various other trait implementations (such asPartialEqandPartialOrd). In particular, theDebugimplementation forBStrshows its bytes as a normal string. For invalid UTF-8, hex escape sequences are used.Examples
Basic usage:
use ByteSlice; println!;fn as_bstr_mut(self: &mut Self) -> &mut BStrReturn this byte slice as a
&mut BStr.Use
&mut BStris useful because of itsfmt::Debugrepresentation and various other trait implementations (such asPartialEqandPartialOrd). In particular, theDebugimplementation forBStrshows its bytes as a normal string. For invalid UTF-8, hex escape sequences are used.Examples
Basic usage:
use ByteSlice; let mut bytes = *b"foo\xFFbar"; println!;fn from_os_str(os_str: &OsStr) -> Option<&[u8]>Create an immutable byte string from an OS string slice.
When the underlying bytes of OS strings are accessible, then this always succeeds and is zero cost. Otherwise, this returns
Noneif the given OS string is not valid UTF-8. (For example, when the underlying bytes are inaccessible on Windows, file paths are allowed to be a sequence of arbitrary 16-bit integers. Not all such sequences can be transcoded to valid UTF-8.)Examples
Basic usage:
use OsStr; use ; let os_str = new; let bs = from_os_str.expect; assert_eq!;fn from_path(path: &Path) -> Option<&[u8]>Create an immutable byte string from a file path.
When the underlying bytes of paths are accessible, then this always succeeds and is zero cost. Otherwise, this returns
Noneif the given path is not valid UTF-8. (For example, when the underlying bytes are inaccessible on Windows, file paths are allowed to be a sequence of arbitrary 16-bit integers. Not all such sequences can be transcoded to valid UTF-8.)Examples
Basic usage:
use Path; use ; let path = new; let bs = from_path.expect; assert_eq!;fn to_str(self: &Self) -> Result<&str, Utf8Error>Safely convert this byte string into a
&strif it's valid UTF-8.If this byte string is not valid UTF-8, then an error is returned. The error returned indicates the first invalid byte found and the length of the error.
In cases where a lossy conversion to
&stris acceptable, then use one of theto_str_lossyorto_str_lossy_intomethods.Examples
Basic usage:
#unsafe fn to_str_unchecked(self: &Self) -> &strUnsafely convert this byte string into a
&str, without checking for valid UTF-8.Safety
Callers must ensure that this byte string is valid UTF-8 before calling this method. Converting a byte string into a
&strthat is not valid UTF-8 is considered undefined behavior.This routine is useful in performance sensitive contexts where the UTF-8 validity of the byte string is already known and it is undesirable to pay the cost of an additional UTF-8 validation check that
to_strperforms.Examples
Basic usage:
use ; // SAFETY: This is safe because string literals are guaranteed to be // valid UTF-8 by the Rust compiler. let s = unsafe ; assert_eq!;fn to_str_lossy(self: &Self) -> Cow<'_, str>Convert this byte string to a valid UTF-8 string by replacing invalid UTF-8 bytes with the Unicode replacement codepoint (
U+FFFD).If the byte string is already valid UTF-8, then no copying or allocation is performed and a borrrowed string slice is returned. If the byte string is not valid UTF-8, then an owned string buffer is returned with invalid bytes replaced by the replacement codepoint.
This method uses the "substitution of maximal subparts" (Unicode Standard, Chapter 3, Section 9) strategy for inserting the replacement codepoint. Specifically, a replacement codepoint is inserted whenever a byte is found that cannot possibly lead to a valid code unit sequence. If there were previous bytes that represented a prefix of a well-formed code unit sequence, then all of those bytes are substituted with a single replacement codepoint. The "substitution of maximal subparts" strategy is the same strategy used by W3C's Encoding standard. For a more precise description of the maximal subpart strategy, see the Unicode Standard, Chapter 3, Section 9. See also Public Review Issue #121.
N.B. Rust's standard library also appears to use the same strategy, but it does not appear to be an API guarantee.
Examples
Basic usage:
use Cow; use ByteSlice; let mut bstring = from; assert_eq!; // Add a byte that makes the sequence invalid. bstring.push; assert_eq!;This demonstrates the "maximal subpart" substitution logic.
use ; // \x61 is the ASCII codepoint for 'a'. // \xF1\x80\x80 is a valid 3-byte code unit prefix. // \xE1\x80 is a valid 2-byte code unit prefix. // \xC2 is a valid 1-byte code unit prefix. // \x62 is the ASCII codepoint for 'b'. // // In sum, each of the prefixes is replaced by a single replacement // codepoint since none of the prefixes are properly completed. This // is in contrast to other strategies that might insert a replacement // codepoint for every single byte. let bs = B; assert_eq!;fn to_str_lossy_into(self: &Self, dest: &mut String)Copy the contents of this byte string into the given owned string buffer, while replacing invalid UTF-8 code unit sequences with the Unicode replacement codepoint (
U+FFFD).This method uses the same "substitution of maximal subparts" strategy for inserting the replacement codepoint as the
to_str_lossymethod.This routine is useful for amortizing allocation. However, unlike
to_str_lossy, this routine will always copy the contents of this byte string into the destination buffer, even if this byte string is valid UTF-8.Examples
Basic usage:
use Cow; use ByteSlice; let mut bstring = from; // Add a byte that makes the sequence invalid. bstring.push; let mut dest = Stringnew; bstring.to_str_lossy_into; assert_eq!;fn to_os_str(self: &Self) -> Result<&OsStr, Utf8Error>Create an OS string slice from this byte string.
When OS strings can be constructed from arbitrary byte sequences, this always succeeds and is zero cost. Otherwise, this returns a UTF-8 decoding error if this byte string is not valid UTF-8. (For example, assuming the representation of
OsStris opaque on Windows, file paths are allowed to be a sequence of arbitrary 16-bit integers. There is no obvious mapping from an arbitrary sequence of 8-bit integers to an arbitrary sequence of 16-bit integers. If the representation ofOsStris even opened up, then this will convert any sequence of bytes to anOsStrwithout cost.)Examples
Basic usage:
use ; let os_str = b"foo".to_os_str.expect; assert_eq!;fn to_os_str_lossy(self: &Self) -> Cow<'_, OsStr>Lossily create an OS string slice from this byte string.
When OS strings can be constructed from arbitrary byte sequences, this is zero cost and always returns a slice. Otherwise, this will perform a UTF-8 check and lossily convert this byte string into valid UTF-8 using the Unicode replacement codepoint.
Note that this can prevent the correct roundtripping of file paths when the representation of
OsStris opaque.Examples
Basic usage:
use ByteSlice; let os_str = b"foo\xFFbar".to_os_str_lossy; assert_eq!;fn to_path(self: &Self) -> Result<&Path, Utf8Error>Create a path slice from this byte string.
When paths can be constructed from arbitrary byte sequences, this always succeeds and is zero cost. Otherwise, this returns a UTF-8 decoding error if this byte string is not valid UTF-8. (For example, assuming the representation of
Pathis opaque on Windows, file paths are allowed to be a sequence of arbitrary 16-bit integers. There is no obvious mapping from an arbitrary sequence of 8-bit integers to an arbitrary sequence of 16-bit integers. If the representation ofPathis even opened up, then this will convert any sequence of bytes to anPathwithout cost.)Examples
Basic usage:
use ByteSlice; let path = b"foo".to_path.expect; assert_eq!;fn to_path_lossy(self: &Self) -> Cow<'_, Path>Lossily create a path slice from this byte string.
When paths can be constructed from arbitrary byte sequences, this is zero cost and always returns a slice. Otherwise, this will perform a UTF-8 check and lossily convert this byte string into valid UTF-8 using the Unicode replacement codepoint.
Note that this can prevent the correct roundtripping of file paths when the representation of
Pathis opaque.Examples
Basic usage:
use ByteSlice; let bs = b"foo\xFFbar"; let path = bs.to_path_lossy; assert_eq!;fn repeatn(self: &Self, n: usize) -> Vec<u8>Create a new byte string by repeating this byte string
ntimes.Panics
This function panics if the capacity of the new byte string would overflow.
Examples
Basic usage:
use ; assert_eq!; assert_eq!;fn contains_str<B: AsRef<[u8]>>(self: &Self, needle: B) -> boolReturns true if and only if this byte string contains the given needle.
Examples
Basic usage:
use ByteSlice; assert!; assert!; assert!;fn starts_with_str<B: AsRef<[u8]>>(self: &Self, prefix: B) -> boolReturns true if and only if this byte string has the given prefix.
Examples
Basic usage:
use ByteSlice; assert!; assert!; assert!;fn ends_with_str<B: AsRef<[u8]>>(self: &Self, suffix: B) -> boolReturns true if and only if this byte string has the given suffix.
Examples
Basic usage:
use ByteSlice; assert!; assert!; assert!;fn find<B: AsRef<[u8]>>(self: &Self, needle: B) -> Option<usize>Returns the index of the first occurrence of the given needle.
The needle may be any type that can be cheaply converted into a
&[u8]. This includes, but is not limited to,&strand&[u8].Note that if you're are searching for the same needle in many different small haystacks, it may be faster to initialize a
Finderonce, and reuse it for each search.Complexity
This routine is guaranteed to have worst case linear time complexity with respect to both the needle and the haystack. That is, this runs in
O(needle.len() + haystack.len())time.This routine is also guaranteed to have worst case constant space complexity.
Examples
Basic usage:
use ByteSlice; let s = b"foo bar baz"; assert_eq!; assert_eq!; assert_eq!;fn rfind<B: AsRef<[u8]>>(self: &Self, needle: B) -> Option<usize>Returns the index of the last occurrence of the given needle.
The needle may be any type that can be cheaply converted into a
&[u8]. This includes, but is not limited to,&strand&[u8].Note that if you're are searching for the same needle in many different small haystacks, it may be faster to initialize a
FinderReverseonce, and reuse it for each search.Complexity
This routine is guaranteed to have worst case linear time complexity with respect to both the needle and the haystack. That is, this runs in
O(needle.len() + haystack.len())time.This routine is also guaranteed to have worst case constant space complexity.
Examples
Basic usage:
use ByteSlice; let s = b"foo bar baz"; assert_eq!; assert_eq!; assert_eq!; assert_eq!;fn find_iter<'h, 'n, B: ?Sized + AsRef<[u8]>>(self: &'h Self, needle: &'n B) -> Find<'h, 'n>Returns an iterator of the non-overlapping occurrences of the given needle. The iterator yields byte offset positions indicating the start of each match.
Complexity
This routine is guaranteed to have worst case linear time complexity with respect to both the needle and the haystack. That is, this runs in
O(needle.len() + haystack.len())time.This routine is also guaranteed to have worst case constant space complexity.
Examples
Basic usage:
use ByteSlice; let s = b"foo bar foo foo quux foo"; let matches: = s.find_iter.collect; assert_eq!;An empty string matches at every position, including the position immediately following the last byte:
use ByteSlice; let matches: = b"foo".find_iter.collect; assert_eq!; let matches: = b"".find_iter.collect; assert_eq!;fn rfind_iter<'h, 'n, B: ?Sized + AsRef<[u8]>>(self: &'h Self, needle: &'n B) -> FindReverse<'h, 'n>Returns an iterator of the non-overlapping occurrences of the given needle in reverse. The iterator yields byte offset positions indicating the start of each match.
Complexity
This routine is guaranteed to have worst case linear time complexity with respect to both the needle and the haystack. That is, this runs in
O(needle.len() + haystack.len())time.This routine is also guaranteed to have worst case constant space complexity.
Examples
Basic usage:
use ByteSlice; let s = b"foo bar foo foo quux foo"; let matches: = s.rfind_iter.collect; assert_eq!;An empty string matches at every position, including the position immediately following the last byte:
use ByteSlice; let matches: = b"foo".rfind_iter.collect; assert_eq!; let matches: = b"".rfind_iter.collect; assert_eq!;fn find_byte(self: &Self, byte: u8) -> Option<usize>Returns the index of the first occurrence of the given byte. If the byte does not occur in this byte string, then
Noneis returned.Examples
Basic usage:
use ByteSlice; assert_eq!; assert_eq!;fn rfind_byte(self: &Self, byte: u8) -> Option<usize>Returns the index of the last occurrence of the given byte. If the byte does not occur in this byte string, then
Noneis returned.Examples
Basic usage:
use ByteSlice; assert_eq!; assert_eq!;fn find_char(self: &Self, ch: char) -> Option<usize>Returns the index of the first occurrence of the given codepoint. If the codepoint does not occur in this byte string, then
Noneis returned.Note that if one searches for the replacement codepoint,
\u{FFFD}, then only explicit occurrences of that encoding will be found. Invalid UTF-8 sequences will not be matched.Examples
Basic usage:
use ; assert_eq!; assert_eq!; assert_eq!;fn rfind_char(self: &Self, ch: char) -> Option<usize>Returns the index of the last occurrence of the given codepoint. If the codepoint does not occur in this byte string, then
Noneis returned.Note that if one searches for the replacement codepoint,
\u{FFFD}, then only explicit occurrences of that encoding will be found. Invalid UTF-8 sequences will not be matched.Examples
Basic usage:
use ; assert_eq!; assert_eq!; assert_eq!;fn find_byteset<B: AsRef<[u8]>>(self: &Self, byteset: B) -> Option<usize>Returns the index of the first occurrence of any of the bytes in the provided set.
The
bytesetmay be any type that can be cheaply converted into a&[u8]. This includes, but is not limited to,&strand&[u8], but note that passing a&strwhich contains multibyte characters may not behave as you expect: each byte in the&stris treated as an individual member of the byte set.Note that order is irrelevant for the
bytesetparameter, and duplicate bytes present in its body are ignored.Complexity
This routine is guaranteed to have worst case linear time complexity with respect to both the set of bytes and the haystack. That is, this runs in
O(byteset.len() + haystack.len())time.This routine is also guaranteed to have worst case constant space complexity.
Examples
Basic usage:
use ByteSlice; assert_eq!; assert_eq!; assert_eq!; // The empty byteset never matches. assert_eq!; assert_eq!;fn find_not_byteset<B: AsRef<[u8]>>(self: &Self, byteset: B) -> Option<usize>Returns the index of the first occurrence of a byte that is not a member of the provided set.
The
bytesetmay be any type that can be cheaply converted into a&[u8]. This includes, but is not limited to,&strand&[u8], but note that passing a&strwhich contains multibyte characters may not behave as you expect: each byte in the&stris treated as an individual member of the byte set.Note that order is irrelevant for the
bytesetparameter, and duplicate bytes present in its body are ignored.Complexity
This routine is guaranteed to have worst case linear time complexity with respect to both the set of bytes and the haystack. That is, this runs in
O(byteset.len() + haystack.len())time.This routine is also guaranteed to have worst case constant space complexity.
Examples
Basic usage:
use ByteSlice; assert_eq!; assert_eq!; assert_eq!; // The negation of the empty byteset matches everything. assert_eq!; // But an empty string never contains anything. assert_eq!;fn rfind_byteset<B: AsRef<[u8]>>(self: &Self, byteset: B) -> Option<usize>Returns the index of the last occurrence of any of the bytes in the provided set.
The
bytesetmay be any type that can be cheaply converted into a&[u8]. This includes, but is not limited to,&strand&[u8], but note that passing a&strwhich contains multibyte characters may not behave as you expect: each byte in the&stris treated as an individual member of the byte set.Note that order is irrelevant for the
bytesetparameter, and duplicate bytes present in its body are ignored.Complexity
This routine is guaranteed to have worst case linear time complexity with respect to both the set of bytes and the haystack. That is, this runs in
O(byteset.len() + haystack.len())time.This routine is also guaranteed to have worst case constant space complexity.
Examples
Basic usage:
use ByteSlice; assert_eq!; assert_eq!; assert_eq!;fn rfind_not_byteset<B: AsRef<[u8]>>(self: &Self, byteset: B) -> Option<usize>Returns the index of the last occurrence of a byte that is not a member of the provided set.
The
bytesetmay be any type that can be cheaply converted into a&[u8]. This includes, but is not limited to,&strand&[u8], but note that passing a&strwhich contains multibyte characters may not behave as you expect: each byte in the&stris treated as an individual member of the byte set.Note that order is irrelevant for the
bytesetparameter, and duplicate bytes present in its body are ignored.Complexity
This routine is guaranteed to have worst case linear time complexity with respect to both the set of bytes and the haystack. That is, this runs in
O(byteset.len() + haystack.len())time.This routine is also guaranteed to have worst case constant space complexity.
Examples
Basic usage:
use ByteSlice; assert_eq!; assert_eq!; assert_eq!;fn fields_with<F: FnMut(char) -> bool>(self: &Self, f: F) -> FieldsWith<'_, F>Returns an iterator over the fields in a byte string, separated by contiguous codepoints satisfying the given predicate.
If this byte string is not valid UTF-8, then the given closure will be called with a Unicode replacement codepoint when invalid UTF-8 bytes are seen.
Example
Basic usage:
use ; let s = b"123foo999999bar1quux123456"; let fields: = s.fields_with.collect; assert_eq!;A byte string consisting of all codepoints satisfying the predicate yields no elements:
use ByteSlice; assert_eq!;fn split_str<'h, 's, B: ?Sized + AsRef<[u8]>>(self: &'h Self, splitter: &'s B) -> Split<'h, 's>Returns an iterator over substrings of this byte string, separated by the given byte string. Each element yielded is guaranteed not to include the splitter substring.
The splitter may be any type that can be cheaply converted into a
&[u8]. This includes, but is not limited to,&strand&[u8].Examples
Basic usage:
use ; let x: = b"Mary had a little lamb".split_str.collect; assert_eq!; let x: = b"".split_str.collect; assert_eq!; let x: = b"lionXXtigerXleopard".split_str.collect; assert_eq!; let x: = b"lion::tiger::leopard".split_str.collect; assert_eq!;If a string contains multiple contiguous separators, you will end up with empty strings yielded by the iterator:
use ; let x: = b"||||a||b|c".split_str.collect; assert_eq!; let x: = b"(///)".split_str.collect; assert_eq!;Separators at the start or end of a string are neighbored by empty strings.
use ; let x: = b"010".split_str.collect; assert_eq!;When the empty string is used as a separator, it splits every byte in the byte string, along with the beginning and end of the byte string.
use ; let x: = b"rust".split_str.collect; assert_eq!; // Splitting by an empty string is not UTF-8 aware. Elements yielded // may not be valid UTF-8! let x: = B.split_str.collect; assert_eq!;Contiguous separators, especially whitespace, can lead to possibly surprising behavior. For example, this code is correct:
use ; let x: = b" a b c".split_str.collect; assert_eq!;It does not give you
["a", "b", "c"]. For that behavior, usefieldsinstead.fn rsplit_str<'h, 's, B: ?Sized + AsRef<[u8]>>(self: &'h Self, splitter: &'s B) -> SplitReverse<'h, 's>Returns an iterator over substrings of this byte string, separated by the given byte string, in reverse. Each element yielded is guaranteed not to include the splitter substring.
The splitter may be any type that can be cheaply converted into a
&[u8]. This includes, but is not limited to,&strand&[u8].Examples
Basic usage:
use ; let x: = b"Mary had a little lamb".rsplit_str.collect; assert_eq!; let x: = b"".rsplit_str.collect; assert_eq!; let x: = b"lionXXtigerXleopard".rsplit_str.collect; assert_eq!; let x: = b"lion::tiger::leopard".rsplit_str.collect; assert_eq!;If a string contains multiple contiguous separators, you will end up with empty strings yielded by the iterator:
use ; let x: = b"||||a||b|c".rsplit_str.collect; assert_eq!; let x: = b"(///)".rsplit_str.collect; assert_eq!;Separators at the start or end of a string are neighbored by empty strings.
use ; let x: = b"010".rsplit_str.collect; assert_eq!;When the empty string is used as a separator, it splits every byte in the byte string, along with the beginning and end of the byte string.
use ; let x: = b"rust".rsplit_str.collect; assert_eq!; // Splitting by an empty string is not UTF-8 aware. Elements yielded // may not be valid UTF-8! let x: = B.rsplit_str.collect; assert_eq!;Contiguous separators, especially whitespace, can lead to possibly surprising behavior. For example, this code is correct:
use ; let x: = b" a b c".rsplit_str.collect; assert_eq!;It does not give you
["a", "b", "c"].fn split_once_str<'a, B: ?Sized + AsRef<[u8]>>(self: &'a Self, splitter: &B) -> Option<(&'a [u8], &'a [u8])>Split this byte string at the first occurrence of
splitter.If the
splitteris found in the byte string, returns a tuple containing the parts of the string before and after the first occurrence ofsplitterrespectively. Otherwise, if there are no occurrences ofsplitterin the byte string, returnsNone.The splitter may be any type that can be cheaply converted into a
&[u8]. This includes, but is not limited to,&strand&[u8].If you need to split on the last instance of a delimiter instead, see the
ByteSlice::rsplit_once_strmethod .Examples
Basic usage:
use ; assert_eq!; assert_eq!; assert_eq!; assert_eq!; assert_eq!;fn rsplit_once_str<'a, B: ?Sized + AsRef<[u8]>>(self: &'a Self, splitter: &B) -> Option<(&'a [u8], &'a [u8])>Split this byte string at the last occurrence of
splitter.If the
splitteris found in the byte string, returns a tuple containing the parts of the string before and after the last occurrence ofsplitter, respectively. Otherwise, if there are no occurrences ofsplitterin the byte string, returnsNone.The splitter may be any type that can be cheaply converted into a
&[u8]. This includes, but is not limited to,&strand&[u8].If you need to split on the first instance of a delimiter instead, see the
ByteSlice::split_once_strmethod.Examples
Basic usage:
use ; assert_eq!; assert_eq!; assert_eq!; assert_eq!; assert_eq!;fn splitn_str<'h, 's, B: ?Sized + AsRef<[u8]>>(self: &'h Self, limit: usize, splitter: &'s B) -> SplitN<'h, 's>Returns an iterator of at most
limitsubstrings of this byte string, separated by the given byte string. Iflimitsubstrings are yielded, then the last substring will contain the remainder of this byte string.The needle may be any type that can be cheaply converted into a
&[u8]. This includes, but is not limited to,&strand&[u8].Examples
Basic usage:
use ; let x: = b"Mary had a little lamb".splitn_str.collect; assert_eq!; let x: = b"".splitn_str.collect; assert_eq!; let x: = b"lionXXtigerXleopard".splitn_str.collect; assert_eq!; let x: = b"lion::tiger::leopard".splitn_str.collect; assert_eq!; let x: = b"abcXdef".splitn_str.collect; assert_eq!; let x: = b"abcdef".splitn_str.collect; assert_eq!; let x: = b"abcXdef".splitn_str.collect; assert!;fn rsplitn_str<'h, 's, B: ?Sized + AsRef<[u8]>>(self: &'h Self, limit: usize, splitter: &'s B) -> SplitNReverse<'h, 's>Returns an iterator of at most
limitsubstrings of this byte string, separated by the given byte string, in reverse. Iflimitsubstrings are yielded, then the last substring will contain the remainder of this byte string.The needle may be any type that can be cheaply converted into a
&[u8]. This includes, but is not limited to,&strand&[u8].Examples
Basic usage:
use ; let x: = b"Mary had a little lamb".rsplitn_str.collect; assert_eq!; let x: = b"".rsplitn_str.collect; assert_eq!; let x: = b"lionXXtigerXleopard".rsplitn_str.collect; assert_eq!; let x: = b"lion::tiger::leopard".rsplitn_str.collect; assert_eq!; let x: = b"abcXdef".rsplitn_str.collect; assert_eq!; let x: = b"abcdef".rsplitn_str.collect; assert_eq!; let x: = b"abcXdef".rsplitn_str.collect; assert!;fn replace<N: AsRef<[u8]>, R: AsRef<[u8]>>(self: &Self, needle: N, replacement: R) -> Vec<u8>Replace all matches of the given needle with the given replacement, and the result as a new
Vec<u8>.This routine is useful as a convenience. If you need to reuse an allocation, use
replace_intoinstead.Examples
Basic usage:
use ByteSlice; let s = b"this is old".replace; assert_eq!;When the pattern doesn't match:
use ByteSlice; let s = b"this is old".replace; assert_eq!;When the needle is an empty string:
use ByteSlice; let s = b"foo".replace; assert_eq!;fn replacen<N: AsRef<[u8]>, R: AsRef<[u8]>>(self: &Self, needle: N, replacement: R, limit: usize) -> Vec<u8>Replace up to
limitmatches of the given needle with the given replacement, and the result as a newVec<u8>.This routine is useful as a convenience. If you need to reuse an allocation, use
replacen_intoinstead.Examples
Basic usage:
use ByteSlice; let s = b"foofoo".replacen; assert_eq!;When the pattern doesn't match:
use ByteSlice; let s = b"foofoo".replacen; assert_eq!;When the needle is an empty string:
use ByteSlice; let s = b"foo".replacen; assert_eq!;fn replace_into<N: AsRef<[u8]>, R: AsRef<[u8]>>(self: &Self, needle: N, replacement: R, dest: &mut Vec<u8>)Replace all matches of the given needle with the given replacement, and write the result into the provided
Vec<u8>.This does not clear
destbefore writing to it.This routine is useful for reusing allocation. For a more convenient API, use
replaceinstead.Examples
Basic usage:
use ByteSlice; let s = b"this is old"; let mut dest = vec!; s.replace_into; assert_eq!;When the pattern doesn't match:
use ByteSlice; let s = b"this is old"; let mut dest = vec!; s.replace_into; assert_eq!;When the needle is an empty string:
use ByteSlice; let s = b"foo"; let mut dest = vec!; s.replace_into; assert_eq!;fn replacen_into<N: AsRef<[u8]>, R: AsRef<[u8]>>(self: &Self, needle: N, replacement: R, limit: usize, dest: &mut Vec<u8>)Replace up to
limitmatches of the given needle with the given replacement, and write the result into the providedVec<u8>.This does not clear
destbefore writing to it.This routine is useful for reusing allocation. For a more convenient API, use
replaceninstead.Examples
Basic usage:
use ByteSlice; let s = b"foofoo"; let mut dest = vec!; s.replacen_into; assert_eq!;When the pattern doesn't match:
use ByteSlice; let s = b"foofoo"; let mut dest = vec!; s.replacen_into; assert_eq!;When the needle is an empty string:
use ByteSlice; let s = b"foo"; let mut dest = vec!; s.replacen_into; assert_eq!;fn bytes(self: &Self) -> Bytes<'_>Returns an iterator over the bytes in this byte string.
Examples
Basic usage:
use ByteSlice; let bs = b"foobar"; let bytes: = bs.bytes.collect; assert_eq!;fn chars(self: &Self) -> Chars<'_>Returns an iterator over the Unicode scalar values in this byte string. If invalid UTF-8 is encountered, then the Unicode replacement codepoint is yielded instead.
Examples
Basic usage:
use ByteSlice; let bs = b"\xE2\x98\x83\xFF\xF0\x9D\x9E\x83\xE2\x98\x61"; let chars: = bs.chars.collect; assert_eq!;Codepoints can also be iterated over in reverse:
use ByteSlice; let bs = b"\xE2\x98\x83\xFF\xF0\x9D\x9E\x83\xE2\x98\x61"; let chars: = bs.chars.rev.collect; assert_eq!;fn char_indices(self: &Self) -> CharIndices<'_>Returns an iterator over the Unicode scalar values in this byte string along with their starting and ending byte index positions. If invalid UTF-8 is encountered, then the Unicode replacement codepoint is yielded instead.
Note that this is slightly different from the
CharIndicesiterator provided by the standard library. Aside from working on possibly invalid UTF-8, this iterator provides both the corresponding starting and ending byte indices of each codepoint yielded. The ending position is necessary to slice the original byte string when invalid UTF-8 bytes are converted into a Unicode replacement codepoint, since a single replacement codepoint can substitute anywhere from 1 to 3 invalid bytes (inclusive).Examples
Basic usage:
use ByteSlice; let bs = b"\xE2\x98\x83\xFF\xF0\x9D\x9E\x83\xE2\x98\x61"; let chars: = bs.char_indices.collect; assert_eq!;Codepoints can also be iterated over in reverse:
use ByteSlice; let bs = b"\xE2\x98\x83\xFF\xF0\x9D\x9E\x83\xE2\x98\x61"; let chars: = bs .char_indices .rev .collect; assert_eq!;fn utf8_chunks(self: &Self) -> Utf8Chunks<'_>Iterate over chunks of valid UTF-8.
The iterator returned yields chunks of valid UTF-8 separated by invalid UTF-8 bytes, if they exist. Invalid UTF-8 bytes are always 1-3 bytes, which are determined via the "substitution of maximal subparts" strategy described in the docs for the
ByteSlice::to_str_lossymethod.Examples
This example shows how to gather all valid and invalid chunks from a byte slice:
use ; let bytes = b"foo\xFD\xFEbar\xFF"; let = ; for chunk in bytes.utf8_chunks assert_eq!; assert_eq!;fn lines(self: &Self) -> Lines<'_>An iterator over all lines in a byte string, without their terminators.
For this iterator, the only line terminators recognized are
\r\nand\n.Examples
Basic usage:
use ; let s = b"\ foo bar\r baz quux"; let lines: = s.lines.collect; assert_eq!;fn lines_with_terminator(self: &Self) -> LinesWithTerminator<'_>An iterator over all lines in a byte string, including their terminators.
For this iterator, the only line terminator recognized is
\n. (Since line terminators are included, this also handles\r\nline endings.)Line terminators are only included if they are present in the original byte string. For example, the last line in a byte string may not end with a line terminator.
Concatenating all elements yielded by this iterator is guaranteed to yield the original byte string.
Examples
Basic usage:
use ; let s = b"\ foo bar\r baz quux"; let lines: = s.lines_with_terminator.collect; assert_eq!;fn trim_with<F: FnMut(char) -> bool>(self: &Self, trim: F) -> &[u8]Return a byte string slice with leading and trailing characters satisfying the given predicate removed.
Examples
Basic usage:
use ; let s = b"123foo5bar789"; assert_eq!;fn trim_start_with<F: FnMut(char) -> bool>(self: &Self, trim: F) -> &[u8]Return a byte string slice with leading characters satisfying the given predicate removed.
Examples
Basic usage:
use ; let s = b"123foo5bar789"; assert_eq!;fn trim_end_with<F: FnMut(char) -> bool>(self: &Self, trim: F) -> &[u8]Return a byte string slice with trailing characters satisfying the given predicate removed.
Examples
Basic usage:
use ; let s = b"123foo5bar789"; assert_eq!;fn to_ascii_lowercase(self: &Self) -> Vec<u8>Returns a new
Vec<u8>containing the ASCII lowercase equivalent of this byte string.In this case, lowercase is only defined in ASCII letters. Namely, the letters
A-Zare converted toa-z. All other bytes remain unchanged. In particular, the length of the byte string returned is always equivalent to the length of this byte string.If you'd like to reuse an allocation for performance reasons, then use
make_ascii_lowercaseto perform the conversion in place.Examples
Basic usage:
use ; let s = B; assert_eq!;Invalid UTF-8 remains as is:
use ; let s = B; assert_eq!;fn make_ascii_lowercase(self: &mut Self)Convert this byte string to its lowercase ASCII equivalent in place.
In this case, lowercase is only defined in ASCII letters. Namely, the letters
A-Zare converted toa-z. All other bytes remain unchanged.If you don't need to do the conversion in place and instead prefer convenience, then use
to_ascii_lowercaseinstead.Examples
Basic usage:
use ByteSlice; let mut s = from; s.make_ascii_lowercase; assert_eq!;Invalid UTF-8 remains as is:
#fn to_ascii_uppercase(self: &Self) -> Vec<u8>Returns a new
Vec<u8>containing the ASCII uppercase equivalent of this byte string.In this case, uppercase is only defined in ASCII letters. Namely, the letters
a-zare converted toA-Z. All other bytes remain unchanged. In particular, the length of the byte string returned is always equivalent to the length of this byte string.If you'd like to reuse an allocation for performance reasons, then use
make_ascii_uppercaseto perform the conversion in place.Examples
Basic usage:
use ; let s = B; assert_eq!;Invalid UTF-8 remains as is:
use ; let s = B; assert_eq!;fn make_ascii_uppercase(self: &mut Self)Convert this byte string to its uppercase ASCII equivalent in place.
In this case, uppercase is only defined in ASCII letters. Namely, the letters
a-zare converted toA-Z. All other bytes remain unchanged.If you don't need to do the conversion in place and instead prefer convenience, then use
to_ascii_uppercaseinstead.Examples
Basic usage:
use ; let mut s = from; s.make_ascii_uppercase; assert_eq!;Invalid UTF-8 remains as is:
#fn escape_bytes(self: &Self) -> EscapeBytes<'_>Escapes this byte string into a sequence of
charvalues.When the sequence of
charvalues is concatenated into a string, the result is always valid UTF-8. Any unprintable or invalid UTF-8 in this byte string are escaped using using\xNNnotation. Moreover, the characters\0,\r,\n,\tand\are escaped as well.This is useful when one wants to get a human readable view of the raw bytes that is also valid UTF-8.
The iterator returned implements the
Displaytrait. So one can dob"foo\xFFbar".escape_bytes().to_string()to get aStringwith its bytes escaped.The dual of this function is
ByteVec::unescape_bytes.Note that this is similar to, but not equivalent to the
Debugimplementation onBStrandBString. TheDebugimplementations also use the debug representation for all Unicode codepoints. However, this escaping routine only escapes individual bytes. All Unicode codepoints aboveU+007Fare passed through unchanged without any escaping.Examples
#fn reverse_bytes(self: &mut Self)Reverse the bytes in this string, in place.
This is not necessarily a well formed operation! For example, if this byte string contains valid UTF-8 that isn't ASCII, then reversing the string will likely result in invalid UTF-8 and otherwise non-sensical content.
Note that this is equivalent to the generic
[u8]::reversemethod. This method is provided to permit callers to explicitly differentiate between reversing bytes, codepoints and graphemes.Examples
Basic usage:
use ByteSlice; let mut s = from; s.reverse_bytes; assert_eq!;fn reverse_chars(self: &mut Self)Reverse the codepoints in this string, in place.
If this byte string is valid UTF-8, then its reversal by codepoint is also guaranteed to be valid UTF-8.
This operation is equivalent to the following, but without allocating:
use ByteSlice; let mut s = from; let mut chars: = s.chars.collect; chars.reverse; let reversed: String = chars.into_iter.collect; assert_eq!;Note that this is not necessarily a well formed operation. For example, if this byte string contains grapheme clusters with more than one codepoint, then those grapheme clusters will not necessarily be preserved. If you'd like to preserve grapheme clusters, then use
reverse_graphemesinstead.Examples
Basic usage:
use ByteSlice; let mut s = from; s.reverse_chars; assert_eq!;This example shows that not all reversals lead to a well formed string. For example, in this case, combining marks are used to put accents over some letters, and those accent marks must appear after the codepoints they modify.
use ; let mut s = from; s.reverse_chars; assert_eq!;A word of warning: the above example relies on the fact that
résuméis in decomposed normal form, which means there are separate codepoints for the accents abovee. If it is instead in composed normal form, then the example works:use ; let mut s = from; s.reverse_chars; assert_eq!;The point here is to be cautious and not assume that just because
reverse_charsworks in one case, that it therefore works in all cases.fn is_ascii(self: &Self) -> boolReturns true if and only if every byte in this byte string is ASCII.
ASCII is an encoding that defines 128 codepoints. A byte corresponds to an ASCII codepoint if and only if it is in the inclusive range
[0, 127].Examples
Basic usage:
use ; assert!; assert!; assert!;fn is_utf8(self: &Self) -> boolReturns true if and only if the entire byte string is valid UTF-8.
If you need location information about where a byte string's first invalid UTF-8 byte is, then use the
to_strmethod.Examples
Basic usage:
use ; assert!; assert!; // invalid bytes assert!; // surrogate encoding assert!; // incomplete sequence assert!; // overlong sequence assert!;fn last_byte(self: &Self) -> Option<u8>Returns the last byte in this byte string, if it's non-empty. If this byte string is empty, this returns
None.Note that this is like the generic
[u8]::last, except this returns the byte by value instead of a reference to the byte.Examples
Basic usage:
use ByteSlice; assert_eq!; assert_eq!;fn find_non_ascii_byte(self: &Self) -> Option<usize>Returns the index of the first non-ASCII byte in this byte string (if any such indices exist). Specifically, it returns the index of the first byte with a value greater than or equal to
0x80.Examples
Basic usage:
use ; assert_eq!; assert_eq!; assert_eq!;
Implementors
impl ByteSlice for [u8]impl<N: usize> ByteSlice for [u8; N]