Struct Utf8Error

struct Utf8Error { ... }

An error that occurs when UTF-8 decoding fails.

This error occurs when attempting to convert a non-UTF-8 byte string to a Rust string that must be valid UTF-8. For example, to_str is one such method.

Example

This example shows what happens when a given byte sequence is invalid, but ends with a sequence that is a possible prefix of valid UTF-8.

use bstr::{B, ByteSlice};

let s = B(b"foobar\xF1\x80\x80");
let err = s.to_str().unwrap_err();
assert_eq!(err.valid_up_to(), 6);
assert_eq!(err.error_len(), None);

This example shows what happens when a given byte sequence contains invalid UTF-8.

use bstr::ByteSlice;

let s = b"foobar\xF1\x80\x80quux";
let err = s.to_str().unwrap_err();
assert_eq!(err.valid_up_to(), 6);
// The error length reports the maximum number of bytes that correspond to
// a valid prefix of a UTF-8 encoded codepoint.
assert_eq!(err.error_len(), Some(3));

// In contrast to the above which contains a single invalid prefix,
// consider the case of multiple individual bytes that are never valid
// prefixes. Note how the value of error_len changes!
let s = b"foobar\xFF\xFFquux";
let err = s.to_str().unwrap_err();
assert_eq!(err.valid_up_to(), 6);
assert_eq!(err.error_len(), Some(1));

// The fact that it's an invalid prefix does not change error_len even
// when it immediately precedes the end of the string.
let s = b"foobar\xFF";
let err = s.to_str().unwrap_err();
assert_eq!(err.valid_up_to(), 6);
assert_eq!(err.error_len(), Some(1));

Implementations

impl Utf8Error

fn valid_up_to(self: &Self) -> usize

Returns the byte index of the position immediately following the last valid UTF-8 byte.

Example

This examples shows how valid_up_to can be used to retrieve a possibly empty prefix that is guaranteed to be valid UTF-8:

use bstr::ByteSlice;

let s = b"foobar\xF1\x80\x80quux";
let err = s.to_str().unwrap_err();

// This is guaranteed to never panic.
let string = s[..err.valid_up_to()].to_str().unwrap();
assert_eq!(string, "foobar");
fn error_len(self: &Self) -> Option<usize>

Returns the total number of invalid UTF-8 bytes immediately following the position returned by valid_up_to. This value is always at least 1, but can be up to 3 if bytes form a valid prefix of some UTF-8 encoded codepoint.

If the end of the original input was found before a valid UTF-8 encoded codepoint could be completed, then this returns None. This is useful when processing streams, where a None value signals that more input might be needed.

impl Clone for Utf8Error

fn clone(self: &Self) -> Utf8Error

impl Debug for Utf8Error

fn fmt(self: &Self, f: &mut Formatter<'_>) -> Result

impl Display for Utf8Error

fn fmt(self: &Self, f: &mut Formatter<'_>) -> Result

impl Eq for Utf8Error

impl Error for Utf8Error

fn description(self: &Self) -> &str

impl Freeze for Utf8Error

impl PartialEq for Utf8Error

fn eq(self: &Self, other: &Utf8Error) -> bool

impl RefUnwindSafe for Utf8Error

impl Send for Utf8Error

impl StructuralPartialEq for Utf8Error

impl Sync for Utf8Error

impl Unpin for Utf8Error

impl UnsafeUnpin for Utf8Error

impl UnwindSafe for Utf8Error

impl<T> Any for Utf8Error

fn type_id(self: &Self) -> TypeId

impl<T> Borrow for Utf8Error

fn borrow(self: &Self) -> &T

impl<T> BorrowMut for Utf8Error

fn borrow_mut(self: &mut Self) -> &mut T

impl<T> CloneToUninit for Utf8Error

unsafe fn clone_to_uninit(self: &Self, dest: *mut u8)

impl<T> From for Utf8Error

fn from(t: T) -> T

Returns the argument unchanged.

impl<T> ToOwned for Utf8Error

fn to_owned(self: &Self) -> T
fn clone_into(self: &Self, target: &mut T)

impl<T> ToString for Utf8Error

fn to_string(self: &Self) -> String

impl<T, U> Into for Utf8Error

fn into(self: Self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of [From]<T> for U chooses to do.

impl<T, U> TryFrom for Utf8Error

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto for Utf8Error

fn try_into(self: Self) -> Result<U, <U as TryFrom<T>>::Error>