Struct Utf8Error

struct Utf8Error { ... }

Errors which can occur when attempting to interpret a sequence of u8 as a string.

As such, the from_utf8 family of functions and methods for both Strings and &strs make use of this error, for example.

Examples

This error type’s methods can be used to create functionality similar to String::from_utf8_lossy without allocating heap memory:

fn from_utf8_lossy<F>(mut input: &[u8], mut push: F) where F: FnMut(&str) {
    loop {
        match std::str::from_utf8(input) {
            Ok(valid) => {
                push(valid);
                break
            }
            Err(error) => {
                let (valid, after_valid) = input.split_at(error.valid_up_to());
                unsafe {
                    push(std::str::from_utf8_unchecked(valid))
                }
                push("\u{FFFD}");

                if let Some(invalid_sequence_length) = error.error_len() {
                    input = &after_valid[invalid_sequence_length..]
                } else {
                    break
                }
            }
        }
    }
}

Implementations

impl Utf8Error

const fn valid_up_to(self: &Self) -> usize

Returns the index in the given string up to which valid UTF-8 was verified.

It is the maximum index such that from_utf8(&input[..index]) would return Ok(_).

Examples

Basic usage:

use std::str;

// some invalid bytes, in a vector
let sparkle_heart = vec![0, 159, 146, 150];

// std::str::from_utf8 returns a Utf8Error
let error = str::from_utf8(&sparkle_heart).unwrap_err();

// the second byte is invalid here
assert_eq!(1, error.valid_up_to());
const fn error_len(self: &Self) -> Option<usize>

Provides more information about the failure:

  • None: the end of the input was reached unexpectedly. self.valid_up_to() is 1 to 3 bytes from the end of the input. If a byte stream (such as a file or a network socket) is being decoded incrementally, this could be a valid char whose UTF-8 byte sequence is spanning multiple chunks.

  • Some(len): an unexpected byte was encountered. The length provided is that of the invalid byte sequence that starts at the index given by valid_up_to(). Decoding should resume after that sequence (after inserting a U+FFFD REPLACEMENT CHARACTER) in case of lossy decoding.

impl Clone for Utf8Error

fn clone(self: &Self) -> Utf8Error

impl Copy for Utf8Error

impl Debug for Utf8Error

fn fmt(self: &Self, f: &mut Formatter<'_>) -> Result

impl Display for Utf8Error

fn fmt(self: &Self, f: &mut Formatter<'_>) -> Result

impl Eq for Utf8Error

impl Error for Utf8Error

impl Freeze for Utf8Error

impl PartialEq for Utf8Error

fn eq(self: &Self, other: &Utf8Error) -> bool

impl RefUnwindSafe for Utf8Error

impl Send for Utf8Error

impl StructuralPartialEq for Utf8Error

impl Sync for Utf8Error

impl Unpin for Utf8Error

impl UnsafeUnpin for Utf8Error

impl UnwindSafe for Utf8Error

impl<T> Any for Utf8Error

fn type_id(self: &Self) -> TypeId

impl<T> Borrow for Utf8Error

fn borrow(self: &Self) -> &T

impl<T> BorrowMut for Utf8Error

fn borrow_mut(self: &mut Self) -> &mut T

impl<T> CloneToUninit for Utf8Error

unsafe fn clone_to_uninit(self: &Self, dest: *mut u8)

impl<T> From for Utf8Error

fn from(t: T) -> T

Returns the argument unchanged.

impl<T, U> Into for Utf8Error

fn into(self: Self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of [From]<T> for U chooses to do.

impl<T, U> TryFrom for Utf8Error

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto for Utf8Error

fn try_into(self: Self) -> Result<U, <U as TryFrom<T>>::Error>