Struct ByteClasses
struct ByteClasses(_)
A representation of byte oriented equivalence classes.
This is used in a DFA to reduce the size of the transition table. This can have a particularly large impact not only on the total size of a dense DFA, but also on compile times.
The essential idea here is that the alphabet of a DFA is shrunk from the usual 256 distinct byte values down to a set of equivalence classes. The guarantee you get is that any byte belonging to the same equivalence class can be treated as if it were any other byte in the same class, and the result of a search wouldn't change.
Example
This example shows how to get byte classes from an
NFA and ask for the class of various bytes.
use NFA;
let nfa = NFAnew?;
let classes = nfa.byte_classes;
// 'a' and 'z' are in the same class for this regex.
assert_eq!;
// But 'a' and 'A' are not.
assert_ne!;
# Ok::
Implementations
impl ByteClasses
fn empty() -> ByteClassesCreates a new set of equivalence classes where all bytes are mapped to the same class.
fn singletons() -> ByteClassesCreates a new set of equivalence classes where each byte belongs to its own equivalence class.
fn set(self: &mut Self, byte: u8, class: u8)Set the equivalence class for the given byte.
fn get(self: &Self, byte: u8) -> u8Get the equivalence class for the given byte.
fn get_by_unit(self: &Self, unit: Unit) -> usizeGet the equivalence class for the given haystack unit and return the class as a
usize.fn eoi(self: &Self) -> UnitCreate a unit that represents the "end of input" sentinel based on the number of equivalence classes.
fn alphabet_len(self: &Self) -> usizeReturn the total number of elements in the alphabet represented by these equivalence classes. Equivalently, this returns the total number of equivalence classes.
fn stride2(self: &Self) -> usizeReturns the stride, as a base-2 exponent, required for these equivalence classes.
The stride is always the smallest power of 2 that is greater than or equal to the alphabet length, and the
stride2returned here is the exponent applied to2to get the smallest power. This is done so that converting between premultiplied state IDs and indices can be done with shifts alone, which is much faster than integer division.fn is_singleton(self: &Self) -> boolReturns true if and only if every byte in this class maps to its own equivalence class. Equivalently, there are 257 equivalence classes and each class contains either exactly one byte or corresponds to the singleton class containing the "end of input" sentinel.
fn iter(self: &Self) -> ByteClassIter<'_>Returns an iterator over all equivalence classes in this set.
fn representatives<R: core::ops::RangeBounds<u8>>(self: &Self, range: R) -> ByteClassRepresentatives<'_>Returns an iterator over a sequence of representative bytes from each equivalence class within the range of bytes given.
When the given range is unbounded on both sides, the iterator yields exactly N items, where N is equivalent to the number of equivalence classes. Each item is an arbitrary byte drawn from each equivalence class.
This is useful when one is determinizing an NFA and the NFA's alphabet hasn't been converted to equivalence classes. Picking an arbitrary byte from each equivalence class then permits a full exploration of the NFA instead of using every possible byte value and thus potentially saves quite a lot of redundant work.
Example
This shows an example of what a complete sequence of representatives might look like from a real example.
use ; let nfa = NFAnew?; let classes = nfa.byte_classes; let reps: = classes.representatives.collect; // Note that the specific byte values yielded are not guaranteed! let expected = vec!; assert_eq!; # Ok::Note though, that you can ask for an arbitrary range of bytes, and only representatives for that range will be returned:
use ; let nfa = NFAnew?; let classes = nfa.byte_classes; let reps: = classes.representatives.collect; // Note that the specific byte values yielded are not guaranteed! let expected = vec!; assert_eq!; # Ok::fn elements(self: &Self, class: Unit) -> ByteClassElements<'_>Returns an iterator of the bytes in the given equivalence class.
This is useful when one needs to know the actual bytes that belong to an equivalence class. For example, conceptually speaking, accelerating a DFA state occurs when a state only has a few outgoing transitions. But in reality, what is required is that there are only a small number of distinct bytes that can lead to an outgoing transition. The difference is that any one transition can correspond to an equivalence class which may contains many bytes. Therefore, DFA state acceleration considers the actual elements in each equivalence class of each outgoing transition.
Example
This shows an example of how to get all of the elements in an equivalence class.
use ; let nfa = NFAnew?; let classes = nfa.byte_classes; let elements: = classes.elements.collect; let expected: = .map.collect; assert_eq!; # Ok::
impl Clone for ByteClasses
fn clone(self: &Self) -> ByteClasses
impl Copy for ByteClasses
impl Debug for ByteClasses
fn fmt(self: &Self, f: &mut Formatter<'_>) -> Result
impl Default for ByteClasses
fn default() -> ByteClasses
impl Freeze for ByteClasses
impl RefUnwindSafe for ByteClasses
impl Send for ByteClasses
impl Sync for ByteClasses
impl Unpin for ByteClasses
impl UnsafeUnpin for ByteClasses
impl UnwindSafe for ByteClasses
impl<T> Any for ByteClasses
fn type_id(self: &Self) -> TypeId
impl<T> Borrow for ByteClasses
fn borrow(self: &Self) -> &T
impl<T> BorrowMut for ByteClasses
fn borrow_mut(self: &mut Self) -> &mut T
impl<T> CloneToUninit for ByteClasses
unsafe fn clone_to_uninit(self: &Self, dest: *mut u8)
impl<T> From for ByteClasses
fn from(t: T) -> TReturns the argument unchanged.
impl<T> ToOwned for ByteClasses
fn to_owned(self: &Self) -> Tfn clone_into(self: &Self, target: &mut T)
impl<T, U> Into for ByteClasses
fn into(self: Self) -> UCalls
U::from(self).That is, this conversion is whatever the implementation of
[From]<T> for Uchooses to do.
impl<T, U> TryFrom for ByteClasses
fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>
impl<T, U> TryInto for ByteClasses
fn try_into(self: Self) -> Result<U, <U as TryFrom<T>>::Error>