Struct Pair

struct Pair { ... }

A pair of byte offsets into a needle to use as a predicate.

This pair is used as a predicate to quickly filter out positions in a haystack in which a needle cannot match. In some cases, this pair can even be used in vector algorithms such that the vector algorithm only switches over to scalar code once this pair has been found.

A pair of offsets can be used in both substring search implementations and in prefilters. The former will report matches of a needle in a haystack where as the latter will only report possible matches of a needle.

The offsets are limited each to a maximum of 255 to keep memory usage low. Moreover, it's rarely advantageous to create a predicate using offsets greater than 255 anyway.

The only guarantee enforced on the pair of offsets is that they are not equivalent. It is not necessarily the case that index1 < index2 for example. By convention, index1 corresponds to the byte in the needle that is believed to be most the predictive. Note also that because of the requirement that the indices be both valid for the needle used to build the pair and not equal, it follows that a pair can only be constructed for needles with length at least 2.

Implementations

impl Pair

fn new(needle: &[u8]) -> Option<Pair>

Create a new pair of offsets from the given needle.

If a pair could not be created (for example, if the needle is too short), then None is returned.

This chooses the pair in the needle that is believed to be as predictive of an overall match of the needle as possible.

fn with_ranker<R: HeuristicFrequencyRank>(needle: &[u8], ranker: R) -> Option<Pair>

Create a new pair of offsets from the given needle and ranker.

This permits the caller to choose a background frequency distribution with which bytes are selected. The idea is to select a pair of bytes that is believed to strongly predict a match in the haystack. This usually means selecting bytes that occur rarely in a haystack.

If a pair could not be created (for example, if the needle is too short), then None is returned.

fn with_indices(needle: &[u8], index1: u8, index2: u8) -> Option<Pair>

Create a new pair using the offsets given for the needle given.

This bypasses any sort of heuristic process for choosing the offsets and permits the caller to choose the offsets themselves.

Indices are limited to valid u8 values so that a Pair uses less memory. It is not possible to create a Pair with offsets bigger than u8::MAX. It's likely that such a thing is not needed, but if it is, it's suggested to build your own bespoke algorithm because you're likely working on a very niche case. (File an issue if this suggestion does not make sense to you.)

If a pair could not be created (for example, if the needle is too short), then None is returned.

fn index1(self: &Self) -> u8

Returns the first offset of the pair.

fn index2(self: &Self) -> u8

Returns the second offset of the pair.

impl Clone for Pair

fn clone(self: &Self) -> Pair

impl Copy for Pair

impl Debug for Pair

fn fmt(self: &Self, f: &mut $crate::fmt::Formatter<'_>) -> $crate::fmt::Result

impl Freeze for Pair

impl RefUnwindSafe for Pair

impl Send for Pair

impl Sync for Pair

impl Unpin for Pair

impl UnwindSafe for Pair

impl<T> Any for Pair

fn type_id(self: &Self) -> TypeId

impl<T> Borrow for Pair

fn borrow(self: &Self) -> &T

impl<T> BorrowMut for Pair

fn borrow_mut(self: &mut Self) -> &mut T

impl<T> CloneToUninit for Pair

unsafe fn clone_to_uninit(self: &Self, dest: *mut u8)

impl<T> From for Pair

fn from(t: T) -> T

Returns the argument unchanged.

impl<T> ToOwned for Pair

fn to_owned(self: &Self) -> T
fn clone_into(self: &Self, target: &mut T)

impl<T, U> Into for Pair

fn into(self: Self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of [From]<T> for U chooses to do.

impl<T, U> TryFrom for Pair

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto for Pair

fn try_into(self: Self) -> Result<U, <U as TryFrom<T>>::Error>