Struct Config

struct Config { ... }

The configuration used for building a bounded backtracker.

A bounded backtracker configuration is a simple data object that is typically used with Builder::configure.

Implementations

impl Config

fn new() -> Config

Return a new default regex configuration.

fn prefilter(self: Self, pre: Option<Prefilter>) -> Config

Set a prefilter to be used whenever a start state is entered.

A Prefilter in this context is meant to accelerate searches by looking for literal prefixes that every match for the corresponding pattern (or patterns) must start with. Once a prefilter produces a match, the underlying search routine continues on to try and confirm the match.

Be warned that setting a prefilter does not guarantee that the search will be faster. While it's usually a good bet, if the prefilter produces a lot of false positive candidates (i.e., positions matched by the prefilter but not by the regex), then the overall result can be slower than if you had just executed the regex engine without any prefilters.

By default no prefilter is set.

Example

use regex_automata::{
    nfa::thompson::backtrack::BoundedBacktracker,
    util::prefilter::Prefilter,
    Input, Match, MatchKind,
};

let pre = Prefilter::new(MatchKind::LeftmostFirst, &["foo", "bar"]);
let re = BoundedBacktracker::builder()
    .configure(BoundedBacktracker::config().prefilter(pre))
    .build(r"(foo|bar)[a-z]+")?;
let mut cache = re.create_cache();
let input = Input::new("foo1 barfox bar");
assert_eq!(
    Some(Match::must(0, 5..11)),
    re.try_find(&mut cache, input)?,
);

# Ok::<(), Box<dyn std::error::Error>>(())

Be warned though that an incorrect prefilter can lead to incorrect results!

use regex_automata::{
    nfa::thompson::backtrack::BoundedBacktracker,
    util::prefilter::Prefilter,
    Input, HalfMatch, MatchKind,
};

let pre = Prefilter::new(MatchKind::LeftmostFirst, &["foo", "car"]);
let re = BoundedBacktracker::builder()
    .configure(BoundedBacktracker::config().prefilter(pre))
    .build(r"(foo|bar)[a-z]+")?;
let mut cache = re.create_cache();
let input = Input::new("foo1 barfox bar");
// No match reported even though there clearly is one!
assert_eq!(None, re.try_find(&mut cache, input)?);

# Ok::<(), Box<dyn std::error::Error>>(())
fn visited_capacity(self: Self, capacity: usize) -> Config

Set the visited capacity used to bound backtracking.

The visited capacity represents the amount of heap memory (in bytes) to allocate toward tracking which parts of the backtracking search have been done before. The heap memory needed for any particular search is proportional to haystack.len() * nfa.states().len(), which an be quite large. Therefore, the bounded backtracker is typically only able to run on shorter haystacks.

For a given regex, increasing the visited capacity means that the maximum haystack length that can be searched is increased. The BoundedBacktracker::max_haystack_len method returns that maximum.

The default capacity is a reasonable but empirically chosen size.

Example

As with other regex engines, Unicode is what tends to make the bounded backtracker less useful by making the maximum haystack length quite small. If necessary, increasing the visited capacity using this routine will increase the maximum haystack length at the cost of using more memory.

Note though that the specific maximum values here are not an API guarantee. The default visited capacity is subject to change and not covered by semver.

# if cfg!(miri) { return Ok(()); } // miri takes too long
use regex_automata::nfa::thompson::backtrack::BoundedBacktracker;

// Unicode inflates the size of the underlying NFA quite a bit, and
// thus means that the backtracker can only handle smaller haystacks,
// assuming that the visited capacity remains unchanged.
let re = BoundedBacktracker::new(r"\w+")?;
assert!(re.max_haystack_len() <= 7_000);
// But we can increase the visited capacity to handle bigger haystacks!
let re = BoundedBacktracker::builder()
    .configure(BoundedBacktracker::config().visited_capacity(1<<20))
    .build(r"\w+")?;
assert!(re.max_haystack_len() >= 25_000);
assert!(re.max_haystack_len() <= 28_000);
# Ok::<(), Box<dyn std::error::Error>>(())
fn get_prefilter(self: &Self) -> Option<&Prefilter>

Returns the prefilter set in this configuration, if one at all.

fn get_visited_capacity(self: &Self) -> usize

Returns the configured visited capacity.

Note that the actual capacity used may be slightly bigger than the configured capacity.

impl Clone for Config

fn clone(self: &Self) -> Config

impl Debug for Config

fn fmt(self: &Self, f: &mut Formatter<'_>) -> Result

impl Default for Config

fn default() -> Config

impl Freeze for Config

impl RefUnwindSafe for Config

impl Send for Config

impl Sync for Config

impl Unpin for Config

impl UnsafeUnpin for Config

impl UnwindSafe for Config

impl<T> Any for Config

fn type_id(self: &Self) -> TypeId

impl<T> Borrow for Config

fn borrow(self: &Self) -> &T

impl<T> BorrowMut for Config

fn borrow_mut(self: &mut Self) -> &mut T

impl<T> CloneToUninit for Config

unsafe fn clone_to_uninit(self: &Self, dest: *mut u8)

impl<T> From for Config

fn from(t: T) -> T

Returns the argument unchanged.

impl<T> ToOwned for Config

fn to_owned(self: &Self) -> T
fn clone_into(self: &Self, target: &mut T)

impl<T, U> Into for Config

fn into(self: Self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of [From]<T> for U chooses to do.

impl<T, U> TryFrom for Config

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto for Config

fn try_into(self: Self) -> Result<U, <U as TryFrom<T>>::Error>