Struct LookMatcher

struct LookMatcher { ... }

A matcher for look-around assertions.

This matcher permits configuring aspects of how look-around assertions are matched.

Example

A LookMatcher can change the line terminator used for matching multi-line anchors such as (?m:^) and (?m:$).

use regex_automata::{
    nfa::thompson::{self, pikevm::PikeVM},
    util::look::LookMatcher,
    Match, Input,
};

let mut lookm = LookMatcher::new();
lookm.set_line_terminator(b'\x00');

let re = PikeVM::builder()
    .thompson(thompson::Config::new().look_matcher(lookm))
    .build(r"(?m)^[a-z]+$")?;
let mut cache = re.create_cache();

// Multi-line assertions now use NUL as a terminator.
assert_eq!(
    Some(Match::must(0, 1..4)),
    re.find(&mut cache, b"\x00abc\x00"),
);
// ... and \n is no longer recognized as a terminator.
assert_eq!(
    None,
    re.find(&mut cache, b"\nabc\n"),
);

# Ok::<(), Box<dyn std::error::Error>>(())

Implementations

impl LookMatcher

fn new() -> LookMatcher

Creates a new default matcher for look-around assertions.

fn set_line_terminator(self: &mut Self, byte: u8) -> &mut LookMatcher

Sets the line terminator for use with (?m:^) and (?m:$).

Namely, instead of ^ matching after \n and $ matching immediately before a \n, this will cause it to match after and before the byte given.

It can occasionally be useful to use this to configure the line terminator to the NUL byte when searching binary data.

Note that this does not apply to CRLF-aware line anchors such as (?Rm:^) and (?Rm:$). CRLF-aware line anchors are hard-coded to use \r and \n.

fn get_line_terminator(self: &Self) -> u8

Returns the line terminator that was configured for this matcher.

If no line terminator was configured, then this returns \n.

Note that the line terminator should only be used for matching (?m:^) and (?m:$) assertions. It specifically should not be used for matching the CRLF aware assertions (?Rm:^) and (?Rm:$).

fn matches(self: &Self, look: Look, haystack: &[u8], at: usize) -> bool

Returns true when the position at in haystack satisfies the given look-around assertion.

Panics

This panics when testing any Unicode word boundary assertion in this set and when the Unicode word data is not available. Specifically, this only occurs when the unicode-word-boundary feature is not enabled.

Since it's generally expected that this routine is called inside of a matching engine, callers should check the error condition when building the matching engine. If there is a Unicode word boundary in the matcher and the data isn't available, then the matcher should fail to build.

Callers can check the error condition with LookSet::available.

This also may panic when at > haystack.len(). Note that at == haystack.len() is legal and guaranteed not to panic.

fn matches_set(self: &Self, set: LookSet, haystack: &[u8], at: usize) -> bool

Returns true when all of the assertions in the given set match at the given position in the haystack.

Panics

This panics when testing any Unicode word boundary assertion in this set and when the Unicode word data is not available. Specifically, this only occurs when the unicode-word-boundary feature is not enabled.

Since it's generally expected that this routine is called inside of a matching engine, callers should check the error condition when building the matching engine. If there is a Unicode word boundary in the matcher and the data isn't available, then the matcher should fail to build.

Callers can check the error condition with LookSet::available.

This also may panic when at > haystack.len(). Note that at == haystack.len() is legal and guaranteed not to panic.

fn is_start(self: &Self, _haystack: &[u8], at: usize) -> bool

Returns true when Look::Start is satisfied at the given position in haystack.

Panics

This may panic when at > haystack.len(). Note that at == haystack.len() is legal and guaranteed not to panic.

fn is_end(self: &Self, haystack: &[u8], at: usize) -> bool

Returns true when Look::End is satisfied at the given position in haystack.

Panics

This may panic when at > haystack.len(). Note that at == haystack.len() is legal and guaranteed not to panic.

fn is_start_lf(self: &Self, haystack: &[u8], at: usize) -> bool

Returns true when Look::StartLF is satisfied at the given position in haystack.

Panics

This may panic when at > haystack.len(). Note that at == haystack.len() is legal and guaranteed not to panic.

fn is_end_lf(self: &Self, haystack: &[u8], at: usize) -> bool

Returns true when Look::EndLF is satisfied at the given position in haystack.

Panics

This may panic when at > haystack.len(). Note that at == haystack.len() is legal and guaranteed not to panic.

fn is_start_crlf(self: &Self, haystack: &[u8], at: usize) -> bool

Returns true when Look::StartCRLF is satisfied at the given position in haystack.

Panics

This may panic when at > haystack.len(). Note that at == haystack.len() is legal and guaranteed not to panic.

fn is_end_crlf(self: &Self, haystack: &[u8], at: usize) -> bool

Returns true when Look::EndCRLF is satisfied at the given position in haystack.

Panics

This may panic when at > haystack.len(). Note that at == haystack.len() is legal and guaranteed not to panic.

fn is_word_ascii(self: &Self, haystack: &[u8], at: usize) -> bool

Returns true when Look::WordAscii is satisfied at the given position in haystack.

Panics

This may panic when at > haystack.len(). Note that at == haystack.len() is legal and guaranteed not to panic.

fn is_word_ascii_negate(self: &Self, haystack: &[u8], at: usize) -> bool

Returns true when Look::WordAsciiNegate is satisfied at the given position in haystack.

Panics

This may panic when at > haystack.len(). Note that at == haystack.len() is legal and guaranteed not to panic.

fn is_word_unicode(self: &Self, haystack: &[u8], at: usize) -> Result<bool, UnicodeWordBoundaryError>

Returns true when Look::WordUnicode is satisfied at the given position in haystack.

Panics

This may panic when at > haystack.len(). Note that at == haystack.len() is legal and guaranteed not to panic.

Errors

This returns an error when Unicode word boundary tables are not available. Specifically, this only occurs when the unicode-word-boundary feature is not enabled.

fn is_word_unicode_negate(self: &Self, haystack: &[u8], at: usize) -> Result<bool, UnicodeWordBoundaryError>

Returns true when Look::WordUnicodeNegate is satisfied at the given position in haystack.

Panics

This may panic when at > haystack.len(). Note that at == haystack.len() is legal and guaranteed not to panic.

Errors

This returns an error when Unicode word boundary tables are not available. Specifically, this only occurs when the unicode-word-boundary feature is not enabled.

fn is_word_start_ascii(self: &Self, haystack: &[u8], at: usize) -> bool

Returns true when Look::WordStartAscii is satisfied at the given position in haystack.

Panics

This may panic when at > haystack.len(). Note that at == haystack.len() is legal and guaranteed not to panic.

fn is_word_end_ascii(self: &Self, haystack: &[u8], at: usize) -> bool

Returns true when Look::WordEndAscii is satisfied at the given position in haystack.

Panics

This may panic when at > haystack.len(). Note that at == haystack.len() is legal and guaranteed not to panic.

fn is_word_start_unicode(self: &Self, haystack: &[u8], at: usize) -> Result<bool, UnicodeWordBoundaryError>

Returns true when Look::WordStartUnicode is satisfied at the given position in haystack.

Panics

This may panic when at > haystack.len(). Note that at == haystack.len() is legal and guaranteed not to panic.

Errors

This returns an error when Unicode word boundary tables are not available. Specifically, this only occurs when the unicode-word-boundary feature is not enabled.

fn is_word_end_unicode(self: &Self, haystack: &[u8], at: usize) -> Result<bool, UnicodeWordBoundaryError>

Returns true when Look::WordEndUnicode is satisfied at the given position in haystack.

Panics

This may panic when at > haystack.len(). Note that at == haystack.len() is legal and guaranteed not to panic.

Errors

This returns an error when Unicode word boundary tables are not available. Specifically, this only occurs when the unicode-word-boundary feature is not enabled.

fn is_word_start_half_ascii(self: &Self, haystack: &[u8], at: usize) -> bool

Returns true when Look::WordStartHalfAscii is satisfied at the given position in haystack.

Panics

This may panic when at > haystack.len(). Note that at == haystack.len() is legal and guaranteed not to panic.

fn is_word_end_half_ascii(self: &Self, haystack: &[u8], at: usize) -> bool

Returns true when Look::WordEndHalfAscii is satisfied at the given position in haystack.

Panics

This may panic when at > haystack.len(). Note that at == haystack.len() is legal and guaranteed not to panic.

fn is_word_start_half_unicode(self: &Self, haystack: &[u8], at: usize) -> Result<bool, UnicodeWordBoundaryError>

Returns true when Look::WordStartHalfUnicode is satisfied at the given position in haystack.

Panics

This may panic when at > haystack.len(). Note that at == haystack.len() is legal and guaranteed not to panic.

Errors

This returns an error when Unicode word boundary tables are not available. Specifically, this only occurs when the unicode-word-boundary feature is not enabled.

fn is_word_end_half_unicode(self: &Self, haystack: &[u8], at: usize) -> Result<bool, UnicodeWordBoundaryError>

Returns true when Look::WordEndHalfUnicode is satisfied at the given position in haystack.

Panics

This may panic when at > haystack.len(). Note that at == haystack.len() is legal and guaranteed not to panic.

Errors

This returns an error when Unicode word boundary tables are not available. Specifically, this only occurs when the unicode-word-boundary feature is not enabled.

impl Clone for LookMatcher

fn clone(self: &Self) -> LookMatcher

impl Debug for LookMatcher

fn fmt(self: &Self, f: &mut Formatter<'_>) -> Result

impl Default for LookMatcher

fn default() -> LookMatcher

impl Freeze for LookMatcher

impl RefUnwindSafe for LookMatcher

impl Send for LookMatcher

impl Sync for LookMatcher

impl Unpin for LookMatcher

impl UnsafeUnpin for LookMatcher

impl UnwindSafe for LookMatcher

impl<T> Any for LookMatcher

fn type_id(self: &Self) -> TypeId

impl<T> Borrow for LookMatcher

fn borrow(self: &Self) -> &T

impl<T> BorrowMut for LookMatcher

fn borrow_mut(self: &mut Self) -> &mut T

impl<T> CloneToUninit for LookMatcher

unsafe fn clone_to_uninit(self: &Self, dest: *mut u8)

impl<T> From for LookMatcher

fn from(t: T) -> T

Returns the argument unchanged.

impl<T> ToOwned for LookMatcher

fn to_owned(self: &Self) -> T
fn clone_into(self: &Self, target: &mut T)

impl<T, U> Into for LookMatcher

fn into(self: Self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of [From]<T> for U chooses to do.

impl<T, U> TryFrom for LookMatcher

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto for LookMatcher

fn try_into(self: Self) -> Result<U, <U as TryFrom<T>>::Error>