Struct Regex
struct Regex { ... }
A regular expression that uses hybrid NFA/DFAs (also called "lazy DFAs") for searching.
A regular expression is comprised of two lazy DFAs, a "forward" DFA and a "reverse" DFA. The forward DFA is responsible for detecting the end of a match while the reverse DFA is responsible for detecting the start of a match. Thus, in order to find the bounds of any given match, a forward search must first be run followed by a reverse search. A match found by the forward DFA guarantees that the reverse DFA will also find a match.
Fallibility
Most of the search routines defined on this type will panic when the
underlying search fails. This might be because the DFA gave up because it
saw a quit byte, whether configured explicitly or via heuristic Unicode
word boundary support, although neither are enabled by default. It might
also fail if the underlying DFA determines it isn't making effective use of
the cache (which also never happens by default). Or it might fail because
an invalid Input configuration is given, for example, with an unsupported
Anchored mode.
If you need to handle these error cases instead of allowing them to trigger
a panic, then the lower level Regex::try_search provides a fallible API
that never panics.
Example
This example shows how to cause a search to terminate if it sees a
\n byte, and handle the error returned. This could be useful if, for
example, you wanted to prevent a user supplied pattern from matching
across a line boundary.
# if cfg! // miri takes too long
use ;
let re = builder
.dfa
.build?;
let mut cache = re.create_cache;
let input = new;
// Normally this would produce a match, since \p{any} contains '\n'.
// But since we instructed the automaton to enter a quit state if a
// '\n' is observed, this produces a match error instead.
let expected = quit;
let got = re.try_search.unwrap_err;
assert_eq!;
# Ok::
Implementations
impl Regex
fn new(pattern: &str) -> Result<Regex, BuildError>Parse the given regular expression using the default configuration and return the corresponding regex.
If you want a non-default configuration, then use the
Builderto set your own configuration.Example
use ; let re = new?; let mut cache = re.create_cache; assert_eq!; # Ok::fn new_many<P: AsRef<str>>(patterns: &[P]) -> Result<Regex, BuildError>Like
new, but parses multiple patterns into a single "multi regex." This similarly uses the default regex configuration.Example
use ; let re = new_many?; let mut cache = re.create_cache; let mut it = re.find_iter; assert_eq!; assert_eq!; assert_eq!; assert_eq!; assert_eq!; assert_eq!; assert_eq!; # Ok::fn builder() -> BuilderReturn a builder for configuring the construction of a
Regex.This is a convenience routine to avoid needing to import the
Buildertype in common cases.Example
This example shows how to use the builder to disable UTF-8 mode everywhere.
# if cfg! // miri takes too long use ; let re = builder .syntax .thompson .build?; let mut cache = re.create_cache; let haystack = b"\xFEfoo\xFFarzz\xE2\x98\xFF\n"; let expected = Some; let got = re.find; assert_eq!; # Ok::fn create_cache(self: &Self) -> CacheCreate a new cache for this
Regex.The cache returned should only be used for searches for this
Regex. If you want to reuse the cache for anotherRegex, then you must callCache::resetwith thatRegex(or, equivalently,Regex::reset_cache).fn reset_cache(self: &Self, cache: &mut Cache)Reset the given cache such that it can be used for searching with the this
Regex(and only thisRegex).A cache reset permits reusing memory already allocated in this cache with a different
Regex.Resetting a cache sets its "clear count" to 0. This is relevant if the
Regexhas been configured to "give up" after it has cleared the cache a certain number of times.Example
This shows how to re-purpose a cache for use with a different
Regex.# if cfg! // miri takes too long use ; let re1 = new?; let re2 = new?; let mut cache = re1.create_cache; assert_eq!; // Using 'cache' with re2 is not allowed. It may result in panics or // incorrect results. In order to re-purpose the cache, we must reset // it with the Regex we'd like to use it with. // // Similarly, after this reset, using the cache with 're1' is also not // allowed. re2.reset_cache; assert_eq!; # Ok::
impl Regex
fn try_search(self: &Self, cache: &mut Cache, input: &Input<'_>) -> Result<Option<Match>, MatchError>Returns the start and end offset of the leftmost match. If no match exists, then
Noneis returned.This is like
Regex::findbut with two differences:- It is not generic over
Into<Input>and instead accepts a&Input. This permits reusing the sameInputfor multiple searches without needing to create a new one. This may help with latency. - It returns an error if the search could not complete where as
Regex::findwill panic.
Errors
This routine errors if the search could not complete. This can occur in a number of circumstances:
- The configuration of the lazy DFA may permit it to "quit" the search. For example, setting quit bytes or enabling heuristic support for Unicode word boundaries. The default configuration does not enable any option that could result in the lazy DFA quitting.
- The configuration of the lazy DFA may also permit it to "give up" on a search if it makes ineffective use of its transition table cache. The default configuration does not enable this by default, although it is typically a good idea to.
- When the provided
Inputconfiguration is not supported. For example, by providing an unsupported anchor mode.
When a search returns an error, callers cannot know whether a match exists or not.
- It is not generic over
impl Regex
fn forward(self: &Self) -> &DFAReturn the underlying lazy DFA responsible for forward matching.
This is useful for accessing the underlying lazy DFA and using it directly if the situation calls for it.
fn reverse(self: &Self) -> &DFAReturn the underlying lazy DFA responsible for reverse matching.
This is useful for accessing the underlying lazy DFA and using it directly if the situation calls for it.
fn pattern_len(self: &Self) -> usizeReturns the total number of patterns matched by this regex.
Example
# if cfg! // miri takes too long use Regex; let re = new_many?; assert_eq!; # Ok::
impl Regex
fn is_match<'h, I: Into<Input<'h>>>(self: &Self, cache: &mut Cache, input: I) -> boolReturns true if and only if this regex matches the given haystack.
This routine may short circuit if it knows that scanning future input will never lead to a different result. In particular, if the underlying DFA enters a match state or a dead state, then this routine will return
trueorfalse, respectively, without inspecting any future input.Panics
This routine panics if the search could not complete. This can occur in a number of circumstances:
- The configuration of the lazy DFA may permit it to "quit" the search. For example, setting quit bytes or enabling heuristic support for Unicode word boundaries. The default configuration does not enable any option that could result in the lazy DFA quitting.
- The configuration of the lazy DFA may also permit it to "give up" on a search if it makes ineffective use of its transition table cache. The default configuration does not enable this by default, although it is typically a good idea to.
- When the provided
Inputconfiguration is not supported. For example, by providing an unsupported anchor mode.
When a search panics, callers cannot know whether a match exists or not.
Use
Regex::try_searchif you want to handle these error conditions.Example
use Regex; let re = new?; let mut cache = re.create_cache; assert!; assert!; # Ok::fn find<'h, I: Into<Input<'h>>>(self: &Self, cache: &mut Cache, input: I) -> Option<Match>Returns the start and end offset of the leftmost match. If no match exists, then
Noneis returned.Panics
This routine panics if the search could not complete. This can occur in a number of circumstances:
- The configuration of the lazy DFA may permit it to "quit" the search. For example, setting quit bytes or enabling heuristic support for Unicode word boundaries. The default configuration does not enable any option that could result in the lazy DFA quitting.
- The configuration of the lazy DFA may also permit it to "give up" on a search if it makes ineffective use of its transition table cache. The default configuration does not enable this by default, although it is typically a good idea to.
- When the provided
Inputconfiguration is not supported. For example, by providing an unsupported anchor mode.
When a search panics, callers cannot know whether a match exists or not.
Use
Regex::try_searchif you want to handle these error conditions.Example
use ; let re = new?; let mut cache = re.create_cache; assert_eq!; // Even though a match is found after reading the first byte (`a`), // the default leftmost-first match semantics demand that we find the // earliest match that prefers earlier parts of the pattern over latter // parts. let re = new?; let mut cache = re.create_cache; assert_eq!; # Ok::fn find_iter<'r, 'c, 'h, I: Into<Input<'h>>>(self: &'r Self, cache: &'c mut Cache, input: I) -> FindMatches<'r, 'c, 'h>Returns an iterator over all non-overlapping leftmost matches in the given bytes. If no match exists, then the iterator yields no elements.
Panics
This routine panics if the search could not complete. This can occur in a number of circumstances:
- The configuration of the lazy DFA may permit it to "quit" the search. For example, setting quit bytes or enabling heuristic support for Unicode word boundaries. The default configuration does not enable any option that could result in the lazy DFA quitting.
- The configuration of the lazy DFA may also permit it to "give up" on a search if it makes ineffective use of its transition table cache. The default configuration does not enable this by default, although it is typically a good idea to.
- When the provided
Inputconfiguration is not supported. For example, by providing an unsupported anchor mode.
When a search panics, callers cannot know whether a match exists or not.
The above conditions also apply to the iterator returned as well. For example, if the lazy DFA gives up or quits during a search using this method, then a panic will occur during iteration.
Use
Regex::try_searchwithutil::iter::Searcherif you want to handle these error conditions.Example
use ; let re = new?; let mut cache = re.create_cache; let text = "foo1 foo12 foo123"; let matches: = re.find_iter.collect; assert_eq!; # Ok::
impl Debug for Regex
fn fmt(self: &Self, f: &mut Formatter<'_>) -> Result
impl Freeze for Regex
impl RefUnwindSafe for Regex
impl Send for Regex
impl Sync for Regex
impl Unpin for Regex
impl UnsafeUnpin for Regex
impl UnwindSafe for Regex
impl<T> Any for Regex
fn type_id(self: &Self) -> TypeId
impl<T> Borrow for Regex
fn borrow(self: &Self) -> &T
impl<T> BorrowMut for Regex
fn borrow_mut(self: &mut Self) -> &mut T
impl<T> From for Regex
fn from(t: T) -> TReturns the argument unchanged.
impl<T, U> Into for Regex
fn into(self: Self) -> UCalls
U::from(self).That is, this conversion is whatever the implementation of
[From]<T> for Uchooses to do.
impl<T, U> TryFrom for Regex
fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>
impl<T, U> TryInto for Regex
fn try_into(self: Self) -> Result<U, <U as TryFrom<T>>::Error>