Struct Properties
struct Properties(_)
A type that collects various properties of an HIR value.
Properties are always scalar values and represent meta data that is computed inductively on an HIR value. Properties are defined for all HIR values.
All methods on a Properties value take constant time and are meant to
be cheap to call.
Implementations
impl Properties
fn minimum_len(self: &Self) -> Option<usize>Returns the length (in bytes) of the smallest string matched by this HIR.
A return value of
0is possible and occurs when the HIR can match an empty string.Noneis returned when there is no minimum length. This occurs in precisely the cases where the HIR matches nothing. i.e., The language the regex matches is empty. An example of such a regex is\P{any}.fn maximum_len(self: &Self) -> Option<usize>Returns the length (in bytes) of the longest string matched by this HIR.
A return value of
0is possible and occurs when nothing longer than the empty string is in the language described by this HIR.Noneis returned when there is no longest matching string. This occurs when the HIR matches nothing or when there is no upper bound on the length of matching strings. Example of such regexes are\P{any}(matches nothing) anda+(has no upper bound).fn look_set(self: &Self) -> LookSetReturns a set of all look-around assertions that appear at least once in this HIR value.
fn look_set_prefix(self: &Self) -> LookSetReturns a set of all look-around assertions that appear as a prefix for this HIR value. That is, the set returned corresponds to the set of assertions that must be passed before matching any bytes in a haystack.
For example,
hir.look_set_prefix().contains(Look::Start)returns true if and only if the HIR is fully anchored at the start.fn look_set_prefix_any(self: &Self) -> LookSetReturns a set of all look-around assertions that appear as a possible prefix for this HIR value. That is, the set returned corresponds to the set of assertions that may be passed before matching any bytes in a haystack.
For example,
hir.look_set_prefix_any().contains(Look::Start)returns true if and only if it's possible for the regex to match through a anchored assertion before consuming any input.fn look_set_suffix(self: &Self) -> LookSetReturns a set of all look-around assertions that appear as a suffix for this HIR value. That is, the set returned corresponds to the set of assertions that must be passed in order to be considered a match after all other consuming HIR expressions.
For example,
hir.look_set_suffix().contains(Look::End)returns true if and only if the HIR is fully anchored at the end.fn look_set_suffix_any(self: &Self) -> LookSetReturns a set of all look-around assertions that appear as a possible suffix for this HIR value. That is, the set returned corresponds to the set of assertions that may be passed before matching any bytes in a haystack.
For example,
hir.look_set_suffix_any().contains(Look::End)returns true if and only if it's possible for the regex to match through a anchored assertion at the end of a match without consuming any input.fn is_utf8(self: &Self) -> boolReturn true if and only if the corresponding HIR will always match valid UTF-8.
When this returns false, then it is possible for this HIR expression to match invalid UTF-8, including by matching between the code units of a single UTF-8 encoded codepoint.
Note that this returns true even when the corresponding HIR can match the empty string. Since an empty string can technically appear between UTF-8 code units, it is possible for a match to be reported that splits a codepoint which could in turn be considered matching invalid UTF-8. However, it is generally assumed that such empty matches are handled specially by the search routine if it is absolutely required that matches not split a codepoint.
Example
This code example shows the UTF-8 property of a variety of patterns.
use ; // Examples of 'is_utf8() == true'. assert!; assert!; assert!; assert!; assert!; assert!; assert!; assert!; // Unicode mode is enabled by default, and in // that mode, all \x hex escapes are treated as // codepoints. So this actually matches the UTF-8 // encoding of U+00FF. assert!; // Now we show examples of 'is_utf8() == false'. // The only way to do this is to force the parser // to permit invalid UTF-8, otherwise all of these // would fail to parse! let parse = ; assert!; assert!; assert!; // Conversely to the equivalent example above, // when Unicode mode is disabled, \x hex escapes // are treated as their raw byte values. assert!; // Note that just because we disabled UTF-8 in the // parser doesn't mean we still can't use Unicode. // It is enabled by default, so \xFF is still // equivalent to matching the UTF-8 encoding of // U+00FF by default. assert!; // Even though we use raw bytes that individually // are not valid UTF-8, when combined together, the // overall expression *does* match valid UTF-8! assert!; # Ok::fn explicit_captures_len(self: &Self) -> usizeReturns the total number of explicit capturing groups in the corresponding HIR.
Note that this does not include the implicit capturing group corresponding to the entire match that is typically included by regex engines.
Example
This method will return
0foraand1for(a):use parse; assert_eq!; assert_eq!; # Ok::fn static_explicit_captures_len(self: &Self) -> Option<usize>Returns the total number of explicit capturing groups that appear in every possible match.
If the number of capture groups can vary depending on the match, then this returns
None. That is, a value is only returned when the number of matching groups is invariant or "static."Note that this does not include the implicit capturing group corresponding to the entire match.
Example
This shows a few cases where a static number of capture groups is available and a few cases where it is not.
use parse; let len = ; assert_eq!; assert_eq!; assert_eq!; assert_eq!; assert_eq!; assert_eq!; assert_eq!; assert_eq!; # Ok::fn is_literal(self: &Self) -> boolReturn true if and only if this HIR is a simple literal. This is only true when this HIR expression is either itself a
Literalor a concatenation of onlyLiterals.For example,
fandfooare literals, butf+,(foo),foo()and the empty string are not (even though they contain sub-expressions that are literals).fn is_alternation_literal(self: &Self) -> boolReturn true if and only if this HIR is either a simple literal or an alternation of simple literals. This is only true when this HIR expression is either itself a
Literalor a concatenation of onlyLiterals or an alternation of onlyLiterals.For example,
f,foo,a|b|c, andfoo|bar|bazare alternation literals, butf+,(foo),foo(), and the empty pattern are not (even though that contain sub-expressions that are literals).fn memory_usage(self: &Self) -> usizeReturns the total amount of heap memory usage, in bytes, used by this
Propertiesvalue.fn union<I, P>(props: I) -> Properties where I: IntoIterator<Item = P>, P: Borrow<Properties>Returns a new set of properties that corresponds to the union of the iterator of properties given.
This is useful when one has multiple
Hirexpressions and wants to combine them into a single alternation without constructing the correspondingHir. This routine provides a way of combining the properties of eachHirexpression into one set of properties representing the union of those expressions.Example: union with HIRs that never match
This example shows that unioning properties together with one that represents a regex that never matches will "poison" certain attributes, like the minimum and maximum lengths.
use ; let hir1 = parse?; assert_eq!; assert_eq!; let hir2 = parse?; assert_eq!; assert_eq!; let hir3 = parse?; assert_eq!; assert_eq!; let unioned = union; assert_eq!; assert_eq!; # Ok::The maximum length can also be "poisoned" by a pattern that has no upper bound on the length of a match. The minimum length remains unaffected:
use ; let hir1 = parse?; assert_eq!; assert_eq!; let hir2 = parse?; assert_eq!; assert_eq!; let hir3 = parse?; assert_eq!; assert_eq!; let unioned = union; assert_eq!; assert_eq!; # Ok::
impl Clone for Properties
fn clone(self: &Self) -> Properties
impl Debug for Properties
fn fmt(self: &Self, f: &mut Formatter<'_>) -> Result
impl Eq for Properties
impl Freeze for Properties
impl PartialEq for Properties
fn eq(self: &Self, other: &Properties) -> bool
impl RefUnwindSafe for Properties
impl Send for Properties
impl StructuralPartialEq for Properties
impl Sync for Properties
impl Unpin for Properties
impl UnsafeUnpin for Properties
impl UnwindSafe for Properties
impl<T> Any for Properties
fn type_id(self: &Self) -> TypeId
impl<T> Borrow for Properties
fn borrow(self: &Self) -> &T
impl<T> BorrowMut for Properties
fn borrow_mut(self: &mut Self) -> &mut T
impl<T> CloneToUninit for Properties
unsafe fn clone_to_uninit(self: &Self, dest: *mut u8)
impl<T> From for Properties
fn from(t: T) -> TReturns the argument unchanged.
impl<T> ToOwned for Properties
fn to_owned(self: &Self) -> Tfn clone_into(self: &Self, target: &mut T)
impl<T, U> Into for Properties
fn into(self: Self) -> UCalls
U::from(self).That is, this conversion is whatever the implementation of
[From]<T> for Uchooses to do.
impl<T, U> TryFrom for Properties
fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>
impl<T, U> TryInto for Properties
fn try_into(self: Self) -> Result<U, <U as TryFrom<T>>::Error>