Struct CodePointSetData

struct CodePointSetData { ... }

A set of Unicode code points. Access its data via the borrowed version, CodePointSetDataBorrowed.

Example

use icu::properties::CodePointSetData;
use icu::properties::props::Alphabetic;

let alphabetic = CodePointSetData::new::<Alphabetic>();

assert!(!alphabetic.contains('3'));
assert!(!alphabetic.contains('เฉฉ'));  // U+0A69 GURMUKHI DIGIT THREE
assert!(alphabetic.contains('A'));
assert!(alphabetic.contains('ร„'));  // U+00C4 LATIN CAPITAL LETTER A WITH DIAERESIS

Implementations

impl CodePointSetData

fn new_for_ecma262(prop: &[u8]) -> Option<CodePointSetDataBorrowed<'static>>

Returns a type capable of looking up values for a property specified as a string, as long as it is a binary property listed in ECMA-262, using strict matching on the names in the spec.

This handles every property required by ECMA-262 /u regular expressions, except for:

  • Script and General_Category: handle these directly using property values parsed via [PropertyParser<GeneralCategory>] and [PropertyParser<Script>] if necessary.
  • Script_Extensions: handle this directly using APIs from crate::script::ScriptWithExtensions
  • General_Category mask values: Handle this alongside General_Category using GeneralCategoryGroup, using property values parsed via [PropertyParser<GeneralCategory>] if necessary
  • Assigned, All, and ASCII pseudoproperties: Handle these using their equivalent sets:
    • Any can be expressed as the range [\u{0}-\u{10FFFF}]
    • Assigned can be expressed as the inverse of the set gc=Cn (i.e., \P{gc=Cn}).
    • ASCII can be expressed as the range [\u{0}-\u{7F}]
  • General_Category property values can themselves be treated like properties using a shorthand in ECMA262, simply create the corresponding GeneralCategory set.

โœจ Enabled with the compiled_data Cargo feature.

๐Ÿ“š Help choosing a constructor

use icu::properties::CodePointSetData;

let emoji = CodePointSetData::new_for_ecma262(b"Emoji")
    .expect("is an ECMA-262 property");

assert!(emoji.contains('๐Ÿ”ฅ')); // U+1F525 FIRE
assert!(!emoji.contains('V'));
fn try_new_for_ecma262_unstable<P>(provider: &P, prop: &[u8]) -> Option<Result<Self, DataError>>
where
    P: ?Sized + DataProvider<PropertyBinaryAsciiHexDigitV1> + DataProvider<PropertyBinaryAlphabeticV1> + DataProvider<PropertyBinaryBidiControlV1> + DataProvider<PropertyBinaryBidiMirroredV1> + DataProvider<PropertyBinaryCaseIgnorableV1> + DataProvider<PropertyBinaryCasedV1> + DataProvider<PropertyBinaryChangesWhenCasefoldedV1> + DataProvider<PropertyBinaryChangesWhenCasemappedV1> + DataProvider<PropertyBinaryChangesWhenLowercasedV1> + DataProvider<PropertyBinaryChangesWhenNfkcCasefoldedV1> + DataProvider<PropertyBinaryChangesWhenTitlecasedV1> + DataProvider<PropertyBinaryChangesWhenUppercasedV1> + DataProvider<PropertyBinaryDashV1> + DataProvider<PropertyBinaryDefaultIgnorableCodePointV1> + DataProvider<PropertyBinaryDeprecatedV1> + DataProvider<PropertyBinaryDiacriticV1> + DataProvider<PropertyBinaryEmojiV1> + DataProvider<PropertyBinaryEmojiComponentV1> + DataProvider<PropertyBinaryEmojiModifierV1> + DataProvider<PropertyBinaryEmojiModifierBaseV1> + DataProvider<PropertyBinaryEmojiPresentationV1> + DataProvider<PropertyBinaryExtendedPictographicV1> + DataProvider<PropertyBinaryExtenderV1> + DataProvider<PropertyBinaryGraphemeBaseV1> + DataProvider<PropertyBinaryGraphemeExtendV1> + DataProvider<PropertyBinaryHexDigitV1> + DataProvider<PropertyBinaryIdsBinaryOperatorV1> + DataProvider<PropertyBinaryIdsTrinaryOperatorV1> + DataProvider<PropertyBinaryIdContinueV1> + DataProvider<PropertyBinaryIdStartV1> + DataProvider<PropertyBinaryIdeographicV1> + DataProvider<PropertyBinaryJoinControlV1> + DataProvider<PropertyBinaryLogicalOrderExceptionV1> + DataProvider<PropertyBinaryLowercaseV1> + DataProvider<PropertyBinaryMathV1> + DataProvider<PropertyBinaryNoncharacterCodePointV1> + DataProvider<PropertyBinaryPatternSyntaxV1> + DataProvider<PropertyBinaryPatternWhiteSpaceV1> + DataProvider<PropertyBinaryQuotationMarkV1> + DataProvider<PropertyBinaryRadicalV1> + DataProvider<PropertyBinaryRegionalIndicatorV1> + DataProvider<PropertyBinarySentenceTerminalV1> + DataProvider<PropertyBinarySoftDottedV1> + DataProvider<PropertyBinaryTerminalPunctuationV1> + DataProvider<PropertyBinaryUnifiedIdeographV1> + DataProvider<PropertyBinaryUppercaseV1> + DataProvider<PropertyBinaryVariationSelectorV1> + DataProvider<PropertyBinaryWhiteSpaceV1> + DataProvider<PropertyBinaryXidContinueV1> + DataProvider<PropertyBinaryXidStartV1>

A version of Self::new_for_ecma262 that uses custom data provided by a DataProvider.

๐Ÿ“š Help choosing a constructor

โš ๏ธ The bounds on provider may change over time, including in SemVer minor releases.

impl CodePointSetData

const fn new<P: BinaryProperty>() -> CodePointSetDataBorrowed<'static>

Creates a new CodePointSetDataBorrowed for a BinaryProperty.

โœจ Enabled with the compiled_data Cargo feature.

๐Ÿ“š Help choosing a constructor

fn try_new_unstable<P: BinaryProperty, impl DataProvider<P::DataMarker> + ?Sized: DataProvider<<P as >::DataMarker> + ?Sized>(provider: &impl DataProvider<<P as >::DataMarker> + ?Sized) -> Result<CodePointSetData, DataError>

A version of Self::new that uses custom data provided by a DataProvider.

๐Ÿ“š Help choosing a constructor

โš ๏ธ The bounds on provider may change over time, including in SemVer minor releases.
fn as_borrowed(self: &Self) -> CodePointSetDataBorrowed<'_>

Construct a borrowed version of this type that can be queried.

This owned version if returned by functions that use a runtime data provider.

fn from_code_point_inversion_list(set: CodePointInversionList<'static>) -> Self

Construct a new owned CodePointInversionList

fn as_code_point_inversion_list(self: &Self) -> Option<&CodePointInversionList<'_>>

Convert this type to a CodePointInversionList as a borrowed value.

The data backing this is extensible and supports multiple implementations. Currently it is always CodePointInversionList; however in the future more backends may be added, and users may select which at data generation time.

This method returns an Option in order to return None when the backing data provider cannot return a CodePointInversionList, or cannot do so within the expected constant time constraint.

fn to_code_point_inversion_list(self: &Self) -> CodePointInversionList<'_>

Convert this type to a CodePointInversionList, borrowing if possible, otherwise allocating a new CodePointInversionList.

The data backing this is extensible and supports multiple implementations. Currently it is always CodePointInversionList; however in the future more backends may be added, and users may select which at data generation time.

The performance of the conversion to this specific return type will vary depending on the data structure that is backing self.

impl Debug for CodePointSetData

fn fmt(self: &Self, f: &mut Formatter<'_>) -> Result

impl Freeze for CodePointSetData

impl RefUnwindSafe for CodePointSetData

impl Send for CodePointSetData

impl Sync for CodePointSetData

impl Unpin for CodePointSetData

impl UnsafeUnpin for CodePointSetData

impl UnwindSafe for CodePointSetData

impl<T> Any for CodePointSetData

fn type_id(self: &Self) -> TypeId

impl<T> Borrow for CodePointSetData

fn borrow(self: &Self) -> &T

impl<T> BorrowMut for CodePointSetData

fn borrow_mut(self: &mut Self) -> &mut T

impl<T> ErasedDestructor for CodePointSetData

impl<T> From for CodePointSetData

fn from(t: T) -> T

Returns the argument unchanged.

impl<T, U> Into for CodePointSetData

fn into(self: Self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of [From]<T> for U chooses to do.

impl<T, U> TryFrom for CodePointSetData

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto for CodePointSetData

fn try_into(self: Self) -> Result<U, <U as TryFrom<T>>::Error>