Crate unicode_segmentation
Iterators which split strings on Grapheme Cluster, Word or Sentence boundaries, according to the Unicode Standard Annex #29 rules.
extern crate unicode_segmentation;
use UnicodeSegmentation;
no_std
unicode-segmentation does not depend on libstd, so it can be used in crates
with the #![no_std] attribute.
crates.io
You can use this package in your project by adding the following
to your Cargo.toml:
[dependencies]
unicode-segmentation = "1.9.0"
Structs
- GraphemeCursor Cursor-based segmenter for grapheme clusters.
- GraphemeIndices External iterator for grapheme clusters and byte offsets.
- Graphemes External iterator for a string's grapheme clusters.
- USentenceBoundIndices External iterator for sentence boundaries and byte offsets.
- USentenceBounds External iterator for a string's sentence boundaries.
- UWordBoundIndices External iterator for word boundaries and byte offsets.
- UWordBounds External iterator for a string's word boundaries.
- UnicodeSentences An iterator over the substrings of a string which, after splitting the string on sentence boundaries, contain any characters with the Alphabetic property, or with General_Category=Number.
- UnicodeWordIndices An iterator over the substrings of a string which, after splitting the string on word boundaries, contain any characters with the Alphabetic property, or with General_Category=Number. This iterator also provides the byte offsets for each substring.
- UnicodeWords An iterator over the substrings of a string which, after splitting the string on word boundaries, contain any characters with the Alphabetic property, or with General_Category=Number.
Enums
- GraphemeIncomplete An error return indicating that not enough content was available in the provided chunk to satisfy the query, and that more content must be provided.
Traits
- UnicodeSegmentation Methods for segmenting strings according to Unicode Standard Annex #29.
Constants
- UNICODE_VERSION The version of Unicode that this version of unicode-segmentation is based on.