Struct Locale

struct Locale { ... }

A core struct representing a Unicode Locale Identifier.

A locale is made of two parts:

Locale exposes all of the same fields and methods as LanguageIdentifier, and on top of that is able to parse, manipulate and serialize unicode extension fields.

Ordering

This type deliberately does not implement Ord or PartialOrd because there are multiple possible orderings. Depending on your use case, two orderings are available:

  1. A string ordering, suitable for stable serialization: Locale::strict_cmp
  2. A struct ordering, suitable for use with a BTreeSet: Locale::total_cmp

See issue: https://github.com/unicode-org/icu4x/issues/1215

Parsing

Unicode recognizes three levels of standard conformance for a locale:

Any syntactically invalid subtags will cause the parsing to fail with an error.

This operation normalizes syntax to be well-formed. No legacy subtag replacements is performed. For validation and canonicalization, see LocaleCanonicalizer.

ICU4X's Locale parsing does not allow for non-BCP-47-compatible locales allowed by UTS 35 for backwards compatability. Furthermore, it currently does not allow for language tags to have more than three characters.

Examples

Simple example:

use icu::locale::{
    extensions::unicode::{key, value},
    locale,
    subtags::{language, region},
};

let loc = locale!("en-US-u-ca-buddhist");

assert_eq!(loc.id.language, language!("en"));
assert_eq!(loc.id.script, None);
assert_eq!(loc.id.region, Some(region!("US")));
assert_eq!(loc.id.variants.len(), 0);
assert_eq!(
    loc.extensions.unicode.keywords.get(&key!("ca")),
    Some(&value!("buddhist"))
);

More complex example:

use icu::locale::{subtags::*, Locale};

let loc: Locale = "eN-latn-Us-Valencia-u-hC-H12"
    .parse()
    .expect("Failed to parse.");

assert_eq!(loc.id.language, "en".parse::<Language>().unwrap());
assert_eq!(loc.id.script, "Latn".parse::<Script>().ok());
assert_eq!(loc.id.region, "US".parse::<Region>().ok());
assert_eq!(
    loc.id.variants.get(0),
    "valencia".parse::<Variant>().ok().as_ref()
);

Fields

id: LanguageIdentifier

The basic language/script/region components in the locale identifier along with any variants.

extensions: Extensions

Any extensions present in the locale identifier.

Implementations

impl Locale

fn strict_cmp(self: &Self, other: &[u8]) -> Ordering

Compare this Locale with BCP-47 bytes.

The return value is equivalent to what would happen if you first converted this Locale to a BCP-47 string and then performed a byte comparison.

This function is case-sensitive and results in a total order, so it is appropriate for binary search. The only argument producing Ordering::Equal is self.to_string().

Examples

Sorting a list of locales with this method requires converting one of them to a string:

use icu::locale::Locale;
use std::cmp::Ordering;
use writeable::Writeable;

// Random input order:
let bcp47_strings: &[&str] = &[
    "und-u-ca-hebrew",
    "ar-Latn",
    "zh-Hant-TW",
    "zh-TW",
    "und-fonipa",
    "zh-Hant",
    "ar-SA",
];

let mut locales = bcp47_strings
    .iter()
    .map(|s| s.parse().unwrap())
    .collect::<Vec<Locale>>();
locales.sort_by(|a, b| {
    let b = b.write_to_string();
    a.strict_cmp(b.as_bytes())
});
let strict_cmp_strings = locales
    .iter()
    .map(|l| l.to_string())
    .collect::<Vec<String>>();

// Output ordering, sorted alphabetically
let expected_ordering: &[&str] = &[
    "ar-Latn",
    "ar-SA",
    "und-fonipa",
    "und-u-ca-hebrew",
    "zh-Hant",
    "zh-Hant-TW",
    "zh-TW",
];

assert_eq!(expected_ordering, strict_cmp_strings);
fn total_cmp(self: &Self, other: &Self) -> Ordering

Returns an ordering suitable for use in BTreeSet.

Unlike Locale::strict_cmp, the ordering may or may not be equivalent to string ordering, and it may or may not be stable across ICU4X releases.

Examples

This method returns a nonsensical ordering derived from the fields of the struct:

use icu::locale::Locale;
use std::cmp::Ordering;

// Input strings, sorted alphabetically
let bcp47_strings: &[&str] = &[
    "ar-Latn",
    "ar-SA",
    "und-fonipa",
    "und-u-ca-hebrew",
    "zh-Hant",
    "zh-Hant-TW",
    "zh-TW",
];
assert!(bcp47_strings.windows(2).all(|w| w[0] < w[1]));

let mut locales = bcp47_strings
    .iter()
    .map(|s| s.parse().unwrap())
    .collect::<Vec<Locale>>();
locales.sort_by(Locale::total_cmp);
let total_cmp_strings = locales
    .iter()
    .map(|l| l.to_string())
    .collect::<Vec<String>>();

// Output ordering, sorted arbitrarily
let expected_ordering: &[&str] = &[
    "ar-SA",
    "ar-Latn",
    "und-u-ca-hebrew",
    "und-fonipa",
    "zh-TW",
    "zh-Hant",
    "zh-Hant-TW",
];

assert_eq!(expected_ordering, total_cmp_strings);

Use a wrapper to add a Locale to a BTreeSet:

use icu::locale::Locale;
use std::cmp::Ordering;
use std::collections::BTreeSet;

#[derive(PartialEq, Eq)]
struct LocaleTotalOrd(Locale);

impl Ord for LocaleTotalOrd {
    fn cmp(&self, other: &Self) -> Ordering {
        self.0.total_cmp(&other.0)
    }
}

impl PartialOrd for LocaleTotalOrd {
    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
        Some(self.cmp(other))
    }
}

let _: BTreeSet<LocaleTotalOrd> = unimplemented!();

impl Locale

fn to_string(self: &Self) -> String

Converts the given value to a String.

Under the hood, this uses an efficient Writeable implementation. However, in order to avoid allocating a string, it is more efficient to use Writeable directly.

impl Clone for Locale

fn clone(self: &Self) -> Locale

impl Debug for Locale

fn fmt(self: &Self, f: &mut Formatter<'_>) -> Result

impl Display for Locale

fn fmt(self: &Self, f: &mut Formatter<'_>) -> Result

impl Eq for Locale

impl Freeze for Locale

impl From for Locale

fn from(lsr: (Language, Option<Script>, Option<Region>)) -> Self

impl From for Locale

fn from(region: Option<Region>) -> Self

impl From for Locale

fn from(id: LanguageIdentifier) -> Self

impl From for Locale

fn from(script: Option<Script>) -> Self

impl From for Locale

fn from(language: Language) -> Self

impl Hash for Locale

fn hash<__H: $crate::hash::Hasher>(self: &Self, state: &mut __H)

impl PartialEq for Locale

fn eq(self: &Self, other: &Locale) -> bool

impl RefUnwindSafe for Locale

impl Send for Locale

impl StructuralPartialEq for Locale

impl Sync for Locale

impl Unpin for Locale

impl UnsafeUnpin for Locale

impl UnwindSafe for Locale

impl Writeable for Locale

fn write_to<W: core::fmt::Write + ?Sized>(self: &Self, sink: &mut W) -> Result
fn writeable_length_hint(self: &Self) -> LengthHint

impl<T> Any for Locale

fn type_id(self: &Self) -> TypeId

impl<T> Borrow for Locale

fn borrow(self: &Self) -> &T

impl<T> BorrowMut for Locale

fn borrow_mut(self: &mut Self) -> &mut T

impl<T> CloneToUninit for Locale

unsafe fn clone_to_uninit(self: &Self, dest: *mut u8)

impl<T> ErasedDestructor for Locale

impl<T> From for Locale

fn from(t: T) -> T

Returns the argument unchanged.

impl<T> ToOwned for Locale

fn to_owned(self: &Self) -> T
fn clone_into(self: &Self, target: &mut T)

impl<T> ToString for Locale

fn to_string(self: &Self) -> String

impl<T, U> Into for Locale

fn into(self: Self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of [From]<T> for U chooses to do.

impl<T, U> TryFrom for Locale

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto for Locale

fn try_into(self: Self) -> Result<U, <U as TryFrom<T>>::Error>