Module zerovec
Documentation on zero-copy deserialization of locale types.
Locale and LanguageIdentifier are highly structured types that cannot be directly
stored in a zero-copy data structure, such as those provided by the zerovec module.
This page explains how to indirectly store these types in a zerovec.
There are two main use cases, which have different solutions:
- Lookup: You need to locate a locale in a zero-copy vector, such as when querying a map.
- Obtain: You have a locale stored in a zero-copy vector, and you need to obtain a proper
LocaleorLanguageIdentifierfor use elsewhere in your program.
Lookup
To perform lookup, store the stringified locale in a canonical BCP-47 form as a byte array,
and then use Locale::strict_cmp() to perform an efficient, zero-allocation lookup.
To produce more human-readable serialized output, you can use PotentialUtf8.
use Locale;
use PotentialUtf8;
use ZeroMap;
// ZeroMap from locales to integers
let data: & = &;
let zm: = data.iter.copied.collect;
// Get the value associated with a locale
let loc: Locale = "en-US-u-ca-buddhist".parse.unwrap;
let value = zm.get_copied_by;
assert_eq!;
Obtain
Obtaining a Locale or LanguageIdentifier is not generally a zero-copy operation, since
both of these types may require memory allocation. If possible, architect your code such that
you do not need to obtain a structured type.
If you need the structured type, such as if you need to manipulate it in some way, there are two options: storing subtags, and storing a string for parsing.
Storing Subtags
If the data being stored only contains a limited number of subtags, you can store them as a
tuple, and then construct the LanguageIdentifier externally.
use ;
use LanguageIdentifier;
use ;
use ZeroMap;
// ZeroMap from integer to LSR (language-script-region)
let zm: =
.into_iter
.collect;
// Construct a LanguageIdentifier from a tuple entry
let lid: LanguageIdentifier =
zm.get_copied.expect.into;
assert_eq!;
Storing Strings
If it is necessary to store and obtain an arbitrary locale, it is currently recommended to store a BCP-47 string and parse it when needed.
Since the string is stored in an unparsed state, it is not safe to unwrap the result from
Locale::try_from_utf8(). See icu4x#831
for a discussion on potential data models that could ensure that the locale is valid during
deserialization.
As above, to produce more human-readable serialized output, you can use PotentialUtf8.
use langid;
use Locale;
use PotentialUtf8;
use ZeroMap;
// ZeroMap from integer to locale string
let data: & = &;
let zm: = data.iter.copied.collect;
// Construct a Locale by parsing the string.
let value = zm.get.expect;
let loc = try_from_utf8;
assert_eq!;
// Invalid entries are fallible
let err_value = zm.get.expect;
let err_loc = try_from_utf8;
assert!;