Tizen Native API
4.0
|
The Unormalization module provides Unicode normalization functionality for standard unicode normalization.
Required Header
#include <utils_i18n.h>
Overview
The Unormalization module provides Unicode normalization functionality for standard unicode normalization. All instances of i18n_unormalizer_h are unmodifiable/immutable. Instances returned by i18n_unormalization_get_instance() are singletons that must not be deleted by the caller.
Sample Code 1
Creates a normalizer and normalizes a unicode string
i18n_unormalizer_h normalizer = NULL; i18n_uchar src = 0xAC00; i18n_uchar dest[4] = {0,}; int dest_str_len = 0; int i = 0; // gets instance for normalizer i18n_unormalization_get_instance( NULL, "nfc", I18N_UNORMALIZATION_DECOMPOSE, &normalizer ); // normalizes a unicode string i18n_unormalization_normalize( normalizer, &src, 1, dest, 4, &dest_str_len ); dlog_print(DLOG_INFO, LOG_TAG, "src is 0x%x\n", src ); // src is 0xAC00 (0xAC00: A Korean character combined with consonant and vowel) for ( i = 0; i < dest_str_len; i++ ) { dlog_print(DLOG_INFO, LOG_TAG, "dest[%d] is 0x%x\t", i + 1, dest[i] ); // dest[1] is 0x1100 dest[2] is 0x1161 (0x1100: consonant, 0x1161: vowel) }
Functions | |
int | i18n_unormalization_get_instance (const char *package_name, const char *name, i18n_unormalization_mode_e mode, i18n_unormalizer_h *normalizer) |
Gets a i18n_unormalizer_h which uses the specified data file and composes or decomposes text according to the specified mode. | |
int | i18n_unormalization_normalize (i18n_unormalizer_h normalizer, const i18n_uchar *src, int32_t len, i18n_uchar *dest, int32_t capacity, int32_t *len_deststr) |
Writes the normalized form of the source string to the destination string(replacing its contents). | |
Typedefs | |
typedef const void * | i18n_unormalizer_h |
i18n_unormalizer_h. |
Typedef Documentation
typedef const void* i18n_unormalizer_h |
i18n_unormalizer_h.
- Since :
- 2.3.1
Enumeration Type Documentation
Result values for normalization quick check functions.
- Since :
- 2.4
Enumeration of constants for normalization modes. For details about standard Unicode normalization forms and about the algorithms which are also used with custom mapping tables see http://www.unicode.org/unicode/reports/tr15/.
- Since :
- 2.3.1
- Enumerator:
I18N_UNORMALIZATION_COMPOSE Decomposition followed by composition. Same as standard NFC when using an "nfc" instance. Same as standard NFKC when using an "nfkc" instance. For details about standard Unicode normalization forms see http://www.unicode.org/unicode/reports/tr15/
I18N_UNORMALIZATION_DECOMPOSE Map and reorder canonically. Same as standard NFD when using an "nfc" instance. Same as standard NFKD when using an "nfkc" instance. For details about standard Unicode normalization forms see http://www.unicode.org/unicode/reports/tr15/
I18N_UNORMALIZATION_FCD "Fast C or D" form. If a string is in this form, then further decomposition without reordering would yield the same form as DECOMPOSE. Text in "Fast C or D" form can be processed efficiently with data tables that are "canonically closed", that is, that provide equivalent data for equivalent text, without having to be fully normalized. Not a standard Unicode normalization form. Not a unique form: Different FCD strings can be canonically equivalent. For details see http://www.unicode.org/notes/tn5/#FCD
I18N_UNORMALIZATION_COMPOSE_CONTIGUOUS Compose only contiguously. Also known as "FCC" or "Fast C Contiguous". The result will often but not always be in NFC. The result will conform to FCD which is useful for processing. Not a standard Unicode normalization form. For details see http://www.unicode.org/notes/tn5/#FCC
Function Documentation
int i18n_unormalization_get_instance | ( | const char * | package_name, |
const char * | name, | ||
i18n_unormalization_mode_e | mode, | ||
i18n_unormalizer_h * | normalizer | ||
) |
Gets a i18n_unormalizer_h which uses the specified data file and composes or decomposes text according to the specified mode.
- Since :
- 2.3.1
- Parameters:
-
[in] package_name NULL
for ICU built-in data, otherwise application data package name.[in] name "nfc" or "nfkc" or "nfkc_cf" or the name of the custom data file. [in] mode The normalization mode (compose or decompose). [out] normalizer The requested normalizer on success.
- Return values:
-
I18N_ERROR_NONE Successful I18N_ERROR_INVALID_PARAMETER Invalid function parameter
int i18n_unormalization_normalize | ( | i18n_unormalizer_h | normalizer, |
const i18n_uchar * | src, | ||
int32_t | len, | ||
i18n_uchar * | dest, | ||
int32_t | capacity, | ||
int32_t * | len_deststr | ||
) |
Writes the normalized form of the source string to the destination string(replacing its contents).
The source and destination strings must be different buffers.
- Since :
- 2.3.1
- Parameters:
-
[in] normalizer i18n normalizer handle. [in] src The source string. [in] len The length of the source string, otherwise -1
if NULL-terminated.[out] dest The destination string
Its contents are replaced with normalized src.[in] capacity The number of string_uchar that can be written to dest [out] len_deststr The length of the destination string
- Return values:
-
I18N_ERROR_NONE Successful I18N_ERROR_INVALID_PARAMETER Invalid function parameter