Asterisk - The Open Source Telephony Project GIT-master-f36a736
|
UTF-8 information and validation functions. More...
Go to the source code of this file.
Enumerations | |
enum | ast_utf8_replace_result { AST_UTF8_REPLACE_VALID , AST_UTF8_REPLACE_INVALID , AST_UTF8_REPLACE_OVERRUN } |
enum | ast_utf8_validation_result { AST_UTF8_VALID , AST_UTF8_INVALID , AST_UTF8_UNKNOWN } |
Functions | |
void | ast_utf8_copy_string (char *dst, const char *src, size_t size) |
Copy a string safely ensuring valid UTF-8. More... | |
int | ast_utf8_init (void) |
Register UTF-8 tests. More... | |
int | ast_utf8_is_valid (const char *str) |
Check if a zero-terminated string is valid UTF-8. More... | |
int | ast_utf8_is_validn (const char *str, size_t size) |
Check if the first size bytes of a string are valid UTF-8. More... | |
enum ast_utf8_replace_result | ast_utf8_replace_invalid_chars (char *dst, size_t *dst_size, const char *src, size_t src_len) |
Copy a string safely replacing any invalid UTF-8 sequences. More... | |
void | ast_utf8_validator_destroy (struct ast_utf8_validator *validator) |
Destroy a UTF-8 validator. More... | |
enum ast_utf8_validation_result | ast_utf8_validator_feed (struct ast_utf8_validator *validator, const char *data) |
Feed a zero-terminated string into the UTF-8 validator. More... | |
enum ast_utf8_validation_result | ast_utf8_validator_feedn (struct ast_utf8_validator *validator, const char *data, size_t size) |
Feed a string into the UTF-8 validator. More... | |
int | ast_utf8_validator_new (struct ast_utf8_validator **validator) |
Create a new UTF-8 validator. More... | |
void | ast_utf8_validator_reset (struct ast_utf8_validator *validator) |
Reset the state of a UTF-8 validator. More... | |
enum ast_utf8_validation_result | ast_utf8_validator_state (struct ast_utf8_validator *validator) |
Get the current UTF-8 validator state. More... | |
UTF-8 information and validation functions.
Definition in file utf8.h.
Definition at line 70 of file utf8.h.
Definition at line 123 of file utf8.h.
void ast_utf8_copy_string | ( | char * | dst, |
const char * | src, | ||
size_t | size | ||
) |
Copy a string safely ensuring valid UTF-8.
This is similar to ast_copy_string, but it will only copy valid UTF-8 sequences from the source string into the destination buffer. If an invalid UTF-8 sequence is encountered, or the available space in the destination buffer is exhausted in the middle of an otherwise valid UTF-8 sequence, the destination buffer will be truncated to ensure that it only contains valid UTF-8.
dst | The destination buffer. |
src | The source string |
size | The size of the destination buffer |
Definition at line 133 of file utf8.c.
References ast_assert, decode(), UTF8_ACCEPT, and UTF8_REJECT.
int ast_utf8_init | ( | void | ) |
Register UTF-8 tests.
Does nothing unless TEST_FRAMEWORK is defined.
0 | Always |
Definition at line 919 of file utf8.c.
Referenced by asterisk_daemon().
int ast_utf8_is_valid | ( | const char * | str | ) |
Check if a zero-terminated string is valid UTF-8.
str | The zero-terminated string to check |
0 | if the string is not valid UTF-8 |
Non-zero | if the string is valid UTF-8 |
Definition at line 110 of file utf8.c.
References decode(), and UTF8_ACCEPT.
int ast_utf8_is_validn | ( | const char * | str, |
size_t | size | ||
) |
Check if the first size bytes of a string are valid UTF-8.
Similar to ast_utf8_is_valid() but checks the first size bytes or until a zero byte is reached, whichever comes first.
str | The string to check |
size | The number of bytes to evaluate |
0 | if the string is not valid UTF-8 |
Non-zero | if the string is valid UTF-8 |
Definition at line 121 of file utf8.c.
References decode(), and UTF8_ACCEPT.
enum ast_utf8_replace_result ast_utf8_replace_invalid_chars | ( | char * | dst, |
size_t * | dst_size, | ||
const char * | src, | ||
size_t | src_len | ||
) |
Copy a string safely replacing any invalid UTF-8 sequences.
This is similar to ast_copy_string, but it will only copy valid UTF-8 sequences from the source string into the destination buffer. If an invalid sequence is encountered, it's replaced with the \uFFFD sequence which is the valid UTF-8 sequence that represents an unknown, unrecognized, or unrepresentable character. Since \uFFFD is actually a 3 byte sequence, the destination buffer will need to be larger than the corresponding source string if it contains invalid sequences. You can pass NULL as the destination buffer pointer to get the actual size required, then call the function again with the properly sized buffer.
dst | Pointer to the destination buffer. If NULL, dst_size will be set to the size of the buffer required to fully process the source string. |
dst_size | A pointer to the size of the dst buffer |
src | The source string |
src_len | The number of bytes to copy |
Definition at line 173 of file utf8.c.
References AST_UTF8_INVALID, AST_UTF8_REPLACE_INVALID, AST_UTF8_REPLACE_OVERRUN, AST_UTF8_REPLACE_VALID, decode(), REPL_SEQ, REPL_SEQ_LEN, UTF8_ACCEPT, and UTF8_REJECT.
Referenced by ast_channel_publish_varset(), and set_id_from_hdr().
void ast_utf8_validator_destroy | ( | struct ast_utf8_validator * | validator | ) |
enum ast_utf8_validation_result ast_utf8_validator_feed | ( | struct ast_utf8_validator * | validator, |
const char * | data | ||
) |
Feed a zero-terminated string into the UTF-8 validator.
validator | The validator instance |
data | The zero-terminated string to feed into the validator |
Definition at line 337 of file utf8.c.
References ast_utf8_validator_state(), decode(), and ast_utf8_validator::state.
enum ast_utf8_validation_result ast_utf8_validator_feedn | ( | struct ast_utf8_validator * | validator, |
const char * | data, | ||
size_t | size | ||
) |
Feed a string into the UTF-8 validator.
Similar to ast_utf8_validator_feed but will stop feeding in data if a zero byte is encountered or size bytes have been read.
validator | The validator instance |
data | The string to feed into the validator |
size | The number of bytes to feed into the validator |
Definition at line 347 of file utf8.c.
References ast_utf8_validator_state(), decode(), and ast_utf8_validator::state.
int ast_utf8_validator_new | ( | struct ast_utf8_validator ** | validator | ) |
Create a new UTF-8 validator.
[out] | validator | The validator instance |
0 | on success |
-1 | on failure |
Definition at line 311 of file utf8.c.
References ast_malloc, tmp(), and UTF8_ACCEPT.
void ast_utf8_validator_reset | ( | struct ast_utf8_validator * | validator | ) |
Reset the state of a UTF-8 validator.
Resets the provided UTF-8 validator to its initial state so that it can be reused.
validator | The validator instance to reset |
Definition at line 358 of file utf8.c.
References ast_utf8_validator::state, and UTF8_ACCEPT.
enum ast_utf8_validation_result ast_utf8_validator_state | ( | struct ast_utf8_validator * | validator | ) |
Get the current UTF-8 validator state.
validator | The validator instance |
Definition at line 324 of file utf8.c.
References AST_UTF8_INVALID, AST_UTF8_UNKNOWN, AST_UTF8_VALID, ast_utf8_validator::state, UTF8_ACCEPT, and UTF8_REJECT.
Referenced by ast_utf8_validator_feed(), and ast_utf8_validator_feedn().