1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
|
.Dd January 18 2024
.Dt U8SET 3
.Os
.Sh NAME
.Nm rtou8 ,
.Nm u8set
.Nd encode a rune to UTF-8
.Sh LIBRARY
.Lb librune
.Sh SYNOPSIS
.In utf8.h
.Ft int
.Fn rtou8 "const char8_t *s" "rune ch" "size_t n"
.Ft size_t
.Fn u8set "const char8_t *s" "rune ch" "size_t n"
.Sh DESCRIPTION
The
.Fn rtou8
function writes the rune
.Fa ch
to the UTF-8 encoded buffer
.Fa s
of length
.Fa n ,
returning the number of bytes required to UTF-8 encode
.Fa ch .
.Pp
The
.Fn u8set
function fills the buffer
.Fa s
of length
.Fa n
with the constant rune
.Fa ch .
It is similar to the
.Fn rtou8
function,
but writes more than 1 rune if the given buffer has the capacity.
Unlike
.Fn rtou8 ,
this function returns the number of bytes that were successfully written
to
.Fa s .
If
.Fa n
is a multiple of
.Fn u8wdth ch
the return value will be equal to
.Fa n ,
however in the case that
.Fa n
is not a multiple then
.Fa s
is filled as much as possible,
and a count shorter than
.Fa n
is returned.
.Pp
Both of these functions assume the input is valid UTF-8.
.Sh RETURN VALUES
The
.Fn rtou8
function returns the number of bytes required to write
.Fa ch
to the buffer
.Fa s .
.Pp
The
.Fn u8set
function returns the number of bytes written to the buffer
.Fa s .
.Sh EXAMPLES
The following calls to
.Fn rtou8
and
.Fn u8set
fill a buffer with box-drawing characters to create a top-row of a box.
.Bd -literal -offset indent
#define SCREEN_WIDTH 80
int bdr_wdth = u8wdth(U'─'); /* box-drawing rune width */
size_t bufsiz = SCREEN_WIDTH * bdr_wdth;
char8_t *s = malloc(bufsiz);
rtou8(s, U'┌', bdr_wdth);
u8set(s + bdr_wdth, U'─', bufsiz - bdr_wdth * 2);
rtou8(s + bufsiz - bdr_wdth, U'┐', bdr_wdth);
.Ed
.Sh SEE ALSO
.Xr u8tor 3 ,
.Xr u8tor_uc 3 ,
.Xr unicode 7 ,
.Xr utf\-8 7
.Sh STANDARDS
.Rs
.%A F. Yergeau
.%D November 2003
.%R RFC 3629
.%T UTF-8, a transformation format of ISO 10646
.Re
.Sh AUTHORS
.An Thomas Voss Aq Mt mail@thomasvoss.com
|