.Dd March 10 2024 .Dt U8LEN 3 .Os .Sh NAME .Nm u8len .Nd count Unicode codepoints .Sh LIBRARY .Lb mlib .Sh SYNOPSIS .In mbstring.h .Ft size_t .Fn u8len "const char8_t *s" "size_t n" .Sh DESCRIPTION The .Fn u8len function returns the number of UTF-8 encoded Unicode codepoints in the buffer .Fa s of length .Fa n bytes. .Pp Invalid bytes are interpreted as having a length of 1 byte. .Sh RETURN VALUES The .Fn u8len function returns the number of codepoints in the buffer .Fa s . .Sh EXAMPLES The following call to .Fn u8len will return 17 while the call to .Fn strlen will return 22 as a result of use of multibyte-characters in .Fa s . .Bd -literal -offset indent struct u8view sv = U8V(u8\(dq„Der Große Duden“\(dq); size_t blen = strlen((char *)sv.p); size_t cplen = u8len(U8_ARGS(sv)); .Ed .Sh SEE ALSO .Xr u8glen 3 , .Xr U8V 3 , .Xr unicode 7 , .Xr utf\-8 7 .Sh STANDARDS .Rs .%A F. Yergeau .%D November 2003 .%R RFC 3629 .%T UTF-8, a transformation format of ISO 10646 .Re .Sh AUTHORS .An Thomas Voss Aq Mt mail@thomasvoss.com .Sh CAVEATS The return value of .Fn u8len does not necessarily represent the number of human-preceived characters in the given buffer; multiple codepoints may combine to form one human-preceived character that spans a single column. To count user-preceived codepoints .Pq also known as graphemes , you may want to use the .Xr u8glen 3 function.