.Dd 4 May 2024 .Dt U8LEN 3 .Os .Sh NAME .Nm u8len .Nd count Unicode codepoints .Sh LIBRARY .Lb mlib .Sh SYNOPSIS .In mbstring.h .Ft size_t .Fn u8len "u8view_t sv" .Sh DESCRIPTION The .Fn u8len function returns the number of UTF-8 encoded Unicode codepoints in the string view .Fa sv . .Pp Invalid bytes are interpreted as having a length of 1 byte. .Sh RETURN VALUES The .Fn u8len function returns the number of codepoints in the string view .Fa sv . .Sh EXAMPLES The following call to .Fn u8len will return 17 while the call to .Fn strlen will return 22 as a result of use of multibyte-characters in .Fa sv . .Bd -literal -offset indent size_t n; u8view_t sv = U8(\(dq„Der Große Duden“\(dq); n = u8len(sv); /* 17 */ n = strlen((char *)sv.p); /* 22 */ .Ed .Sh SEE ALSO .Xr U8 3 , .Xr u8gcnt 3 , .Xr u8view 3 , .Xr unicode 7 , .Xr utf\-8 7 .Sh STANDARDS .Rs .%A F. Yergeau .%D November 2003 .%R RFC 3629 .%T UTF-8, a transformation format of ISO 10646 .Re .Sh AUTHORS .An Thomas Voss Aq Mt mail@thomasvoss.com .Sh CAVEATS The return value of .Fn u8len does not necessarily represent the number of human-preceived characters in the given string view; multiple codepoints may combine to form one human-preceived character. These human-preceived characters may even take up multiple columns in a monospaced-environment such as in a terminal emulator. To count user-preceived characters .Pq also known as graphemes , you may want to use the .Xr u8gcnt 3 function.