From 4f93f935dc7a981ca073a322425c3f5929ffb644 Mon Sep 17 00:00:00 2001 From: Thomas Voss Date: Sun, 21 Jan 2024 03:03:58 +0100 Subject: Support line- & column-based match locations --- vendor/librune/man/u8len.3 | 70 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) create mode 100644 vendor/librune/man/u8len.3 (limited to 'vendor/librune/man/u8len.3') diff --git a/vendor/librune/man/u8len.3 b/vendor/librune/man/u8len.3 new file mode 100644 index 0000000..4e58e14 --- /dev/null +++ b/vendor/librune/man/u8len.3 @@ -0,0 +1,70 @@ +.Dd January 15 2024 +.Dt U8LEN 3 +.Os +.Sh NAME +.Nm u8len +.Nd count Unicode codepoints +.Sh LIBRARY +.Lb librune +.Sh SYNOPSIS +.In utf8.h +.Ft size_t +.Fn u8len "const char8_t *s" "size_t n" +.Sh DESCRIPTION +The +.Fn u8len +function returns the number of UTF-8 encoded Unicode codepoints in the +buffer +.Fa s +of length +.Fa n +bytes. +.Pp +This function assumes that +.Fa s +contains only valid UTF-8. +.Sh RETURN VALUES +The +.Fn u8len +function returns the number of codepoints in the buffer +.Fa s . +.Sh EXAMPLES +The following call to +.Fn u8len +will return 17 while the call to +.Fn strlen +will return 22 as a result of use of multibyte-characters in +.Fa s . +.Bd -literal -offset indent +char8_t s[] = u8\(dq„Der Große Duden“\(dq; +size_t blen, cplen; + +blen = strlen((char *)s); +cplen = u8len(s, sizeof(s) - 1); +.Ed +.Sh SEE ALSO +.Xr u8glen 3 , +.Xr u8wdth 3 , +.Xr unicode 7 , +.Xr utf\-8 7 +.Sh STANDARDS +.Rs +.%A F. Yergeau +.%D November 2003 +.%R RFC 3629 +.%T UTF-8, a transformation format of ISO 10646 +.Re +.Sh AUTHORS +.An Thomas Voss Aq Mt mail@thomasvoss.com +.Sh CAVEATS +The return value of +.Fn u8len +does not necessarily represent the number of human-preceived characters +in the given buffer; +multiple codepoints may combine to form one human-preceived character +that spans a single column. +To count user-preceived codepoints +.Pq also known as graphemes , +you may want to use the +.Xr u8glen 3 +function. -- cgit v1.2.3