From c6853cefe4b0de30d3fb04d7be8a0a78a23d51d3 Mon Sep 17 00:00:00 2001 From: Thomas Voss Date: Thu, 3 Oct 2024 01:24:50 +0200 Subject: Update for Unicode 16 --- data/SentenceBreakProperty | 104 ++++++++++++++++++++++++++++++++++++--------- 1 file changed, 84 insertions(+), 20 deletions(-) (limited to 'data/SentenceBreakProperty') diff --git a/data/SentenceBreakProperty b/data/SentenceBreakProperty index f848af1..ca3689e 100644 --- a/data/SentenceBreakProperty +++ b/data/SentenceBreakProperty @@ -1,8 +1,8 @@ -# SentenceBreakProperty-15.1.0.txt -# Date: 2023-07-28, 23:34:37 GMT -# © 2023 Unicode®, Inc. +# SentenceBreakProperty-16.0.0.txt +# Date: 2024-07-29, 16:27:32 GMT +# © 2024 Unicode®, Inc. # Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries. -# For terms of use, see https://www.unicode.org/terms_of_use.html +# For terms of use and license, see https://www.unicode.org/terms_of_use.html # # Unicode Character Database # For documentation, see https://www.unicode.org/reports/tr44/ @@ -55,7 +55,7 @@ 0825..0827 ; Extend # Mn [3] SAMARITAN VOWEL SIGN SHORT A..SAMARITAN VOWEL SIGN U 0829..082D ; Extend # Mn [5] SAMARITAN VOWEL SIGN LONG I..SAMARITAN MARK NEQUDAA 0859..085B ; Extend # Mn [3] MANDAIC AFFRICATION MARK..MANDAIC GEMINATION MARK -0898..089F ; Extend # Mn [8] ARABIC SMALL HIGH WORD AL-JUZ..ARABIC HALF MADDA OVER MADDA +0897..089F ; Extend # Mn [9] ARABIC PEPET..ARABIC HALF MADDA OVER MADDA 08CA..08E1 ; Extend # Mn [24] ARABIC SMALL HIGH FARSI YEH..ARABIC SMALL HIGH SIGN SAFHA 08E3..0902 ; Extend # Mn [32] ARABIC TURNED DAMMA BELOW..DEVANAGARI SIGN ANUSVARA 0903 ; Extend # Mc DEVANAGARI SIGN VISARGA @@ -371,8 +371,9 @@ FF9E..FF9F ; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDT 10A3F ; Extend # Mn KHAROSHTHI VIRAMA 10AE5..10AE6 ; Extend # Mn [2] MANICHAEAN ABBREVIATION MARK ABOVE..MANICHAEAN ABBREVIATION MARK BELOW 10D24..10D27 ; Extend # Mn [4] HANIFI ROHINGYA SIGN HARBAHAY..HANIFI ROHINGYA SIGN TASSI +10D69..10D6D ; Extend # Mn [5] GARAY VOWEL SIGN E..GARAY CONSONANT NASALIZATION MARK 10EAB..10EAC ; Extend # Mn [2] YEZIDI COMBINING HAMZA MARK..YEZIDI COMBINING MADDA MARK -10EFD..10EFF ; Extend # Mn [3] ARABIC SMALL LOW WORD SAKTA..ARABIC SMALL LOW WORD MADDA +10EFC..10EFF ; Extend # Mn [4] ARABIC COMBINING ALEF OVERLAY..ARABIC SMALL LOW WORD MADDA 10F46..10F50 ; Extend # Mn [11] SOGDIAN COMBINING DOT BELOW..SOGDIAN COMBINING STROKE BELOW 10F82..10F85 ; Extend # Mn [4] OLD UYGHUR COMBINING DOT ABOVE..OLD UYGHUR COMBINING TWO DOTS BELOW 11000 ; Extend # Mc BRAHMI SIGN CANDRABINDU @@ -425,6 +426,17 @@ FF9E..FF9F ; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDT 11362..11363 ; Extend # Mc [2] GRANTHA VOWEL SIGN VOCALIC L..GRANTHA VOWEL SIGN VOCALIC LL 11366..1136C ; Extend # Mn [7] COMBINING GRANTHA DIGIT ZERO..COMBINING GRANTHA DIGIT SIX 11370..11374 ; Extend # Mn [5] COMBINING GRANTHA LETTER A..COMBINING GRANTHA LETTER PA +113B8..113BA ; Extend # Mc [3] TULU-TIGALARI VOWEL SIGN AA..TULU-TIGALARI VOWEL SIGN II +113BB..113C0 ; Extend # Mn [6] TULU-TIGALARI VOWEL SIGN U..TULU-TIGALARI VOWEL SIGN VOCALIC LL +113C2 ; Extend # Mc TULU-TIGALARI VOWEL SIGN EE +113C5 ; Extend # Mc TULU-TIGALARI VOWEL SIGN AI +113C7..113CA ; Extend # Mc [4] TULU-TIGALARI VOWEL SIGN OO..TULU-TIGALARI SIGN CANDRA ANUNASIKA +113CC..113CD ; Extend # Mc [2] TULU-TIGALARI SIGN ANUSVARA..TULU-TIGALARI SIGN VISARGA +113CE ; Extend # Mn TULU-TIGALARI SIGN VIRAMA +113CF ; Extend # Mc TULU-TIGALARI SIGN LOOPED VIRAMA +113D0 ; Extend # Mn TULU-TIGALARI CONJOINER +113D2 ; Extend # Mn TULU-TIGALARI GEMINATION MARK +113E1..113E2 ; Extend # Mn [2] TULU-TIGALARI VEDIC TONE SVARITA..TULU-TIGALARI VEDIC TONE ANUDATTA 11435..11437 ; Extend # Mc [3] NEWA VOWEL SIGN AA..NEWA VOWEL SIGN II 11438..1143F ; Extend # Mn [8] NEWA VOWEL SIGN U..NEWA VOWEL SIGN AI 11440..11441 ; Extend # Mc [2] NEWA VOWEL SIGN O..NEWA VOWEL SIGN AU @@ -460,7 +472,9 @@ FF9E..FF9F ; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDT 116B0..116B5 ; Extend # Mn [6] TAKRI VOWEL SIGN U..TAKRI VOWEL SIGN AU 116B6 ; Extend # Mc TAKRI SIGN VIRAMA 116B7 ; Extend # Mn TAKRI SIGN NUKTA -1171D..1171F ; Extend # Mn [3] AHOM CONSONANT SIGN MEDIAL LA..AHOM CONSONANT SIGN MEDIAL LIGATING RA +1171D ; Extend # Mn AHOM CONSONANT SIGN MEDIAL LA +1171E ; Extend # Mc AHOM CONSONANT SIGN MEDIAL RA +1171F ; Extend # Mn AHOM CONSONANT SIGN MEDIAL LIGATING RA 11720..11721 ; Extend # Mc [2] AHOM VOWEL SIGN A..AHOM VOWEL SIGN AA 11722..11725 ; Extend # Mn [4] AHOM VOWEL SIGN I..AHOM VOWEL SIGN UU 11726 ; Extend # Mc AHOM VOWEL SIGN E @@ -527,8 +541,12 @@ FF9E..FF9F ; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDT 11F40 ; Extend # Mn KAWI VOWEL SIGN EU 11F41 ; Extend # Mc KAWI SIGN KILLER 11F42 ; Extend # Mn KAWI CONJOINER +11F5A ; Extend # Mn KAWI SIGN NUKTA 13440 ; Extend # Mn EGYPTIAN HIEROGLYPH MIRROR HORIZONTALLY 13447..13455 ; Extend # Mn [15] EGYPTIAN HIEROGLYPH MODIFIER DAMAGED AT TOP START..EGYPTIAN HIEROGLYPH MODIFIER DAMAGED +1611E..16129 ; Extend # Mn [12] GURUNG KHEMA VOWEL SIGN AA..GURUNG KHEMA VOWEL LENGTH MARK +1612A..1612C ; Extend # Mc [3] GURUNG KHEMA CONSONANT SIGN MEDIAL YA..GURUNG KHEMA CONSONANT SIGN MEDIAL HA +1612D..1612F ; Extend # Mn [3] GURUNG KHEMA SIGN ANUSVARA..GURUNG KHEMA SIGN THOLHOMA 16AF0..16AF4 ; Extend # Mn [5] BASSA VAH COMBINING HIGH TONE..BASSA VAH COMBINING HIGH-LOW TONE 16B30..16B36 ; Extend # Mn [7] PAHAWH HMONG MARK CIM TUB..PAHAWH HMONG MARK CIM TAUM 16F4F ; Extend # Mn MIAO SIGN CONSONANT MODIFIER BAR @@ -562,12 +580,13 @@ FF9E..FF9F ; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDT 1E2AE ; Extend # Mn TOTO SIGN RISING TONE 1E2EC..1E2EF ; Extend # Mn [4] WANCHO TONE TUP..WANCHO TONE KOINI 1E4EC..1E4EF ; Extend # Mn [4] NAG MUNDARI SIGN MUHOR..NAG MUNDARI SIGN SUTUH +1E5EE..1E5EF ; Extend # Mn [2] OL ONAL SIGN MU..OL ONAL SIGN IKIR 1E8D0..1E8D6 ; Extend # Mn [7] MENDE KIKAKUI COMBINING NUMBER TEENS..MENDE KIKAKUI COMBINING NUMBER MILLIONS 1E944..1E94A ; Extend # Mn [7] ADLAM ALIF LENGTHENER..ADLAM NUKTA E0020..E007F ; Extend # Cf [96] TAG SPACE..CANCEL TAG E0100..E01EF ; Extend # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256 -# Total code points: 2550 +# Total code points: 2601 # ================================================ @@ -892,6 +911,7 @@ E0001 ; Format # Cf LANGUAGE TAG 10FC ; Lower # Lm MODIFIER LETTER GEORGIAN NAR 13F8..13FD ; Lower # L& [6] CHEROKEE SMALL LETTER YE..CHEROKEE SMALL LETTER MV 1C80..1C88 ; Lower # L& [9] CYRILLIC SMALL LETTER ROUNDED VE..CYRILLIC SMALL LETTER UNBLENDED UK +1C8A ; Lower # L& CYRILLIC SMALL LETTER TJE 1D00..1D2B ; Lower # L& [44] LATIN LETTER SMALL CAPITAL A..CYRILLIC LETTER SMALL CAPITAL EL 1D2C..1D6A ; Lower # Lm [63] MODIFIER LETTER CAPITAL A..GREEK SUBSCRIPT SMALL LETTER CHI 1D6B..1D77 ; Lower # L& [13] LATIN SMALL LETTER UE..LATIN SMALL LETTER TURNED G @@ -1233,11 +1253,13 @@ A7C1 ; Lower # L& LATIN SMALL LETTER OLD POLISH O A7C3 ; Lower # L& LATIN SMALL LETTER ANGLICANA W A7C8 ; Lower # L& LATIN SMALL LETTER D WITH SHORT STROKE OVERLAY A7CA ; Lower # L& LATIN SMALL LETTER S WITH SHORT STROKE OVERLAY +A7CD ; Lower # L& LATIN SMALL LETTER S WITH DIAGONAL STROKE A7D1 ; Lower # L& LATIN SMALL LETTER CLOSED INSULAR G A7D3 ; Lower # L& LATIN SMALL LETTER DOUBLE THORN A7D5 ; Lower # L& LATIN SMALL LETTER DOUBLE WYNN A7D7 ; Lower # L& LATIN SMALL LETTER MIDDLE SCOTS S A7D9 ; Lower # L& LATIN SMALL LETTER SIGMOID S +A7DB ; Lower # L& LATIN SMALL LETTER LAMBDA A7F2..A7F4 ; Lower # Lm [3] MODIFIER LETTER CAPITAL C..MODIFIER LETTER CAPITAL Q A7F6 ; Lower # L& LATIN SMALL LETTER REVERSED HALF H A7F8..A7F9 ; Lower # Lm [2] MODIFIER LETTER CAPITAL H WITH STROKE..MODIFIER LETTER SMALL LIGATURE OE @@ -1261,6 +1283,7 @@ FF41..FF5A ; Lower # L& [26] FULLWIDTH LATIN SMALL LETTER A..FULLWIDTH LATIN 10787..107B0 ; Lower # Lm [42] MODIFIER LETTER SMALL DZ DIGRAPH..MODIFIER LETTER SMALL V WITH RIGHT HOOK 107B2..107BA ; Lower # Lm [9] MODIFIER LETTER SMALL CAPITAL Y..MODIFIER LETTER SMALL S WITH CURL 10CC0..10CF2 ; Lower # L& [51] OLD HUNGARIAN SMALL LETTER A..OLD HUNGARIAN SMALL LETTER US +10D70..10D85 ; Lower # L& [22] GARAY SMALL LETTER A..GARAY SMALL LETTER OLD NA 118C0..118DF ; Lower # L& [32] WARANG CITI SMALL LETTER NGAA..WARANG CITI SMALL LETTER VIYO 16E60..16E7F ; Lower # L& [32] MEDEFAIDRIN SMALL LETTER M..MEDEFAIDRIN SMALL LETTER Y 1D41A..1D433 ; Lower # L& [26] MATHEMATICAL BOLD SMALL A..MATHEMATICAL BOLD SMALL Z @@ -1297,7 +1320,7 @@ FF41..FF5A ; Lower # L& [26] FULLWIDTH LATIN SMALL LETTER A..FULLWIDTH LATIN 1E030..1E06D ; Lower # Lm [62] MODIFIER LETTER CYRILLIC SMALL A..MODIFIER LETTER CYRILLIC SMALL STRAIGHT U WITH STROKE 1E922..1E943 ; Lower # L& [34] ADLAM SMALL LETTER ALIF..ADLAM SMALL LETTER SHA -# Total code points: 2497 +# Total code points: 2522 # ================================================ @@ -1577,6 +1600,7 @@ FF41..FF5A ; Lower # L& [26] FULLWIDTH LATIN SMALL LETTER A..FULLWIDTH LATIN 10C7 ; Upper # L& GEORGIAN CAPITAL LETTER YN 10CD ; Upper # L& GEORGIAN CAPITAL LETTER AEN 13A0..13F5 ; Upper # L& [86] CHEROKEE LETTER A..CHEROKEE LETTER MV +1C89 ; Upper # L& CYRILLIC CAPITAL LETTER TJE 1E00 ; Upper # L& LATIN CAPITAL LETTER A WITH RING BELOW 1E02 ; Upper # L& LATIN CAPITAL LETTER B WITH DOT ABOVE 1E04 ; Upper # L& LATIN CAPITAL LETTER B WITH DOT BELOW @@ -1904,9 +1928,12 @@ A7C0 ; Upper # L& LATIN CAPITAL LETTER OLD POLISH O A7C2 ; Upper # L& LATIN CAPITAL LETTER ANGLICANA W A7C4..A7C7 ; Upper # L& [4] LATIN CAPITAL LETTER C WITH PALATAL HOOK..LATIN CAPITAL LETTER D WITH SHORT STROKE OVERLAY A7C9 ; Upper # L& LATIN CAPITAL LETTER S WITH SHORT STROKE OVERLAY +A7CB..A7CC ; Upper # L& [2] LATIN CAPITAL LETTER RAMS HORN..LATIN CAPITAL LETTER S WITH DIAGONAL STROKE A7D0 ; Upper # L& LATIN CAPITAL LETTER CLOSED INSULAR G A7D6 ; Upper # L& LATIN CAPITAL LETTER MIDDLE SCOTS S A7D8 ; Upper # L& LATIN CAPITAL LETTER SIGMOID S +A7DA ; Upper # L& LATIN CAPITAL LETTER LAMBDA +A7DC ; Upper # L& LATIN CAPITAL LETTER LAMBDA WITH STROKE A7F5 ; Upper # L& LATIN CAPITAL LETTER REVERSED HALF H FF21..FF3A ; Upper # L& [26] FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LATIN CAPITAL LETTER Z 10400..10427 ; Upper # L& [40] DESERET CAPITAL LETTER LONG I..DESERET CAPITAL LETTER EW @@ -1916,6 +1943,7 @@ FF21..FF3A ; Upper # L& [26] FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LAT 1058C..10592 ; Upper # L& [7] VITHKUQI CAPITAL LETTER SE..VITHKUQI CAPITAL LETTER XE 10594..10595 ; Upper # L& [2] VITHKUQI CAPITAL LETTER Y..VITHKUQI CAPITAL LETTER ZE 10C80..10CB2 ; Upper # L& [51] OLD HUNGARIAN CAPITAL LETTER A..OLD HUNGARIAN CAPITAL LETTER US +10D50..10D65 ; Upper # L& [22] GARAY CAPITAL LETTER A..GARAY CAPITAL LETTER OLD NA 118A0..118BF ; Upper # L& [32] WARANG CITI CAPITAL LETTER NGAA..WARANG CITI CAPITAL LETTER VIYO 16E40..16E5F ; Upper # L& [32] MEDEFAIDRIN CAPITAL LETTER M..MEDEFAIDRIN CAPITAL LETTER Y 1D400..1D419 ; Upper # L& [26] MATHEMATICAL BOLD CAPITAL A..MATHEMATICAL BOLD CAPITAL Z @@ -1954,7 +1982,7 @@ FF21..FF3A ; Upper # L& [26] FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LAT 1F150..1F169 ; Upper # So [26] NEGATIVE CIRCLED LATIN CAPITAL LETTER A..NEGATIVE CIRCLED LATIN CAPITAL LETTER Z 1F170..1F189 ; Upper # So [26] NEGATIVE SQUARED LATIN CAPITAL LETTER A..NEGATIVE SQUARED LATIN CAPITAL LETTER Z -# Total code points: 1936 +# Total code points: 1963 # ================================================ @@ -2321,6 +2349,7 @@ FFDA..FFDC ; OLetter # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL 10450..1049D ; OLetter # Lo [78] SHAVIAN LETTER PEEP..OSMANYA LETTER OO 10500..10527 ; OLetter # Lo [40] ELBASAN LETTER A..ELBASAN LETTER KHE 10530..10563 ; OLetter # Lo [52] CAUCASIAN ALBANIAN LETTER ALT..CAUCASIAN ALBANIAN LETTER KIW +105C0..105F3 ; OLetter # Lo [52] TODHRI LETTER A..TODHRI LETTER OO 10600..10736 ; OLetter # Lo [311] LINEAR A SIGN AB001..LINEAR A SIGN A664 10740..10755 ; OLetter # Lo [22] LINEAR A SIGN A701 A..LINEAR A SIGN A732 JE 10760..10767 ; OLetter # Lo [8] LINEAR A SIGN A800..LINEAR A SIGN A807 @@ -2353,8 +2382,13 @@ FFDA..FFDC ; OLetter # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL 10B80..10B91 ; OLetter # Lo [18] PSALTER PAHLAVI LETTER ALEPH..PSALTER PAHLAVI LETTER TAW 10C00..10C48 ; OLetter # Lo [73] OLD TURKIC LETTER ORKHON A..OLD TURKIC LETTER ORKHON BASH 10D00..10D23 ; OLetter # Lo [36] HANIFI ROHINGYA LETTER A..HANIFI ROHINGYA MARK NA KHONNA +10D4A..10D4D ; OLetter # Lo [4] GARAY VOWEL SIGN A..GARAY VOWEL SIGN EE +10D4E ; OLetter # Lm GARAY VOWEL LENGTH MARK +10D4F ; OLetter # Lo GARAY SUKUN +10D6F ; OLetter # Lm GARAY REDUPLICATION MARK 10E80..10EA9 ; OLetter # Lo [42] YEZIDI LETTER ELIF..YEZIDI LETTER ET 10EB0..10EB1 ; OLetter # Lo [2] YEZIDI LETTER LAM WITH DOT ABOVE..YEZIDI LETTER YOT WITH CIRCUMFLEX ABOVE +10EC2..10EC4 ; OLetter # Lo [3] ARABIC LETTER DAL WITH TWO DOTS VERTICALLY BELOW..ARABIC LETTER KAF WITH TWO DOTS VERTICALLY BELOW 10F00..10F1C ; OLetter # Lo [29] OLD SOGDIAN LETTER ALEPH..OLD SOGDIAN LETTER FINAL TAW WITH VERTICAL TAIL 10F27 ; OLetter # Lo OLD SOGDIAN LIGATURE AYIN-DALETH 10F30..10F45 ; OLetter # Lo [22] SOGDIAN LETTER ALEPH..SOGDIAN INDEPENDENT SHIN @@ -2393,6 +2427,13 @@ FFDA..FFDC ; OLetter # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL 1133D ; OLetter # Lo GRANTHA SIGN AVAGRAHA 11350 ; OLetter # Lo GRANTHA OM 1135D..11361 ; OLetter # Lo [5] GRANTHA SIGN PLUTA..GRANTHA LETTER VOCALIC LL +11380..11389 ; OLetter # Lo [10] TULU-TIGALARI LETTER A..TULU-TIGALARI LETTER VOCALIC LL +1138B ; OLetter # Lo TULU-TIGALARI LETTER EE +1138E ; OLetter # Lo TULU-TIGALARI LETTER AI +11390..113B5 ; OLetter # Lo [38] TULU-TIGALARI LETTER OO..TULU-TIGALARI LETTER LLLA +113B7 ; OLetter # Lo TULU-TIGALARI SIGN AVAGRAHA +113D1 ; OLetter # Lo TULU-TIGALARI REPHA +113D3 ; OLetter # Lo TULU-TIGALARI SIGN PLUTA 11400..11434 ; OLetter # Lo [53] NEWA LETTER A..NEWA LETTER HA 11447..1144A ; OLetter # Lo [4] NEWA SIGN AVAGRAHA..NEWA SIDDHI 1145F..11461 ; OLetter # Lo [3] NEWA LETTER VEDIC ANUSVARA..NEWA SIGN UPADHMANIYA @@ -2426,6 +2467,7 @@ FFDA..FFDC ; OLetter # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL 11A5C..11A89 ; OLetter # Lo [46] SOYOMBO LETTER KA..SOYOMBO CLUSTER-INITIAL LETTER SA 11A9D ; OLetter # Lo SOYOMBO MARK PLUTA 11AB0..11AF8 ; OLetter # Lo [73] CANADIAN SYLLABICS NATTILIK HI..PAU CIN HAU GLOTTAL STOP FINAL +11BC0..11BE0 ; OLetter # Lo [33] SUNUWAR LETTER DEVI..SUNUWAR LETTER KLOKO 11C00..11C08 ; OLetter # Lo [9] BHAIKSUKI LETTER A..BHAIKSUKI LETTER VOCALIC L 11C0A..11C2E ; OLetter # Lo [37] BHAIKSUKI LETTER E..BHAIKSUKI LETTER HA 11C40 ; OLetter # Lo BHAIKSUKI SIGN AVAGRAHA @@ -2449,7 +2491,9 @@ FFDA..FFDC ; OLetter # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL 12F90..12FF0 ; OLetter # Lo [97] CYPRO-MINOAN SIGN CM001..CYPRO-MINOAN SIGN CM114 13000..1342F ; OLetter # Lo [1072] EGYPTIAN HIEROGLYPH A001..EGYPTIAN HIEROGLYPH V011D 13441..13446 ; OLetter # Lo [6] EGYPTIAN HIEROGLYPH FULL BLANK..EGYPTIAN HIEROGLYPH WIDE LOST SIGN +13460..143FA ; OLetter # Lo [3995] EGYPTIAN HIEROGLYPH-13460..EGYPTIAN HIEROGLYPH-143FA 14400..14646 ; OLetter # Lo [583] ANATOLIAN HIEROGLYPH A001..ANATOLIAN HIEROGLYPH A530 +16100..1611D ; OLetter # Lo [30] GURUNG KHEMA LETTER A..GURUNG KHEMA LETTER SA 16800..16A38 ; OLetter # Lo [569] BAMUM LETTER PHASE-A NGKUE MFON..BAMUM LETTER PHASE-F VUEQ 16A40..16A5E ; OLetter # Lo [31] MRO LETTER TA..MRO LETTER TEK 16A70..16ABE ; OLetter # Lo [79] TANGSA LETTER OZ..TANGSA LETTER ZA @@ -2458,6 +2502,9 @@ FFDA..FFDC ; OLetter # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL 16B40..16B43 ; OLetter # Lm [4] PAHAWH HMONG SIGN VOS SEEV..PAHAWH HMONG SIGN IB YAM 16B63..16B77 ; OLetter # Lo [21] PAHAWH HMONG SIGN VOS LUB..PAHAWH HMONG SIGN CIM NRES TOS 16B7D..16B8F ; OLetter # Lo [19] PAHAWH HMONG CLAN SIGN TSHEEJ..PAHAWH HMONG CLAN SIGN VWJ +16D40..16D42 ; OLetter # Lm [3] KIRAT RAI SIGN ANUSVARA..KIRAT RAI SIGN VISARGA +16D43..16D6A ; OLetter # Lo [40] KIRAT RAI LETTER A..KIRAT RAI VOWEL SIGN AU +16D6B..16D6C ; OLetter # Lm [2] KIRAT RAI SIGN VIRAMA..KIRAT RAI SIGN SAAT 16F00..16F4A ; OLetter # Lo [75] MIAO LETTER PA..MIAO LETTER RTE 16F50 ; OLetter # Lo MIAO LETTER NASALIZATION 16F93..16F9F ; OLetter # Lm [13] MIAO LETTER TONE-2..MIAO LETTER REFORMED TONE-8 @@ -2465,7 +2512,7 @@ FFDA..FFDC ; OLetter # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL 16FE3 ; OLetter # Lm OLD CHINESE ITERATION MARK 17000..187F7 ; OLetter # Lo [6136] TANGUT IDEOGRAPH-17000..TANGUT IDEOGRAPH-187F7 18800..18CD5 ; OLetter # Lo [1238] TANGUT COMPONENT-001..KHITAN SMALL SCRIPT CHARACTER-18CD5 -18D00..18D08 ; OLetter # Lo [9] TANGUT IDEOGRAPH-18D00..TANGUT IDEOGRAPH-18D08 +18CFF..18D08 ; OLetter # Lo [10] KHITAN SMALL SCRIPT CHARACTER-18CFF..TANGUT IDEOGRAPH-18D08 1AFF0..1AFF3 ; OLetter # Lm [4] KATAKANA LETTER MINNAN TONE-2..KATAKANA LETTER MINNAN TONE-5 1AFF5..1AFFB ; OLetter # Lm [7] KATAKANA LETTER MINNAN TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-5 1AFFD..1AFFE ; OLetter # Lm [2] KATAKANA LETTER MINNAN NASALIZED TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-8 @@ -2487,6 +2534,8 @@ FFDA..FFDC ; OLetter # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL 1E2C0..1E2EB ; OLetter # Lo [44] WANCHO LETTER AA..WANCHO LETTER YIH 1E4D0..1E4EA ; OLetter # Lo [27] NAG MUNDARI LETTER O..NAG MUNDARI LETTER ELL 1E4EB ; OLetter # Lm NAG MUNDARI SIGN OJOD +1E5D0..1E5ED ; OLetter # Lo [30] OL ONAL LETTER O..OL ONAL LETTER EG +1E5F0 ; OLetter # Lo OL ONAL SIGN HODDOND 1E7E0..1E7E6 ; OLetter # Lo [7] ETHIOPIC SYLLABLE HHYA..ETHIOPIC SYLLABLE HHYO 1E7E8..1E7EB ; OLetter # Lo [4] ETHIOPIC SYLLABLE GURAGE HHWA..ETHIOPIC SYLLABLE HHWE 1E7ED..1E7EE ; OLetter # Lo [2] ETHIOPIC SYLLABLE GURAGE MWI..ETHIOPIC SYLLABLE GURAGE MWEE @@ -2536,7 +2585,7 @@ FFDA..FFDC ; OLetter # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL 30000..3134A ; OLetter # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A 31350..323AF ; OLetter # Lo [4192] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-323AF -# Total code points: 132658 +# Total code points: 136908 # ================================================ @@ -2568,6 +2617,7 @@ FFDA..FFDC ; OLetter # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL 1810..1819 ; Numeric # Nd [10] MONGOLIAN DIGIT ZERO..MONGOLIAN DIGIT NINE 1946..194F ; Numeric # Nd [10] LIMBU DIGIT ZERO..LIMBU DIGIT NINE 19D0..19D9 ; Numeric # Nd [10] NEW TAI LUE DIGIT ZERO..NEW TAI LUE DIGIT NINE +19DA ; Numeric # No NEW TAI LUE THAM DIGIT ONE 1A80..1A89 ; Numeric # Nd [10] TAI THAM HORA DIGIT ZERO..TAI THAM HORA DIGIT NINE 1A90..1A99 ; Numeric # Nd [10] TAI THAM THAM DIGIT ZERO..TAI THAM THAM DIGIT NINE 1B50..1B59 ; Numeric # Nd [10] BALINESE DIGIT ZERO..BALINESE DIGIT NINE @@ -2584,6 +2634,7 @@ ABF0..ABF9 ; Numeric # Nd [10] MEETEI MAYEK DIGIT ZERO..MEETEI MAYEK DIGIT N FF10..FF19 ; Numeric # Nd [10] FULLWIDTH DIGIT ZERO..FULLWIDTH DIGIT NINE 104A0..104A9 ; Numeric # Nd [10] OSMANYA DIGIT ZERO..OSMANYA DIGIT NINE 10D30..10D39 ; Numeric # Nd [10] HANIFI ROHINGYA DIGIT ZERO..HANIFI ROHINGYA DIGIT NINE +10D40..10D49 ; Numeric # Nd [10] GARAY DIGIT ZERO..GARAY DIGIT NINE 11066..1106F ; Numeric # Nd [10] BRAHMI DIGIT ZERO..BRAHMI DIGIT NINE 110BD ; Numeric # Cf KAITHI NUMBER SIGN 110CD ; Numeric # Cf KAITHI NUMBER SIGN ABOVE @@ -2595,24 +2646,30 @@ FF10..FF19 ; Numeric # Nd [10] FULLWIDTH DIGIT ZERO..FULLWIDTH DIGIT NINE 114D0..114D9 ; Numeric # Nd [10] TIRHUTA DIGIT ZERO..TIRHUTA DIGIT NINE 11650..11659 ; Numeric # Nd [10] MODI DIGIT ZERO..MODI DIGIT NINE 116C0..116C9 ; Numeric # Nd [10] TAKRI DIGIT ZERO..TAKRI DIGIT NINE +116D0..116E3 ; Numeric # Nd [20] MYANMAR PAO DIGIT ZERO..MYANMAR EASTERN PWO KAREN DIGIT NINE 11730..11739 ; Numeric # Nd [10] AHOM DIGIT ZERO..AHOM DIGIT NINE 118E0..118E9 ; Numeric # Nd [10] WARANG CITI DIGIT ZERO..WARANG CITI DIGIT NINE 11950..11959 ; Numeric # Nd [10] DIVES AKURU DIGIT ZERO..DIVES AKURU DIGIT NINE +11BF0..11BF9 ; Numeric # Nd [10] SUNUWAR DIGIT ZERO..SUNUWAR DIGIT NINE 11C50..11C59 ; Numeric # Nd [10] BHAIKSUKI DIGIT ZERO..BHAIKSUKI DIGIT NINE 11D50..11D59 ; Numeric # Nd [10] MASARAM GONDI DIGIT ZERO..MASARAM GONDI DIGIT NINE 11DA0..11DA9 ; Numeric # Nd [10] GUNJALA GONDI DIGIT ZERO..GUNJALA GONDI DIGIT NINE 11F50..11F59 ; Numeric # Nd [10] KAWI DIGIT ZERO..KAWI DIGIT NINE +16130..16139 ; Numeric # Nd [10] GURUNG KHEMA DIGIT ZERO..GURUNG KHEMA DIGIT NINE 16A60..16A69 ; Numeric # Nd [10] MRO DIGIT ZERO..MRO DIGIT NINE 16AC0..16AC9 ; Numeric # Nd [10] TANGSA DIGIT ZERO..TANGSA DIGIT NINE 16B50..16B59 ; Numeric # Nd [10] PAHAWH HMONG DIGIT ZERO..PAHAWH HMONG DIGIT NINE +16D70..16D79 ; Numeric # Nd [10] KIRAT RAI DIGIT ZERO..KIRAT RAI DIGIT NINE +1CCF0..1CCF9 ; Numeric # Nd [10] OUTLINED DIGIT ZERO..OUTLINED DIGIT NINE 1D7CE..1D7FF ; Numeric # Nd [50] MATHEMATICAL BOLD DIGIT ZERO..MATHEMATICAL MONOSPACE DIGIT NINE 1E140..1E149 ; Numeric # Nd [10] NYIAKENG PUACHUE HMONG DIGIT ZERO..NYIAKENG PUACHUE HMONG DIGIT NINE 1E2F0..1E2F9 ; Numeric # Nd [10] WANCHO DIGIT ZERO..WANCHO DIGIT NINE 1E4F0..1E4F9 ; Numeric # Nd [10] NAG MUNDARI DIGIT ZERO..NAG MUNDARI DIGIT NINE +1E5F1..1E5FA ; Numeric # Nd [10] OL ONAL DIGIT ZERO..OL ONAL DIGIT NINE 1E950..1E959 ; Numeric # Nd [10] ADLAM DIGIT ZERO..ADLAM DIGIT NINE 1FBF0..1FBF9 ; Numeric # Nd [10] SEGMENTED DIGIT ZERO..SEGMENTED DIGIT NINE -# Total code points: 694 +# Total code points: 775 # ================================================ @@ -2646,13 +2703,15 @@ FF0E ; ATerm # Po FULLWIDTH FULL STOP 1809 ; STerm # Po MONGOLIAN MANCHU FULL STOP 1944..1945 ; STerm # Po [2] LIMBU EXCLAMATION MARK..LIMBU QUESTION MARK 1AA8..1AAB ; STerm # Po [4] TAI THAM SIGN KAAN..TAI THAM SIGN SATKAANKUU +1B4E..1B4F ; STerm # Po [2] BALINESE INVERTED CARIK SIKI..BALINESE INVERTED CARIK PAREREN 1B5A..1B5B ; STerm # Po [2] BALINESE PANTI..BALINESE PAMADA 1B5E..1B5F ; STerm # Po [2] BALINESE CARIK SIKI..BALINESE CARIK PAREREN -1B7D..1B7E ; STerm # Po [2] BALINESE PANTI LANTANG..BALINESE PAMADA LANTANG +1B7D..1B7F ; STerm # Po [3] BALINESE PANTI LANTANG..BALINESE PANTI BAWAK 1C3B..1C3C ; STerm # Po [2] LEPCHA PUNCTUATION TA-ROL..LEPCHA PUNCTUATION NYET THYOOM TA-ROL 1C7E..1C7F ; STerm # Po [2] OL CHIKI PUNCTUATION MUCAAD..OL CHIKI PUNCTUATION DOUBLE MUCAAD 203C..203D ; STerm # Po [2] DOUBLE EXCLAMATION MARK..INTERROBANG 2047..2049 ; STerm # Po [3] DOUBLE QUESTION MARK..EXCLAMATION QUESTION MARK +2CF9..2CFB ; STerm # Po [3] COPTIC OLD NUBIAN FULL STOP..COPTIC OLD NUBIAN INDIRECT QUESTION MARK 2E2E ; STerm # Po REVERSED QUESTION MARK 2E3C ; STerm # Po STENOGRAPHIC FULL STOP 2E53..2E54 ; STerm # Po [2] MEDIEVAL EXCLAMATION MARK..MEDIEVAL QUESTION MARK @@ -2668,6 +2727,8 @@ A9C8..A9C9 ; STerm # Po [2] JAVANESE PADA LINGSA..JAVANESE PADA LUNGSI AA5D..AA5F ; STerm # Po [3] CHAM PUNCTUATION DANDA..CHAM PUNCTUATION TRIPLE DANDA AAF0..AAF1 ; STerm # Po [2] MEETEI MAYEK CHEIKHAN..MEETEI MAYEK AHANG KHUDAM ABEB ; STerm # Po MEETEI MAYEK CHEIKHEI +FE12 ; STerm # Po PRESENTATION FORM FOR VERTICAL IDEOGRAPHIC FULL STOP +FE15..FE16 ; STerm # Po [2] PRESENTATION FORM FOR VERTICAL EXCLAMATION MARK..PRESENTATION FORM FOR VERTICAL QUESTION MARK FE56..FE57 ; STerm # Po [2] SMALL QUESTION MARK..SMALL EXCLAMATION MARK FF01 ; STerm # Po FULLWIDTH EXCLAMATION MARK FF1F ; STerm # Po FULLWIDTH QUESTION MARK @@ -2684,6 +2745,7 @@ FF61 ; STerm # Po HALFWIDTH IDEOGRAPHIC FULL STOP 11238..11239 ; STerm # Po [2] KHOJKI DANDA..KHOJKI DOUBLE DANDA 1123B..1123C ; STerm # Po [2] KHOJKI SECTION MARK..KHOJKI DOUBLE SECTION MARK 112A9 ; STerm # Po MULTANI SECTION MARK +113D4..113D5 ; STerm # Po [2] TULU-TIGALARI DANDA..TULU-TIGALARI DOUBLE DANDA 1144B..1144C ; STerm # Po [2] NEWA DANDA..NEWA DOUBLE DANDA 115C2..115C3 ; STerm # Po [2] SIDDHAM DANDA..SIDDHAM DOUBLE DANDA 115C9..115D7 ; STerm # Po [15] SIDDHAM END OF TEXT MARK..SIDDHAM SECTION MARK WITH CIRCLES AND FOUR ENCLOSURES @@ -2700,11 +2762,12 @@ FF61 ; STerm # Po HALFWIDTH IDEOGRAPHIC FULL STOP 16AF5 ; STerm # Po BASSA VAH FULL STOP 16B37..16B38 ; STerm # Po [2] PAHAWH HMONG SIGN VOS THOM..PAHAWH HMONG SIGN VOS TSHAB CEEB 16B44 ; STerm # Po PAHAWH HMONG SIGN XAUS +16D6E..16D6F ; STerm # Po [2] KIRAT RAI DANDA..KIRAT RAI DOUBLE DANDA 16E98 ; STerm # Po MEDEFAIDRIN FULL STOP 1BC9F ; STerm # Po DUPLOYAN PUNCTUATION CHINOOK FULL STOP 1DA88 ; STerm # Po SIGNWRITING FULL STOP -# Total code points: 153 +# Total code points: 166 # ================================================ @@ -2898,7 +2961,8 @@ FF63 ; Close # Pe HALFWIDTH RIGHT CORNER BRACKET 002C ; SContinue # Po COMMA 002D ; SContinue # Pd HYPHEN-MINUS -003A ; SContinue # Po COLON +003A..003B ; SContinue # Po [2] COLON..SEMICOLON +037E ; SContinue # Po GREEK QUESTION MARK 055D ; SContinue # Po ARMENIAN COMMA 060C..060D ; SContinue # Po [2] ARABIC COMMA..ARABIC DATE SEPARATOR 07F8 ; SContinue # Po NKO COMMA @@ -2907,17 +2971,17 @@ FF63 ; Close # Pe HALFWIDTH RIGHT CORNER BRACKET 2013..2014 ; SContinue # Pd [2] EN DASH..EM DASH 3001 ; SContinue # Po IDEOGRAPHIC COMMA FE10..FE11 ; SContinue # Po [2] PRESENTATION FORM FOR VERTICAL COMMA..PRESENTATION FORM FOR VERTICAL IDEOGRAPHIC COMMA -FE13 ; SContinue # Po PRESENTATION FORM FOR VERTICAL COLON +FE13..FE14 ; SContinue # Po [2] PRESENTATION FORM FOR VERTICAL COLON..PRESENTATION FORM FOR VERTICAL SEMICOLON FE31..FE32 ; SContinue # Pd [2] PRESENTATION FORM FOR VERTICAL EM DASH..PRESENTATION FORM FOR VERTICAL EN DASH FE50..FE51 ; SContinue # Po [2] SMALL COMMA..SMALL IDEOGRAPHIC COMMA -FE55 ; SContinue # Po SMALL COLON +FE54..FE55 ; SContinue # Po [2] SMALL SEMICOLON..SMALL COLON FE58 ; SContinue # Pd SMALL EM DASH FE63 ; SContinue # Pd SMALL HYPHEN-MINUS FF0C ; SContinue # Po FULLWIDTH COMMA FF0D ; SContinue # Pd FULLWIDTH HYPHEN-MINUS -FF1A ; SContinue # Po FULLWIDTH COLON +FF1A..FF1B ; SContinue # Po [2] FULLWIDTH COLON..FULLWIDTH SEMICOLON FF64 ; SContinue # Po HALFWIDTH IDEOGRAPHIC COMMA -# Total code points: 26 +# Total code points: 31 # EOF -- cgit v1.2.3