summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc3718.txt
diff options
context:
space:
mode:
authorThomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
committerThomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
commit4bfd864f10b68b71482b35c818559068ef8d5797 (patch)
treee3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc3718.txt
parentea76e11061bda059ae9f9ad130a9895cc85607db (diff)
doc: Add RFC documents
Diffstat (limited to 'doc/rfc/rfc3718.txt')
-rw-r--r--doc/rfc/rfc3718.txt619
1 files changed, 619 insertions, 0 deletions
diff --git a/doc/rfc/rfc3718.txt b/doc/rfc/rfc3718.txt
new file mode 100644
index 0000000..b07a69a
--- /dev/null
+++ b/doc/rfc/rfc3718.txt
@@ -0,0 +1,619 @@
+
+
+
+
+
+
+Network Working Group R. McGowan
+Request for Comments: 3718 Unicode
+Category: Informational February 2004
+
+
+ A Summary of Unicode Consortium Procedures, Policies, Stability,
+ and Public Access
+
+Status of this Memo
+
+ This memo provides information for the Internet community. It does
+ not specify an Internet standard of any kind. Distribution of this
+ memo is unlimited.
+
+Copyright Notice
+
+ Copyright (C) The Internet Society (2004). All Rights Reserved.
+
+Abstract
+
+ This memo describes various internal workings of the Unicode
+ Consortium for the benefit of participants in the IETF. It is
+ intended solely for informational purposes. Included are discussions
+ of how the decision-making bodies of the Consortium work and their
+ procedures, as well as information on public access to the character
+ encoding & standardization processes.
+
+1. Introduction
+
+ This memo describes various internal workings of the Unicode
+ Consortium for the benefit of participants in the IETF. It is
+ intended solely for informational purposes. Included are discussions
+ of how the decision-making bodies of the Consortium work and their
+ procedures, as well as information on public access to the character
+ encoding & standardization processes.
+
+2. About The Unicode Consortium
+
+ The Unicode Consortium is a corporation. Legally speaking, it is a
+ "California Nonprofit Mutual Benefit Corporation", organized under
+ section 501 C(6) of the Internal Revenue Service Code of the United
+ States. As such, it is a "business league" not focussed on profiting
+ by sales or production of goods and services, but neither is it
+ formally a "charitable" organization. It is an alliance of member
+ companies whose purpose is to "extend, maintain, and promote the
+ Unicode Standard". To this end, the Consortium keeps a small office,
+ a few editorial and technical staff, World Wide Web presence, and
+ mail list presence.
+
+
+
+McGowan Informational [Page 1]
+
+RFC 3718 Internal Workings of the Unicode Consortium February 2004
+
+
+ The corporation is presided over by a Board of Directors who meet
+ annually. The Board is comprised of individuals who are elected
+ annually by the full members for three-year terms. The Board
+ appoints Officers of the corporation to run the daily operations.
+
+ Membership in the Consortium is open to "all corporations, other
+ business entities, governmental agencies, not-for-profit
+ organizations and academic institutions" who support the Consortium's
+ purpose. Formally, one class of voting membership is recognized, and
+ dues-paying members are typically for-profit corporations, research
+ and educational institutions, or national governments. Each such
+ full member sends representatives to meetings of the Unicode
+ Technical Committee (see below), as well as to a brief annual
+ Membership meeting.
+
+3. The Unicode Technical Committee
+
+ The Unicode Technical Committee (UTC) is the technical decision
+ making body of the Consortium. The UTC inherited the work and prior
+ decisions of the Unicode Working Group (UWG) that was active prior to
+ formation of the Consortium in January 1991.
+
+ Formally, the UTC is a technical body instituted by resolution of the
+ board of directors. Each member appoints one principal and one or
+ two alternate representatives to the UTC. UTC representatives
+ frequently do, but need not, act as the ordinary member
+ representatives for the purposes of the annual meeting.
+
+ The UTC is presided over by a Chair and Vice-Chair, appointed by the
+ Board of Directors for an unspecified term of service.
+
+ The UTC meets 4 to 5 times a year to discuss proposals, additions,
+ and various other technical topics. Each meeting lasts 3 to 4 full
+ days. Meetings are held in locations decided upon by the membership,
+ frequently in the San Francisco Bay Area. There is no fee for
+ participation in UTC meetings. Agendas for meetings are not
+ generally posted to any public forum, but meeting dates, locations,
+ and logistics are posted well in advance on the "Unicode Calendar of
+ Events" web page.
+
+ At the discretion of the UTC chair, meetings are open to
+ participation of member and liaison organizations, and to observation
+ by others. The minutes of meetings are also posted publicly on the
+ "UTC Minutes" page of the Unicode Web site.
+
+ All UTC meetings are held jointly with the INCITS Technical Committee
+ L2, the body responsible for Character Code standards in the United
+ States. They constitute "ad hoc" meetings of the L2 body and are
+
+
+
+McGowan Informational [Page 2]
+
+RFC 3718 Internal Workings of the Unicode Consortium February 2004
+
+
+ usually followed by a full meeting of the L2 committee. Further
+ information on L2 is available on the official INCITS web page.
+
+4. Unicode Technical Committee Procedures
+
+ The formal procedures of the UTC are publicly available in a document
+ entitled "UTC Procedures", available from the Consortium, and on the
+ Unicode web site.
+
+ Despite the invocation of Robert's Rules of Order, UTC meetings are
+ conducted with relative informality in view of the highly technical
+ nature of most discussions. Meetings focus on items from a technical
+ agenda organized and published by the UTC Chair prior to the meeting.
+ Technical items are usually proposals in one of the following
+ categories:
+
+ 1. Addition of new characters (whole scripts, additions to
+ existing scripts, or other characters)
+
+ 2. Preparation and Editing of Technical Reports and Standards
+
+ 3. Changes in the semantics of specific characters
+
+ 4. Extensions to the encoding architecture and forms of use
+
+ Note: There may also be changes to the architecture, character
+ properties, or semantics. Such changes are rare, and are always
+ constrained by the "Unicode Stability Policies" posted on the Unicode
+ web site. Significant changes are undertaken in consultation with
+ liaison organizations, such as W3C and IETF, which have standards
+ that may be affected by such changes. See sections 5 and 6 below.
+
+ Typical outputs of the UTC are:
+
+ 1. The Unicode Standard, major and minor versions (including the
+ Unicode Character Database)
+
+ 2. Unicode Technical Reports
+
+ 3. Stand-alone Unicode Technical Standards
+
+ 4. Formal resolutions
+
+ 5. Liaison statements and instructions to the Unicode liaisons to
+ other organizations.
+
+
+
+
+
+
+McGowan Informational [Page 3]
+
+RFC 3718 Internal Workings of the Unicode Consortium February 2004
+
+
+ For each technical item on the meeting agenda, the general process is
+ as follows:
+
+ 1. Introduction by the topic sponsor
+
+ 2. Proposals and discussion
+
+ 3. Consensus statements or formal motions
+
+ 4. Assignment of formal actions to implement decisions
+
+5. Unicode Technical Committee Motions
+
+ Technical topics of any complexity never proceed from initial
+ proposal to final ratification or adoption into the standard in the
+ course of one UTC meeting. The UTC members and presiding officers
+ are aware that technical changes to the standard have broad
+ consequences to other standards, implementers, and end-users of the
+ standard. Input from other organizations and experts is often vital
+ to the understanding of various proposals and for successful adoption
+ into the standard.
+
+ Technical topics are decided in UTC through the use of formal
+ motions, either taken in meetings, or by means of thirty-day letter
+ ballots. Formal UTC motions are of two types:
+
+ 1. Simple motions
+
+ 2. Precedents
+
+ Simple motions may pass with a simple majority constituting more than
+ 50 percent of the qualified voting members; or by a special majority
+ constituting two-thirds or more of the qualified voting members.
+
+ Precedents are defined, according to the UTC Procedures as either
+
+ (A) an existing Unicode Policy, or
+
+ (B) an explicit precedent.
+
+ Precedents must be passed or overturned by a special majority.
+
+ Examples of implicit precedents include:
+
+ 1. Publication of a character in the standard
+
+ 2. Published normative character properties
+
+
+
+
+McGowan Informational [Page 4]
+
+RFC 3718 Internal Workings of the Unicode Consortium February 2004
+
+
+ 3. Algorithms required for formal conformance
+
+ An Explicit Precedent is a policy, procedure, encoding, algorithm, or
+ other item that is established by a separate motion saying (in
+ effect) that a particular prior motion establishes a precedent.
+
+ A proposal may be passed either by a formal motion and vote, or by
+ consensus. If there is broad agreement as to the proposal, and no
+ member wishes to force a vote, then the proposal passes by consensus
+ and is recorded as such in the minutes.
+
+6. Unicode Consortium Policies
+
+ Because the Unicode Standard is continually evolving in an attempt to
+ reach the ideal of encoding "all the world's scripts", new characters
+ will constantly be added. In this sense, the standard is unstable:
+ in the standard's useful lifetime, there may never be a final point
+ at which no more characters are added. Realizing this, the
+ Consortium has adopted certain policies to promote and maintain
+ stability of the characters that are already encoded, as well as
+ laying out a Roadmap to future encodings.
+
+ The overall policies of the Consortium with regard to encoding
+ stability, as well as other issues such as privacy, are published on
+ a "Unicode Consortium Policies" web page. Deliberations and encoding
+ proposals in the UTC are bound by these policies.
+
+ The general effect of the stability policies may be stated in this
+ way: once a character is encoded, it will not be moved or removed and
+ its name will not be changed. Any of those actions has the potential
+ for causing obsolescence of data, and they are not permitted. The
+ canonical combining class and decompositions of characters will not
+ be changed in any way that affects normalization. In this sense,
+ normalization, such as that used for International Domain Naming and
+ "early normalization" for use on the World Wide Web, is fixed and
+ stable for every character at the time that character is encoded.
+ (Any changes that are undertaken because of outright errors in
+ properties or decompositions are dealt with by means of an adjunct
+ data file so that normalization stability can still be maintained by
+ those who need it.)
+
+ Once published, each version of the Unicode Standard is absolutely
+ stable and will never be changed retroactively. Implementations or
+ specifications that refer to a specific version of the Unicode
+ Standard can rely upon this stability. If future versions of such
+ implementations or specifications upgrade to a future version of the
+ Unicode Standard, then some changes may be necessary.
+
+
+
+
+McGowan Informational [Page 5]
+
+RFC 3718 Internal Workings of the Unicode Consortium February 2004
+
+
+ Property values of characters, such as directionality for the Unicode
+ Bidi algorithm, may be changed between versions of the standard in
+ some circumstances. As less-well documented characters and scripts
+ are encoded, the exact character properties and behavior may not be
+ well known at the time the characters are first encoded. As more
+ experience is gathered in implementing the newly encoded characters,
+ adjustments in the properties may become necessary. This re-working
+ is kept to a minimum. New and old versions of the relevant property
+ tables are made available on the Consortium's web site.
+
+ Normative and some informative data about characters is kept in the
+ Unicode Character Database (UCD). The structure of many of these
+ property values will not be changed. Instead, when new properties
+ are defined, the Consortium adds new files for these properties, so
+ as not to affect the stability of existing implementations that use
+ the values and properties defined in the existing formats and files.
+ The latest version of the UCD is available on the Consortium web site
+ via the "Unicode Data" heading.
+
+ Note on data redistribution: Unlike the situation with IETF
+ documents, some parts of the Unicode Character Database may have
+ restrictions on their verbatim redistribution with source-code
+ products. Users should read the notices in files they intend to use
+ in such products. The information contained in the UCD may be freely
+ used to create derivative works (such as programs, compressed data
+ files, subroutines, data structures, etc.) that may be redistributed
+ freely, but some files may not be redistributable verbatim. Such
+ restrictions on Unicode data files are never meant to prohibit or
+ control the use of the data in products, but only to help ensure that
+ users retrieve the latest official releases of data files when using
+ the data in products.
+
+7. UTC and ISO (WG2)
+
+ The character repertoire, names, and general architecture of the
+ Unicode Standard are identical to the parallel international standard
+ ISO/IEC 10646. ISO/IEC 10646 only contains a small fraction of the
+ semantics, properties and implementation guidelines supplied by the
+ Unicode Standard and associated technical standards and reports.
+ Implementations conformant to Unicode are conformant to ISO/IEC
+ 10646.
+
+ ISO/IEC 10646 is maintained by the committee ISO/IEC JTC1/SC2/WG2.
+ The WG2 committee is composed of national body representatives to
+ ISO. Details on the ISO organization may be found on the official
+ web site of the International Organization for Standardization (ISO).
+
+
+
+
+
+McGowan Informational [Page 6]
+
+RFC 3718 Internal Workings of the Unicode Consortium February 2004
+
+
+ Details and history of the relationship between ISO/IEC JTC1/SC2/WG2
+ and Unicode, Inc. may be found in Appendix C of The Unicode Standard.
+ (A PDF rendition of the most recent printed edition of the Unicode
+ Standard can be found on the Unicode web site.)
+
+ WG2 shares with UTC the policies regarding stability: WG2 neither
+ removes characters nor changes their names once published. Changes
+ in both standards are closely tracked by the respective committees,
+ and a very close working relationship is fostered to maintain
+ synchronization between the standards.
+
+ The Unicode Collation Algorithm (UCA) is one of a small set of other
+ independent standards defined and maintained by UTC. It is not,
+ properly speaking, part of the Unicode Standard itself, but is
+ separately defined in Unicode Technical Standard #10 (UTS #10).
+ There is no conformance relationship between the two standards,
+ except that conformance to a specific base version of the Unicode
+ Standard (e.g., 4.0) is specified in a particular version of a UTS.
+ The collation algorithm specified in UTS #10 is conformant to ISO/IEC
+ 14651, maintained by ISO/IEC JTC1/SC2, and the two organizations
+ maintain a close relationship. Beyond what is specified in ISO/IEC
+ 14651, the UCA contains additional constraints on collation,
+ specifies additional options, and provides many more implementation
+ guidelines.
+
+8. Process of Technical Changes to the Unicode Standard
+
+ Changes to The Unicode Standard are of two types: architectural
+ changes, and character additions.
+
+ Most architectural changes do not affect ISO/IEC 10646, for example,
+ the addition of various character properties to Unicode. Those
+ architectural changes that do affect both standards, such as
+ additional UTF formats or allocation of planes, are very carefully
+ coordinated by the committees. As always, on the UTC side,
+ architectural changes that establish precedents are carefully
+ monitored and the above-described rules and procedures are followed.
+
+ Additional characters for inclusion in the The Unicode Standard must
+ be approved both by the UTC and by WG2. Proposals for additional
+ characters enter the standards process in one of several ways:
+ through...
+
+ 1. a national body member of WG2
+
+ 2. a member company or associate of UTC
+
+ 3. directly from an individual "expert" contributor
+
+
+
+McGowan Informational [Page 7]
+
+RFC 3718 Internal Workings of the Unicode Consortium February 2004
+
+
+ The two committees have jointly produced a "Proposal Summary Form"
+ that is required to accompany all additional character proposals.
+ This form may be found online at the WG2 web site, and on the Unicode
+ web site along with information about "Submitting New Characters or
+ Scripts". Instructions for submitting proposals to UTC may likewise
+ be found online.
+
+ Often, submission of proposals to both committees (UTC and WG2) is
+ simultaneous. Members of UTC also frequently forward to WG2
+ proposals that have been initially reviewed by UTC.
+
+ In general, a proposal that is submitted to UTC before being
+ submitted to WG2 passes through several stages:
+
+ 1. Initial presentation to UTC
+
+ 2. Review and re-drafting
+
+ 3. Forwarding to WG2 for consideration
+
+ 4. Re-drafting for technical changes
+
+ 5. Balloting for approval in UTC
+
+ 6. Re-forwarding and recommendation to WG2
+
+ 7. At least two rounds of international balloting in ISO
+
+ About two years are required to complete this process. Initial
+ proposals most often do not include sufficient information or
+ justification to be approved. These are returned to the submitters
+ with comments on how the proposal needs to be amended or extended.
+ Repertoire addition proposals that are submitted to WG2 before being
+ submitted to UTC are generally forwarded immediately to UTC through
+ committee liaisons. The crucial parts of the process (steps 5
+ through 7 above) are never short-circuited. A two-thirds majority in
+ UTC is required for approval at step 5.
+
+ Proposals for additional scripts are required to be coordinated with
+ relevant user communities. Often there are ad-hoc subcommittees of
+ UTC or expert mail list participants who are responsible for actually
+ drafting proposals, garnering community support, or representing user
+ communities.
+
+ The rounds of international balloting in step 7 have participation
+ both by UTC and WG2, though UTC does not directly vote in the ISO
+ process.
+
+
+
+
+McGowan Informational [Page 8]
+
+RFC 3718 Internal Workings of the Unicode Consortium February 2004
+
+
+ Occasionally a proposal approved by one body is considered too
+ immature for approval by the other body, and may be blocked de-facto
+ by either of the two. Only after both bodies have approved the
+ additional characters do they proceed to the rounds of international
+ balloting. (The first round is a draft international standard during
+ which some changes may occur, the second round is final approval
+ during which only editorial changes are made.)
+
+ This process assures that proposals for additional characters are
+ mature and stable by the time they appear in a final international
+ ballot.
+
+9. Public Access to the Character Encoding Process
+
+ While Unicode, Inc. is a membership organization, and the final say
+ in technical matters rests with UTC, the process is quite open to
+ public input and scrutiny of processes and proposals. There are many
+ influential individual experts and industry groups who are not
+ formally members, but whose input to the process is taken seriously
+ by UTC.
+
+ Internally, UTC maintains a mail list called the "Unicore" list,
+ which carries traffic related to meetings, technical content of the
+ standard, and so forth. Members of the list are UTC representatives;
+ employees and staff of member organizations (such as the Research
+ Libraries Group); individual liaisons to and from other standards
+ bodies (such as WG2 and IETF); and invited experts from institutions
+ such as the Library of Congress and some universities. Subscription
+ to the list for external individuals is subject to "sponsorship" by
+ the corporate officers.
+
+ Unicode, Inc. also maintains a public discussion list called the
+ "Unicode" list. Subscription is open to anyone, and proceedings of
+ the "Unicode" mail list are publicly archived. Details are on the
+ Consortium web site under the "Mail Lists" heading.
+
+ Technical proposals for changes to the standard are posted to both of
+ these mail lists on a regular basis. Discussion on the public list
+ may result in a written proposal being generated for a later UTC
+ meeting. Technical issues and other standardization "events" of any
+ significance, such as beta releases and availability of draft
+ documents, are announced and then discussed in this public forum,
+ well before standardization is finalized. From time to time, the UTC
+ also publishes on the Consortium web site "Public Review Issues" to
+ gather feedback and generate discussion of specific proposals whose
+ impact may be unclear, or for which sufficiently broad review may not
+ yet have been brought to the UTC deliberations.
+
+
+
+
+McGowan Informational [Page 9]
+
+RFC 3718 Internal Workings of the Unicode Consortium February 2004
+
+
+ Anyone may make a character encoding or architectural proposal to
+ UTC. Membership in the organization is not required to submit a
+ proposal. To be taken seriously, the proposal must be framed in a
+ substantial way, and be accompanied by sufficient documentation to
+ warrant discussion. Examples of proposals are easily available by
+ following links from the "Proposed Characters" and "Roadmaps"
+ headings on the Unicode web site. Guidelines for proposals are also
+ available under the heading "Submitting Proposals".
+
+ In general, proposals are publicly aired on the "Unicode" mail list,
+ sometimes for a long period, prior to formal submission. Generally
+ this is of benefit to the proposer as it tends to reduce the number
+ of times the proposal is sent back for clarification or with requests
+ for additional information. Once a proposal reaches the stage of
+ being ready for discussion by UTC, the proposer will have received
+ contact through the public mail list with one or more UTC members
+ willing to explain or defend it in a UTC meeting.
+
+10. Acknowledgements
+
+ Thanks to Mark Davis, Simon Josefsson, and Ken Whistler for their
+ extensive review and feedback on previous versions of this document.
+
+11. Security Considerations
+
+ This memo describes the operational procedures of an organization;
+ the procedures themselves have no consequences for Internet Security.
+
+12. Author's Address
+
+ Rick McGowan
+ c/o The Unicode Consortium
+ P.O. Box 391476
+ Mountain View, CA 94039-1476
+ U.S.A.
+
+ Phone: +1-650-693-3921
+ Web: http://www.unicode.org/
+
+
+
+
+
+
+
+
+
+
+
+
+
+McGowan Informational [Page 10]
+
+RFC 3718 Internal Workings of the Unicode Consortium February 2004
+
+
+13. Full Copyright Statement
+
+ Copyright (C) The Internet Society (2004). This document is subject
+ to the rights, licenses and restrictions contained in BCP 78 and
+ except as set forth therein, the authors retain all their rights.
+
+ This document and the information contained herein are provided on an
+ "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
+ OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
+ ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
+ INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
+ INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
+ WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+
+Intellectual Property
+
+ The IETF takes no position regarding the validity or scope of any
+ Intellectual Property Rights or other rights that might be claimed to
+ pertain to the implementation or use of the technology described in
+ this document or the extent to which any license under such rights
+ might or might not be available; nor does it represent that it has
+ made any independent effort to identify any such rights. Information
+ on the procedures with respect to rights in RFC documents can be
+ found in BCP 78 and BCP 79.
+
+ Copies of IPR disclosures made to the IETF Secretariat and any
+ assurances of licenses to be made available, or the result of an
+ attempt made to obtain a general license or permission for the use of
+ such proprietary rights by implementers or users of this
+ specification can be obtained from the IETF on-line IPR repository at
+ http://www.ietf.org/ipr.
+
+ The IETF invites any interested party to bring to its attention any
+ copyrights, patents or patent applications, or other proprietary
+ rights that may cover technology that may be required to implement
+ this standard. Please address the information to the IETF at ietf-
+ ipr@ietf.org.
+
+Acknowledgement
+
+ Funding for the RFC Editor function is currently provided by the
+ Internet Society.
+
+
+
+
+
+
+
+
+
+McGowan Informational [Page 11]
+