summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc2192.txt
diff options
context:
space:
mode:
authorThomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
committerThomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
commit4bfd864f10b68b71482b35c818559068ef8d5797 (patch)
treee3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc2192.txt
parentea76e11061bda059ae9f9ad130a9895cc85607db (diff)
doc: Add RFC documents
Diffstat (limited to 'doc/rfc/rfc2192.txt')
-rw-r--r--doc/rfc/rfc2192.txt899
1 files changed, 899 insertions, 0 deletions
diff --git a/doc/rfc/rfc2192.txt b/doc/rfc/rfc2192.txt
new file mode 100644
index 0000000..1b5a1d4
--- /dev/null
+++ b/doc/rfc/rfc2192.txt
@@ -0,0 +1,899 @@
+
+
+
+
+
+
+Network Working Group C. Newman
+Request for Comments: 2192 Innosoft
+Category: Standards Track September 1997
+
+
+ IMAP URL Scheme
+
+
+Status of this memo
+
+ This document specifies an Internet standards track protocol for
+ the Internet community, and requests discussion and suggestions for
+ improvements. Please refer to the current edition of the "Internet
+ Official Protocol Standards" (STD 1) for the standardization state
+ and status of this protocol. Distribution of this memo is
+ unlimited.
+
+
+Abstract
+
+ IMAP [IMAP4] is a rich protocol for accessing remote message
+ stores. It provides an ideal mechanism for accessing public
+ mailing list archives as well as private and shared message stores.
+ This document defines a URL scheme for referencing objects on an
+ IMAP server.
+
+
+1. Conventions used in this document
+
+ The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY"
+ in this document are to be interpreted as defined in "Key words for
+ use in RFCs to Indicate Requirement Levels" [KEYWORDS].
+
+
+2. IMAP scheme
+
+ The IMAP URL scheme is used to designate IMAP servers, mailboxes,
+ messages, MIME bodies [MIME], and search programs on Internet hosts
+ accessible using the IMAP protocol.
+
+ The IMAP URL follows the common Internet scheme syntax as defined
+ in RFC 1738 [BASIC-URL] except that clear text passwords are not
+ permitted. If :<port> is omitted, the port defaults to 143.
+
+
+
+
+
+
+
+
+Newman Standards Track [Page 1]
+
+RFC 2192 IMAP URL Scheme September 1997
+
+
+ An IMAP URL takes one of the following forms:
+
+ imap://<iserver>/
+ imap://<iserver>/<enc_list_mailbox>;TYPE=<list_type>
+ imap://<iserver>/<enc_mailbox>[uidvalidity][?<enc_search>]
+ imap://<iserver>/<enc_mailbox>[uidvalidity]<iuid>[isection]
+
+ The first form is used to refer to an IMAP server, the second form
+ refers to a list of mailboxes, the third form refers to the
+ contents of a mailbox or a set of messages resulting from a search,
+ and the final form refers to a specific message or message part.
+ Note that the syntax here is informal. The authoritative formal
+ syntax for IMAP URLs is defined in section 11.
+
+
+3. IMAP User Name and Authentication Mechanism
+
+ A user name and/or authentication mechanism may be supplied. They
+ are used in the "LOGIN" or "AUTHENTICATE" commands after making the
+ connection to the IMAP server. If no user name or authentication
+ mechanism is supplied, the user name "anonymous" is used with the
+ "LOGIN" command and the password is supplied as the Internet e-mail
+ address of the end user accessing the resource. If the URL doesn't
+ supply a user name, the program interpreting the IMAP URL SHOULD
+ request one from the user if necessary.
+
+ An authentication mechanism can be expressed by adding
+ ";AUTH=<enc_auth_type>" to the end of the user name. When such an
+ <enc_auth_type> is indicated, the client SHOULD request appropriate
+ credentials from that mechanism and use the "AUTHENTICATE" command
+ instead of the "LOGIN" command. If no user name is specified, one
+ SHOULD be obtained from the mechanism or requested from the user as
+ appropriate.
+
+ The string ";AUTH=*" indicates that the client SHOULD select an
+ appropriate authentication mechanism. It MAY use any mechanism
+ listed in the CAPABILITY command or use an out of band security
+ service resulting in a PREAUTH connection. If no user name is
+ specified and no appropriate authentication mechanisms are
+ available, the client SHOULD fall back to anonymous login as
+ described above. This allows a URL which grants read-write access
+ to authorized users, and read-only anonymous access to other users.
+
+ If a user name is included with no authentication mechanism, then
+ ";AUTH=*" is assumed.
+
+
+
+
+
+
+Newman Standards Track [Page 2]
+
+RFC 2192 IMAP URL Scheme September 1997
+
+
+ Since URLs can easily come from untrusted sources, care must be
+ taken when resolving a URL which requires or requests any sort of
+ authentication. If authentication credentials are supplied to the
+ wrong server, it may compromise the security of the user's account.
+ The program resolving the URL should make sure it meets at least
+ one of the following criteria in this case:
+
+ (1) The URL comes from a trusted source, such as a referral server
+ which the client has validated and trusts according to site policy.
+ Note that user entry of the URL may or may not count as a trusted
+ source, depending on the experience level of the user and site
+ policy.
+ (2) Explicit local site policy permits the client to connect to the
+ server in the URL. For example, if the client knows the site
+ domain name, site policy may dictate that any hostname ending in
+ that domain is trusted.
+ (3) The user confirms that connecting to that domain name with the
+ specified credentials and/or mechanism is permitted.
+ (4) A mechanism is used which validates the server before passing
+ potentially compromising client credentials.
+ (5) An authentication mechanism is used which will not reveal
+ information to the server which could be used to compromise future
+ connections.
+
+ URLs which do not include a user name must be treated with extra
+ care, since they are more likely to compromise the user's primary
+ account. A URL containing ";AUTH=*" must also be treated with
+ extra care since it might fall back on a weaker security mechanism.
+ Finally, clients are discouraged from using a plain text password
+ as a fallback with ";AUTH=*" unless the connection has strong
+ encryption (e.g. a key length of greater than 56 bits).
+
+ A program interpreting IMAP URLs MAY cache open connections to an
+ IMAP server for later re-use. If a URL contains a user name, only
+ connections authenticated as that user may be re-used. If a URL
+ does not contain a user name or authentication mechanism, then only
+ an anonymous connection may be re-used. If a URL contains an
+ authentication mechanism without a user name, then any non-
+ anonymous connection may be re-used.
+
+ Note that if unsafe or reserved characters such as " " or ";" are
+ present in the user name or authentication mechanism, they MUST be
+ encoded as described in RFC 1738 [BASIC-URL].
+
+
+
+
+
+
+
+
+Newman Standards Track [Page 3]
+
+RFC 2192 IMAP URL Scheme September 1997
+
+
+4. IMAP server
+
+ An IMAP URL referring to an IMAP server has the following form:
+
+ imap://<iserver>/
+
+ A program interpreting this URL would issue the standard set of
+ commands it uses to present a view of the contents of an IMAP
+ server. This is likely to be semanticly equivalent to one of the
+ following URLs:
+
+ imap://<iserver>/;TYPE=LIST
+ imap://<iserver>/;TYPE=LSUB
+
+ The program interpreting this URL SHOULD use the LSUB form if it
+ supports mailbox subscriptions.
+
+
+5. Lists of mailboxes
+
+ An IMAP URL referring to a list of mailboxes has the following
+ form:
+
+ imap://<iserver>/<enc_list_mailbox>;TYPE=<list_type>
+
+ The <list_type> may be either "LIST" or "LSUB", and is case
+ insensitive. The field ";TYPE=<list_type>" MUST be included.
+
+ The <enc_list_mailbox> is any argument suitable for the
+ list_mailbox field of the IMAP [IMAP4] LIST or LSUB commands. The
+ field <enc_list_mailbox> may be omitted, in which case the program
+ interpreting the IMAP URL may use "*" or "%" as the
+ <enc_list_mailbox>. The program SHOULD use "%" if it supports a
+ hierarchical view, otherwise it SHOULD use "*".
+
+ Note that if unsafe or reserved characters such as " " or "%" are
+ present in <enc_list_mailbox> they MUST be encoded as described in
+ RFC 1738 [BASIC-URL]. If the character "/" is present in
+ enc_list_mailbox, it SHOULD NOT be encoded.
+
+
+6. Lists of messages
+
+ An IMAP URL referring to a list of messages has the following form:
+
+ imap://<iserver>/<enc_mailbox>[uidvalidity][?<enc_search>]
+
+
+
+
+
+Newman Standards Track [Page 4]
+
+RFC 2192 IMAP URL Scheme September 1997
+
+
+ The <enc_mailbox> field is used as the argument to the IMAP4
+ "SELECT" command. Note that if unsafe or reserved characters such
+ as " ", ";", or "?" are present in <enc_mailbox> they MUST be
+ encoded as described in RFC 1738 [BASIC-URL]. If the character "/"
+ is present in enc_mailbox, it SHOULD NOT be encoded.
+
+ The [uidvalidity] field is optional. If it is present, it MUST be
+ the argument to the IMAP4 UIDVALIDITY status response at the time
+ the URL was created. This SHOULD be used by the program
+ interpreting the IMAP URL to determine if the URL is stale.
+
+ The [?<enc_search>] field is optional. If it is not present, the
+ contents of the mailbox SHOULD be presented by the program
+ interpreting the URL. If it is present, it SHOULD be used as the
+ arguments following an IMAP4 SEARCH command with unsafe characters
+ such as " " (which are likely to be present in the <enc_search>)
+ encoded as described in RFC 1738 [BASIC-URL].
+
+
+7. A specific message or message part
+
+ An IMAP URL referring to a specific message or message part has the
+ following form:
+
+ imap://<iserver>/<enc_mailbox>[uidvalidity]<iuid>[isection]
+
+ The <enc_mailbox> and [uidvalidity] are as defined above.
+
+ If [uidvalidity] is present in this form, it SHOULD be used by the
+ program interpreting the URL to determine if the URL is stale.
+
+ The <iuid> refers to an IMAP4 message UID, and SHOULD be used as
+ the <set> argument to the IMAP4 "UID FETCH" command.
+
+ The [isection] field is optional. If not present, the URL refers
+ to the entire Internet message as returned by the IMAP command "UID
+ FETCH <uid> BODY.PEEK[]". If present, the URL refers to the object
+ returned by a "UID FETCH <uid> BODY.PEEK[<section>]" command. The
+ type of the object may be determined with a "UID FETCH <uid>
+ BODYSTRUCTURE" command and locating the appropriate part in the
+ resulting BODYSTRUCTURE. Note that unsafe characters in [isection]
+ MUST be encoded as described in [BASIC-URL].
+
+
+
+
+
+
+
+
+
+Newman Standards Track [Page 5]
+
+RFC 2192 IMAP URL Scheme September 1997
+
+
+8. Relative IMAP URLs
+
+ Relative IMAP URLs are permitted and are resolved according to the
+ rules defined in RFC 1808 [REL-URL] with one exception. In IMAP
+ URLs, parameters are treated as part of the normal path with
+ respect to relative URL resolution. This is believed to be the
+ behavior of the installed base and is likely to be documented in a
+ future revision of the relative URL specification.
+
+ The following observations are also important:
+
+ The <iauth> grammar element is considered part of the user name for
+ purposes of resolving relative IMAP URLs. This means that unless a
+ new login/server specification is included in the relative URL, the
+ authentication mechanism is inherited from a base IMAP URL.
+
+ URLs always use "/" as the hierarchy delimiter for the purpose of
+ resolving paths in relative URLs. IMAP4 permits the use of any
+ hierarchy delimiter in mailbox names. For this reason, relative
+ mailbox paths will only work if the mailbox uses "/" as the
+ hierarchy delimiter. Relative URLs may be used on mailboxes which
+ use other delimiters, but in that case, the entire mailbox name
+ MUST be specified in the relative URL or inherited as a whole from
+ the base URL.
+
+ The base URL for a list of mailboxes or messages which was referred
+ to by an IMAP URL is always the referring IMAP URL itself. The
+ base URL for a message or message part which was referred to by an
+ IMAP URL may be more complicated to determine. The program
+ interpreting the relative URL will have to check the headers of the
+ MIME entity and any enclosing MIME entities in order to locate the
+ "Content-Base" and "Content-Location" headers. These headers are
+ used to determine the base URL as defined in [HTTP]. For example,
+ if the referring IMAP URL contains a "/;SECTION=1.2" parameter,
+ then the MIME headers for section 1.2, for section 1, and for the
+ enclosing message itself SHOULD be checked in that order for
+ "Content-Base" or "Content-Location" headers.
+
+
+9. Multinational Considerations
+
+ IMAP4 [IMAP4] section 5.1.3 includes a convention for encoding
+ non-US-ASCII characters in IMAP mailbox names. Because this
+ convention is private to IMAP, it is necessary to convert IMAP's
+ encoding to one that can be more easily interpreted by a URL
+ display program. For this reason, IMAP's modified UTF-7 encoding
+ for mailboxes MUST be converted to UTF-8 [UTF8]. Since 8-bit
+ characters are not permitted in URLs, the UTF-8 characters are
+
+
+
+Newman Standards Track [Page 6]
+
+RFC 2192 IMAP URL Scheme September 1997
+
+
+ encoded as required by the URL specification [BASIC-URL]. Sample
+ code is included in Appendix A to demonstrate this conversion.
+
+
+10. Examples
+
+ The following examples demonstrate how an IMAP4 client program
+ might translate various IMAP4 URLs into a series of IMAP4 commands.
+ Commands sent from the client to the server are prefixed with "C:",
+ and responses sent from the server to the client are prefixed with
+ "S:".
+
+ The URL:
+
+ <imap://minbari.org/gray-council;UIDVALIDITY=385759045/;UID=20>
+
+ Results in the following client commands:
+
+ <connect to minbari.org, port 143>
+ C: A001 LOGIN ANONYMOUS sheridan@babylon5.org
+ C: A002 SELECT gray-council
+ <client verifies the UIDVALIDITY matches>
+ C: A003 UID FETCH 20 BODY.PEEK[]
+
+ The URL:
+
+ <imap://michael@minbari.org/users.*;type=list>
+
+ Results in the following client commands:
+
+ <client requests password from user>
+ <connect to minbari.org imap server, activate strong encryption>
+ C: A001 LOGIN MICHAEL zipper
+ C: A002 LIST "" users.*
+
+ The URL:
+
+ <imap://psicorp.org/~peter/%E6%97%A5%E6%9C%AC%E8%AA%9E/
+ %E5%8F%B0%E5%8C%97>
+
+ Results in the following client commands:
+
+ <connect to psicorp.org, port 143>
+ C: A001 LOGIN ANONYMOUS bester@psycop.psicorp.org
+ C: A002 SELECT ~peter/&ZeVnLIqe-/&U,BTFw-
+ <commands the client uses for viewing the contents of a mailbox>
+
+
+
+
+
+Newman Standards Track [Page 7]
+
+RFC 2192 IMAP URL Scheme September 1997
+
+
+ The URL:
+
+ <imap://;AUTH=KERBEROS_V4@minbari.org/gray-council/;uid=20/
+ ;section=1.2>
+
+ Results in the following client commands:
+
+ <connect to minbari.org, port 143>
+ C: A001 AUTHENTICATE KERBEROS_V4
+ <authentication exchange>
+ C: A002 SELECT gray-council
+ C: A003 UID FETCH 20 BODY.PEEK[1.2]
+
+ If the following relative URL is located in that body part:
+
+ <;section=1.4>
+
+ This could result in the following client commands:
+
+ C: A004 UID FETCH 20 (BODY.PEEK[1.2.MIME]
+ BODY.PEEK[1.MIME]
+ BODY.PEEK[HEADER.FIELDS (Content-Base Content-Location)])
+ <Client looks for Content-Base or Content-Location headers in
+ result. If no such headers, then it does the following>
+ C: A005 UID FETCH 20 BODY.PEEK[1.4]
+
+ The URL:
+
+ <imap://;AUTH=*@minbari.org/gray%20council?SUBJECT%20shadows>
+
+ Could result in the following:
+
+ <connect to minbari.org, port 143>
+ C: A001 CAPABILITY
+ S: * CAPABILITY IMAP4rev1 AUTH=GSSAPI
+ S: A001 OK
+ C: A002 AUTHENTICATE GSSAPI
+ <authentication exchange>
+ S: A002 OK user lennier authenticated
+ C: A003 SELECT "gray council"
+ ...
+ C: A004 SEARCH SUBJECT shadows
+ S: * SEARCH 8 10 13 14 15 16
+ S: A004 OK SEARCH completed
+ C: A005 FETCH 8,10,13:16 ALL
+ ...
+
+
+
+
+
+Newman Standards Track [Page 8]
+
+RFC 2192 IMAP URL Scheme September 1997
+
+
+ NOTE: In this final example, the client has implementation
+ dependent choices. The authentication mechanism could be anything,
+ including PREAUTH. And the final FETCH command could fetch more or
+ less information about the messages, depending on what it wishes to
+ display to the user.
+
+
+11. Security Considerations
+
+ Security considerations discussed in the IMAP specification [IMAP4]
+ and the URL specification [BASIC-URL] are relevant. Security
+ considerations related to authenticated URLs are discussed in
+ section 3 of this document.
+
+ Many email clients store the plain text password for later use
+ after logging into an IMAP server. Such clients MUST NOT use a
+ stored password in response to an IMAP URL without explicit
+ permission from the user to supply that password to the specified
+ host name.
+
+
+12. ABNF for IMAP URL scheme
+
+ This uses ABNF as defined in RFC 822 [IMAIL]. Terminals from the
+ BNF for IMAP [IMAP4] and URLs [BASIC-URL] are also used. Strings
+ are not case sensitive and free insertion of linear-white-space is
+ not permitted.
+
+ achar = uchar / "&" / "=" / "~"
+ ; see [BASIC-URL] for "uchar" definition
+
+ bchar = achar / ":" / "@" / "/"
+
+ enc_auth_type = 1*achar
+ ; encoded version of [IMAP-AUTH] "auth_type"
+
+ enc_list_mailbox = 1*bchar
+ ; encoded version of [IMAP4] "list_mailbox"
+
+ enc_mailbox = 1*bchar
+ ; encoded version of [IMAP4] "mailbox"
+
+ enc_search = 1*bchar
+ ; encoded version of search_program below
+
+ enc_section = 1*bchar
+ ; encoded version of section below
+
+
+
+
+Newman Standards Track [Page 9]
+
+RFC 2192 IMAP URL Scheme September 1997
+
+
+ enc_user = 1*achar
+ ; encoded version of [IMAP4] "userid"
+
+ imapurl = "imap://" iserver "/" [ icommand ]
+
+ iauth = ";AUTH=" ( "*" / enc_auth_type )
+
+ icommand = imailboxlist / imessagelist / imessagepart
+
+ imailboxlist = [enc_list_mailbox] ";TYPE=" list_type
+
+ imessagelist = enc_mailbox [ "?" enc_search ] [uidvalidity]
+
+ imessagepart = enc_mailbox [uidvalidity] iuid [isection]
+
+ isection = "/;SECTION=" enc_section
+
+ iserver = [iuserauth "@"] hostport
+ ; See [BASIC-URL] for "hostport" definition
+
+ iuid = "/;UID=" nz_number
+ ; See [IMAP4] for "nz_number" definition
+
+ iuserauth = enc_user [iauth] / [enc_user] iauth
+
+ list_type = "LIST" / "LSUB"
+
+ search_program = ["CHARSET" SPACE astring SPACE]
+ search_key *(SPACE search_key)
+ ; IMAP4 literals may not be used
+ ; See [IMAP4] for "astring" and "search_key"
+
+ section = section_text / (nz_number *["." nz_number]
+ ["." (section_text / "MIME")])
+ ; See [IMAP4] for "section_text" and "nz_number"
+
+ uidvalidity = ";UIDVALIDITY=" nz_number
+ ; See [IMAP4] for "nz_number" definition
+
+13. References
+
+ [BASIC-URL] Berners-Lee, Masinter, McCahill, "Uniform Resource
+ Locators (URL)", RFC 1738, CERN, Xerox Corporation, University of
+ Minnesota, December 1994.
+
+ <ftp://ds.internic.net/rfc/rfc1738.txt>
+
+
+
+
+
+Newman Standards Track [Page 10]
+
+RFC 2192 IMAP URL Scheme September 1997
+
+
+ [IMAP4] Crispin, M., "Internet Message Access Protocol - Version
+ 4rev1", RFC 2060, University of Washington, December 1996.
+
+ <ftp://ds.internic.net/rfc/rfc2060.txt>
+
+ [IMAP-AUTH] Myers, J., "IMAP4 Authentication Mechanism", RFC 1731,
+ Carnegie-Mellon University, December 1994.
+
+ <ftp://ds.internic.net/rfc/rfc1731.txt>
+
+ [HTTP] Fielding, Gettys, Mogul, Frystyk, Berners-Lee, "Hypertext
+ Transfer Protocol -- HTTP/1.1", RFC 2068, UC Irvine, DEC, MIT/LCS,
+ January 1997.
+
+ <ftp://ds.internic.net/rfc/rfc2068.txt>
+
+ [IMAIL] Crocker, "Standard for the Format of ARPA Internet Text
+ Messages", STD 11, RFC 822, University of Delaware, August 1982.
+
+ <ftp://ds.internic.net/rfc/rfc822.txt>
+
+ [KEYWORDS] Bradner, "Key words for use in RFCs to Indicate
+ Requirement Levels", RFC 2119, Harvard University, March 1997.
+
+ <ftp://ds.internic.net/rfc/rfc2119.txt>
+
+ [MIME] Freed, N., Borenstein, N., "Multipurpose Internet Mail
+ Extensions", RFC 2045, Innosoft, First Virtual, November 1996.
+
+ <ftp://ds.internic.net/rfc/rfc2045.txt>
+
+ [REL-URL] Fielding, "Relative Uniform Resource Locators", RFC 1808,
+ UC Irvine, June 1995.
+
+ <ftp://ds.internic.net/rfc/rfc1808.txt>
+
+ [UTF8] Yergeau, F. "UTF-8, a transformation format of Unicode and
+ ISO 10646", RFC 2044, Alis Technologies, October 1996.
+
+ <ftp://ds.internic.net/rfc/rfc2044.txt>
+
+14. Author's Address
+
+ Chris Newman
+ Innosoft International, Inc.
+ 1050 Lakes Drive
+ West Covina, CA 91790 USA
+ EMail: chris.newman@innosoft.com
+
+
+
+Newman Standards Track [Page 11]
+
+RFC 2192 IMAP URL Scheme September 1997
+
+
+Appendix A. Sample code
+
+Here is sample C source code to convert between URL paths and IMAP
+mailbox names, taking into account mapping between IMAP's modified UTF-7
+[IMAP4] and hex-encoded UTF-8 which is more appropriate for URLs. This
+code has not been rigorously tested nor does it necessarily behave
+reasonably with invalid input, but it should serve as a useful example.
+This code just converts the mailbox portion of the URL and does not deal
+with parameters, query or server components of the URL.
+
+#include <stdio.h>
+#include <string.h>
+
+/* hexadecimal lookup table */
+static char hex[] = "0123456789ABCDEF";
+
+/* URL unsafe printable characters */
+static char urlunsafe[] = " \"#%&+:;<=>?@[\\]^`{|}";
+
+/* UTF7 modified base64 alphabet */
+static char base64chars[] =
+ "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+,";
+#define UNDEFINED 64
+
+/* UTF16 definitions */
+#define UTF16MASK 0x03FFUL
+#define UTF16SHIFT 10
+#define UTF16BASE 0x10000UL
+#define UTF16HIGHSTART 0xD800UL
+#define UTF16HIGHEND 0xDBFFUL
+#define UTF16LOSTART 0xDC00UL
+#define UTF16LOEND 0xDFFFUL
+
+/* Convert an IMAP mailbox to a URL path
+ * dst needs to have roughly 4 times the storage space of src
+ * Hex encoding can triple the size of the input
+ * UTF-7 can be slightly denser than UTF-8
+ * (worst case: 8 octets UTF-7 becomes 9 octets UTF-8)
+ */
+void MailboxToURL(char *dst, char *src)
+{
+ unsigned char c, i, bitcount;
+ unsigned long ucs4, utf16, bitbuf;
+ unsigned char base64[256], utf8[6];
+
+
+
+
+
+
+
+Newman Standards Track [Page 12]
+
+RFC 2192 IMAP URL Scheme September 1997
+
+
+ /* initialize modified base64 decoding table */
+ memset(base64, UNDEFINED, sizeof (base64));
+ for (i = 0; i < sizeof (base64chars); ++i) {
+ base64[base64chars[i]] = i;
+ }
+
+ /* loop until end of string */
+ while (*src != '\0') {
+ c = *src++;
+ /* deal with literal characters and &- */
+ if (c != '&' || *src == '-') {
+ if (c < ' ' || c > '~' || strchr(urlunsafe, c) != NULL) {
+ /* hex encode if necessary */
+ dst[0] = '%';
+ dst[1] = hex[c >> 4];
+ dst[2] = hex[c & 0x0f];
+ dst += 3;
+ } else {
+ /* encode literally */
+ *dst++ = c;
+ }
+ /* skip over the '-' if this is an &- sequence */
+ if (c == '&') ++src;
+ } else {
+ /* convert modified UTF-7 -> UTF-16 -> UCS-4 -> UTF-8 -> HEX */
+ bitbuf = 0;
+ bitcount = 0;
+ ucs4 = 0;
+ while ((c = base64[(unsigned char) *src]) != UNDEFINED) {
+ ++src;
+ bitbuf = (bitbuf << 6) | c;
+ bitcount += 6;
+ /* enough bits for a UTF-16 character? */
+ if (bitcount >= 16) {
+ bitcount -= 16;
+ utf16 = (bitcount ? bitbuf >> bitcount
+ : bitbuf) & 0xffff;
+ /* convert UTF16 to UCS4 */
+ if
+ (utf16 >= UTF16HIGHSTART && utf16 <= UTF16HIGHEND) {
+ ucs4 = (utf16 - UTF16HIGHSTART) << UTF16SHIFT;
+ continue;
+ } else if
+ (utf16 >= UTF16LOSTART && utf16 <= UTF16LOEND) {
+ ucs4 += utf16 - UTF16LOSTART + UTF16BASE;
+ } else {
+ ucs4 = utf16;
+ }
+
+
+
+Newman Standards Track [Page 13]
+
+RFC 2192 IMAP URL Scheme September 1997
+
+
+ /* convert UTF-16 range of UCS4 to UTF-8 */
+ if (ucs4 <= 0x7fUL) {
+ utf8[0] = ucs4;
+ i = 1;
+ } else if (ucs4 <= 0x7ffUL) {
+ utf8[0] = 0xc0 | (ucs4 >> 6);
+ utf8[1] = 0x80 | (ucs4 & 0x3f);
+ i = 2;
+ } else if (ucs4 <= 0xffffUL) {
+ utf8[0] = 0xe0 | (ucs4 >> 12);
+ utf8[1] = 0x80 | ((ucs4 >> 6) & 0x3f);
+ utf8[2] = 0x80 | (ucs4 & 0x3f);
+ i = 3;
+ } else {
+ utf8[0] = 0xf0 | (ucs4 >> 18);
+ utf8[1] = 0x80 | ((ucs4 >> 12) & 0x3f);
+ utf8[2] = 0x80 | ((ucs4 >> 6) & 0x3f);
+ utf8[3] = 0x80 | (ucs4 & 0x3f);
+ i = 4;
+ }
+ /* convert utf8 to hex */
+ for (c = 0; c < i; ++c) {
+ dst[0] = '%';
+ dst[1] = hex[utf8[c] >> 4];
+ dst[2] = hex[utf8[c] & 0x0f];
+ dst += 3;
+ }
+ }
+ }
+ /* skip over trailing '-' in modified UTF-7 encoding */
+ if (*src == '-') ++src;
+ }
+ }
+ /* terminate destination string */
+ *dst = '\0';
+}
+
+/* Convert hex coded UTF-8 URL path to modified UTF-7 IMAP mailbox
+ * dst should be about twice the length of src to deal with non-hex
+ * coded URLs
+ */
+void URLtoMailbox(char *dst, char *src)
+{
+ unsigned int utf8pos, utf8total, i, c, utf7mode, bitstogo, utf16flag;
+ unsigned long ucs4, bitbuf;
+ unsigned char hextab[256];
+
+ /* initialize hex lookup table */
+
+
+
+Newman Standards Track [Page 14]
+
+RFC 2192 IMAP URL Scheme September 1997
+
+
+ memset(hextab, 0, sizeof (hextab));
+ for (i = 0; i < sizeof (hex); ++i) {
+ hextab[hex[i]] = i;
+ if (isupper(hex[i])) hextab[tolower(hex[i])] = i;
+ }
+
+ utf7mode = 0;
+ utf8total = 0;
+ bitstogo = 0;
+ while ((c = *src) != '\0') {
+ ++src;
+ /* undo hex-encoding */
+ if (c == '%' && src[0] != '\0' && src[1] != '\0') {
+ c = (hextab[src[0]] << 4) | hextab[src[1]];
+ src += 2;
+ }
+ /* normal character? */
+ if (c >= ' ' && c <= '~') {
+ /* switch out of UTF-7 mode */
+ if (utf7mode) {
+ if (bitstogo) {
+ *dst++ = base64chars[(bitbuf << (6 - bitstogo)) & 0x3F];
+ }
+ *dst++ = '-';
+ utf7mode = 0;
+ }
+ *dst++ = c;
+ /* encode '&' as '&-' */
+ if (c == '&') {
+ *dst++ = '-';
+ }
+ continue;
+ }
+ /* switch to UTF-7 mode */
+ if (!utf7mode) {
+ *dst++ = '&';
+ utf7mode = 1;
+ }
+ /* Encode US-ASCII characters as themselves */
+ if (c < 0x80) {
+ ucs4 = c;
+ utf8total = 1;
+ } else if (utf8total) {
+ /* save UTF8 bits into UCS4 */
+ ucs4 = (ucs4 << 6) | (c & 0x3FUL);
+ if (++utf8pos < utf8total) {
+ continue;
+ }
+
+
+
+Newman Standards Track [Page 15]
+
+RFC 2192 IMAP URL Scheme September 1997
+
+
+ } else {
+ utf8pos = 1;
+ if (c < 0xE0) {
+ utf8total = 2;
+ ucs4 = c & 0x1F;
+ } else if (c < 0xF0) {
+ utf8total = 3;
+ ucs4 = c & 0x0F;
+ } else {
+ /* NOTE: can't convert UTF8 sequences longer than 4 */
+ utf8total = 4;
+ ucs4 = c & 0x03;
+ }
+ continue;
+ }
+ /* loop to split ucs4 into two utf16 chars if necessary */
+ utf8total = 0;
+ do {
+ if (ucs4 >= UTF16BASE) {
+ ucs4 -= UTF16BASE;
+ bitbuf = (bitbuf << 16) | ((ucs4 >> UTF16SHIFT)
+ + UTF16HIGHSTART);
+ ucs4 = (ucs4 & UTF16MASK) + UTF16LOSTART;
+ utf16flag = 1;
+ } else {
+ bitbuf = (bitbuf << 16) | ucs4;
+ utf16flag = 0;
+ }
+ bitstogo += 16;
+ /* spew out base64 */
+ while (bitstogo >= 6) {
+ bitstogo -= 6;
+ *dst++ = base64chars[(bitstogo ? (bitbuf >> bitstogo)
+ : bitbuf)
+ & 0x3F];
+ }
+ } while (utf16flag);
+ }
+ /* if in UTF-7 mode, finish in ASCII */
+ if (utf7mode) {
+ if (bitstogo) {
+ *dst++ = base64chars[(bitbuf << (6 - bitstogo)) & 0x3F];
+ }
+ *dst++ = '-';
+ }
+ /* tie off string */
+ *dst = '\0';
+}
+
+
+
+Newman Standards Track [Page 16]
+