summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc4952.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/rfc/rfc4952.txt')
-rw-r--r--doc/rfc/rfc4952.txt1123
1 files changed, 1123 insertions, 0 deletions
diff --git a/doc/rfc/rfc4952.txt b/doc/rfc/rfc4952.txt
new file mode 100644
index 0000000..d5368f4
--- /dev/null
+++ b/doc/rfc/rfc4952.txt
@@ -0,0 +1,1123 @@
+
+
+
+
+
+
+Network Working Group J. Klensin
+Request for Comments: 4952
+Category: Informational Y. Ko
+ ICU
+ July 2007
+
+
+ Overview and Framework for Internationalized Email
+
+Status of This Memo
+
+ This memo provides information for the Internet community. It does
+ not specify an Internet standard of any kind. Distribution of this
+ memo is unlimited.
+
+Copyright Notice
+
+ Copyright (C) The IETF Trust (2007).
+
+Abstract
+
+ Full use of electronic mail throughout the world requires that people
+ be able to use their own names, written correctly in their own
+ languages and scripts, as mailbox names in email addresses. This
+ document introduces a series of specifications that define mechanisms
+ and protocol extensions needed to fully support internationalized
+ email addresses. These changes include an SMTP extension and
+ extension of email header syntax to accommodate UTF-8 data. The
+ document set also includes discussion of key assumptions and issues
+ in deploying fully internationalized email.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Klensin & Ko Informational [Page 1]
+
+RFC 4952 EAI Framework July 2007
+
+
+Table of Contents
+
+ 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
+ 1.1. Role of This Specification . . . . . . . . . . . . . . . . 3
+ 1.2. Problem Statement . . . . . . . . . . . . . . . . . . . . 3
+ 1.3. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4
+ 2. Overview of the Approach . . . . . . . . . . . . . . . . . . . 6
+ 3. Document Plan . . . . . . . . . . . . . . . . . . . . . . . . 6
+ 4. Overview of Protocol Extensions and Changes . . . . . . . . . 7
+ 4.1. SMTP Extension for Internationalized Email Address . . . . 7
+ 4.2. Transmission of Email Header Fields in UTF-8 Encoding . . 8
+ 4.3. Downgrading Mechanism for Backward Compatibility . . . . . 9
+ 5. Downgrading before and after SMTP Transactions . . . . . . . . 10
+ 5.1. Downgrading before or during Message Submission . . . . . 10
+ 5.2. Downgrading or Other Processing After Final SMTP
+ Delivery . . . . . . . . . . . . . . . . . . . . . . . . . 11
+ 6. Additional Issues . . . . . . . . . . . . . . . . . . . . . . 11
+ 6.1. Impact on URIs and IRIs . . . . . . . . . . . . . . . . . 11
+ 6.2. Interaction with Delivery Notifications . . . . . . . . . 12
+ 6.3. Use of Email Addresses as Identifiers . . . . . . . . . . 12
+ 6.4. Encoded Words, Signed Messages, and Downgrading . . . . . 12
+ 6.5. Other Uses of Local Parts . . . . . . . . . . . . . . . . 13
+ 6.6. Non-Standard Encapsulation Formats . . . . . . . . . . . . 13
+ 7. Experimental Targets . . . . . . . . . . . . . . . . . . . . . 13
+ 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13
+ 9. Security Considerations . . . . . . . . . . . . . . . . . . . 14
+ 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 15
+ 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 16
+ 11.1. Normative References . . . . . . . . . . . . . . . . . . . 16
+ 11.2. Informative References . . . . . . . . . . . . . . . . . . 16
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Klensin & Ko Informational [Page 2]
+
+RFC 4952 EAI Framework July 2007
+
+
+1. Introduction
+
+ In order to use internationalized email addresses, we need to
+ internationalize both the domain part and the local part of email
+ addresses. The domain part of email addresses is already
+ internationalized [RFC3490], while the local part is not. Without
+ the extensions specified in this document, the mailbox name is
+ restricted to a subset of 7-bit ASCII [RFC2821]. Though MIME
+ [RFC2045] enables the transport of non-ASCII data, it does not
+ provide a mechanism for internationalized email addresses. In RFC
+ 2047 [RFC2047], MIME defines an encoding mechanism for some specific
+ message header fields to accommodate non-ASCII data. However, it
+ does not permit the use of email addresses that include non-ASCII
+ characters. Without the extensions defined here, or some equivalent
+ set, the only way to incorporate non-ASCII characters in any part of
+ email addresses is to use RFC 2047 coding to embed them in what RFC
+ 2822 [RFC2822] calls the "display name" (known as a "name phrase" or
+ by other terms elsewhere) of the relevant headers. Information coded
+ into the display name is invisible in the message envelope and, for
+ many purposes, is not part of the address at all.
+
+1.1. Role of This Specification
+
+ This document presents the overview and framework for an approach to
+ the next stage of email internationalization. This new stage
+ requires not only internationalization of addresses and headers, but
+ also associated transport and delivery models.
+
+ This document provides the framework for a series of experimental
+ specifications that, together, provide the details for a way to
+ implement and support internationalized email. The document itself
+ describes how the various elements of email internationalization fit
+ together and how the relationships among the various documents are
+ involved.
+
+1.2. Problem Statement
+
+ Internationalizing Domain Names in Applications (IDNA) [RFC3490]
+ permits internationalized domain names, but deployment has not yet
+ reached most users. One of the reasons for this is that we do not
+ yet have fully internationalized naming schemes. Domain names are
+ just one of the various names and identifiers that are required to be
+ internationalized. In many contexts, until more of those identifiers
+ are internationalized, internationalized domain names alone have
+ little value.
+
+ Email addresses are prime examples of why it is not good enough to
+ just internationalize the domain name. As most of us have learned
+
+
+
+Klensin & Ko Informational [Page 3]
+
+RFC 4952 EAI Framework July 2007
+
+
+ from experience, users strongly prefer email addresses that resemble
+ names or initials to those involving seemingly meaningless strings of
+ letters or numbers. Unless the entire email address can use familiar
+ characters and formats, users will perceive email as being culturally
+ unfriendly. If the names and initials used in email addresses can be
+ expressed in the native languages and writing systems of the users,
+ the Internet will be perceived as more natural, especially by those
+ whose native language is not written in a subset of a Roman-derived
+ script.
+
+ Internationalization of email addresses is not merely a matter of
+ changing the SMTP envelope; or of modifying the From, To, and Cc
+ headers; or of permitting upgraded Mail User Agents (MUAs) to decode
+ a special coding and respond by displaying local characters. To be
+ perceived as usable, the addresses must be internationalized and
+ handled consistently in all of the contexts in which they occur.
+ This requirement has far-reaching implications: collections of
+ patches and workarounds are not adequate. Even if they were
+ adequate, a workaround-based approach may result in an assortment of
+ implementations with different sets of patches and workarounds having
+ been applied with consequent user confusion about what is actually
+ usable and supported. Instead, we need to build a fully
+ internationalized email environment, focusing on permitting efficient
+ communication among those who share a language or other community.
+ That, in turn, implies changes to the mail header environment to
+ permit the full range of Unicode characters where that makes sense,
+ an SMTP Extension to permit UTF-8 [RFC3629] mail addressing and
+ delivery of those extended headers, and (finally) a requirement for
+ support of the 8BITMIME SMTP extension [RFC1652] so that all of these
+ can be transported through the mail system without having to overcome
+ the limitation that headers do not have content-transfer-encodings.
+
+1.3. Terminology
+
+ This document assumes a reasonable understanding of the protocols and
+ terminology of the core email standards as documented in [RFC2821]
+ and [RFC2822].
+
+ Much of the description in this document depends on the abstractions
+ of "Mail Transfer Agent" ("MTA") and "Mail User Agent" ("MUA").
+ However, it is important to understand that those terms and the
+ underlying concepts postdate the design of the Internet's email
+ architecture and the application of the "protocols on the wire"
+ principle to it. That email architecture, as it has evolved, and the
+ "wire" principle have prevented any strong and standardized
+ distinctions about how MTAs and MUAs interact on a given origin or
+ destination host (or even whether they are separate).
+
+
+
+
+Klensin & Ko Informational [Page 4]
+
+RFC 4952 EAI Framework July 2007
+
+
+ However, the term "final delivery MTA" is used in this document in a
+ fashion equivalent to the term "delivery system" or "final delivery
+ system" of RFC 2821. This is the SMTP server that controls the
+ format of the local parts of addresses and is permitted to inspect
+ and interpret them. It receives messages from the network for
+ delivery to mailboxes or for other local processing, including any
+ forwarding or aliasing that changes envelope addresses, rather than
+ relaying. From the perspective of the network, any local delivery
+ arrangements such as saving to a message store, handoff to specific
+ message delivery programs or agents, and mechanisms for retrieving
+ messages are all "behind" the final delivery MTA and hence are not
+ part of the SMTP transport or delivery process.
+
+ In this document, an address is "all-ASCII", or just an "ASCII
+ address", if every character in the address is in the ASCII character
+ repertoire [ASCII]; an address is "non-ASCII", or an "i18n-address",
+ if any character is not in the ASCII character repertoire. Such
+ addresses may be restricted in other ways, but those restrictions are
+ not relevant to this definition. The term "all-ASCII" is also
+ applied to other protocol elements when the distinction is important,
+ with "non-ASCII" or "internationalized" as its opposite.
+
+ The umbrella term to describe the email address internationalization
+ specified by this document and its companion documents is "UTF8SMTP".
+ For example, an address permitted by this specification is referred
+ to as a "UTF8SMTP (compliant) address".
+
+ Please note that, according to the definitions given here, the set of
+ all "all-ASCII" addresses and the set of all "non-ASCII" addresses
+ are mutually exclusive. The set of all UTF8SMTP addresses is the
+ union of these two sets.
+
+ An "ASCII user" (i) exclusively uses email addresses that contain
+ ASCII characters only, and (ii) cannot generate recipient addresses
+ that contain non-ASCII characters.
+
+ An "i18mail user" has one or more non-ASCII email addresses. Such a
+ user may have ASCII addresses too; if the user has more than one
+ email account and a corresponding address, or more than one alias for
+ the same address, he or she has some method to choose which address
+ to use on outgoing email. Note that under this definition, it is not
+ possible to tell from an ASCII address if the owner of that address
+ is an i18mail user or not. (A non-ASCII address implies a belief
+ that the owner of that address is an i18mail user.) There is no such
+ thing as an "i18mail message"; the term applies only to users and
+ their agents and capabilities.
+
+
+
+
+
+Klensin & Ko Informational [Page 5]
+
+RFC 4952 EAI Framework July 2007
+
+
+ A "message" is sent from one user (sender) using a particular email
+ address to one or more other recipient email addresses (often
+ referred to just as "users" or "recipient users").
+
+ A "mailing list" is a mechanism whereby a message may be distributed
+ to multiple recipients by sending it to one recipient address. An
+ agent (typically not a human being) at that single address then
+ causes the message to be redistributed to the target recipients.
+ This agent sets the envelope return address of the redistributed
+ message to a different address from that of the original single
+ recipient message. Using a different envelope return address
+ (reverse-path) causes error (and other automatically generated)
+ messages to go to an error handling address.
+
+ As specified in RFC 2821, a message that is undeliverable for some
+ reason is expected to result in notification to the sender. This can
+ occur in either of two ways. One, typically called "Rejection",
+ occurs when an SMTP server returns a reply code indicating a fatal
+ error (a "5yz" code) or persistently returns a temporary failure
+ error (a "4yz" code). The other involves accepting the message
+ during SMTP processing and then generating a message to the sender,
+ typically known as a "Non-delivery Notification" or "NDN". Current
+ practice often favors rejection over NDNs because of the reduced
+ likelihood that the generation of NDNs will be used as a spamming
+ technique. The latter, NDN, case is unavoidable if an intermediate
+ MTA accepts a message that is then rejected by the next-hop server.
+
+ The pronouns "he" and "she" are used interchangeably to indicate a
+ human of indeterminate gender.
+
+ The key words "MUST", "SHALL", "REQUIRED", "SHOULD", "RECOMMENDED",
+ and "MAY" in this document are to be interpreted as described in RFC
+ 2119 [RFC2119].
+
+2. Overview of the Approach
+
+ This set of specifications changes both SMTP and the format of email
+ headers to permit non-ASCII characters to be represented directly.
+ Each important component of the work is described in a separate
+ document. The document set, whose members are described in the next
+ section, also contains informational documents whose purpose is to
+ provide implementation suggestions and guidance for the protocols.
+
+3. Document Plan
+
+ In addition to this document, the following documents make up this
+ specification and provide advice and context for it.
+
+
+
+
+Klensin & Ko Informational [Page 6]
+
+RFC 4952 EAI Framework July 2007
+
+
+ o SMTP extensions. This document [EAI-SMTPext] provides an SMTP
+ extension for internationalized addresses, as provided for in RFC
+ 2821.
+
+ o Email headers in UTF-8. This document [EAI-UTF8] essentially
+ updates RFC 2822 to permit some information in email headers to be
+ expressed directly by Unicode characters encoded in UTF-8 when the
+ SMTP extension described above is used. This document, possibly
+ with one or more supplemental ones, will also need to address the
+ interactions with MIME, including relationships between UTF8SMTP
+ and internal MIME headers and content types.
+
+ o In-transit downgrading from internationalized addressing with the
+ SMTP extension and UTF-8 headers to traditional email formats and
+ characters [EAI-downgrade]. Downgrading either at the point of
+ message origination or after the mail has successfully been
+ received by a final delivery SMTP server involve different
+ constraints and possibilities; see Section 4.3 and Section 5,
+ below. Processing that occurs after such final delivery,
+ particularly processing that is involved with the delivery to a
+ mailbox or message store, is sometimes called "Message Delivery"
+ processing.
+
+ o Extensions to the IMAP protocol to support internationalized
+ headers [EAI-imap].
+
+ o Parallel extensions to the POP protocol [EAI-pop].
+
+ o Description of internationalization changes for delivery
+ notifications (DSNs) [EAI-DSN].
+
+ o Scenarios for the use of these protocols [EAI-scenarios].
+
+4. Overview of Protocol Extensions and Changes
+
+4.1. SMTP Extension for Internationalized Email Address
+
+ An SMTP extension, "UTF8SMTP" is specified as follows:
+
+ o Permits the use of UTF-8 strings in email addresses, both local
+ parts and domain names.
+
+ o Permits the selective use of UTF-8 strings in email headers (see
+ Section 4.2).
+
+
+
+
+
+
+
+Klensin & Ko Informational [Page 7]
+
+RFC 4952 EAI Framework July 2007
+
+
+ o Requires that the server advertise the 8BITMIME extension
+ [RFC1652] and that the client support 8-bit transmission so that
+ header information can be transmitted without using special
+ content-transfer-encoding.
+
+ o Provides information to support downgrading mechanisms.
+
+ Some general principles affect the development decisions underlying
+ this work.
+
+ 1. Email addresses enter subsystems (such as a user interface) that
+ may perform charset conversions or other encoding changes. When
+ the left hand side of the address includes characters outside the
+ US-ASCII character repertoire, use of punycode on the right hand
+ side is discouraged to promote consistent processing of
+ characters throughout the address.
+
+ 2. An SMTP relay must
+
+ * Either recognize the format explicitly, agreeing to do so via
+ an ESMTP option,
+
+ * Select and use an ASCII-only address, downgrading other
+ information as needed (see Section 4.3), or
+
+ * Reject the message or, if necessary, return a non-delivery
+ notification message, so that the sender can make another
+ plan.
+
+ If the message cannot be forwarded because the next-hop system
+ cannot accept the extension and insufficient information is
+ available to reliably downgrade it, it MUST be rejected or a non-
+ delivery message generated and sent.
+
+ 3. In the interest of interoperability, charsets other than UTF-8
+ are prohibited in mail addresses and headers. There is no
+ practical way to identify them properly with an extension similar
+ to this without introducing great complexity.
+
+ Conformance to the group of standards specified here for email
+ transport and delivery requires implementation of the SMTP Extension
+ specification, including recognition of the keywords associated with
+ alternate addresses, and the UTF-8 Header specification. Support for
+ downgrading is not required, but, if implemented, MUST be implemented
+ as specified. Similarly, if the system implements IMAP or POP, it
+ MUST conform to the i18n IMAP or POP specifications respectively.
+
+
+
+
+
+Klensin & Ko Informational [Page 8]
+
+RFC 4952 EAI Framework July 2007
+
+
+4.2. Transmission of Email Header Fields in UTF-8 Encoding
+
+ There are many places in MUAs or in a user presentation in which
+ email addresses or domain names appear. Examples include the
+ conventional From, To, or Cc header fields; Message-ID and
+ In-Reply-To header fields that normally contain domain names (but
+ that may be a special case); and in message bodies. Each of these
+ must be examined from an internationalization perspective. The user
+ will expect to see mailbox and domain names in local characters, and
+ to see them consistently. If non-obvious encodings, such as
+ protocol-specific ASCII-Compatible Encoding (ACE) variants, are used,
+ the user will inevitably, if only occasionally, see them rather than
+ "native" characters and will find that discomfiting or astonishing.
+ Similarly, if different codings are used for mail transport and
+ message bodies, the user is particularly likely to be surprised, if
+ only as a consequence of the long-established "things leak"
+ principle. The only practical way to avoid these sources of
+ discomfort, in both the medium and the longer term, is to have the
+ encodings used in transport be as similar to the encodings used in
+ message headers and message bodies as possible.
+
+ When email local parts are internationalized, it seems clear that
+ they should be accompanied by arrangements for the email headers to
+ be in the fully internationalized form. That form should presumably
+ use UTF-8 rather than ASCII as the base character set for the
+ contents of header fields (protocol elements such as the header field
+ names themselves will remain entirely in ASCII). For transition
+ purposes and compatibility with legacy systems, this can done by
+ extending the encoding models of [RFC2045] and [RFC2231]. However,
+ our target should be fully internationalized headers, as discussed in
+ [EAI-UTF8].
+
+4.3. Downgrading Mechanism for Backward Compatibility
+
+ As with any use of the SMTP extension mechanism, there is always the
+ possibility of a client that requires the feature encountering a
+ server that does not support the required feature. In the case of
+ email address and header internationalization, the risk should be
+ minimized by the fact that the selection of submission servers are
+ presumably under the control of the sender's client and the selection
+ of potential intermediate relays is under the control of the
+ administration of the final delivery server.
+
+ For situations in which a client that needs to use UTF8SMTP
+ encounters a server that does not support the extension UTF8SMTP,
+ there are two possibilities:
+
+
+
+
+
+Klensin & Ko Informational [Page 9]
+
+RFC 4952 EAI Framework July 2007
+
+
+ o Reject the message or generate and send a non-delivery message,
+ requiring the sender to resubmit it with traditional-format
+ addresses and headers.
+
+ o Figure out a way to downgrade the envelope or message body in
+ transit. Especially when internationalized addresses are
+ involved, downgrading will require that all-ASCII addresses be
+ obtained from some source. An optional extension parameter is
+ provided as a way of transmitting an alternate address. Downgrade
+ issues and a specification are discussed in [EAI-downgrade].
+
+ (The client can also try an alternate next-hop host or requeue the
+ message and try later, on the assumption that the lack of UTF8SMTP is
+ a transient failure; since this ultimately resolves to success or
+ failure, it doesn't change the discussion here.)
+
+ The first of these two options, that of rejecting or returning the
+ message to the sender MAY always be chosen.
+
+ If a UTF8SMTP capable client is sending a message that does not
+ require the extended capabilities, it SHOULD send the message whether
+ or not the server announces support for the extension. In other
+ words, both the addresses in the envelope and the entire set of
+ headers of the message are entirely in ASCII (perhaps including
+ encoded words in the headers). In that case, the client SHOULD send
+ the message whether or not the server announces the capability
+ specified here.
+
+5. Downgrading before and after SMTP Transactions
+
+ In addition to the in-transit downgrades discussed above, downgrading
+ may also occur before or during the initial message submission or
+ after the delivery to the final delivery MTA. Because these cases
+ have a different set of available information from in-transit cases,
+ the constraints and opportunities may be somewhat different too.
+ These two cases are discussed in the subsections below.
+
+5.1. Downgrading before or during Message Submission
+
+ Perhaps obviously, the most convenient time to find an ASCII address
+ corresponding to an internationalized address is at the originating
+ MUA. This can occur either before the message is sent or after the
+ internationalized form of the message is rejected. It is also the
+ most convenient time to convert a message from the internationalized
+ form into conventional ASCII form or to generate a non-delivery
+ message to the sender if either is necessary. At that point, the
+ user has a full range of choices available, including contacting the
+ intended recipient out of band for an alternate address, consulting
+
+
+
+Klensin & Ko Informational [Page 10]
+
+RFC 4952 EAI Framework July 2007
+
+
+ appropriate directories, arranging for translation of both addresses
+ and message content into a different language, and so on. While it
+ is natural to think of message downgrading as optimally being a
+ fully-automated process, we should not underestimate the capabilities
+ of a user of at least moderate intelligence who wishes to communicate
+ with another such user.
+
+ In this context, one can easily imagine modifications to message
+ submission servers (as described in [RFC4409]) so that they would
+ perform downgrading, or perhaps even upgrading, operations, receiving
+ messages with one or more of the internationalization extensions
+ discussed here and adapting the outgoing message, as needed, to
+ respond to the delivery or next-hop environment it encounters.
+
+5.2. Downgrading or Other Processing After Final SMTP Delivery
+
+ When an email message is received by a final delivery SMTP server, it
+ is usually stored in some form. Then it is retrieved either by
+ software that reads the stored form directly or by client software
+ via some email retrieval mechanisms such as POP or IMAP.
+
+ The SMTP extension described in Section 4.1 provides protection only
+ in transport. It does not prevent MUAs and email retrieval
+ mechanisms that have not been upgraded to understand
+ internationalized addresses and UTF-8 headers from accessing stored
+ internationalized emails.
+
+ Since the final delivery SMTP server (or, to be more specific, its
+ corresponding mail storage agent) cannot safely assume that agents
+ accessing email storage will always be capable of handling the
+ extensions proposed here, it MAY either downgrade internationalized
+ emails or specially identify messages that utilize these extensions,
+ or both. If this is done, the final delivery SMTP server SHOULD
+ include a mechanism to preserve or recover the original
+ internationalized forms without information loss to support access by
+ UTF8SMTP-aware agents.
+
+6. Additional Issues
+
+ This section identifies issues that are not covered as part of this
+ set of specifications, but that will need to be considered as part of
+ deployment of email address and header internationalization.
+
+6.1. Impact on URIs and IRIs
+
+ The mailto: schema defined in [RFC2368] and discussed in the
+ Internationalized Resource Identifier (IRI) specification [RFC3987]
+ may need to be modified when this work is completed and standardized.
+
+
+
+Klensin & Ko Informational [Page 11]
+
+RFC 4952 EAI Framework July 2007
+
+
+6.2. Interaction with Delivery Notifications
+
+ The advent of UTF8SMTP will make necessary consideration of the
+ interaction with delivery notification mechanisms, including the SMTP
+ extension for requesting delivery notifications [RFC3461], and the
+ format of delivery notifications [RFC3464]. These issues are
+ discussed in a forthcoming document that will update those RFCs as
+ needed [EAI-DSN].
+
+6.3. Use of Email Addresses as Identifiers
+
+ There are a number of places in contemporary Internet usage in which
+ email addresses are used as identifiers for individuals, including as
+ identifiers to Web servers supporting some electronic commerce sites.
+ These documents do not address those uses, but it is reasonable to
+ expect that some difficulties will be encountered when
+ internationalized addresses are first used in those contexts, many of
+ which cannot even handle the full range of addresses permitted today.
+
+6.4. Encoded Words, Signed Messages, and Downgrading
+
+ One particular characteristic of the email format is its persistency:
+ MUAs are expected to handle messages that were originally sent
+ decades ago and not just those delivered seconds ago. As such, MUAs
+ and mail filtering software, such as that specified in Sieve
+ [RFC3028], will need to continue to accept and decode header fields
+ that use the "encoded word" mechanism [RFC2047] to accommodate
+ non-ASCII characters in some header fields. While extensions to both
+ POP3 and IMAP have been proposed to enable automatic EAI-upgrade --
+ including RFC 2047 decoding -- of messages by the POP3 or IMAP
+ server, there are message structures and MIME content-types for which
+ that cannot be done or where the change would have unacceptable side
+ effects.
+
+ For example, message parts that are cryptographically signed, using
+ e.g., S/MIME [RFC3851] or Pretty Good Privacy (PGP) [RFC3156], cannot
+ be upgraded from the RFC 2047 form to normal UTF-8 characters without
+ breaking the signature. Similarly, message parts that are encrypted
+ may contain, when decrypted, header fields that use the RFC 2047
+ encoding; such messages cannot be 'fully' upgraded without access to
+ cryptographic keys.
+
+ Similar issues may arise if signed messages are downgraded in transit
+ [EAI-downgrade] and then an attempt is made to upgrade them to the
+ original form and then verify the signatures. Even the very subtle
+ changes that may result from algorithms to downgrade and then upgrade
+ again may be sufficient to invalidate the signatures if they impact
+
+
+
+
+Klensin & Ko Informational [Page 12]
+
+RFC 4952 EAI Framework July 2007
+
+
+ either the primary or MIME bodypart headers. When signatures are
+ present, downgrading must be performed with extreme care if at all.
+
+6.5. Other Uses of Local Parts
+
+ Local parts are sometimes used to construct domain labels, e.g., the
+ local part "user" in the address user@domain.example could be
+ converted into a vanity host user.domain.example with its Web space
+ at <http://user.domain.example> and the catchall addresses
+ any.thing.goes@user.domain.example.
+
+ Such schemes are obviously limited by, among other things, the SMTP
+ rules for domain names, and will not work without further
+ restrictions for other local parts such as the <utf8-local-part>
+ specified in [EAI-UTF8]. Whether this issue is relevant to these
+ specifications is an open question. It may be simply another case of
+ the considerable flexibility accorded to delivery MTAs in determining
+ the mailbox names they will accept and how they are interpreted.
+
+6.6. Non-Standard Encapsulation Formats
+
+ Some applications use formats similar to the application/mbox format
+ defined in [RFC4155] instead of the message/digest RFC 2046, Section
+ 5.1.5 [RFC2046] form to transfer multiple messages as single units.
+ Insofar as such applications assume that all stored messages use the
+ message/rfc822 RFC 2046, Section 5.2.1 [RFC2046] format with US-ASCII
+ headers, they are not ready for the extensions specified in this
+ series of documents and special measures may be needed to properly
+ detect and process them.
+
+7. Experimental Targets
+
+ In addition to the simple question of whether the model outlined here
+ can be made to work in a satisfactory way for upgraded systems and
+ provide adequate protection for un-upgraded ones, we expect that
+ actually working with the systems will provide answers to two
+ additional questions: what restrictions such as character lists or
+ normalization should be placed, if any, on the characters that are
+ permitted to be used in address local-parts and how useful, in
+ practice, will downgrading turn out to be given whatever restrictions
+ and constraints that must be placed upon it.
+
+8. IANA Considerations
+
+ This overview description and framework document does not contemplate
+ any IANA registrations or other actions. Some of the documents in
+ the group have their own IANA considerations sections and
+ requirements.
+
+
+
+Klensin & Ko Informational [Page 13]
+
+RFC 4952 EAI Framework July 2007
+
+
+9. Security Considerations
+
+ Any expansion of permitted characters and encoding forms in email
+ addresses raises some risks. There have been discussions on so
+ called "IDN-spoofing" or "IDN homograph attacks". These attacks
+ allow an attacker (or "phisher") to spoof the domain or URLs of
+ businesses. The same kind of attack is also possible on the local
+ part of internationalized email addresses. It should be noted that
+ the proposed fix involving forcing all displayed elements into
+ normalized lower-case works for domain names in URLs, but not email
+ local parts since those are case sensitive.
+
+ Since email addresses are often transcribed from business cards and
+ notes on paper, they are subject to problems arising from confusable
+ characters (see [RFC4690]). These problems are somewhat reduced if
+ the domain associated with the mailbox is unambiguous and supports a
+ relatively small number of mailboxes whose names follow local system
+ conventions. They are increased with very large mail systems in
+ which users can freely select their own addresses.
+
+ The internationalization of email addresses and headers must not
+ leave the Internet less secure than it is without the required
+ extensions. The requirements and mechanisms documented in this set
+ of specifications do not, in general, raise any new security issues.
+ They do require a review of issues associated with confusable
+ characters -- a topic that is being explored thoroughly elsewhere
+ (see, e.g., [RFC4690]) -- and, potentially, some issues with UTF-8
+ normalization, discussed in [RFC3629], and other transformations.
+ Normalization and other issues associated with transformations and
+ standard forms are also part of the subject of ongoing work discussed
+ in [Net-Unicode], in [IDNAbis-BIDI] and elsewhere. Some issues
+ specifically related to internationalized addresses and headers are
+ discussed in more detail in the other documents in this set.
+ However, in particular, caution should be taken that any
+ "downgrading" mechanism, or use of downgraded addresses, does not
+ inappropriately assume authenticated bindings between the
+ internationalized and ASCII addresses.
+
+ The new UTF-8 header and message formats might also raise, or
+ aggravate, another known issue. If the model creates new forms of an
+ 'invalid' or 'malformed' message, then a new email attack is created:
+ in an effort to be robust, some or most agents will accept such
+ message and interpret them as if they were well-formed. If a filter
+ interprets such a message differently than the final MUA, then it may
+ be possible to create a message that appears acceptable under the
+ filter's interpretation but should be rejected under the
+ interpretation given to it by the final MUA. Such attacks already
+ exist for existing messages and encoding layers, e.g., invalid MIME
+
+
+
+Klensin & Ko Informational [Page 14]
+
+RFC 4952 EAI Framework July 2007
+
+
+ syntax, invalid HTML markup, and invalid coding of particular image
+ types.
+
+ Models for the "downgrading" of messages or addresses from UTF-8 form
+ to some ASCII form, including those described in [EAI-downgrade],
+ pose another special problem and risk. Any system that transforms
+ one address or set of mail header fields into another becomes a point
+ at which spoofing attacks can occur and those who wish to spoof
+ messages might be able to do so by imitating a message downgraded
+ from one with a legitimate original address.
+
+ In addition, email addresses are used in many contexts other than
+ sending mail, such as for identifiers under various circumstances
+ (see Section 6.3). Each of those contexts will need to be evaluated,
+ in turn, to determine whether the use of non-ASCII forms is
+ appropriate and what particular issues they raise.
+
+ This work will clearly impact any systems or mechanisms that are
+ dependent on digital signatures or similar integrity protection for
+ mail headers (see also the discussion in Section 6.4). Many
+ conventional uses of PGP and S/MIME are not affected since they are
+ used to sign body parts but not headers. On the other hand, the
+ developing work on domain keys identified mail (DKIM [DKIM-Charter])
+ will eventually need to consider this work and vice versa: while this
+ experiment does not propose to address or solve the issues raised by
+ DKIM and other signed header mechanisms, the issues will have to be
+ coordinated and resolved eventually if the two sets of protocols are
+ to co-exist. In addition, to the degree to which email addresses
+ appear in PKI (Public Key Infrastructure) certificates, standards
+ addressing such certificates will need to be upgraded to address
+ these internationalized addresses. Those upgrades will need to
+ address questions of spoofing by look-alikes of the addresses
+ themselves.
+
+10. Acknowledgements
+
+ This document, and the related ones, were originally derived from
+ documents by John Klensin and the JET group [Klensin-emailaddr],
+ [JET-IMA]. The work drew inspiration from discussions on the "IMAA"
+ mailing list, sponsored by the Internet Mail Consortium and
+ especially from an early document by Paul Hoffman and Adam Costello
+ [Hoffman-IMAA] that attempted to define an MUA-only solution to the
+ address internationalization problem.
+
+ More recent documents have benefited from considerable discussion
+ within the IETF EAI Working Group and especially from suggestions and
+ text provided by Martin Duerst, Frank Ellermann, Philip Guenther,
+ Kari Hurtta, and Alexey Melnikov, and from extended discussions among
+
+
+
+Klensin & Ko Informational [Page 15]
+
+RFC 4952 EAI Framework July 2007
+
+
+ the editors and authors of the core documents cited in Section 3:
+ Harald Alvestrand, Kazunori Fujiwara, Chris Newman, Pete Resnick,
+ Jiankang Yao, Jeff Yeh, and Yoshiro Yoneya.
+
+ Additional comments received during IETF Last Call, including those
+ from Paul Hoffman and Robert Sparks, were helpful in making the
+ document more clear and comprehensive.
+
+11. References
+
+11.1. Normative References
+
+ [ASCII] American National Standards Institute (formerly
+ United States of America Standards Institute),
+ "USA Code for Information Interchange",
+ ANSI X3.4-1968, 1968.
+
+ ANSI X3.4-1968 has been replaced by newer
+ versions with slight modifications, but the 1968
+ version remains definitive for the Internet.
+
+ [RFC1652] Klensin, J., Freed, N., Rose, M., Stefferud, E.,
+ and D. Crocker, "SMTP Service Extension for
+ 8bit-MIMEtransport", RFC 1652, July 1994.
+
+ [RFC2119] Bradner, S., "Key words for use in RFCs to
+ Indicate Requirement Levels'", RFC 2119, BCP 14,
+ March 1997.
+
+ [RFC2821] Klensin, J., "Simple Mail Transfer Protocol",
+ RFC 2821, April 2001.
+
+ [RFC3490] Faltstrom, P., Hoffman, P., and A. Costello,
+ "Internationalizing Domain Names in Applications
+ (IDNA)", RFC 3490, March 2003.
+
+ [RFC3629] Yergeau, F., "UTF-8, a transformation format of
+ ISO 10646", STD 63, RFC 3629, November 2003.
+
+11.2. Informative References
+
+ [DKIM-Charter] IETF, "Domain Keys Identified Mail (dkim)",
+ October 2006, <http://www.ietf.org/
+ html.charters/dkim-charter.html>.
+
+ [EAI-DSN] Newman, C., "UTF-8 Delivery and Disposition
+ Notification", Work in Progress, January 2007.
+
+
+
+
+Klensin & Ko Informational [Page 16]
+
+RFC 4952 EAI Framework July 2007
+
+
+ [EAI-SMTPext] Yao, J., Ed. and W. Mao, Ed., "SMTP extension
+ for internationalized email address", Work
+ in Progress, June 2007.
+
+ [EAI-UTF8] Yeh, J., "Internationalized Email Headers", Work
+ in Progress, April 2007.
+
+ [EAI-downgrade] Yoneya, Y., Ed. and K. Fujiwara, Ed.,
+ "Downgrading mechanism for Internationalized
+ eMail Address (IMA)", Work in Progress,
+ March 2007.
+
+ [EAI-imap] Resnick, P. and C. Newman, "IMAP Support for
+ UTF-8", Work in Progress, March 2007.
+
+ [EAI-pop] Newman, C., "POP3 Support for UTF-8", Work
+ in Progress, January 2007.
+
+ [EAI-scenarios] Alvestrand, H., "UTF-8 Mail: Scenarios", Work
+ in Progress, February 2007.
+
+ [Hoffman-IMAA] Hoffman, P. and A. Costello, "Internationalizing
+ Mail Addresses in Applications (IMAA)", Work
+ in Progress, October 2003.
+
+ [IDNAbis-BIDI] Alvestrand, H. and C. Karp, "An IDNA problem in
+ right-to-left scripts", Work in Progress,
+ October 2006.
+
+ [JET-IMA] Yao, J. and J. Yeh, "Internationalized eMail
+ Address (IMA)", Work in Progress, June 2005.
+
+ [Klensin-emailaddr] Klensin, J., "Internationalization of Email
+ Addresses", Work in Progress, July 2005.
+
+ [Net-Unicode] Klensin, J. and M. Padlipsky, "Unicode Format
+ for Network Interchange", Work in Progress,
+ March 2007.
+
+ [RFC2045] Freed, N. and N. Borenstein, "Multipurpose
+ Internet Mail Extensions (MIME) Part One: Format
+ of Internet Message Bodies", RFC 2045,
+ November 1996.
+
+ [RFC2046] Freed, N. and N. Borenstein, "Multipurpose
+ Internet Mail Extensions (MIME) Part Two: Media
+ Types", RFC 2046, November 1996.
+
+
+
+
+Klensin & Ko Informational [Page 17]
+
+RFC 4952 EAI Framework July 2007
+
+
+ [RFC2047] Moore, K., "MIME (Multipurpose Internet Mail
+ Extensions) Part Three: Message Header
+ Extensions for Non-ASCII Text", RFC 2047,
+ November 1996.
+
+ [RFC2231] Freed, N. and K. Moore, "MIME Parameter Value
+ and Encoded Word Extensions:
+ Character Sets, Languages, and Continuations",
+ RFC 2231, November 1997.
+
+ [RFC2368] Hoffman, P., Masinter, L., and J. Zawinski, "The
+ mailto URL scheme", RFC 2368, July 1998.
+
+ [RFC2822] Resnick, P., "Internet Message Format",
+ RFC 2822, April 2001.
+
+ [RFC3028] Showalter, T., "Sieve: A Mail Filtering
+ Language", RFC 3028, January 2001.
+
+ [RFC3156] Elkins, M., Del Torto, D., Levien, R., and T.
+ Roessler, "MIME Security with OpenPGP",
+ RFC 3156, August 2001.
+
+ [RFC3461] Moore, K., "Simple Mail Transfer Protocol (SMTP)
+ Service Extension for Delivery Status
+ Notifications (DSNs)", RFC 3461, January 2003.
+
+ [RFC3464] Moore, K. and G. Vaudreuil, "An Extensible
+ Message Format for Delivery Status
+ Notifications", RFC 3464, January 2003.
+
+ [RFC3851] Ramsdell, B., "Secure/Multipurpose Internet Mail
+ Extensions (S/MIME) Version 3.1 Message
+ Specification", RFC 3851, July 2004.
+
+ [RFC3987] Duerst, M. and M. Suignard, "Internationalized
+ Resource Identifiers (IRIs)", RFC 3987,
+ January 2005.
+
+ [RFC4155] Hall, E., "The application/mbox Media Type",
+ RFC 4155, September 2005.
+
+ [RFC4409] Gellens, R. and J. Klensin, "Message Submission
+ for Mail", RFC 4409, April 2006.
+
+
+
+
+
+
+
+Klensin & Ko Informational [Page 18]
+
+RFC 4952 EAI Framework July 2007
+
+
+ [RFC4690] Klensin, J., Faltstrom, P., Karp, C., and IAB,
+ "Review and Recommendations for
+ Internationalized Domain Names (IDNs)",
+ RFC 4690, September 2006.
+
+Authors' Addresses
+
+ John C Klensin
+ 1770 Massachusetts Ave, #322
+ Cambridge, MA 02140
+ USA
+
+ Phone: +1 617 491 5735
+ EMail: john-ietf@jck.com
+
+
+ YangWoo Ko
+ ICU
+ 119 Munjiro
+ Yuseong-gu, Daejeon 305-732
+ Republic of Korea
+
+ EMail: yw@mrko.pe.kr
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Klensin & Ko Informational [Page 19]
+
+RFC 4952 EAI Framework July 2007
+
+
+Full Copyright Statement
+
+ Copyright (C) The IETF Trust (2007).
+
+ This document is subject to the rights, licenses and restrictions
+ contained in BCP 78, and except as set forth therein, the authors
+ retain all their rights.
+
+ This document and the information contained herein are provided on an
+ "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
+ OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
+ THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
+ OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
+ THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
+ WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+
+Intellectual Property
+
+ The IETF takes no position regarding the validity or scope of any
+ Intellectual Property Rights or other rights that might be claimed to
+ pertain to the implementation or use of the technology described in
+ this document or the extent to which any license under such rights
+ might or might not be available; nor does it represent that it has
+ made any independent effort to identify any such rights. Information
+ on the procedures with respect to rights in RFC documents can be
+ found in BCP 78 and BCP 79.
+
+ Copies of IPR disclosures made to the IETF Secretariat and any
+ assurances of licenses to be made available, or the result of an
+ attempt made to obtain a general license or permission for the use of
+ such proprietary rights by implementers or users of this
+ specification can be obtained from the IETF on-line IPR repository at
+ http://www.ietf.org/ipr.
+
+ The IETF invites any interested party to bring to its attention any
+ copyrights, patents or patent applications, or other proprietary
+ rights that may cover technology that may be required to implement
+ this standard. Please address the information to the IETF at
+ ietf-ipr@ietf.org.
+
+Acknowledgement
+
+ Funding for the RFC Editor function is currently provided by the
+ Internet Society.
+
+
+
+
+
+
+
+Klensin & Ko Informational [Page 20]
+