summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc6590.txt
diff options
context:
space:
mode:
authorThomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
committerThomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
commit4bfd864f10b68b71482b35c818559068ef8d5797 (patch)
treee3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc6590.txt
parentea76e11061bda059ae9f9ad130a9895cc85607db (diff)
doc: Add RFC documents
Diffstat (limited to 'doc/rfc/rfc6590.txt')
-rw-r--r--doc/rfc/rfc6590.txt451
1 files changed, 451 insertions, 0 deletions
diff --git a/doc/rfc/rfc6590.txt b/doc/rfc/rfc6590.txt
new file mode 100644
index 0000000..2cb5d3c
--- /dev/null
+++ b/doc/rfc/rfc6590.txt
@@ -0,0 +1,451 @@
+
+
+
+
+
+
+Internet Engineering Task Force (IETF) J. Falk, Ed.
+Request for Comments: 6590 Return Path
+Category: Standards Track M. Kucherawy, Ed.
+ISSN: 2070-1721 Cloudmark
+ April 2012
+
+
+ Redaction of Potentially Sensitive Data from Mail Abuse Reports
+
+Abstract
+
+ Email messages often contain information that might be considered
+ private or sensitive, per either regulation or social norms. When
+ such a message becomes the subject of a report intended to be shared
+ with other entities, the report generator may wish to redact or elide
+ the sensitive portions of the message. This memo suggests one method
+ for doing so effectively.
+
+Status of This Memo
+
+ This is an Internet Standards Track document.
+
+ This document is a product of the Internet Engineering Task Force
+ (IETF). It represents the consensus of the IETF community. It has
+ received public review and has been approved for publication by the
+ Internet Engineering Steering Group (IESG). Further information on
+ Internet Standards is available in Section 2 of RFC 5741.
+
+ Information about the current status of this document, any errata,
+ and how to provide feedback on it may be obtained at
+ http://www.rfc-editor.org/info/rfc6590.
+
+Copyright Notice
+
+ Copyright (c) 2012 IETF Trust and the persons identified as the
+ document authors. All rights reserved.
+
+ This document is subject to BCP 78 and the IETF Trust's Legal
+ Provisions Relating to IETF Documents
+ (http://trustee.ietf.org/license-info) in effect on the date of
+ publication of this document. Please review these documents
+ carefully, as they describe your rights and restrictions with respect
+ to this document. Code Components extracted from this document must
+ include Simplified BSD License text as described in Section 4.e of
+ the Trust Legal Provisions and are provided without warranty as
+ described in the Simplified BSD License.
+
+
+
+
+
+Falk & Kucherawy Standards Track [Page 1]
+
+RFC 6590 Redaction April 2012
+
+
+Table of Contents
+
+ 1. Introduction ....................................................2
+ 2. Key Words .......................................................3
+ 3. Recommended Practice ............................................3
+ 4. Transformation Mechanisms .......................................4
+ 5. Security Considerations .........................................5
+ 5.1. General ....................................................5
+ 5.2. Digest Collisions ..........................................5
+ 5.3. Information Not Redacted ...................................5
+ 6. Privacy Considerations ..........................................6
+ 7. References ......................................................6
+ 7.1. Normative References .......................................6
+ 7.2. Informative References .....................................6
+ Appendix A. Example ................................................7
+ Appendix B. Acknowledgements .......................................8
+
+1. Introduction
+
+ The Abuse Reporting Format [ARF] defines a message format for sending
+ reports of abuse in the messaging infrastructure, with an eye toward
+ automating both the generation and consumption of those reports.
+
+ For privacy considerations, it might be the policy of a report
+ generator to anonymize, or obscure, portions of the report that might
+ identify an end user who caused the report to be generated. This has
+ come to be known in feedback loop parlance as "redaction". Precisely
+ how this is done is unspecified in [ARF], as it will generally be a
+ matter of local policy. That specification does admonish generators
+ against being too overzealous with this practice, as obscuring too
+ much data makes the report non-actionable.
+
+ Previous redaction practices, such as replacing local-parts of
+ addresses with a uniform string like "xxxxxxxx", frustrated any kind
+ of prioritizing or grouping of reports. This memo presents a
+ practice for conducting redaction in a manner that allows a report
+ receiver to detect that two reports were caused by the same end user,
+ without revealing the identity of that user. That is, the report
+ receiver can use the redacted string, such as an obscured email
+ address, to determine that two such unredacted strings were
+ identical; the reports originally contained the same address.
+
+
+
+
+
+
+
+
+
+
+Falk & Kucherawy Standards Track [Page 2]
+
+RFC 6590 Redaction April 2012
+
+
+ Generally, it is assumed that the recipient-identifying fields of a
+ message, when copied into a report, are to be obscured to protect the
+ identity of the end user who submitted the complaint about the
+ message. However, it is also presumed that other data will be left
+ intact, and those data could be correlated against log files or other
+ resources to determine the intended recipient of the original
+ message.
+
+2. Key Words
+
+ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+ "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
+ document are to be interpreted as described in [KEYWORDS].
+
+3. Recommended Practice
+
+ When redacting of reports is desired, in order to enable a report
+ receiver to correlate reports that might refer to a common but
+ anonymous source, the report generator SHOULD use the following
+ practice:
+
+ 1. Select a transformation mechanism (see Section 4) that is
+ consistent (i.e., the same input string produces the same output
+ each time) and reasonably collision-resistant (i.e., two
+ different inputs are unlikely to produce the same output).
+
+ 2. Identify string(s) (such as local-parts of email addresses) in a
+ message that need to be redacted. Call these strings the
+ "private data".
+
+ 3. For each piece of private data, apply the selected transformation
+ mechanism.
+
+ 4. If the output of the transformation can contain bytes that are
+ not printable ASCII, or if the output can include characters not
+ appropriate to replace the private data directly, encode the
+ output with the base64 algorithm as defined in Section 4 of
+ [BASE64], or some similar translation, to form a valid
+ replacement in the original context. For example, replacing a
+ local-part in an email address with transformation output
+ containing an "@" character (ASCII 0x40) or a space character
+ (ASCII 0x20) is not permitted by the specification for local-part
+ [SMTP], so the transformation output needs to be encoded as
+ described.
+
+
+
+
+
+
+
+Falk & Kucherawy Standards Track [Page 3]
+
+RFC 6590 Redaction April 2012
+
+
+ 5. Replace each instance of private data with the corresponding
+ (possibly encoded) transformation when generating the report.
+ Note that the replaced text could also be in a context that has
+ constraints, such as length limits that need to be observed.
+
+ This has the effect of obscuring the data (in a potentially
+ irreversible way) while still allowing the report recipient to
+ observe that numerous reports are about one particular end user.
+ Such detection enables the receiver to prioritize its reactions based
+ on problems that appear to be focused on specific end users that may
+ be under attack.
+
+4. Transformation Mechanisms
+
+ This memo does not specify a particular transformation mechanism as a
+ requirement. The interoperability that this memo seeks to provide is
+ enabled by the consistency of the transformation.
+
+ Dealing with the issue of the security of the transformation (i.e.,
+ frustrating attempts to reverse the transformation) is a matter of
+ local policy. A continuum of possible transformations exists, from
+ trivial ones such as rot13, CRC32, and base64, through strong
+ cryptographic encodings such as the Hashed Message Authentication
+ Code [HMAC] and even full encryption, or private transformations such
+ as mapping an email address to an internal customer number. An
+ operator wishing to perform report redaction needs to select a
+ consistent transformation that obscures the private data and is
+ resilient to attempts to extract the original data to the extent
+ required by local policy, keeping in mind that the environment in
+ which the transformation is operating is not a highly secure one.
+ See Section 5.3 for further details of this issue.
+
+ An implementation MAY choose any transformation that has a reasonably
+ low likelihood of collision.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Falk & Kucherawy Standards Track [Page 4]
+
+RFC 6590 Redaction April 2012
+
+
+5. Security Considerations
+
+5.1. General
+
+ General security issues with respect to these reports are found
+ in [ARF].
+
+5.2. Digest Collisions
+
+ Message digest collisions are a well-understood issue. Their
+ application here involves a report receiver improperly concluding
+ that two pieces of redacted information were originally the same when
+ in fact they are not. This can lead to a denial of service, where
+ the inadvertently improper application of complaint data causes
+ unjustified corrective action. Such cases are sufficiently unlikely
+ as to be of little concern.
+
+5.3. Information Not Redacted
+
+ Although the identity of the user causing a report to be generated
+ can be obscured using this mechanism, other properties of a message
+ (such as the Message-ID field) that are not redacted could be used to
+ recover the original data by locating them in the message logs of the
+ originating system or via other data correlation techniques. It is
+ incumbent on the report generator to anticipate and redact or
+ otherwise obscure such data, or accept that such recovery is possible
+ even from the very simplest kinds of feedback.
+
+ It is for this reason that the normative portions of this memo do not
+ include stronger assertions about cryptography used in the
+ transformation. Given the ultimate recoverability of the redacted
+ information, the cryptographic strength of the transformation is not
+ a critical security measure.
+
+ The process of redacting a feedback report satisfies a privacy
+ requirement established by local policy, and is not meant to provide
+ strong security properties.
+
+ [FBL-BCP] and Section 8 of [ARF] discuss topics related to
+ establishment of bilateral agreements between report producers and
+ consumers. The issues raised here are also things to be considered
+ when establishing such agreements.
+
+
+
+
+
+
+
+
+
+Falk & Kucherawy Standards Track [Page 5]
+
+RFC 6590 Redaction April 2012
+
+
+6. Privacy Considerations
+
+ While the method of redaction described in this document may reduce
+ the likelihood of some types of private data from leaking between
+ ADministrative Management Domains (ADMDs), it is extremely unlikely
+ that report generation software could ever be created to recognize
+ all of the different ways that private information could be expressed
+ through human written language. If further protections are required,
+ implementers may wish to consider establishing some sort of out-of-
+ band arrangements between the relevant entities, to contain private
+ data as much as possible.
+
+7. References
+
+7.1. Normative References
+
+ [ARF] Shafranovich, Y., Levine, J., and M. Kucherawy, "An
+ Extensible Format for Email Feedback Reports", RFC 5965,
+ August 2010.
+
+ [BASE64] Josefsson, S., "The Base16, Base32, and Base64 Data
+ Encodings", RFC 4648, October 2006.
+
+ [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate
+ Requirement Levels", BCP 14, RFC 2119, March 1997.
+
+7.2. Informative References
+
+ [FBL-BCP] Falk, J., Ed., "Complaint Feedback Loop Operational
+ Recommendations", RFC 6449, November 2011.
+
+ [HMAC] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed-
+ Hashing for Message Authentication", RFC 2104,
+ February 1997.
+
+ [SMTP] Klensin, J., "Simple Mail Transfer Protocol", RFC 5321,
+ October 2008.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Falk & Kucherawy Standards Track [Page 6]
+
+RFC 6590 Redaction April 2012
+
+
+Appendix A. Example
+
+ Assume the following input message:
+
+ From: alice@example.com
+ To: bob@example.net
+ Subject: Make money fast!
+ Message-ID: <123456789@mailer.example.com>
+ Date: Thu, 17 Nov 2011 22:19:40 -0500
+
+ Want to make a lot of money really fast? Check it out!
+ http://www.example.com/scam/0xd0d0cafe
+
+ On receipt, bob@example.net reports this message as abusive through
+ whatever mechanism his mailbox provider has established. This causes
+ an [ARF] message to be generated. However, example.net wishes to
+ obscure Bob's email address lest it be relayed to the offending
+ agent, which could lead to more trouble for Bob.
+
+ Thus, example.net plans to redact the local-part of the recipient
+ address in the To: field. Local policy and security requirements
+ suggest that the algorithm known as "H" (a hash of a key concatenated
+ with the data to be obscured) using SHA1 is adequate. It has thus
+ selected a redaction key of "potatoes", and the private data in this
+ case is the string "bob". The concatenation of "potatoesbob" is
+ digested with SHA1 and then base64-encoded to the string
+ "rZ8cqXWGiKHzhz1MsFRGTysHia4=".
+
+ Therefore, when constructing the ARF message in response to Bob's
+ complaint, the following form of the received message is used in the
+ third part of the ARF report:
+
+ From: alice@example.com
+ To: rZ8cqXWGiKHzhz1MsFRGTysHia4=@example.net
+ Subject: Make money fast!
+ Message-ID: <123456789@mailer.example.com>
+ Date: Thu, 17 Nov 2011 22:19:40 -0500
+
+ Want to make a lot of money really fast? Check it out!
+ http://www.example.com/scam/0xd0d0cafe
+
+ Note, however, that it is possible that the redacted information can
+ be recovered by agents at example.com searching their logs for the
+ original envelope associated with the message, by correlating with
+ the Message-ID contents, which were not redacted here. It is
+ expected that feedback loops generating such reports involve senders
+ that have been vetted against such information leakage.
+
+
+
+
+Falk & Kucherawy Standards Track [Page 7]
+
+RFC 6590 Redaction April 2012
+
+
+Appendix B. Acknowledgements
+
+ Much of the text in this document was initially moved from other MARF
+ working group documents, with contributions from Monica Chew, Tim
+ Draegen, Michael Adkins, and other members of the Messaging Anti-
+ Abuse Working Group. Additional feedback was provided by John
+ Levine, S. Moonesamy, Alessandro Vesely, and Mykyta Yevstifeyev.
+
+Authors' Addresses
+
+ J.D. Falk (editor)
+ Return Path
+ 100 Mathilda Place, Suite 100
+ Sunnyvale, CA 94086
+ US
+
+ EMail: ietf@cybernothing.org
+ URI: http://www.returnpath.net/
+
+
+ M. Kucherawy (editor)
+ Cloudmark
+ 128 King St., 2nd Floor
+ San Francisco, CA 94107
+ US
+
+ EMail: msk@cloudmark.com
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Falk & Kucherawy Standards Track [Page 8]
+