diff options
author | Thomas Voss <mail@thomasvoss.com> | 2024-11-27 20:54:24 +0100 |
---|---|---|
committer | Thomas Voss <mail@thomasvoss.com> | 2024-11-27 20:54:24 +0100 |
commit | 4bfd864f10b68b71482b35c818559068ef8d5797 (patch) | |
tree | e3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc6562.txt | |
parent | ea76e11061bda059ae9f9ad130a9895cc85607db (diff) |
doc: Add RFC documents
Diffstat (limited to 'doc/rfc/rfc6562.txt')
-rw-r--r-- | doc/rfc/rfc6562.txt | 339 |
1 files changed, 339 insertions, 0 deletions
diff --git a/doc/rfc/rfc6562.txt b/doc/rfc/rfc6562.txt new file mode 100644 index 0000000..f4d8b16 --- /dev/null +++ b/doc/rfc/rfc6562.txt @@ -0,0 +1,339 @@ + + + + + + +Internet Engineering Task Force (IETF) C. Perkins +Request for Comments: 6562 University of Glasgow +Category: Standards Track JM. Valin +ISSN: 2070-1721 Mozilla Corporation + March 2012 + + Guidelines for the Use of + Variable Bit Rate Audio with Secure RTP + +Abstract + + This memo discusses potential security issues that arise when using + variable bit rate (VBR) audio with the secure RTP profile. + Guidelines to mitigate these issues are suggested. + +Status of This Memo + + This is an Internet Standards Track document. + + This document is a product of the Internet Engineering Task Force + (IETF). It represents the consensus of the IETF community. It has + received public review and has been approved for publication by the + Internet Engineering Steering Group (IESG). Further information on + Internet Standards is available in Section 2 of RFC 5741. + + Information about the current status of this document, any errata, + and how to provide feedback on it may be obtained at + http://www.rfc-editor.org/info/rfc6562. + +Copyright Notice + + Copyright (c) 2012 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents + (http://trustee.ietf.org/license-info) in effect on the date of + publication of this document. Please review these documents + carefully, as they describe your rights and restrictions with respect + to this document. Code Components extracted from this document must + include Simplified BSD License text as described in Section 4.e of + the Trust Legal Provisions and are provided without warranty as + described in the Simplified BSD License. + + + + + + + + +Perkins & Valin Standards Track [Page 1] + +RFC 6562 VBR Audio with SRTP March 2012 + + +Table of Contents + + 1. Introduction ...................................................2 + 2. Scenario-Dependent Risk ........................................2 + 3. Guidelines for Use of VBR Audio with SRTP ......................3 + 4. Guidelines for Use of Voice Activity Detection with SRTP .......3 + 5. Padding the Output of VBR Codecs ...............................4 + 6. Security Considerations ........................................5 + 7. Acknowledgements ...............................................5 + 8. References .....................................................5 + 8.1. Normative References ......................................5 + 8.2. Informative References ....................................6 + +1. Introduction + + The Secure RTP (SRTP) framework [RFC3711] is a widely used framework + for securing RTP sessions [RFC3550]. SRTP provides the ability to + encrypt the payload of an RTP packet, and optionally add an + authentication tag, while leaving the RTP header and any header + extension in the clear. A range of encryption transforms can be used + with SRTP, but none of the predefined encryption transforms use any + padding; the RTP and SRTP payload sizes match exactly. + + When using SRTP with voice streams compressed using variable bit rate + (VBR) codecs, the length of the compressed packets will depend on the + characteristics of the speech signal. This variation in packet size + will leak a small amount of information about the contents of the + speech signal. This is potentially a security risk for some + applications. For example, [spot-me] shows that known phrases in an + encrypted call using the Speex codec in VBR mode can be recognized + with high accuracy in certain circumstances, and [fon-iks] shows that + approximate transcripts of encrypted VBR calls can be derived for + some codecs without breaking the encryption. How significant these + results are, and how they generalize to other codecs, is still an + open question. This memo discusses ways in which such traffic + analysis risks may be mitigated. + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this + document are to be interpreted as described in RFC 2119 [RFC2119]. + +2. Scenario-Dependent Risk + + Whether the information leaks and attacks discussed in [spot-me], + [fon-iks], and similar works are significant is highly dependent on + the application and use scenario. In the worst case, using the rate + information to recognize a prerecorded message knowing the set of all + possible messages would lead to near-perfect accuracy. Even when the + + + +Perkins & Valin Standards Track [Page 2] + +RFC 6562 VBR Audio with SRTP March 2012 + + + audio is not prerecorded, there is a real possibility of being able + to recognize contents from encrypted audio when the dialog is highly + structured (e.g., when the eavesdropper knows that only a handful of + possible sentences are possible), and thus contain only little + information. Recognizing unconstrained conversational speech from + the rate information alone is unreliable and computationally + expensive at present, but does appear possible in some circumstances. + These attacks are only likely to improve over time. + + In practical SRTP scenarios, how significant the information leak is + when compared to other SRTP-related information must be considered, + such as the fact that the source and destination IP addresses are + available. + +3. Guidelines for Use of VBR Audio with SRTP + + It is the responsibility of the application designer to determine the + appropriate trade-off between security and bandwidth overhead. As a + general rule, VBR codecs should be considered safe in the context of + low-value encrypted unstructured calls. However, applications that + make use of prerecorded messages where the contents of such + prerecorded messages may be of any value to an eavesdropper (i.e., + messages beyond standard greeting messages) SHOULD NOT use codecs in + VBR mode. Interactive voice response (IVR) applications would be + particularly vulnerable since an eavesdropper could easily use the + rate information to recognize the prompts being played out. + Applications conveying highly sensitive unstructured information + SHOULD NOT use codecs in VBR mode. + + It is safe to use variable rate coding to adapt the output of a voice + codec to match characteristics of a network channel, provided this + adaptation is done in a way that does not expose any information on + the speech signal. For example, VBR audio can be used for congestion + control purposes, where the variation is driven by the available + network bandwidth, not by the input speech (i.e., the packet sizes + and spacing are constant unless the network conditions change). VBR + speech codecs can safely be used in this fashion with SRTP while + avoiding leaking information on the contents of the speech signal + that might be useful for traffic analysis. + +4. Guidelines for Use of Voice Activity Detection with SRTP + + Many speech codecs employ some form of voice activity detection (VAD) + to either suppress output frames, or generate some form of lower-rate + comfort noise frames, during periods when the speaker is not active. + If VAD is used on an encrypted speech signal, then some information + + + + + +Perkins & Valin Standards Track [Page 3] + +RFC 6562 VBR Audio with SRTP March 2012 + + + about the characteristics of that speech signal can be determined by + watching the patterns of voice activity. This information leakage is + less than with VBR coding since there are only two rates possible. + + The information leakage due to VAD in SRTP audio sessions can be much + reduced if the sender adds an unpredictable "overhang" period to the + end of active speech intervals, obscuring their actual length. An + RTP sender using VAD with encrypted SRTP audio SHOULD insert such an + overhang period at the end of each talkspurt, delaying the start of + the silence/comfort noise by a random interval. The length of the + overhang applied to each talkspurt must be randomly chosen in such a + way that it is computationally infeasible for an attacker to reliably + estimate the length of that talkspurt. This may be more important + for short talkspurts, since it seems easier to distinguish between + different single word responses based on the exact word length, than + to glean meaning from the duration of a longer phrase. The audio + data comprising the overhang period must be packetized and + transmitted in RTP packets in a manner that is indistinguishable from + the other data in the talkspurt. + + The overhang period SHOULD have an exponentially decreasing + probability distribution function. This ensures a long tail, while + being easy to compute. It is RECOMMENDED to use an overhang with a + "half life" of a few hundred milliseconds (this should be sufficient + to obscure the presence of interword pauses and the lengths of single + words spoken in isolation, for example, the digits of a credit card + number clearly enunciated for an automated system, but not so long as + to significantly reduce the effectiveness of VAD for detecting + listening pauses). Despite the overhang (and no matter what the + duration is), there is still a small amount of information leaked + about the start time of the talkspurt due to the fact that we cannot + apply an overhang to the start of a talkspurt without unacceptably + affecting intelligibility. For that reason, VAD SHOULD NOT be used + in encrypted IVR applications where the content of prerecorded + messages may be of any value to an eavesdropper. + + The application of a random overhang period to each talkspurt will + reduce the effectiveness of VAD in SRTP sessions when compared to + non-SRTP sessions. However, it is still expected that the use of VAD + will provide significant bandwidth savings for many encrypted + sessions. + +5. Padding the Output of VBR Codecs + + For scenarios where VBR is considered unsafe, a constant bit rate + (CBR) codec SHOULD be negotiated and used instead, or the VBR codec + SHOULD be operated in a CBR mode. However, if the codec does not + support CBR, RTP padding SHOULD be used to reduce the information + + + +Perkins & Valin Standards Track [Page 4] + +RFC 6562 VBR Audio with SRTP March 2012 + + + leak to an insignificant level. Packets may be padded to a constant + size or to a small range of sizes ([spot-me] achieves good results by + padding to the next multiple of 16 octets, but the amount of padding + needed to hide the variation in packet size will depend on the codec + and the sophistication of the attacker) or may be padded to a size + that varies with time. The most secure and RECOMMENDED option is to + pad all packets throughout the call to the same size. + + In the case where the size of the padded packets varies in time, the + same concerns as for VAD apply. That is, the padding SHOULD NOT be + reduced without waiting for a certain (random) time. The RECOMMENDED + "hold time" is the same as the one for VAD. + + Note that SRTP encrypts the count of the number of octets of padding + added to a packet, but not the bit in the RTP header that indicates + that the packet has been padded. For this reason, it is RECOMMENDED + to add at least one octet of padding to all packets in a media + stream, so an attacker cannot tell which packets needed padding. + +6. Security Considerations + + This entire memo is about security. The security considerations of + [RFC3711] also apply. + +7. Acknowledgements + + ZRTP [RFC6189] contains similar recommendations; the purpose of this + memo is to highlight these issues to a wider audience, since they are + not specific to ZRTP. Thanks are due to Phil Zimmermann, Stefan + Doehla, Mats Naslund, Gregory Maxwell, David McGrew, Mark Baugher, + Koen Vos, Ingemar Johansson, and Stephen Farrell for their comments + and feedback on this memo. + +8. References + +8.1. Normative References + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, March 1997. + + [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. + Jacobson, "RTP: A Transport Protocol for Real-Time + Applications", STD 64, RFC 3550, July 2003. + + [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. + Norrman, "The Secure Real-time Transport Protocol (SRTP)", + RFC 3711, March 2004. + + + + +Perkins & Valin Standards Track [Page 5] + +RFC 6562 VBR Audio with SRTP March 2012 + + +8.2. Informative References + + [RFC6189] Zimmermann, P., Johnston, A., and J. Callas, "ZRTP: Media + Path Key Agreement for Unicast Secure RTP", RFC 6189, + April 2011. + + [fon-iks] White, A., Matthews, A., Snow, K., and F. Monrose, + "Phonotactic Reconstruction of Encrypted VoIP + Conversations: Hookt on fon-iks", Proceedings of the IEEE + Symposium on Security and Privacy 2011, May 2011. + + [spot-me] Wright, C., Ballard, L., Coull, S., Monrose, F., and G. + Masson, "Spot me if you can: Uncovering spoken phrases in + encrypted VoIP conversation", Proceedings of the IEEE + Symposium on Security and Privacy 2008, May 2008. + +Authors' Addresses + + Colin Perkins + University of Glasgow + School of Computing Science + Glasgow G12 8QQ + UK + + EMail: csp@csperkins.org + + + Jean-Marc Valin + Mozilla Corporation + 650 Castro Street + Mountain View, CA 94041 + USA + + Phone: +1 650 903-0800 + EMail: jmvalin@jmvalin.ca + + + + + + + + + + + + + + + + +Perkins & Valin Standards Track [Page 6] + |