diff options
Diffstat (limited to 'doc/rfc/rfc8817.txt')
-rw-r--r-- | doc/rfc/rfc8817.txt | 984 |
1 files changed, 984 insertions, 0 deletions
diff --git a/doc/rfc/rfc8817.txt b/doc/rfc/rfc8817.txt new file mode 100644 index 0000000..2fdbb81 --- /dev/null +++ b/doc/rfc/rfc8817.txt @@ -0,0 +1,984 @@ + + + + +Internet Engineering Task Force (IETF) V. Demjanenko +Request for Comments: 8817 J. Punaro +Category: Standards Track D. Satterlee +ISSN: 2070-1721 VOCAL Technologies, Ltd. + August 2020 + + + RTP Payload Format for Tactical Secure Voice Cryptographic + Interoperability Specification (TSVCIS) Codec + +Abstract + + This document describes the RTP payload format for the Tactical + Secure Voice Cryptographic Interoperability Specification (TSVCIS) + speech coder. TSVCIS is a scalable narrowband voice coder supporting + varying encoder data rates and fallbacks. It is implemented as an + augmentation to the Mixed Excitation Linear Prediction Enhanced + (MELPe) speech coder by conveying additional speech coder parameters + to enhance voice quality. TSVCIS augmented speech data is processed + in conjunction with its temporally matched Mixed Excitation Linear + Prediction (MELP) 2400 speech data. The RTP packetization of TSVCIS + and MELPe speech coder data is described in detail. + +Status of This Memo + + This is an Internet Standards Track document. + + This document is a product of the Internet Engineering Task Force + (IETF). It represents the consensus of the IETF community. It has + received public review and has been approved for publication by the + Internet Engineering Steering Group (IESG). Further information on + Internet Standards is available in Section 2 of RFC 7841. + + Information about the current status of this document, any errata, + and how to provide feedback on it may be obtained at + https://www.rfc-editor.org/info/rfc8817. + +Copyright Notice + + Copyright (c) 2020 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents + (https://trustee.ietf.org/license-info) in effect on the date of + publication of this document. Please review these documents + carefully, as they describe your rights and restrictions with respect + to this document. Code Components extracted from this document must + include Simplified BSD License text as described in Section 4.e of + the Trust Legal Provisions and are provided without warranty as + described in the Simplified BSD License. + +Table of Contents + + 1. Introduction + 1.1. Conventions + 1.2. Abbreviations + 2. Background + 3. Payload Format + 3.1. MELPe Bitstream Definitions + 3.1.1. 2400 bps Bitstream Structure + 3.1.2. 1200 bps Bitstream Structure + 3.1.3. 600 bps Bitstream Structure + 3.1.4. Comfort Noise Bitstream Definition + 3.2. TSVCIS Bitstream Definition + 3.3. Multiple TSVCIS Frames in an RTP Packet + 3.4. Congestion Control Considerations + 4. Payload Format Parameters + 4.1. Media Type Definitions + 4.2. Mapping to SDP + 4.3. Declarative SDP Considerations + 4.4. Offer/Answer SDP Considerations + 5. Discontinuous Transmissions + 6. Packet Loss Concealment + 7. IANA Considerations + 8. Security Considerations + 9. References + 9.1. Normative References + 9.2. Informative References + Authors' Addresses + +1. Introduction + + This document describes how compressed Tactical Secure Voice + Cryptographic Interoperability Specification (TSVCIS) speech as + produced by the TSVCIS codec [TSVCIS] [NRLVDR] may be formatted for + use as an RTP payload. The TSVCIS speech coder (or TSVCIS speech- + aware communications equipment on any intervening transport link) may + adjust to restricted bandwidth conditions by reducing the amount of + augmented speech data and relying on the underlying MELPe speech + coder for the most constrained bandwidth links. + + Details are provided for packetizing the TSVCIS augmented speech data + along with MELPe 2400 bps speech parameters in an RTP packet. The + sender may send one or more codec data frames per packet, depending + on the application scenario or based on transport network conditions, + bandwidth restrictions, delay requirements, and packet loss + tolerance. + +1.1. Conventions + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and + "OPTIONAL" in this document are to be interpreted as described in + BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all + capitals, as shown here. + + Best current practices for writing an RTP payload format + specification were followed [RFC2736] [RFC8088]. + +1.2. Abbreviations + + The following abbreviations are used in this document. + + AVP: Audio/Video Profile + + AVPF: Audio/Video Profile Feedback + + CELP: Code-Excited Linear Prediction + + FEC: Forward Error Correction + + LPC: Linear-Predictive Coding + + LSB: Least Significant Bit + + MELP: Mixed Excitation Linear Prediction + + MELPe: Mixed Excitation Linear Prediction Enhanced + + MSB: Most Significant Bit + + MTC: Modified Count + + NATO: North American Treaty Organization + + NRL: Naval Research Lab + + PLC: Packet Loss Concealment + + SAVP: Secure Audio/Video Profile + + SAVPF: Secure Audio/Video Profile Feedback + + SDP: Session Description Protocol + + SSRC: Synchronization Source + + SRTP: Secure Real-Time Transport Protocol + + TSVCIS: Tactical Secure Voice Cryptographic Interoperability + Specification + + VAD: Voice Activity Detect + + VDR: Variable Date Rate + +2. Background + + The MELP speech coder was developed by the US military as an upgrade + from the LPC-based CELP standard vocoder for low-bitrate + communications [MELP]. ("LPC" stands for "Linear-Predictive Coding", + and "CELP" stands for "Code-Excited Linear Prediction".) MELP was + further enhanced and subsequently adopted by NATO as "MELPe" for use + by its members and Partnership for Peace countries for military and + other governmental communications as international NATO Standard + STANAG 4591 [MELPE]. + + The Tactical Secure Voice Cryptographic Interoperability + Specification (TSVCIS) is a specification written by the Tactical + Secure Voice Working Group (TSVWG) to enable all modern tactical + secure voice devices to be interoperable across the US Department of + Defense [TSVCIS]. One of the most important aspects is that the + voice modes defined in TSVCIS are based on specific fixed rates of + the Naval Research Lab's (NRL's) Variable Date Rate (VDR) Vocoder, + which uses the MELPe standard as its base [NRLVDR]. A complete + TSVCIS speech frame consists of MELPe speech parameters and + corresponding TSVCIS augmented speech data. + + In addition to the augmented speech data, the TSVCIS specification + identifies which speech coder and framing bits are to be encrypted + and how they are protected by forward error correction (FEC) + techniques (using block codes). At the RTP transport layer, only the + speech coder-related bits need to be considered and are conveyed in + unencrypted form. In most IP-based network deployments, standard + link encryption methods (Secure Real-Time Transport Protocol (SRTP), + VPNs, FIPS 140 link encryptors, or Type 1 Ethernet encryptors) would + be used to secure the RTP speech contents. + + TSVCIS augmented speech data is derived from the signal processing + and data generated by the MELPe speech coder. For the purposes of + this specification, only the general parameter nature of TSVCIS will + be characterized. Depending on the bandwidth available (and FEC + requirements), a varying number of TSVCIS-specific speech coder + parameters need to be transported. These are first byte-packed and + then conveyed from encoder to decoder. + + Byte packing of TSVCIS speech data into packed parameters is + processed as per the following example, where + + Three-bit field: Bits A, B, and C (A is MSB; C is LSB) + + Five-bit field: Bits D, E, F, G, and H (D is MSB; H is LSB) + + MSB LSB + 0 1 2 3 4 5 6 7 + +------+------+------+------+------+------+------+------+ + | H | G | F | E | D | C | B | A | + +------+------+------+------+------+------+------+------+ + + This packing method places the three-bit field "first" in the lowest + bits followed by the next five-bit field. Parameters may be split + between octets with the most significant bits in the earlier octet. + Any unfilled bits in the last octet MUST be filled with zero. + + In order to accommodate a varying amount of TSVCIS augmented speech + data, an octet count specifies the number of octets representing the + TSVCIS packed parameters. The encoding to do so is presented in + Section 3.2. TSVCIS specifically uses the NRL VDR in two + configurations with a fixed set of 15 and 35 packed octet parameters + in a standardized order [TSVCIS]. + +3. Payload Format + + The TSVCIS codec augments the standard MELP 2400, 1200, and 600 + bitrates and hence uses 22.5, 67.5, or 90 ms frames with a sampling + rate clock of 8 kHz, so the RTP timestamp MUST be in units of 1/8000 + of a second. + + The RTP payload for TSVCIS has the format shown in Figure 1. No + additional header specific to this payload format is needed. This + format is intended for situations where the sender and the receiver + send one or more codec data frames per packet. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | RTP Header | + +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ + | | + + one or more frames of TSVCIS | + | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 1: Packet Format Diagram + + The RTP header of the packetized encoded TSVCIS speech has the + expected values as described in [RFC3550]. The usage of the M bit + SHOULD be as specified in the applicable RTP profile -- for example, + [RFC3551] specifies that if the sender does not suppress silence + (i.e., sends a frame on every frame interval), the M bit will always + be zero. When more than one codec data frame is present in a single + RTP packet, the timestamp specified is that of the oldest data frame + represented in the RTP packet. + + The assignment of an RTP payload type for this new packet format is + outside the scope of this document and will not be specified here. + It is expected that the RTP profile for a particular class of + applications will assign a payload type for this encoding; if that is + not done, then a payload type in the dynamic range shall be chosen by + the sender. + +3.1. MELPe Bitstream Definitions + + The TSVCIS speech coder includes all three MELPe coder rates used as + base speech parameters or as speech coders for bandwidth-restricted + links. RTP packetization of MELPe follows [RFC8130] and is repeated + here for all three MELPe rates [RFC8130], with its recommendations + now regarded as requirements. The bits previously labeled as RSVA, + RSVB, and RSVC in [RFC8130] SHOULD be filled with rate code bits + CODA, CODB, and CODC, as shown in Table 1 (compatible with Table 7 in + Section 3.3 of [RFC8130]). + + +===============+======+======+======+========+ + | Coder Bitrate | CODA | CODB | CODC | Length | + +===============+======+======+======+========+ + | 2400 bps | 0 | 0 | N/A | 7 | + +---------------+------+------+------+--------+ + | 1200 bps | 1 | 0 | 0 | 11 | + +---------------+------+------+------+--------+ + | 600 bps | 0 | 1 | N/A | 7 | + +---------------+------+------+------+--------+ + | Comfort Noise | 1 | 0 | 1 | 2 | + +---------------+------+------+------+--------+ + | TSVCIS Data | 1 | 1 | N/A | var. | + +---------------+------+------+------+--------+ + + Table 1: TSVCIS/MELPe Frame Bitrate + Indicators and Frame Length + + The total number of bits used to describe one MELPe frame of 2400 bps + speech is 54, which fits in 7 octets (with two rate code bits). For + MELPe 1200 bps speech, the total number of bits used is 81, which + fits in 11 octets (with three rate code bits and four unused bits). + For MELPe 600 bps speech, the total number of bits used is 54, which + fits in 7 octets (with two rate code bits). The comfort noise frame + consists of 13 bits, which fits in 2 octets (with three rate code + bits). TSVCIS packed parameters will use the last code combination + in a trailing byte as discussed in Section 3.2. + + It should be noted that CODB for MELPe 600 bps mode MAY deviate from + the value in Table 1 when bit 55 is used as an alternating 1/0 end- + to-end framing bit. Frame decoding would remain distinct as CODA + being zero on its own would indicate a 7-byte frame for either a 2400 + or 600 bps rate, and the use of 600 bps speech coding could be + deduced from the RTP timestamp (and anticipated by the Session + Description Protocol (SDP) negotiations). + +3.1.1. 2400 bps Bitstream Structure + + The 2400 bps MELPe RTP payload is constructed as per Figure 2. Note + that CODA MUST be filled with 0 and CODB SHOULD be filled with 0 as + per Section 3.1. CODB MAY contain an end-to-end framing bit if + required by the endpoints. + + MSB LSB + 0 1 2 3 4 5 6 7 + +------+------+------+------+------+------+------+------+ + | B_08 | B_07 | B_06 | B_05 | B_04 | B_03 | B_02 | B_01 | + +------+------+------+------+------+------+------+------+ + | B_16 | B_15 | B_14 | B_13 | B_12 | B_11 | B_10 | B_09 | + +------+------+------+------+------+------+------+------+ + | B_24 | B_23 | B_22 | B_21 | B_20 | B_19 | B_18 | B_17 | + +------+------+------+------+------+------+------+------+ + | B_32 | B_31 | B_30 | B_29 | B_28 | B_27 | B_26 | B_25 | + +------+------+------+------+------+------+------+------+ + | B_40 | B_39 | B_38 | B_37 | B_36 | B_35 | B_34 | B_33 | + +------+------+------+------+------+------+------+------+ + | B_48 | B_47 | B_46 | B_45 | B_44 | B_43 | B_42 | B_41 | + +------+------+------+------+------+------+------+------+ + | CODA | CODB | B_54 | B_53 | B_52 | B_51 | B_50 | B_49 | + +------+------+------+------+------+------+------+------+ + + Figure 2: Packed MELPe 2400 bps Payload Octets + +3.1.2. 1200 bps Bitstream Structure + + The 1200 bps MELPe RTP payload is constructed as per Figure 3. Note + that CODA, CODB, and CODC MUST be filled with 1, 0, and 0, + respectively, as per Section 3.1. RSV0 MUST be coded as 0. + + MSB LSB + 0 1 2 3 4 5 6 7 + +------+------+------+------+------+------+------+------+ + | B_08 | B_07 | B_06 | B_05 | B_04 | B_03 | B_02 | B_01 | + +------+------+------+------+------+------+------+------+ + | B_16 | B_15 | B_14 | B_13 | B_12 | B_11 | B_10 | B_09 | + +------+------+------+------+------+------+------+------+ + | B_24 | B_23 | B_22 | B_21 | B_20 | B_19 | B_18 | B_17 | + +------+------+------+------+------+------+------+------+ + | B_32 | B_31 | B_30 | B_29 | B_28 | B_27 | B_26 | B_25 | + +------+------+------+------+------+------+------+------+ + | B_40 | B_39 | B_38 | B_37 | B_36 | B_35 | B_34 | B_33 | + +------+------+------+------+------+------+------+------+ + | B_48 | B_47 | B_46 | B_45 | B_44 | B_43 | B_42 | B_41 | + +------+------+------+------+------+------+------+------+ + | B_56 | B_55 | B_54 | B_53 | B_52 | B_51 | B_50 | B_49 | + +------+------+------+------+------+------+------+------+ + | B_64 | B_63 | B_62 | B_61 | B_60 | B_59 | B_58 | B_57 | + +------+------+------+------+------+------+------+------+ + | B_72 | B_71 | B_70 | B_69 | B_68 | B_67 | B_66 | B_65 | + +------+------+------+------+------+------+------+------+ + | B_80 | B_79 | B_78 | B_77 | B_76 | B_75 | B_74 | B_73 | + +------+------+------+------+------+------+------+------+ + | CODA | CODB | CODC | RSV0 | RSV0 | RSV0 | RSV0 | B_81 | + +------+------+------+------+------+------+------+------+ + + Figure 3: Packed MELPe 1200 bps Payload Octets + +3.1.3. 600 bps Bitstream Structure + + The 600 bps MELPe RTP payload is constructed as per Figure 4. Note + CODA MUST be filled with 0 and CODB SHOULD be filled with 1 as per + Section 3.1. CODB MAY contain an end-to-end framing bit if required + by the endpoints. + + MSB LSB + 0 1 2 3 4 5 6 7 + +------+------+------+------+------+------+------+------+ + | B_08 | B_07 | B_06 | B_05 | B_04 | B_03 | B_02 | B_01 | + +------+------+------+------+------+------+------+------+ + | B_16 | B_15 | B_14 | B_13 | B_12 | B_11 | B_10 | B_09 | + +------+------+------+------+------+------+------+------+ + | B_24 | B_23 | B_22 | B_21 | B_20 | B_19 | B_18 | B_17 | + +------+------+------+------+------+------+------+------+ + | B_32 | B_31 | B_30 | B_29 | B_28 | B_27 | B_26 | B_25 | + +------+------+------+------+------+------+------+------+ + | B_40 | B_39 | B_38 | B_37 | B_36 | B_35 | B_34 | B_33 | + +------+------+------+------+------+------+------+------+ + | B_48 | B_47 | B_46 | B_45 | B_44 | B_43 | B_42 | B_41 | + +------+------+------+------+------+------+------+------+ + | CODA | CODB | B_54 | B_53 | B_52 | B_51 | B_50 | B_49 | + +------+------+------+------+------+------+------+------+ + + Figure 4: Packed MELPe 600 bps Payload Octets + +3.1.4. Comfort Noise Bitstream Definition + + The comfort noise MELPe RTP payload is constructed as per Figure 5. + Note that CODA, CODB, and CODC MUST be filled with 1, 0, and 1, + respectively, as per Section 3.1. + + MSB LSB + 0 1 2 3 4 5 6 7 + +------+------+------+------+------+------+------+------+ + | B_08 | B_07 | B_06 | B_05 | B_04 | B_03 | B_02 | B_01 | + +------+------+------+------+------+------+------+------+ + | CODA | CODB | CODC | B_13 | B_12 | B_11 | B_10 | B_09 | + +------+------+------+------+------+------+------+------+ + + Figure 5: Packed MELPe Comfort Noise Payload Octets + +3.2. TSVCIS Bitstream Definition + + The TSVCIS augmented speech data as packed parameters MUST be placed + immediately after a corresponding MELPe 2400 bps payload in the same + RTP packet. The packed parameters are counted in octets (TC). The + preferred placement SHOULD be used for TSVCIS payloads with TC less + than or equal to 77 octets; this is shown in Figure 6. In the + preferred placement, a single trailing octet SHALL be appended to + include a two-bit rate code, CODA and CODB (both bits set to one), + and a six-bit modified count (MTC). The special modified count value + of all ones (representing an MTC value of 63) SHALL NOT be used for + this format as it is used as the indicator for the alternate packing + format shown next. In a standard implementation, the TSVCIS speech + coder uses a minimum of 15 octets for parameters in octet packed + form. The modified count (MTC) MUST be reduced by 15 from the full + octet count (TC). Computed MTC = TC-15. This accommodates a maximum + of 77 parameter octets (the maximum value of MTC is 62; 77 is the sum + of 62+15). + + MSB LSB + 0 1 2 3 4 5 6 7 + +------+------+------+------+------+------+------+------+ + 1 | T008 | T007 | T006 | T005 | T004 | T003 | T002 | T001 | + +------+------+------+------+------+------+------+------+ + 2 | T016 | T015 | T014 | T013 | T012 | T011 | T010 | T009 | + +------+------+------+------+------+------+------+------+ + 3 | T024 | T023 | T022 | T021 | T020 | T019 | T018 | T017 | + +------+------+------+------+------+------+------+------+ + 4 | T032 | T031 | T030 | T029 | T028 | T027 | T026 | T025 | + +------+------+------+------+------+------+------+------+ + 5 | T040 | T039 | T038 | T037 | T036 | T035 | T034 | T033 | + +------+------+------+------+------+------+------+------+ + 6 | T048 | T047 | T046 | T045 | T044 | T043 | T042 | T041 | + +------+------+------+------+------+------+------+------+ + 7 | TO56 | TO55 | T054 | T053 | T052 | T051 | T050 | T049 | + +------+------+------+------+------+------+------+------+ + 8 | T064 | T063 | T062 | T061 | T060 | T059 | T058 | T057 | + +------+------+------+------+------+------+------+------+ + 9 | T072 | T071 | T070 | T069 | T068 | T067 | T066 | T065 | + +------+------+------+------+------+------+------+------+ + 10 | T080 | T079 | T078 | T077 | T076 | T075 | T074 | T073 | + +------+------+------+------+------+------+------+------+ + 11 | T088 | T087 | T086 | T085 | T084 | T083 | T082 | T081 | + +------+------+------+------+------+------+------+------+ + 12 | TO96 | TO95 | T094 | T093 | T092 | T091 | T090 | T089 | + +------+------+------+------+------+------+------+------+ + 13 | T104 | T103 | T102 | T101 | T100 | T099 | T098 | T097 | + +------+------+------+------+------+------+------+------+ + 14 | T112 | T111 | T110 | T109 | T108 | T107 | T106 | T105 | + +------+------+------+------+------+------+------+------+ + 15 | T120 | T119 | T118 | T117 | T116 | T115 | T114 | T113 | + +------+------+------+------+------+------+------+------+ + | . . . . | + +------+------+------+------+------+------+------+------+ + TC+1 | CODA | CODB | modified octet count | + +------+------+------+------+------+------+------+------+ + + Figure 6: Preferred Packed TSVCIS Payload Octets + + In order to accommodate all other NRL VDR configurations, an + alternate parameter placement MUST use two trailing bytes as shown in + Figure 7. The last trailing byte MUST be filled with a two-bit rate + code, CODA and CODB (both bits set to one), and its six-bit count + field MUST be filled with ones. The second to last trailing byte + MUST contain the parameter count (TC) in octets (a value from 1 and + 255, inclusive). The value of zero SHALL be considered as reserved. + + MSB LSB + 0 1 2 3 4 5 6 7 + +------+------+------+------+------+------+------+------+ + 1 | T008 | T007 | T006 | T005 | T004 | T003 | T002 | T001 | + +------+------+------+------+------+------+------+------+ + 2 | T016 | T015 | T014 | T013 | T012 | T011 | T010 | T009 | + +------+------+------+------+------+------+------+------+ + | . . . . | + +------+------+------+------+------+------+------+------+ + TC+1 | octet count | + +------+------+------+------+------+------+------+------+ + TC+2 | CODA | CODB | 1 | 1 | 1 | 1 | 1 | 1 | + +------+------+------+------+------+------+------+------+ + + Figure 7: Length Unrestricted Packed TSVCIS Payload Octets + +3.3. Multiple TSVCIS Frames in an RTP Packet + + A TSVCIS RTP packet payload consists of zero or more consecutive + TSVCIS coder frames (each consisting of MELPe 2400 and TSVCIS coder + data), with the oldest frame first, followed by zero or one MELPe + comfort noise frame. The presence of a comfort noise frame can be + determined by its rate code bits in its last octet. + + The default packetization interval is one coder frame (22.5, 67.5, or + 90 ms) according to the coder bitrate (2400, 1200, or 600 bps). For + some applications, a longer packetization interval is used to reduce + the packet rate. + + A TSVCIS RTP packet without coder and comfort noise frames MAY be + used periodically by an endpoint to indicate connectivity by an + otherwise idle receiver. + + TSVCIS coder frames in a single RTP packet MAY have varying TSVCIS + parameter octet counts. Its packed parameter octet count (length) is + indicated in the trailing byte(s). All MELPe frames in a single RTP + packet MUST be of the same coder bitrate. For all MELPe coder + frames, the coder rate bits in the trailing byte identify the + contents and length as per Table 1. + + It is important to observe that senders have the following additional + restrictions: + + * Senders SHOULD NOT include more TSVCIS or MELPe frames in a single + RTP packet than will fit in the MTU of the RTP transport protocol. + + * Frames MUST NOT be split between RTP packets. + + It is RECOMMENDED that the number of frames contained within an RTP + packet be consistent with the application. For example, in telephony + and other real-time applications where delay is important, the fewer + frames per packet, the lower the delay. However, for bandwidth- + constrained links or delay-insensitive streaming messaging + applications, more than one frame per packet or many frames per + packet would be acceptable. + + Information describing the number of frames contained in an RTP + packet is not transmitted as part of the RTP payload. The way to + determine the number of TSVCIS/MELPe frames is to identify each frame + type and length, thereby counting the total number of octets within + the RTP packet. + +3.4. Congestion Control Considerations + + The target bitrate of TSVCIS can be adjusted at any point in time, + thus allowing congestion management. Furthermore, the amount of + encoded speech or audio data encoded in a single packet can be used + for congestion control, since the packet rate is inversely + proportional to the packet duration. A lower packet transmission + rate reduces the amount of header overhead but at the same time + increases latency and loss sensitivity, so it ought to be used with + care. + + Since UDP does not provide congestion control, applications that use + RTP over UDP SHOULD implement their own congestion control above the + UDP layer [RFC8085] and MAY also implement a transport circuit + breaker [RFC8083]. Work in the RMCAT Working Group [RMCAT] describes + the interactions and conceptual interfaces necessary between the + application components that relate to congestion control, including + the RTP layer, the higher-level media codec control layer, and the + lower-level transport interface, as well as components dedicated to + congestion control functions. + +4. Payload Format Parameters + + This RTP payload format is identified using the TSVCIS media subtype, + which is registered in accordance with [RFC4855] and per the media + type registration template from [RFC6838]. + +4.1. Media Type Definitions + + Type name: audio + + Subtype name: TSVCIS + + Required parameters: Clock Rate (Hz): 8000 + + Optional parameters: + ptime: + the recommended length of time (in milliseconds) represented by + the media in a packet. It SHALL use the nearest rounded-up ms + integer packet duration. For TSVCIS, this corresponds to the + following values: 23, 45, 68, 90, 112, 135, 156, and 180. + Larger values can be used as long as they are properly rounded. + See Section 6 of [RFC4566]. + + maxptime: + the maximum length of time (in milliseconds) that can be + encapsulated in a packet. It SHALL use the nearest rounded-up + ms integer packet duration. For TSVCIS, this corresponds to + the following values: 23, 45, 68, 90, 112, 135, 156, and 180. + Larger values can be used as long as they are properly rounded. + See Section 6 of [RFC4566]. + + bitrate: + specifies the MELPe coder bitrates supported. Possible values + are a comma-separated list of rates from the following set: + 2400, 1200, 600. The modes are listed in order of preference; + the first is preferred. If "bitrate" is not present, the fixed + coder bitrate of 2400 MUST be used. + + tcmax: + specifies the TSVCIS maximum value for the TC supported or + desired, ranging from 1 to 255. If "tcmax" is not present, a + default value of 35 is used. + + Channels: + 1 + + Encoding considerations: This media subtype is framed and binary; + see Section 4.8 of [RFC6838]. + + Security considerations: Please see Section 8 of RFC 8817. + + Interoperability considerations: N/A + + Published specification: [TSVCIS] + + Applications that use this media type: N/A + + Fragment identifier considerations: N/A + + Additional information: + + Deprecated alias names for this type: N/A + Magic number(s): N/A + File extension(s): N/A + Macintosh file type code(s): N/A + + Person & email address to contact for further information: + Victor Demjanenko, Ph.D. <victor.demjanenko@vocal.com> + + Intended usage: COMMON + + Restrictions on usage: The media subtype depends on RTP framing and + hence is only defined for transfer via RTP [RFC3550]. Transport + within other framing protocols is not defined at this time. + + Author: Victor Demjanenko, Ph.D. + + Change controller: IETF; contact <avt@ietf.org> + + Provisional registration? (standards tree only): No + +4.2. Mapping to SDP + + The mapping of the above-defined payload format media subtype and its + parameters SHALL be done according to Section 3 of [RFC4855]. + + The information carried in the media type specification has a + specific mapping to fields in the Session Description Protocol (SDP) + [RFC4566], which is commonly used to describe RTP sessions. When SDP + is used to specify sessions employing the TSVCIS codec, the mapping + is as follows: + + * The media type ("audio") goes in SDP "m=" as the media name. + + * The media subtype (payload format name) goes in SDP "a=rtpmap" as + the encoding name. + + * The parameter "bitrate" goes in the SDP "a=fmtp" attribute by + copying it as a "bitrate=<value>" string. + + * The parameter "tcmax" goes in the SDP "a=fmtp" attribute by + copying it as a "tcmax=<value>" string. + + * The parameters "ptime" and "maxptime" go in the SDP "a=ptime" and + "a=maxptime" attributes, respectively. + + When conveying information via SDP, the encoding name SHALL be + "TSVCIS" (the same as the media subtype). + + An example of the media representation in SDP for describing TSVCIS + might be: + + m=audio 49120 RTP/AVP 96 + a=rtpmap:96 TSVCIS/8000 + + The optional media type parameter "bitrate", when present, MUST be + included in the "a=fmtp" attribute in the SDP, expressed as a media + type string in the form of a semicolon-separated list of + parameter=value pairs. The string "value" can be one or more of + 2400, 1200, and 600, separated by commas (where each bitrate value + indicates the corresponding MELPe coder). An example of the media + representation in SDP for describing TSVCIS when all three coder + bitrates are supported might be: + + m=audio 49120 RTP/AVP 96 + a=rtpmap:96 TSVCIS/8000 + a=fmtp:96 bitrate=2400,600,1200 + + The optional media type parameter "tcmax", when present, MUST be + included in the "a=fmtp" attribute in the SDP, expressed as a media + type string in the form of a semicolon-separated list of + parameter=value pairs. The string "value" is an integer number in + the range of 1 to 255 representing the maximum number of TSVCIS + parameter octets supported. An example of the media representation + in SDP for describing TSVCIS with a maximum of 101 octets supported + is as follows: + + m=audio 49120 RTP/AVP 96 + a=rtpmap:96 TSVCIS/8000 + a=fmtp:96 tcmax=101 + + The parameter "ptime" cannot be used for the purpose of specifying + the TSVCIS operating mode due to the fact that, for certain values, + it will be impossible to distinguish which mode is about to be used + (e.g., when ptime=68, it would be impossible to distinguish whether + the packet is carrying one frame of 67.5 ms or three frames of 22.5 + ms). + + Note that the payload format (encoding) names are commonly shown in + upper case. Media subtypes are commonly shown in lower case. These + names are case insensitive in both places. Similarly, parameter + names are case insensitive in both the media subtype name and the + default mapping to the SDP a=fmtp attribute. + +4.3. Declarative SDP Considerations + + For declarative media, the "bitrate" parameter specifies the possible + bitrates used by the sender. Multiple TSVCIS rtpmap values (such as + 97, 98, and 99, as used below) MAY be used to convey TSVCIS-coded + voice at different bitrates. The receiver can then select an + appropriate TSVCIS codec by using 97, 98, or 99. + + m=audio 49120 RTP/AVP 97 98 99 + a=rtpmap:97 TSVCIS/8000 + a=fmtp:97 bitrate=2400 + a=rtpmap:98 TSVCIS/8000 + a=fmtp:98 bitrate=1200 + a=rtpmap:99 TSVCIS/8000 + a=fmtp:99 bitrate=600 + + For declarative media, the "tcmax" parameter specifies the maximum + number of octets of TSVCIS packed parameters used by the sender or + the sender's communications channel. + +4.4. Offer/Answer SDP Considerations + + In the Offer/Answer model [RFC3264], "bitrate" is a bidirectional + parameter. Both sides MUST use a common "bitrate" value or values. + The offer contains the bitrates supported by the offerer, listed in + its preferred order. The answerer MAY agree to any bitrate by + listing the bitrate first in the answerer response. Additionally, + the answerer MAY indicate any secondary bitrate or bitrates that it + supports. The initial bitrate used by both parties SHALL be the + first bitrate specified in the answerer response. + + For example, if offerer bitrates are "2400,600" and answerer bitrates + are "600,2400", the initial bitrate is 600. If other bitrates are + provided by the answerer, any common bitrate between the offer and + answer MAY be used at any time in the future. Activation of these + other common bitrates is beyond the scope of this document. + + The use of a lower bitrate is often important for a case such as when + one endpoint utilizes a bandwidth-constrained link (e.g., 1200 bps + radio link or slower), where only the lower coder bitrate will work. + + In the Offer/Answer model [RFC3264], "tcmax" is a bidirectional + parameter. Both sides SHOULD use a common "tcmax" value. The offer + contains the tcmax supported by the offerer. The answerer MAY agree + to any tcmax equal to or less than this value by stating the desired + tcmax in the answerer response. The answerer alternatively MAY + identify its own tcmax and rely on TSVCIS ignoring any augmented data + it cannot use. + +5. Discontinuous Transmissions + + A primary application of TSVCIS is for radio communications of voice + conversations, and discontinuous transmissions are normal. When + TSVCIS is used in an IP network, TSVCIS RTP packet transmissions may + cease and resume frequently. RTP synchronization source (SSRC) + sequence number gaps indicate lost packets to be filled by Packet + Loss Concealment (PLC), while abrupt loss of RTP packets indicates + intended discontinuous transmissions. Resumption of voice + transmission SHOULD be indicated by the RTP marker bit (M) set to 1. + + If a TSVCIS coder so desires, it may send a MELPe comfort noise frame + as per Appendix B of [SCIP210] prior to ceasing transmission. A + receiver may optionally use comfort noise during its silence periods. + No SDP negotiations are required. + +6. Packet Loss Concealment + + TSVCIS packet loss concealment (PLC) uses the special properties and + coding for the pitch/voicing parameter of the MELPe 2400 bps coder. + The PLC erasure indication utilizes any of the errored encodings of a + non-voiced frame as identified in Table 1 of [MELPE]. For the sake + of simplicity, it is preferred that a code value of 3 for the pitch/ + voicing parameter be used. Hence, set bits P0 and P1 to one and bits + P2, P3, P4, P5, and P6 to zero. + + When using PLC in 1200 bps or 600 bps mode, the MELPe 2400 bps + decoder is called three or four times, respectively, to cover the + loss of a low bitrate MELPe frame. + +7. IANA Considerations + + IANA has registered TSVCIS as specified in Section 4.1. The media + type has been added to the IANA registry for "RTP Payload Format + Media Types" (https://www.iana.org/assignments/rtp-parameters). + +8. Security Considerations + + RTP packets using the payload format defined in this specification + are subject to the security considerations discussed in the RTP + specification [RFC3550] and in any applicable RTP profile such as + RTP/AVP [RFC3551], RTP/AVPF [RFC4585], RTP/SAVP [RFC3711], or RTP/ + SAVPF [RFC5124]. However, as discussed in [RFC7202], it is not an + RTP payload format's responsibility to discuss or mandate what + solutions are used to meet such basic security goals as + confidentiality, integrity, and source authenticity for RTP in + general. This responsibility lies with anyone using RTP in an + application. They can find guidance on available security mechanisms + and important considerations in [RFC7201]. Applications SHOULD use + one or more appropriate strong security mechanisms. The rest of this + section discusses the security-impacting properties of the payload + format itself. + + This RTP payload format and the TSVCIS decoder, to the best of our + knowledge, do not exhibit any significant non-uniformity in the + receiver-side computational complexity for packet processing and thus + are unlikely to pose a denial-of-service threat due to the receipt of + pathological data. Additionally, the RTP payload format does not + contain any active content. + + Please see the security considerations discussed in [RFC6562] + regarding Voice Activity Detect (VAD) and its effect on bitrates. + +9. References + +9.1. Normative References + + [MELP] Department of Defense, "Analog-to-Digital Conversion of + Voice by 2,400 Bit/Second Mixed Excitation Linear + Prediction (MELP)", Department of Defense + Telecommunications Standard MIL-STD-3005, December 1999. + + [MELPE] North Atlantic Treaty Organization (NATO), "The 600 Bit/S, + 1200 Bit/S and 2400 Bit/S NATO Interoperable Narrow Band + Voice Coder", STANAG No. 4591, October 2008. + + [NRLVDR] Heide, D., Cohen, A., Lee, Y., and T. Moran, "Universal + Vocoder Using Variable Data Rate Vocoding", + DOI 10.21236/ada588068, Naval Research Lab NRL/FR/5555-- + 13-10, 239, June 2013, + <https://doi.org/10.21236/ada588068>. + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, + DOI 10.17487/RFC2119, March 1997, + <https://www.rfc-editor.org/info/rfc2119>. + + [RFC2736] Handley, M. and C. Perkins, "Guidelines for Writers of RTP + Payload Format Specifications", BCP 36, RFC 2736, + DOI 10.17487/RFC2736, December 1999, + <https://www.rfc-editor.org/info/rfc2736>. + + [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model + with Session Description Protocol (SDP)", RFC 3264, + DOI 10.17487/RFC3264, June 2002, + <https://www.rfc-editor.org/info/rfc3264>. + + [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. + Jacobson, "RTP: A Transport Protocol for Real-Time + Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550, + July 2003, <https://www.rfc-editor.org/info/rfc3550>. + + [RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and + Video Conferences with Minimal Control", STD 65, RFC 3551, + DOI 10.17487/RFC3551, July 2003, + <https://www.rfc-editor.org/info/rfc3551>. + + [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. + Norrman, "The Secure Real-time Transport Protocol (SRTP)", + RFC 3711, DOI 10.17487/RFC3711, March 2004, + <https://www.rfc-editor.org/info/rfc3711>. + + [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session + Description Protocol", RFC 4566, DOI 10.17487/RFC4566, + July 2006, <https://www.rfc-editor.org/info/rfc4566>. + + [RFC4855] Casner, S., "Media Type Registration of RTP Payload + Formats", RFC 4855, DOI 10.17487/RFC4855, February 2007, + <https://www.rfc-editor.org/info/rfc4855>. + + [RFC5124] Ott, J. and E. Carrara, "Extended Secure RTP Profile for + Real-time Transport Control Protocol (RTCP)-Based Feedback + (RTP/SAVPF)", RFC 5124, DOI 10.17487/RFC5124, February + 2008, <https://www.rfc-editor.org/info/rfc5124>. + + [RFC6562] Perkins, C. and JM. Valin, "Guidelines for the Use of + Variable Bit Rate Audio with Secure RTP", RFC 6562, + DOI 10.17487/RFC6562, March 2012, + <https://www.rfc-editor.org/info/rfc6562>. + + [RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type + Specifications and Registration Procedures", BCP 13, + RFC 6838, DOI 10.17487/RFC6838, January 2013, + <https://www.rfc-editor.org/info/rfc6838>. + + [RFC8083] Perkins, C. and V. Singh, "Multimedia Congestion Control: + Circuit Breakers for Unicast RTP Sessions", RFC 8083, + DOI 10.17487/RFC8083, March 2017, + <https://www.rfc-editor.org/info/rfc8083>. + + [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage + Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, + March 2017, <https://www.rfc-editor.org/info/rfc8085>. + + [RFC8088] Westerlund, M., "How to Write an RTP Payload Format", + RFC 8088, DOI 10.17487/RFC8088, May 2017, + <https://www.rfc-editor.org/info/rfc8088>. + + [RFC8130] Demjanenko, V. and D. Satterlee, "RTP Payload Format for + the Mixed Excitation Linear Prediction Enhanced (MELPe) + Codec", RFC 8130, DOI 10.17487/RFC8130, March 2017, + <https://www.rfc-editor.org/info/rfc8130>. + + [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC + 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, + May 2017, <https://www.rfc-editor.org/info/rfc8174>. + + [SCIP210] National Security Agency, "SCIP Signaling Plan", SCIP-210, + January 2013. + +9.2. Informative References + + [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, + "Extended RTP Profile for Real-time Transport Control + Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, + DOI 10.17487/RFC4585, July 2006, + <https://www.rfc-editor.org/info/rfc4585>. + + [RFC7201] Westerlund, M. and C. Perkins, "Options for Securing RTP + Sessions", RFC 7201, DOI 10.17487/RFC7201, April 2014, + <https://www.rfc-editor.org/info/rfc7201>. + + [RFC7202] Perkins, C. and M. Westerlund, "Securing the RTP + Framework: Why RTP Does Not Mandate a Single Media + Security Solution", RFC 7202, DOI 10.17487/RFC7202, April + 2014, <https://www.rfc-editor.org/info/rfc7202>. + + [RMCAT] IETF, "RTP Media Congestion Avoidance Techniques (rmcat) + Working Group", + <https://datatracker.ietf.org/wg/rmcat/about/>. + + [TSVCIS] National Security Agency, "Tactical Secure Voice + Cryptographic Interoperability Specification (TSVCIS) + Version 3.1", NSA 09-01A, March 2019. + +Authors' Addresses + + Victor Demjanenko, Ph.D. + VOCAL Technologies, Ltd. + 520 Lee Entrance, Suite 202 + Buffalo, NY 14228 + United States of America + + Phone: +1 716 688 4675 + Email: victor.demjanenko@vocal.com + + + John Punaro + VOCAL Technologies, Ltd. + 520 Lee Entrance, Suite 202 + Buffalo, NY 14228 + United States of America + + Phone: +1 716 688 4675 + Email: john.punaro@vocal.com + + + David Satterlee + VOCAL Technologies, Ltd. + 520 Lee Entrance, Suite 202 + Buffalo, NY 14228 + United States of America + + Phone: +1 716 688 4675 + Email: david.satterlee@vocal.com |