summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc8817.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/rfc/rfc8817.txt')
-rw-r--r--doc/rfc/rfc8817.txt984
1 files changed, 984 insertions, 0 deletions
diff --git a/doc/rfc/rfc8817.txt b/doc/rfc/rfc8817.txt
new file mode 100644
index 0000000..2fdbb81
--- /dev/null
+++ b/doc/rfc/rfc8817.txt
@@ -0,0 +1,984 @@
+
+
+
+
+Internet Engineering Task Force (IETF) V. Demjanenko
+Request for Comments: 8817 J. Punaro
+Category: Standards Track D. Satterlee
+ISSN: 2070-1721 VOCAL Technologies, Ltd.
+ August 2020
+
+
+ RTP Payload Format for Tactical Secure Voice Cryptographic
+ Interoperability Specification (TSVCIS) Codec
+
+Abstract
+
+ This document describes the RTP payload format for the Tactical
+ Secure Voice Cryptographic Interoperability Specification (TSVCIS)
+ speech coder. TSVCIS is a scalable narrowband voice coder supporting
+ varying encoder data rates and fallbacks. It is implemented as an
+ augmentation to the Mixed Excitation Linear Prediction Enhanced
+ (MELPe) speech coder by conveying additional speech coder parameters
+ to enhance voice quality. TSVCIS augmented speech data is processed
+ in conjunction with its temporally matched Mixed Excitation Linear
+ Prediction (MELP) 2400 speech data. The RTP packetization of TSVCIS
+ and MELPe speech coder data is described in detail.
+
+Status of This Memo
+
+ This is an Internet Standards Track document.
+
+ This document is a product of the Internet Engineering Task Force
+ (IETF). It represents the consensus of the IETF community. It has
+ received public review and has been approved for publication by the
+ Internet Engineering Steering Group (IESG). Further information on
+ Internet Standards is available in Section 2 of RFC 7841.
+
+ Information about the current status of this document, any errata,
+ and how to provide feedback on it may be obtained at
+ https://www.rfc-editor.org/info/rfc8817.
+
+Copyright Notice
+
+ Copyright (c) 2020 IETF Trust and the persons identified as the
+ document authors. All rights reserved.
+
+ This document is subject to BCP 78 and the IETF Trust's Legal
+ Provisions Relating to IETF Documents
+ (https://trustee.ietf.org/license-info) in effect on the date of
+ publication of this document. Please review these documents
+ carefully, as they describe your rights and restrictions with respect
+ to this document. Code Components extracted from this document must
+ include Simplified BSD License text as described in Section 4.e of
+ the Trust Legal Provisions and are provided without warranty as
+ described in the Simplified BSD License.
+
+Table of Contents
+
+ 1. Introduction
+ 1.1. Conventions
+ 1.2. Abbreviations
+ 2. Background
+ 3. Payload Format
+ 3.1. MELPe Bitstream Definitions
+ 3.1.1. 2400 bps Bitstream Structure
+ 3.1.2. 1200 bps Bitstream Structure
+ 3.1.3. 600 bps Bitstream Structure
+ 3.1.4. Comfort Noise Bitstream Definition
+ 3.2. TSVCIS Bitstream Definition
+ 3.3. Multiple TSVCIS Frames in an RTP Packet
+ 3.4. Congestion Control Considerations
+ 4. Payload Format Parameters
+ 4.1. Media Type Definitions
+ 4.2. Mapping to SDP
+ 4.3. Declarative SDP Considerations
+ 4.4. Offer/Answer SDP Considerations
+ 5. Discontinuous Transmissions
+ 6. Packet Loss Concealment
+ 7. IANA Considerations
+ 8. Security Considerations
+ 9. References
+ 9.1. Normative References
+ 9.2. Informative References
+ Authors' Addresses
+
+1. Introduction
+
+ This document describes how compressed Tactical Secure Voice
+ Cryptographic Interoperability Specification (TSVCIS) speech as
+ produced by the TSVCIS codec [TSVCIS] [NRLVDR] may be formatted for
+ use as an RTP payload. The TSVCIS speech coder (or TSVCIS speech-
+ aware communications equipment on any intervening transport link) may
+ adjust to restricted bandwidth conditions by reducing the amount of
+ augmented speech data and relying on the underlying MELPe speech
+ coder for the most constrained bandwidth links.
+
+ Details are provided for packetizing the TSVCIS augmented speech data
+ along with MELPe 2400 bps speech parameters in an RTP packet. The
+ sender may send one or more codec data frames per packet, depending
+ on the application scenario or based on transport network conditions,
+ bandwidth restrictions, delay requirements, and packet loss
+ tolerance.
+
+1.1. Conventions
+
+ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+ "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
+ "OPTIONAL" in this document are to be interpreted as described in
+ BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
+ capitals, as shown here.
+
+ Best current practices for writing an RTP payload format
+ specification were followed [RFC2736] [RFC8088].
+
+1.2. Abbreviations
+
+ The following abbreviations are used in this document.
+
+ AVP: Audio/Video Profile
+
+ AVPF: Audio/Video Profile Feedback
+
+ CELP: Code-Excited Linear Prediction
+
+ FEC: Forward Error Correction
+
+ LPC: Linear-Predictive Coding
+
+ LSB: Least Significant Bit
+
+ MELP: Mixed Excitation Linear Prediction
+
+ MELPe: Mixed Excitation Linear Prediction Enhanced
+
+ MSB: Most Significant Bit
+
+ MTC: Modified Count
+
+ NATO: North American Treaty Organization
+
+ NRL: Naval Research Lab
+
+ PLC: Packet Loss Concealment
+
+ SAVP: Secure Audio/Video Profile
+
+ SAVPF: Secure Audio/Video Profile Feedback
+
+ SDP: Session Description Protocol
+
+ SSRC: Synchronization Source
+
+ SRTP: Secure Real-Time Transport Protocol
+
+ TSVCIS: Tactical Secure Voice Cryptographic Interoperability
+ Specification
+
+ VAD: Voice Activity Detect
+
+ VDR: Variable Date Rate
+
+2. Background
+
+ The MELP speech coder was developed by the US military as an upgrade
+ from the LPC-based CELP standard vocoder for low-bitrate
+ communications [MELP]. ("LPC" stands for "Linear-Predictive Coding",
+ and "CELP" stands for "Code-Excited Linear Prediction".) MELP was
+ further enhanced and subsequently adopted by NATO as "MELPe" for use
+ by its members and Partnership for Peace countries for military and
+ other governmental communications as international NATO Standard
+ STANAG 4591 [MELPE].
+
+ The Tactical Secure Voice Cryptographic Interoperability
+ Specification (TSVCIS) is a specification written by the Tactical
+ Secure Voice Working Group (TSVWG) to enable all modern tactical
+ secure voice devices to be interoperable across the US Department of
+ Defense [TSVCIS]. One of the most important aspects is that the
+ voice modes defined in TSVCIS are based on specific fixed rates of
+ the Naval Research Lab's (NRL's) Variable Date Rate (VDR) Vocoder,
+ which uses the MELPe standard as its base [NRLVDR]. A complete
+ TSVCIS speech frame consists of MELPe speech parameters and
+ corresponding TSVCIS augmented speech data.
+
+ In addition to the augmented speech data, the TSVCIS specification
+ identifies which speech coder and framing bits are to be encrypted
+ and how they are protected by forward error correction (FEC)
+ techniques (using block codes). At the RTP transport layer, only the
+ speech coder-related bits need to be considered and are conveyed in
+ unencrypted form. In most IP-based network deployments, standard
+ link encryption methods (Secure Real-Time Transport Protocol (SRTP),
+ VPNs, FIPS 140 link encryptors, or Type 1 Ethernet encryptors) would
+ be used to secure the RTP speech contents.
+
+ TSVCIS augmented speech data is derived from the signal processing
+ and data generated by the MELPe speech coder. For the purposes of
+ this specification, only the general parameter nature of TSVCIS will
+ be characterized. Depending on the bandwidth available (and FEC
+ requirements), a varying number of TSVCIS-specific speech coder
+ parameters need to be transported. These are first byte-packed and
+ then conveyed from encoder to decoder.
+
+ Byte packing of TSVCIS speech data into packed parameters is
+ processed as per the following example, where
+
+ Three-bit field: Bits A, B, and C (A is MSB; C is LSB)
+
+ Five-bit field: Bits D, E, F, G, and H (D is MSB; H is LSB)
+
+ MSB LSB
+ 0 1 2 3 4 5 6 7
+ +------+------+------+------+------+------+------+------+
+ | H | G | F | E | D | C | B | A |
+ +------+------+------+------+------+------+------+------+
+
+ This packing method places the three-bit field "first" in the lowest
+ bits followed by the next five-bit field. Parameters may be split
+ between octets with the most significant bits in the earlier octet.
+ Any unfilled bits in the last octet MUST be filled with zero.
+
+ In order to accommodate a varying amount of TSVCIS augmented speech
+ data, an octet count specifies the number of octets representing the
+ TSVCIS packed parameters. The encoding to do so is presented in
+ Section 3.2. TSVCIS specifically uses the NRL VDR in two
+ configurations with a fixed set of 15 and 35 packed octet parameters
+ in a standardized order [TSVCIS].
+
+3. Payload Format
+
+ The TSVCIS codec augments the standard MELP 2400, 1200, and 600
+ bitrates and hence uses 22.5, 67.5, or 90 ms frames with a sampling
+ rate clock of 8 kHz, so the RTP timestamp MUST be in units of 1/8000
+ of a second.
+
+ The RTP payload for TSVCIS has the format shown in Figure 1. No
+ additional header specific to this payload format is needed. This
+ format is intended for situations where the sender and the receiver
+ send one or more codec data frames per packet.
+
+ 0 1 2 3
+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | RTP Header |
+ +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
+ | |
+ + one or more frames of TSVCIS |
+ | |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+ Figure 1: Packet Format Diagram
+
+ The RTP header of the packetized encoded TSVCIS speech has the
+ expected values as described in [RFC3550]. The usage of the M bit
+ SHOULD be as specified in the applicable RTP profile -- for example,
+ [RFC3551] specifies that if the sender does not suppress silence
+ (i.e., sends a frame on every frame interval), the M bit will always
+ be zero. When more than one codec data frame is present in a single
+ RTP packet, the timestamp specified is that of the oldest data frame
+ represented in the RTP packet.
+
+ The assignment of an RTP payload type for this new packet format is
+ outside the scope of this document and will not be specified here.
+ It is expected that the RTP profile for a particular class of
+ applications will assign a payload type for this encoding; if that is
+ not done, then a payload type in the dynamic range shall be chosen by
+ the sender.
+
+3.1. MELPe Bitstream Definitions
+
+ The TSVCIS speech coder includes all three MELPe coder rates used as
+ base speech parameters or as speech coders for bandwidth-restricted
+ links. RTP packetization of MELPe follows [RFC8130] and is repeated
+ here for all three MELPe rates [RFC8130], with its recommendations
+ now regarded as requirements. The bits previously labeled as RSVA,
+ RSVB, and RSVC in [RFC8130] SHOULD be filled with rate code bits
+ CODA, CODB, and CODC, as shown in Table 1 (compatible with Table 7 in
+ Section 3.3 of [RFC8130]).
+
+ +===============+======+======+======+========+
+ | Coder Bitrate | CODA | CODB | CODC | Length |
+ +===============+======+======+======+========+
+ | 2400 bps | 0 | 0 | N/A | 7 |
+ +---------------+------+------+------+--------+
+ | 1200 bps | 1 | 0 | 0 | 11 |
+ +---------------+------+------+------+--------+
+ | 600 bps | 0 | 1 | N/A | 7 |
+ +---------------+------+------+------+--------+
+ | Comfort Noise | 1 | 0 | 1 | 2 |
+ +---------------+------+------+------+--------+
+ | TSVCIS Data | 1 | 1 | N/A | var. |
+ +---------------+------+------+------+--------+
+
+ Table 1: TSVCIS/MELPe Frame Bitrate
+ Indicators and Frame Length
+
+ The total number of bits used to describe one MELPe frame of 2400 bps
+ speech is 54, which fits in 7 octets (with two rate code bits). For
+ MELPe 1200 bps speech, the total number of bits used is 81, which
+ fits in 11 octets (with three rate code bits and four unused bits).
+ For MELPe 600 bps speech, the total number of bits used is 54, which
+ fits in 7 octets (with two rate code bits). The comfort noise frame
+ consists of 13 bits, which fits in 2 octets (with three rate code
+ bits). TSVCIS packed parameters will use the last code combination
+ in a trailing byte as discussed in Section 3.2.
+
+ It should be noted that CODB for MELPe 600 bps mode MAY deviate from
+ the value in Table 1 when bit 55 is used as an alternating 1/0 end-
+ to-end framing bit. Frame decoding would remain distinct as CODA
+ being zero on its own would indicate a 7-byte frame for either a 2400
+ or 600 bps rate, and the use of 600 bps speech coding could be
+ deduced from the RTP timestamp (and anticipated by the Session
+ Description Protocol (SDP) negotiations).
+
+3.1.1. 2400 bps Bitstream Structure
+
+ The 2400 bps MELPe RTP payload is constructed as per Figure 2. Note
+ that CODA MUST be filled with 0 and CODB SHOULD be filled with 0 as
+ per Section 3.1. CODB MAY contain an end-to-end framing bit if
+ required by the endpoints.
+
+ MSB LSB
+ 0 1 2 3 4 5 6 7
+ +------+------+------+------+------+------+------+------+
+ | B_08 | B_07 | B_06 | B_05 | B_04 | B_03 | B_02 | B_01 |
+ +------+------+------+------+------+------+------+------+
+ | B_16 | B_15 | B_14 | B_13 | B_12 | B_11 | B_10 | B_09 |
+ +------+------+------+------+------+------+------+------+
+ | B_24 | B_23 | B_22 | B_21 | B_20 | B_19 | B_18 | B_17 |
+ +------+------+------+------+------+------+------+------+
+ | B_32 | B_31 | B_30 | B_29 | B_28 | B_27 | B_26 | B_25 |
+ +------+------+------+------+------+------+------+------+
+ | B_40 | B_39 | B_38 | B_37 | B_36 | B_35 | B_34 | B_33 |
+ +------+------+------+------+------+------+------+------+
+ | B_48 | B_47 | B_46 | B_45 | B_44 | B_43 | B_42 | B_41 |
+ +------+------+------+------+------+------+------+------+
+ | CODA | CODB | B_54 | B_53 | B_52 | B_51 | B_50 | B_49 |
+ +------+------+------+------+------+------+------+------+
+
+ Figure 2: Packed MELPe 2400 bps Payload Octets
+
+3.1.2. 1200 bps Bitstream Structure
+
+ The 1200 bps MELPe RTP payload is constructed as per Figure 3. Note
+ that CODA, CODB, and CODC MUST be filled with 1, 0, and 0,
+ respectively, as per Section 3.1. RSV0 MUST be coded as 0.
+
+ MSB LSB
+ 0 1 2 3 4 5 6 7
+ +------+------+------+------+------+------+------+------+
+ | B_08 | B_07 | B_06 | B_05 | B_04 | B_03 | B_02 | B_01 |
+ +------+------+------+------+------+------+------+------+
+ | B_16 | B_15 | B_14 | B_13 | B_12 | B_11 | B_10 | B_09 |
+ +------+------+------+------+------+------+------+------+
+ | B_24 | B_23 | B_22 | B_21 | B_20 | B_19 | B_18 | B_17 |
+ +------+------+------+------+------+------+------+------+
+ | B_32 | B_31 | B_30 | B_29 | B_28 | B_27 | B_26 | B_25 |
+ +------+------+------+------+------+------+------+------+
+ | B_40 | B_39 | B_38 | B_37 | B_36 | B_35 | B_34 | B_33 |
+ +------+------+------+------+------+------+------+------+
+ | B_48 | B_47 | B_46 | B_45 | B_44 | B_43 | B_42 | B_41 |
+ +------+------+------+------+------+------+------+------+
+ | B_56 | B_55 | B_54 | B_53 | B_52 | B_51 | B_50 | B_49 |
+ +------+------+------+------+------+------+------+------+
+ | B_64 | B_63 | B_62 | B_61 | B_60 | B_59 | B_58 | B_57 |
+ +------+------+------+------+------+------+------+------+
+ | B_72 | B_71 | B_70 | B_69 | B_68 | B_67 | B_66 | B_65 |
+ +------+------+------+------+------+------+------+------+
+ | B_80 | B_79 | B_78 | B_77 | B_76 | B_75 | B_74 | B_73 |
+ +------+------+------+------+------+------+------+------+
+ | CODA | CODB | CODC | RSV0 | RSV0 | RSV0 | RSV0 | B_81 |
+ +------+------+------+------+------+------+------+------+
+
+ Figure 3: Packed MELPe 1200 bps Payload Octets
+
+3.1.3. 600 bps Bitstream Structure
+
+ The 600 bps MELPe RTP payload is constructed as per Figure 4. Note
+ CODA MUST be filled with 0 and CODB SHOULD be filled with 1 as per
+ Section 3.1. CODB MAY contain an end-to-end framing bit if required
+ by the endpoints.
+
+ MSB LSB
+ 0 1 2 3 4 5 6 7
+ +------+------+------+------+------+------+------+------+
+ | B_08 | B_07 | B_06 | B_05 | B_04 | B_03 | B_02 | B_01 |
+ +------+------+------+------+------+------+------+------+
+ | B_16 | B_15 | B_14 | B_13 | B_12 | B_11 | B_10 | B_09 |
+ +------+------+------+------+------+------+------+------+
+ | B_24 | B_23 | B_22 | B_21 | B_20 | B_19 | B_18 | B_17 |
+ +------+------+------+------+------+------+------+------+
+ | B_32 | B_31 | B_30 | B_29 | B_28 | B_27 | B_26 | B_25 |
+ +------+------+------+------+------+------+------+------+
+ | B_40 | B_39 | B_38 | B_37 | B_36 | B_35 | B_34 | B_33 |
+ +------+------+------+------+------+------+------+------+
+ | B_48 | B_47 | B_46 | B_45 | B_44 | B_43 | B_42 | B_41 |
+ +------+------+------+------+------+------+------+------+
+ | CODA | CODB | B_54 | B_53 | B_52 | B_51 | B_50 | B_49 |
+ +------+------+------+------+------+------+------+------+
+
+ Figure 4: Packed MELPe 600 bps Payload Octets
+
+3.1.4. Comfort Noise Bitstream Definition
+
+ The comfort noise MELPe RTP payload is constructed as per Figure 5.
+ Note that CODA, CODB, and CODC MUST be filled with 1, 0, and 1,
+ respectively, as per Section 3.1.
+
+ MSB LSB
+ 0 1 2 3 4 5 6 7
+ +------+------+------+------+------+------+------+------+
+ | B_08 | B_07 | B_06 | B_05 | B_04 | B_03 | B_02 | B_01 |
+ +------+------+------+------+------+------+------+------+
+ | CODA | CODB | CODC | B_13 | B_12 | B_11 | B_10 | B_09 |
+ +------+------+------+------+------+------+------+------+
+
+ Figure 5: Packed MELPe Comfort Noise Payload Octets
+
+3.2. TSVCIS Bitstream Definition
+
+ The TSVCIS augmented speech data as packed parameters MUST be placed
+ immediately after a corresponding MELPe 2400 bps payload in the same
+ RTP packet. The packed parameters are counted in octets (TC). The
+ preferred placement SHOULD be used for TSVCIS payloads with TC less
+ than or equal to 77 octets; this is shown in Figure 6. In the
+ preferred placement, a single trailing octet SHALL be appended to
+ include a two-bit rate code, CODA and CODB (both bits set to one),
+ and a six-bit modified count (MTC). The special modified count value
+ of all ones (representing an MTC value of 63) SHALL NOT be used for
+ this format as it is used as the indicator for the alternate packing
+ format shown next. In a standard implementation, the TSVCIS speech
+ coder uses a minimum of 15 octets for parameters in octet packed
+ form. The modified count (MTC) MUST be reduced by 15 from the full
+ octet count (TC). Computed MTC = TC-15. This accommodates a maximum
+ of 77 parameter octets (the maximum value of MTC is 62; 77 is the sum
+ of 62+15).
+
+ MSB LSB
+ 0 1 2 3 4 5 6 7
+ +------+------+------+------+------+------+------+------+
+ 1 | T008 | T007 | T006 | T005 | T004 | T003 | T002 | T001 |
+ +------+------+------+------+------+------+------+------+
+ 2 | T016 | T015 | T014 | T013 | T012 | T011 | T010 | T009 |
+ +------+------+------+------+------+------+------+------+
+ 3 | T024 | T023 | T022 | T021 | T020 | T019 | T018 | T017 |
+ +------+------+------+------+------+------+------+------+
+ 4 | T032 | T031 | T030 | T029 | T028 | T027 | T026 | T025 |
+ +------+------+------+------+------+------+------+------+
+ 5 | T040 | T039 | T038 | T037 | T036 | T035 | T034 | T033 |
+ +------+------+------+------+------+------+------+------+
+ 6 | T048 | T047 | T046 | T045 | T044 | T043 | T042 | T041 |
+ +------+------+------+------+------+------+------+------+
+ 7 | TO56 | TO55 | T054 | T053 | T052 | T051 | T050 | T049 |
+ +------+------+------+------+------+------+------+------+
+ 8 | T064 | T063 | T062 | T061 | T060 | T059 | T058 | T057 |
+ +------+------+------+------+------+------+------+------+
+ 9 | T072 | T071 | T070 | T069 | T068 | T067 | T066 | T065 |
+ +------+------+------+------+------+------+------+------+
+ 10 | T080 | T079 | T078 | T077 | T076 | T075 | T074 | T073 |
+ +------+------+------+------+------+------+------+------+
+ 11 | T088 | T087 | T086 | T085 | T084 | T083 | T082 | T081 |
+ +------+------+------+------+------+------+------+------+
+ 12 | TO96 | TO95 | T094 | T093 | T092 | T091 | T090 | T089 |
+ +------+------+------+------+------+------+------+------+
+ 13 | T104 | T103 | T102 | T101 | T100 | T099 | T098 | T097 |
+ +------+------+------+------+------+------+------+------+
+ 14 | T112 | T111 | T110 | T109 | T108 | T107 | T106 | T105 |
+ +------+------+------+------+------+------+------+------+
+ 15 | T120 | T119 | T118 | T117 | T116 | T115 | T114 | T113 |
+ +------+------+------+------+------+------+------+------+
+ | . . . . |
+ +------+------+------+------+------+------+------+------+
+ TC+1 | CODA | CODB | modified octet count |
+ +------+------+------+------+------+------+------+------+
+
+ Figure 6: Preferred Packed TSVCIS Payload Octets
+
+ In order to accommodate all other NRL VDR configurations, an
+ alternate parameter placement MUST use two trailing bytes as shown in
+ Figure 7. The last trailing byte MUST be filled with a two-bit rate
+ code, CODA and CODB (both bits set to one), and its six-bit count
+ field MUST be filled with ones. The second to last trailing byte
+ MUST contain the parameter count (TC) in octets (a value from 1 and
+ 255, inclusive). The value of zero SHALL be considered as reserved.
+
+ MSB LSB
+ 0 1 2 3 4 5 6 7
+ +------+------+------+------+------+------+------+------+
+ 1 | T008 | T007 | T006 | T005 | T004 | T003 | T002 | T001 |
+ +------+------+------+------+------+------+------+------+
+ 2 | T016 | T015 | T014 | T013 | T012 | T011 | T010 | T009 |
+ +------+------+------+------+------+------+------+------+
+ | . . . . |
+ +------+------+------+------+------+------+------+------+
+ TC+1 | octet count |
+ +------+------+------+------+------+------+------+------+
+ TC+2 | CODA | CODB | 1 | 1 | 1 | 1 | 1 | 1 |
+ +------+------+------+------+------+------+------+------+
+
+ Figure 7: Length Unrestricted Packed TSVCIS Payload Octets
+
+3.3. Multiple TSVCIS Frames in an RTP Packet
+
+ A TSVCIS RTP packet payload consists of zero or more consecutive
+ TSVCIS coder frames (each consisting of MELPe 2400 and TSVCIS coder
+ data), with the oldest frame first, followed by zero or one MELPe
+ comfort noise frame. The presence of a comfort noise frame can be
+ determined by its rate code bits in its last octet.
+
+ The default packetization interval is one coder frame (22.5, 67.5, or
+ 90 ms) according to the coder bitrate (2400, 1200, or 600 bps). For
+ some applications, a longer packetization interval is used to reduce
+ the packet rate.
+
+ A TSVCIS RTP packet without coder and comfort noise frames MAY be
+ used periodically by an endpoint to indicate connectivity by an
+ otherwise idle receiver.
+
+ TSVCIS coder frames in a single RTP packet MAY have varying TSVCIS
+ parameter octet counts. Its packed parameter octet count (length) is
+ indicated in the trailing byte(s). All MELPe frames in a single RTP
+ packet MUST be of the same coder bitrate. For all MELPe coder
+ frames, the coder rate bits in the trailing byte identify the
+ contents and length as per Table 1.
+
+ It is important to observe that senders have the following additional
+ restrictions:
+
+ * Senders SHOULD NOT include more TSVCIS or MELPe frames in a single
+ RTP packet than will fit in the MTU of the RTP transport protocol.
+
+ * Frames MUST NOT be split between RTP packets.
+
+ It is RECOMMENDED that the number of frames contained within an RTP
+ packet be consistent with the application. For example, in telephony
+ and other real-time applications where delay is important, the fewer
+ frames per packet, the lower the delay. However, for bandwidth-
+ constrained links or delay-insensitive streaming messaging
+ applications, more than one frame per packet or many frames per
+ packet would be acceptable.
+
+ Information describing the number of frames contained in an RTP
+ packet is not transmitted as part of the RTP payload. The way to
+ determine the number of TSVCIS/MELPe frames is to identify each frame
+ type and length, thereby counting the total number of octets within
+ the RTP packet.
+
+3.4. Congestion Control Considerations
+
+ The target bitrate of TSVCIS can be adjusted at any point in time,
+ thus allowing congestion management. Furthermore, the amount of
+ encoded speech or audio data encoded in a single packet can be used
+ for congestion control, since the packet rate is inversely
+ proportional to the packet duration. A lower packet transmission
+ rate reduces the amount of header overhead but at the same time
+ increases latency and loss sensitivity, so it ought to be used with
+ care.
+
+ Since UDP does not provide congestion control, applications that use
+ RTP over UDP SHOULD implement their own congestion control above the
+ UDP layer [RFC8085] and MAY also implement a transport circuit
+ breaker [RFC8083]. Work in the RMCAT Working Group [RMCAT] describes
+ the interactions and conceptual interfaces necessary between the
+ application components that relate to congestion control, including
+ the RTP layer, the higher-level media codec control layer, and the
+ lower-level transport interface, as well as components dedicated to
+ congestion control functions.
+
+4. Payload Format Parameters
+
+ This RTP payload format is identified using the TSVCIS media subtype,
+ which is registered in accordance with [RFC4855] and per the media
+ type registration template from [RFC6838].
+
+4.1. Media Type Definitions
+
+ Type name: audio
+
+ Subtype name: TSVCIS
+
+ Required parameters: Clock Rate (Hz): 8000
+
+ Optional parameters:
+ ptime:
+ the recommended length of time (in milliseconds) represented by
+ the media in a packet. It SHALL use the nearest rounded-up ms
+ integer packet duration. For TSVCIS, this corresponds to the
+ following values: 23, 45, 68, 90, 112, 135, 156, and 180.
+ Larger values can be used as long as they are properly rounded.
+ See Section 6 of [RFC4566].
+
+ maxptime:
+ the maximum length of time (in milliseconds) that can be
+ encapsulated in a packet. It SHALL use the nearest rounded-up
+ ms integer packet duration. For TSVCIS, this corresponds to
+ the following values: 23, 45, 68, 90, 112, 135, 156, and 180.
+ Larger values can be used as long as they are properly rounded.
+ See Section 6 of [RFC4566].
+
+ bitrate:
+ specifies the MELPe coder bitrates supported. Possible values
+ are a comma-separated list of rates from the following set:
+ 2400, 1200, 600. The modes are listed in order of preference;
+ the first is preferred. If "bitrate" is not present, the fixed
+ coder bitrate of 2400 MUST be used.
+
+ tcmax:
+ specifies the TSVCIS maximum value for the TC supported or
+ desired, ranging from 1 to 255. If "tcmax" is not present, a
+ default value of 35 is used.
+
+ Channels:
+ 1
+
+ Encoding considerations: This media subtype is framed and binary;
+ see Section 4.8 of [RFC6838].
+
+ Security considerations: Please see Section 8 of RFC 8817.
+
+ Interoperability considerations: N/A
+
+ Published specification: [TSVCIS]
+
+ Applications that use this media type: N/A
+
+ Fragment identifier considerations: N/A
+
+ Additional information:
+
+ Deprecated alias names for this type: N/A
+ Magic number(s): N/A
+ File extension(s): N/A
+ Macintosh file type code(s): N/A
+
+ Person & email address to contact for further information:
+ Victor Demjanenko, Ph.D. <victor.demjanenko@vocal.com>
+
+ Intended usage: COMMON
+
+ Restrictions on usage: The media subtype depends on RTP framing and
+ hence is only defined for transfer via RTP [RFC3550]. Transport
+ within other framing protocols is not defined at this time.
+
+ Author: Victor Demjanenko, Ph.D.
+
+ Change controller: IETF; contact <avt@ietf.org>
+
+ Provisional registration? (standards tree only): No
+
+4.2. Mapping to SDP
+
+ The mapping of the above-defined payload format media subtype and its
+ parameters SHALL be done according to Section 3 of [RFC4855].
+
+ The information carried in the media type specification has a
+ specific mapping to fields in the Session Description Protocol (SDP)
+ [RFC4566], which is commonly used to describe RTP sessions. When SDP
+ is used to specify sessions employing the TSVCIS codec, the mapping
+ is as follows:
+
+ * The media type ("audio") goes in SDP "m=" as the media name.
+
+ * The media subtype (payload format name) goes in SDP "a=rtpmap" as
+ the encoding name.
+
+ * The parameter "bitrate" goes in the SDP "a=fmtp" attribute by
+ copying it as a "bitrate=<value>" string.
+
+ * The parameter "tcmax" goes in the SDP "a=fmtp" attribute by
+ copying it as a "tcmax=<value>" string.
+
+ * The parameters "ptime" and "maxptime" go in the SDP "a=ptime" and
+ "a=maxptime" attributes, respectively.
+
+ When conveying information via SDP, the encoding name SHALL be
+ "TSVCIS" (the same as the media subtype).
+
+ An example of the media representation in SDP for describing TSVCIS
+ might be:
+
+ m=audio 49120 RTP/AVP 96
+ a=rtpmap:96 TSVCIS/8000
+
+ The optional media type parameter "bitrate", when present, MUST be
+ included in the "a=fmtp" attribute in the SDP, expressed as a media
+ type string in the form of a semicolon-separated list of
+ parameter=value pairs. The string "value" can be one or more of
+ 2400, 1200, and 600, separated by commas (where each bitrate value
+ indicates the corresponding MELPe coder). An example of the media
+ representation in SDP for describing TSVCIS when all three coder
+ bitrates are supported might be:
+
+ m=audio 49120 RTP/AVP 96
+ a=rtpmap:96 TSVCIS/8000
+ a=fmtp:96 bitrate=2400,600,1200
+
+ The optional media type parameter "tcmax", when present, MUST be
+ included in the "a=fmtp" attribute in the SDP, expressed as a media
+ type string in the form of a semicolon-separated list of
+ parameter=value pairs. The string "value" is an integer number in
+ the range of 1 to 255 representing the maximum number of TSVCIS
+ parameter octets supported. An example of the media representation
+ in SDP for describing TSVCIS with a maximum of 101 octets supported
+ is as follows:
+
+ m=audio 49120 RTP/AVP 96
+ a=rtpmap:96 TSVCIS/8000
+ a=fmtp:96 tcmax=101
+
+ The parameter "ptime" cannot be used for the purpose of specifying
+ the TSVCIS operating mode due to the fact that, for certain values,
+ it will be impossible to distinguish which mode is about to be used
+ (e.g., when ptime=68, it would be impossible to distinguish whether
+ the packet is carrying one frame of 67.5 ms or three frames of 22.5
+ ms).
+
+ Note that the payload format (encoding) names are commonly shown in
+ upper case. Media subtypes are commonly shown in lower case. These
+ names are case insensitive in both places. Similarly, parameter
+ names are case insensitive in both the media subtype name and the
+ default mapping to the SDP a=fmtp attribute.
+
+4.3. Declarative SDP Considerations
+
+ For declarative media, the "bitrate" parameter specifies the possible
+ bitrates used by the sender. Multiple TSVCIS rtpmap values (such as
+ 97, 98, and 99, as used below) MAY be used to convey TSVCIS-coded
+ voice at different bitrates. The receiver can then select an
+ appropriate TSVCIS codec by using 97, 98, or 99.
+
+ m=audio 49120 RTP/AVP 97 98 99
+ a=rtpmap:97 TSVCIS/8000
+ a=fmtp:97 bitrate=2400
+ a=rtpmap:98 TSVCIS/8000
+ a=fmtp:98 bitrate=1200
+ a=rtpmap:99 TSVCIS/8000
+ a=fmtp:99 bitrate=600
+
+ For declarative media, the "tcmax" parameter specifies the maximum
+ number of octets of TSVCIS packed parameters used by the sender or
+ the sender's communications channel.
+
+4.4. Offer/Answer SDP Considerations
+
+ In the Offer/Answer model [RFC3264], "bitrate" is a bidirectional
+ parameter. Both sides MUST use a common "bitrate" value or values.
+ The offer contains the bitrates supported by the offerer, listed in
+ its preferred order. The answerer MAY agree to any bitrate by
+ listing the bitrate first in the answerer response. Additionally,
+ the answerer MAY indicate any secondary bitrate or bitrates that it
+ supports. The initial bitrate used by both parties SHALL be the
+ first bitrate specified in the answerer response.
+
+ For example, if offerer bitrates are "2400,600" and answerer bitrates
+ are "600,2400", the initial bitrate is 600. If other bitrates are
+ provided by the answerer, any common bitrate between the offer and
+ answer MAY be used at any time in the future. Activation of these
+ other common bitrates is beyond the scope of this document.
+
+ The use of a lower bitrate is often important for a case such as when
+ one endpoint utilizes a bandwidth-constrained link (e.g., 1200 bps
+ radio link or slower), where only the lower coder bitrate will work.
+
+ In the Offer/Answer model [RFC3264], "tcmax" is a bidirectional
+ parameter. Both sides SHOULD use a common "tcmax" value. The offer
+ contains the tcmax supported by the offerer. The answerer MAY agree
+ to any tcmax equal to or less than this value by stating the desired
+ tcmax in the answerer response. The answerer alternatively MAY
+ identify its own tcmax and rely on TSVCIS ignoring any augmented data
+ it cannot use.
+
+5. Discontinuous Transmissions
+
+ A primary application of TSVCIS is for radio communications of voice
+ conversations, and discontinuous transmissions are normal. When
+ TSVCIS is used in an IP network, TSVCIS RTP packet transmissions may
+ cease and resume frequently. RTP synchronization source (SSRC)
+ sequence number gaps indicate lost packets to be filled by Packet
+ Loss Concealment (PLC), while abrupt loss of RTP packets indicates
+ intended discontinuous transmissions. Resumption of voice
+ transmission SHOULD be indicated by the RTP marker bit (M) set to 1.
+
+ If a TSVCIS coder so desires, it may send a MELPe comfort noise frame
+ as per Appendix B of [SCIP210] prior to ceasing transmission. A
+ receiver may optionally use comfort noise during its silence periods.
+ No SDP negotiations are required.
+
+6. Packet Loss Concealment
+
+ TSVCIS packet loss concealment (PLC) uses the special properties and
+ coding for the pitch/voicing parameter of the MELPe 2400 bps coder.
+ The PLC erasure indication utilizes any of the errored encodings of a
+ non-voiced frame as identified in Table 1 of [MELPE]. For the sake
+ of simplicity, it is preferred that a code value of 3 for the pitch/
+ voicing parameter be used. Hence, set bits P0 and P1 to one and bits
+ P2, P3, P4, P5, and P6 to zero.
+
+ When using PLC in 1200 bps or 600 bps mode, the MELPe 2400 bps
+ decoder is called three or four times, respectively, to cover the
+ loss of a low bitrate MELPe frame.
+
+7. IANA Considerations
+
+ IANA has registered TSVCIS as specified in Section 4.1. The media
+ type has been added to the IANA registry for "RTP Payload Format
+ Media Types" (https://www.iana.org/assignments/rtp-parameters).
+
+8. Security Considerations
+
+ RTP packets using the payload format defined in this specification
+ are subject to the security considerations discussed in the RTP
+ specification [RFC3550] and in any applicable RTP profile such as
+ RTP/AVP [RFC3551], RTP/AVPF [RFC4585], RTP/SAVP [RFC3711], or RTP/
+ SAVPF [RFC5124]. However, as discussed in [RFC7202], it is not an
+ RTP payload format's responsibility to discuss or mandate what
+ solutions are used to meet such basic security goals as
+ confidentiality, integrity, and source authenticity for RTP in
+ general. This responsibility lies with anyone using RTP in an
+ application. They can find guidance on available security mechanisms
+ and important considerations in [RFC7201]. Applications SHOULD use
+ one or more appropriate strong security mechanisms. The rest of this
+ section discusses the security-impacting properties of the payload
+ format itself.
+
+ This RTP payload format and the TSVCIS decoder, to the best of our
+ knowledge, do not exhibit any significant non-uniformity in the
+ receiver-side computational complexity for packet processing and thus
+ are unlikely to pose a denial-of-service threat due to the receipt of
+ pathological data. Additionally, the RTP payload format does not
+ contain any active content.
+
+ Please see the security considerations discussed in [RFC6562]
+ regarding Voice Activity Detect (VAD) and its effect on bitrates.
+
+9. References
+
+9.1. Normative References
+
+ [MELP] Department of Defense, "Analog-to-Digital Conversion of
+ Voice by 2,400 Bit/Second Mixed Excitation Linear
+ Prediction (MELP)", Department of Defense
+ Telecommunications Standard MIL-STD-3005, December 1999.
+
+ [MELPE] North Atlantic Treaty Organization (NATO), "The 600 Bit/S,
+ 1200 Bit/S and 2400 Bit/S NATO Interoperable Narrow Band
+ Voice Coder", STANAG No. 4591, October 2008.
+
+ [NRLVDR] Heide, D., Cohen, A., Lee, Y., and T. Moran, "Universal
+ Vocoder Using Variable Data Rate Vocoding",
+ DOI 10.21236/ada588068, Naval Research Lab NRL/FR/5555--
+ 13-10, 239, June 2013,
+ <https://doi.org/10.21236/ada588068>.
+
+ [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
+ Requirement Levels", BCP 14, RFC 2119,
+ DOI 10.17487/RFC2119, March 1997,
+ <https://www.rfc-editor.org/info/rfc2119>.
+
+ [RFC2736] Handley, M. and C. Perkins, "Guidelines for Writers of RTP
+ Payload Format Specifications", BCP 36, RFC 2736,
+ DOI 10.17487/RFC2736, December 1999,
+ <https://www.rfc-editor.org/info/rfc2736>.
+
+ [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
+ with Session Description Protocol (SDP)", RFC 3264,
+ DOI 10.17487/RFC3264, June 2002,
+ <https://www.rfc-editor.org/info/rfc3264>.
+
+ [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V.
+ Jacobson, "RTP: A Transport Protocol for Real-Time
+ Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550,
+ July 2003, <https://www.rfc-editor.org/info/rfc3550>.
+
+ [RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and
+ Video Conferences with Minimal Control", STD 65, RFC 3551,
+ DOI 10.17487/RFC3551, July 2003,
+ <https://www.rfc-editor.org/info/rfc3551>.
+
+ [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
+ Norrman, "The Secure Real-time Transport Protocol (SRTP)",
+ RFC 3711, DOI 10.17487/RFC3711, March 2004,
+ <https://www.rfc-editor.org/info/rfc3711>.
+
+ [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
+ Description Protocol", RFC 4566, DOI 10.17487/RFC4566,
+ July 2006, <https://www.rfc-editor.org/info/rfc4566>.
+
+ [RFC4855] Casner, S., "Media Type Registration of RTP Payload
+ Formats", RFC 4855, DOI 10.17487/RFC4855, February 2007,
+ <https://www.rfc-editor.org/info/rfc4855>.
+
+ [RFC5124] Ott, J. and E. Carrara, "Extended Secure RTP Profile for
+ Real-time Transport Control Protocol (RTCP)-Based Feedback
+ (RTP/SAVPF)", RFC 5124, DOI 10.17487/RFC5124, February
+ 2008, <https://www.rfc-editor.org/info/rfc5124>.
+
+ [RFC6562] Perkins, C. and JM. Valin, "Guidelines for the Use of
+ Variable Bit Rate Audio with Secure RTP", RFC 6562,
+ DOI 10.17487/RFC6562, March 2012,
+ <https://www.rfc-editor.org/info/rfc6562>.
+
+ [RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type
+ Specifications and Registration Procedures", BCP 13,
+ RFC 6838, DOI 10.17487/RFC6838, January 2013,
+ <https://www.rfc-editor.org/info/rfc6838>.
+
+ [RFC8083] Perkins, C. and V. Singh, "Multimedia Congestion Control:
+ Circuit Breakers for Unicast RTP Sessions", RFC 8083,
+ DOI 10.17487/RFC8083, March 2017,
+ <https://www.rfc-editor.org/info/rfc8083>.
+
+ [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage
+ Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085,
+ March 2017, <https://www.rfc-editor.org/info/rfc8085>.
+
+ [RFC8088] Westerlund, M., "How to Write an RTP Payload Format",
+ RFC 8088, DOI 10.17487/RFC8088, May 2017,
+ <https://www.rfc-editor.org/info/rfc8088>.
+
+ [RFC8130] Demjanenko, V. and D. Satterlee, "RTP Payload Format for
+ the Mixed Excitation Linear Prediction Enhanced (MELPe)
+ Codec", RFC 8130, DOI 10.17487/RFC8130, March 2017,
+ <https://www.rfc-editor.org/info/rfc8130>.
+
+ [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
+ 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
+ May 2017, <https://www.rfc-editor.org/info/rfc8174>.
+
+ [SCIP210] National Security Agency, "SCIP Signaling Plan", SCIP-210,
+ January 2013.
+
+9.2. Informative References
+
+ [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey,
+ "Extended RTP Profile for Real-time Transport Control
+ Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585,
+ DOI 10.17487/RFC4585, July 2006,
+ <https://www.rfc-editor.org/info/rfc4585>.
+
+ [RFC7201] Westerlund, M. and C. Perkins, "Options for Securing RTP
+ Sessions", RFC 7201, DOI 10.17487/RFC7201, April 2014,
+ <https://www.rfc-editor.org/info/rfc7201>.
+
+ [RFC7202] Perkins, C. and M. Westerlund, "Securing the RTP
+ Framework: Why RTP Does Not Mandate a Single Media
+ Security Solution", RFC 7202, DOI 10.17487/RFC7202, April
+ 2014, <https://www.rfc-editor.org/info/rfc7202>.
+
+ [RMCAT] IETF, "RTP Media Congestion Avoidance Techniques (rmcat)
+ Working Group",
+ <https://datatracker.ietf.org/wg/rmcat/about/>.
+
+ [TSVCIS] National Security Agency, "Tactical Secure Voice
+ Cryptographic Interoperability Specification (TSVCIS)
+ Version 3.1", NSA 09-01A, March 2019.
+
+Authors' Addresses
+
+ Victor Demjanenko, Ph.D.
+ VOCAL Technologies, Ltd.
+ 520 Lee Entrance, Suite 202
+ Buffalo, NY 14228
+ United States of America
+
+ Phone: +1 716 688 4675
+ Email: victor.demjanenko@vocal.com
+
+
+ John Punaro
+ VOCAL Technologies, Ltd.
+ 520 Lee Entrance, Suite 202
+ Buffalo, NY 14228
+ United States of America
+
+ Phone: +1 716 688 4675
+ Email: john.punaro@vocal.com
+
+
+ David Satterlee
+ VOCAL Technologies, Ltd.
+ 520 Lee Entrance, Suite 202
+ Buffalo, NY 14228
+ United States of America
+
+ Phone: +1 716 688 4675
+ Email: david.satterlee@vocal.com