doc: Add RFC documents

author: Thomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
committer: Thomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
commit: 4bfd864f10b68b71482b35c818559068ef8d5797 (patch)
tree: e3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc3558.txt
parent: ea76e11061bda059ae9f9ad130a9895cc85607db (diff)
1 files changed, 1291 insertions, 0 deletions
diff --git a/doc/rfc/rfc3558.txt b/doc/rfc/rfc3558.txt
new file mode 100644
index 0000000..9299c4e
--- /dev/null
+++ b/doc/rfc/rfc3558.txt
@@ -0,0 +1,1291 @@
+
+
+
+
+
+
+Network Working Group                                              A. Li
+Request for Comments: 3558                                          UCLA
+Category: Standards Track                                      July 2003
+
+
+      RTP Payload Format for Enhanced Variable Rate Codecs (EVRC)
+                   and Selectable Mode Vocoders (SMV)
+
+Status of this Memo
+
+   This document specifies an Internet standards track protocol for the
+   Internet community, and requests discussion and suggestions for
+   improvements.  Please refer to the current edition of the "Internet
+   Official Protocol Standards" (STD 1) for the standardization state
+   and status of this protocol.  Distribution of this memo is unlimited.
+
+Copyright Notice
+
+   Copyright (C) The Internet Society (2003).  All Rights Reserved.
+
+Abstract
+
+   This document describes the RTP payload format for Enhanced Variable
+   Rate Codec (EVRC) Speech and Selectable Mode Vocoder (SMV) Speech.
+   Two sub-formats are specified for different application scenarios.  A
+   bundled/interleaved format is included to reduce the effect of packet
+   loss on speech quality and amortize the overhead of the RTP header
+   over more than one speech frame.  A non-bundled format is also
+   supported for conversational applications.
+
+Table of Contents
+
+   1. Introduction ................................................... 2
+   2. Background ..................................................... 2
+   3. The Codecs Supported ........................................... 3
+      3.1. EVRC ...................................................... 3
+      3.2. SMV ....................................................... 3
+      3.3. Other Frame-Based Vocoders ................................ 4
+   4. RTP/Vocoder Packet Format ...................................... 4
+      4.1. Interleaved/Bundled Packet Format ......................... 5
+      4.2. Header-Free Packet Format ................................. 6
+      4.3. Determining the Format of Packets ......................... 7
+   5. Packet Table of Contents Entries and Codec Data Frame Format ... 7
+      5.1. Packet Table of Contents entries .......................... 7
+      5.2. Codec Data Frames ......................................... 8
+   6. Interleaving Codec Data Frames ................................. 9
+   7. Bundling Codec Data Frames .................................... 12
+   8. Handling Missing Codec Data Frames ............................ 12
+
+
+
+Li                          Standards Track                     [Page 1]
+
+RFC 3558          RTP Payload Format for EVRC and SMV          July 2003
+
+
+   9. Implementation Issues ......................................... 12
+      9.1. Interleaving Length .......................................12
+      9.2. Validation of Received Packets ............................13
+      9.3. Processing the Late Packets ...............................13
+   10. Mode Request ................................................. 13
+   11. Storage Format ............................................... 14
+   12. IANA Considerations .......................................... 15
+      12.1. Registration of Media Type EVRC ..........................15
+      12.2. Registration of Media Type EVRC0 .........................16
+      12.3. Registration of Media Type SMV ...........................17
+      12.4. Registration of Media Type SMV0 ..........................18
+   13. Mapping to SDP Parameters .................................... 19
+   14. Security Considerations ...................................... 20
+   15. Adding Support of Other Frame-Based Vocoders ................. 20
+   16. Acknowledgements ............................................. 21
+   17. References ................................................... 21
+      17.1 Normative ................................................ 21
+      17.2 Informative .............................................. 22
+   18. Author's Address ............................................. 22
+   19. Full Copyright Statement ..................................... 23
+
+1. Introduction
+
+   This document describes how speech compressed with EVRC [1] or SMV
+   [2] may be formatted for use as an RTP payload type.  The format is
+   also extensible to other codecs that generate a similar set of frame
+   types.  Two methods are provided to packetize the codec data frames
+   into RTP packets: an interleaved/bundled format and a zero-header
+   format.  The sender may choose the best format for each application
+   scenario, based on network conditions, bandwidth availability, delay
+   requirements, and packet-loss tolerance.
+
+   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
+   document are to be interpreted as described in RFC 2119 [3].
+
+2. Background
+
+   The 3rd Generation Partnership Project 2 (3GPP2) has published two
+   standards which define speech compression algorithms for CDMA
+   applications: EVRC [1] and SMV [2].  EVRC is currently deployed in
+   millions of first and second generation CDMA handsets.  SMV is the
+   preferred speech codec standard for CDMA2000, and will be deployed in
+   third generation handsets in addition to EVRC.  Improvements and new
+   codecs will keep emerging as technology improves, and future handsets
+   will likely support multiple codecs.
+
+
+
+
+
+Li                          Standards Track                     [Page 2]
+
+RFC 3558          RTP Payload Format for EVRC and SMV          July 2003
+
+
+   The formats of the EVRC and SMV codec frames are very similar.  Many
+   other vocoders also share common characteristics, and have many
+   similar application scenarios.  This parallelism enables an RTP
+   payload format to be designed for EVRC and SMV that may also support
+   other, similar vocoders with minimal additional specification work.
+   This can simplify the protocol for transporting vocoder data frames
+   through RTP and reduce the complexity of implementations.
+
+3. The Codecs Supported
+
+3.1. EVRC
+
+   The Enhanced Variable Rate Codec (EVRC) [1] compresses each 20
+   milliseconds of 8000 Hz, 16-bit sampled speech input into output
+   frames in one of the three different sizes: Rate 1 (171 bits), Rate
+   1/2 (80 bits), or Rate 1/8 (16 bits).  In addition, there are two
+   zero bit codec frame types: null frames and erasure frames.  Null
+   frames are produced as a result of the vocoder running at rate 0.
+   Null frames are zero bits long and are normally not transmitted.
+   Erasure frames are the frames substituted by the receiver to the
+   codec for the lost or damaged frames.  Erasure frames are also zero
+   bits long and are normally not transmitted.
+
+   The codec chooses the output frame rate based on analysis of the
+   input speech and the current operating mode (either normal or one of
+   several reduced rate modes).  For typical speech patterns, this
+   results in an average output of 4.2 kilobits/second for normal mode
+   and a lower average output for reduced rate modes.
+
+3.2. SMV
+
+   The Selectable Mode Vocoder (SMV) [2] compresses each 20 milliseconds
+   of 8000 Hz, 16-bit sampled speech input into output frames of one of
+   the four different sizes: Rate 1 (171 bits), Rate 1/2 (80 bits), Rate
+   1/4 (40 bits), or Rate 1/8 (16 bits).  In addition, there are two
+   zero bit codec frame types: null frames and erasure frames.  Null
+   frames are produced as a result of the vocoder running at rate 0.
+   Null frames are zero bits long and are normally not transmitted.
+   Erasure frames are the frames substituted by the receiver to the
+   codec for the lost or damaged frames.  Erasure frames are also zero
+   bits long and are normally not transmitted.
+
+   The SMV codec can operate in six modes.  Each mode may produce frames
+   of any of the rates (full rate to 1/8 rate) for varying percentages
+   of time, based on the characteristics of the speech samples and the
+   selected mode.  The SMV mode can change on a
+   frame-by-frame basis.  The SMV codec does not need additional
+   information other than the codec data frames to correctly decode the
+
+
+
+Li                          Standards Track                     [Page 3]
+
+RFC 3558          RTP Payload Format for EVRC and SMV          July 2003
+
+
+   data of various modes; therefore, the mode of the encoder does not
+   need to be transmitted with the encoded frames.
+
+   The SMV codec chooses the output frame rate based on analysis of the
+   input speech and the current operating mode.  For typical speech
+   patterns, this results in an average output of 4.2 kilobits/second
+   for Mode 0 in two way conversation (approximately 50% active speech
+   time and 50% in eighth rate while listening) and lower for other
+   reduced rate modes.  SMV is more bandwidth efficient than EVRC.  EVRC
+   is equivalent in performance to SMV mode 1.
+
+3.3. Other Frame-Based Vocoders
+
+   Other frame-based vocoders can be carried in the packet format
+   defined in this document, as long as they possess the following
+   properties:
+
+      o The codec is frame-based;
+      o blank and erasure frames are supported;
+      o the total number of rates is less than 17;
+      o the maximum full rate frame can be transported in a single RTP
+        packet using this specific format.
+
+   Vocoders with the characteristics listed above can be transported
+   using the packet format specified in this document with some
+   additional specification work; the pieces that must be defined are
+   listed in Section 15.
+
+4. RTP/Vocoder Packet Format
+
+   The vocoder speech data may be transmitted in either of the two RTP
+   packet formats specified in the following two subsections, as
+   appropriate for the application scenario.  In the packet format
+   diagrams shown in this document, bit 0 is the most significant bit.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Li                          Standards Track                     [Page 4]
+
+RFC 3558          RTP Payload Format for EVRC and SMV          July 2003
+
+
+4.1. Interleaved/Bundled Packet Format
+
+   This format is used to send one or more vocoder frames per packet.
+   Interleaving or bundling MAY be used.  The RTP packet for this format
+   is as follows:
+
+    0                   1                   2                   3
+    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |                      RTP Header [4]                           |
+   +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
+   |R|R| LLL | NNN | MMM |  Count  |  TOC  |  ...  |  TOC  |padding|
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |        one or more codec data frames, one per TOC entry       |
+   |                             ....                              |
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+   The RTP header has the expected values as described in the RTP
+   specification [4].  The RTP timestamp is in 1/8000 of a second units
+   for EVRC and SMV.  For any other vocoders that use this packet
+   format, the timestamp unit needs to be defined explicitly.  The M bit
+   should be set as specified in the applicable RTP profile, for
+   example, RFC 3551 [5].  Note that RFC 3551 [5] specifies that if the
+   sender does not suppress silence, the M bit will always be zero.
+   When multiple codec data frames are present in a single RTP packet,
+   the timestamp is that of the oldest data represented in the RTP
+   packet.  The assignment of an RTP payload type for this packet format
+   is outside the scope of this document; it is specified by the RTP
+   profile under which this payload format is used.
+
+   The first octet of a Interleaved/Bundled format packet is the
+   Interleave Octet.  The second octet contains the Mode Request and
+   Frame Count fields.  The Table of Contents (ToC) field then follows.
+   The fields are specified as follows:
+
+   Reserved (RR): 2 bits
+      Reserved bits.  MUST be set to zero by sender, SHOULD be ignored
+      by receiver.
+
+   Interleave Length (LLL): 3 bits
+      Indicates the length of interleave; a value of 0 indicates
+      bundling, a special case of interleaving.  See Section 6 and
+      Section 7 for more detailed discussion.
+
+
+
+
+
+
+
+
+Li                          Standards Track                     [Page 5]
+
+RFC 3558          RTP Payload Format for EVRC and SMV          July 2003
+
+
+   Interleave Index (NNN): 3 bits
+      Indicates the index within an interleave group.  MUST have a value
+      less than or equal to the value of LLL.  Values of NNN greater
+      than the value of LLL are invalid.  Packet with invalid NNN values
+      SHOULD be ignored by the receiver.
+
+   Mode Request (MMM): 3 bits
+      The Mode Request field is used to signal Mode Request information.
+      See Section 10 for details.
+
+   Frame Count (Count): 5 bits
+      The number of ToC fields (and vocoder frames) present in the
+      packet is the value of the frame count field plus one.  A value of
+      zero indicates that the packet contains one ToC field, while a
+      value of 31 indicates that the packet contains 32 ToC fields.
+
+   Padding (padding): 0 or 4 bits
+      This padding ensures that codec data frames start on an octet
+      boundary.  When the frame count is odd, the sender MUST add 4 bits
+      of padding following the last TOC.  When the frame count is even,
+      the sender MUST NOT add padding bits.  If padding is present, the
+      padding bits MUST be set to zero by sender, and SHOULD be ignored
+      by receiver.
+
+   The Table of Contents field (ToC) provides information on the codec
+   data frame(s) in the packet.  There is one ToC entry for each codec
+   data frame.  The detailed formats of the ToC field and codec data
+   frames are specified in Section 5.
+
+   Multiple data frames may be included within a Interleaved/Bundled
+   packet using interleaving or bundling as described in Section 6 and
+   Section 7.
+
+4.2. Header-Free Packet Format
+
+   The Header-Free Packet Format is designed for maximum bandwidth
+   efficiency and low latency.  Only one codec data frame can be sent in
+   each Header-Free format packet.  None of the payload header fields
+   (LLL, NNN, MMM, Count) nor ToC entries are present.  The codec rate
+   for the data frame can be determined from the length of the codec
+   data frame, since there is only one codec data frame in each
+   Header-Free packet.
+
+   Use of the RTP header fields for Header-Free RTP/Vocoder Packet
+   Format is the same as described in Section 4.1 for
+   Interleaved/Bundled RTP/Vocoder Packet Format.  The detailed format
+   of the codec data frame is specified in Section 5.
+
+
+
+
+Li                          Standards Track                     [Page 6]
+
+RFC 3558          RTP Payload Format for EVRC and SMV          July 2003
+
+
+    0                   1                   2                   3
+    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |                      RTP Header [4]                           |
+   +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
+   |                                                               |
+   +          ONLY one codec data frame            +-+-+-+-+-+-+-+-+
+   |                                               |
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+4.3. Determining the Format of Packets
+
+   All receivers SHOULD be able to process both packet formats.  The
+   sender MAY choose to use one or both packet formats.
+
+   A receiver MUST have prior knowledge of the packet format to
+   correctly decode the RTP packets.  When packets of both formats are
+   used within the same session, different RTP payload type values MUST
+   be used for each format to distinguish the packet formats.  The
+   association of payload type number with the packet format is done
+   out-of-band, for example by SDP during the setup of a session.
+
+5. Packet Table of Contents Entries and Codec Data Frame Format
+
+5.1. Packet Table of Contents entries
+
+   Each codec data frame in a Interleaved/Bundled packet has a
+   corresponding Table of Contents (ToC) entry.  The ToC entry indicates
+   the rate of the codec frame.  (Header-Free packets MUST NOT have a
+   ToC field.)
+
+   Each ToC entry is occupies four bits.  The format of the bits is
+   indicated below:
+
+       0 1 2 3
+      +-+-+-+-+
+      |fr type|
+      +-+-+-+-+
+
+   Frame Type: 4 bits
+      The frame type indicates the type of the corresponding codec data
+      frame in the RTP packet.
+
+
+
+
+
+
+
+
+
+Li                          Standards Track                     [Page 7]
+
+RFC 3558          RTP Payload Format for EVRC and SMV          July 2003
+
+
+   For EVRC and SMV codecs, the frame type values and size of the
+   associated codec data frame are described in the table below:
+
+   Value   Rate      Total codec data frame size (in octets)
+   ---------------------------------------------------------
+     0     Blank      0    (0 bit)
+     1     1/8        2    (16 bits)
+     2     1/4        5    (40 bits; not valid for EVRC)
+     3     1/2       10    (80 bits)
+     4     1         22    (171 bits; 5 padded at end with zeros)
+     5     Erasure    0    (SHOULD NOT be transmitted by sender)
+
+   All values not listed in the above table MUST be considered reserved.
+   A ToC entry with a reserved Frame Type value SHOULD be considered
+   invalid.  Note that the EVRC codec does not have 1/4 rate frames,
+   thus frame type value 2 MUST be considered a reserved value when the
+   EVRC codec is in use.
+
+   Other vocoders that use this packet format need to specify their own
+   table of frame types and corresponding codec data frames.
+
+5.2. Codec Data Frames
+
+   The output of the vocoder MUST be converted into codec data frames
+   for inclusion in the RTP payload.  The conversions for EVRC and SMV
+   codecs are specified below.  (Note: Because the EVRC codec does not
+   have Rate 1/4 frames, the specifications of 1/4 frames does not apply
+   to EVRC codec data frames).  Other vocoders that use this packet
+   format need to specify how to convert vocoder output data into
+   frames.
+
+   The codec output data bits as numbered in EVRC and SMV are packed
+   into octets.  The lowest numbered bit (bit 1 for Rate 1, Rate 1/2,
+   Rate 1/4 and Rate 1/8) is placed in the most significant bit
+   (internet bit 0) of octet 1 of the codec data frame, the second
+   lowest bit is placed in the second most significant bit of the first
+   octet, the third lowest in the third most significant bit of the
+   first octet, and so on.  This continues until all of the bits have
+   been placed in the codec data frame.
+
+   The remaining unused bits of the last octet of the codec data frame
+   MUST be set to zero.  Note that in EVRC and SMV this is only
+   applicable to Rate 1 frames (171 bits) as the Rate 1/2 (80 bits),
+   Rate 1/4 (40 bits, SMV only) and Rate 1/8 frames (16 bits) fit
+   exactly into a whole number of octets.
+
+   Following is a detailed listing showing a Rate 1 EVRC/SMV codec
+   output frame converted into a codec data frame:
+
+
+
+Li                          Standards Track                     [Page 8]
+
+RFC 3558          RTP Payload Format for EVRC and SMV          July 2003
+
+
+   The codec data frame for a EVRC/SMV codec Rate 1 frame is 22 octets
+   long.  Bits 1 through 171 from the EVRC/SMV codec Rate 1 frame are
+   placed as indicated, with bits marked with "Z" set to zero.  EVRC/SMV
+   codec Rate 1/8, Rate 1/4 and Rate 1/2 frames are converted similarly,
+   but do not require zero padding because they align on octet
+   boundaries.
+
+                        Rate 1 codec data frame
+
+    0                   1                   2                   3
+    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
+   |0|0|0|0|0|0|0|0|0|1|1|1|1|1|1|1|1|1|1|2|2|2|2|2|2|2|2|2|2|3|3|3|
+   |1|2|3|4|5|6|7|8|9|0|1|2|3|4|5|6|7|8|9|0|1|2|3|4|5|6|7|8|9|0|1|2|
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   :                                                               :
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |1|1|1|1|1|1|1|1|1|1|1|1|1|1|1|1|1|1|1|1|1|1|1|1|1|1|1| | | | | |
+   |4|4|4|4|4|5|5|5|5|5|5|5|5|5|5|6|6|6|6|6|6|6|6|6|6|7|7|Z|Z|Z|Z|Z|
+   |5|6|7|8|9|0|1|2|3|4|5|6|7|8|9|0|1|2|3|4|5|6|7|8|9|0|1| | | | | |
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+6. Interleaving Codec Data Frames
+
+   As indicated in Section 4.1, more than one codec data frame MAY be
+   included in a single Interleaved/Bundled packet by a sender.  This is
+   accomplished by interleaving or bundling.
+
+   Bundling is used to spread the transmission overhead of the RTP and
+   payload header over multiple vocoder frames.  Interleaving
+   additionally reduces the listener's perception of data loss by
+   spreading such loss over non-consecutive vocoder frames.  EVRC, SMV,
+   and similar vocoders are able to compensate for an occasional lost
+   frame, but speech quality degrades exponentially with consecutive
+   frame loss.
+
+   Bundling is signaled by setting the LLL field to zero and the Count
+   field to greater than zero.  Interleaving is indicated by setting the
+   LLL field to a value greater than zero.
+
+   The discussions on general interleaving apply to the bundling (which
+   can be viewed as a reduced case of interleaving) with reduced
+   complexity.  The bundling case is discussed in detail in Section 7.
+
+   Senders MAY support interleaving and/or bundling.  All receivers that
+   support Interleave/Bundling packet format MUST support both
+   interleaving and bundling.
+
+
+
+Li                          Standards Track                     [Page 9]
+
+RFC 3558          RTP Payload Format for EVRC and SMV          July 2003
+
+
+   Given a time-ordered sequence of output frames from the codec
+   numbered 0..n, a bundling value B (the value in the Count field plus
+   one), and an interleave length L where n = B * (L+1) - 1, the output
+   frames are placed into RTP packets as follows (the values of the
+   fields LLL and NNN are indicated for each RTP packet):
+
+   First RTP Packet in Interleave group:
+      LLL=L, NNN=0
+      Frame 0, Frame L+1, Frame 2(L+1), Frame 3(L+1), ... for a total of
+      B frames
+
+   Second RTP Packet in Interleave group:
+      LLL=L, NNN=1
+      Frame 1, Frame 1+L+1, Frame 1+2(L+1), Frame 1+3(L+1), ... for a
+      total of B frames
+
+   This continues to the last RTP packet in the interleave group:
+
+   L+1 RTP Packet in Interleave group:
+      LLL=L, NNN=L
+      Frame L, Frame L+L+1, Frame L+2(L+1), Frame L+3(L+1), ... for a
+      total of B frames
+
+   Within each interleave group, the RTP packets making up the
+   interleave group MUST be transmitted in value-increasing order of the
+   NNN field.  While this does not guarantee reduced end-to-end delay on
+   the receiving end, when packets are delivered in order by the
+   underlying transport, delay will be reduced to the minimum possible.
+
+   Receivers MAY signal the maximum number of codec data frames (i.e.,
+   the maximum acceptable bundling value B) they can handle in a single
+   RTP packet using the OPTIONAL maxptime RTP mode parameter identified
+   in Section 12.
+
+   Receivers MAY signal the maximum interleave length (i.e., the maximum
+   acceptable LLL value in the Interleaving Octet) they will accept
+   using the OPTIONAL maxinterleave RTP mode parameter identified in
+   Section 12.
+
+   The parameters maxptime and maxinterleave are exchanged at the
+   initial setup of the session.  In one-to-one sessions, the sender
+   MUST respect these values set be the receiver, and MUST NOT
+   interleave/bundle more packets than what the receiver signals that it
+   can handle.  This ensures that the receiver can allocate a known
+   amount of buffer space that will be sufficient for all
+   interleaving/bundling used in that session.  During the session, the
+   sender may decrease the bundling value or interleaving length (so
+   that less buffer space is required at the receiver), but never exceed
+
+
+
+Li                          Standards Track                    [Page 10]
+
+RFC 3558          RTP Payload Format for EVRC and SMV          July 2003
+
+
+   the maximum value set by the receiver.  This prevents the situation
+   where a receiver needs to allocate more buffer space in the middle of
+   a session but is unable to do so.
+
+   Additionally, senders have the following restrictions:
+
+   o  MUST NOT bundle more codec data frames in a single RTP packet than
+      indicated by maxptime (see Section 12) if it is signaled.
+
+   o  SHOULD NOT bundle more codec data frames in a single RTP packet
+      than will fit in the MTU of the underlying network.
+
+   o  Once beginning a session with a given maximum interleaving value
+      set by maxinterleave in Section 12, MUST NOT increase the
+      interleaving value (LLL) to exceed the maximum interleaving value
+      that is signaled.
+
+   o  MAY change the interleaving value, but MUST do so only between
+      interleave groups.
+
+   o  Silence suppression MUST only be used between interleave groups.
+      A ToC with Frame Type 0 (Blank Frame, Section 5.1) MUST be used
+      within interleaving groups if the codec outputs a blank frame.
+      The M bit in the RTP header is not set for these blank frames, as
+      the stream is continuous in time.  Because there is only one time
+      stamp for each RTP packet, silence suppression used within an
+      interleave group would cause ambiguities when reconstructing the
+      speech at the receiver side, and thus is prohibited.
+
+   Given an RTP packet with sequence number S, interleave length (field
+   LLL) L, interleave index value (field NNN) N, and bundling value B,
+   the interleave group consists of this RTP packet and other RTP
+   packets with sequence numbers from S-N mod 65536 to S-N+L mod 65536
+   inclusive.  In other words, the interleave group always consists of
+   L+1 RTP packets with sequential sequence numbers.  The bundling value
+   for all RTP packets in an interleave group MUST be the same.
+
+   The receiver determines the expected bundling value for all RTP
+   packets in an interleave group by the number of codec data frames
+   bundled in the first RTP packet of the interleave group received.
+   Note that this may not be the first RTP packet of the interleave
+   group if packets are delivered out of order by the underlying
+   transport.
+
+
+
+
+
+
+
+
+Li                          Standards Track                    [Page 11]
+
+RFC 3558          RTP Payload Format for EVRC and SMV          July 2003
+
+
+7. Bundling Codec Data Frames
+
+   As discussed in Section 6, the bundling of codec data frames is a
+   special reduced case of interleaving with LLL value in the Interleave
+   Octet set to 0.
+
+   Bundling codec data frames indicates that multiple data frames are
+   included consecutively in a packet, because the interleaving length
+   (LLL) is 0.  The interleaving group is thus reduced to a single RTP
+   packet, and the reconstruction of the codec data frames from RTP
+   packets becomes a much simpler process.
+
+   Furthermore, the additional restrictions on senders are reduced to:
+
+   o  MUST NOT bundle more codec data frames in a single RTP packet than
+      indicated by maxptime (see Section 12) if it is signaled.
+
+   o  SHOULD NOT bundle more codec data frames in a single RTP packet
+      than will fit in the MTU of the underlying network.
+
+8. Handling Missing Codec Data Frames
+
+   The vocoders covered by this payload format support erasure frames as
+   an indication when frames are not available.  The erasure frames are
+   normally used internally by a receiver to advance the state of the
+   voice decoder by exactly one frame time for each missing frame.
+   Using the information from packet sequence number, time stamp, and
+   the M bit, the receiver can detect missing codec data frames from RTP
+   packet loss and/or silence suppression, and generate corresponding
+   erasure frames.  Erasure frames MUST also be used in storage format
+   to record missing frames.
+
+9. Implementation Issues
+
+9.1. Interleaving Length
+
+   The vocoder interpolates the missing speech content when given an
+   erasure frame.  However, the best quality is perceived by the
+   listener when erasure frames are not consecutive.  This makes
+   interleaving desirable as it increases speech quality when packet
+   loss occurs.
+
+   On the other hand, interleaving can greatly increase the end-to-end
+   delay.  Where an interactive session is desired, either
+   Interleaved/Bundled packet format with interleaving length (field
+   LLL) 0 or Header-Free packet format is RECOMMENDED.
+
+
+
+
+
+Li                          Standards Track                    [Page 12]
+
+RFC 3558          RTP Payload Format for EVRC and SMV          July 2003
+
+
+   When end-to-end delay is not a primary concern, an interleaving
+   length (field LLL) of 4 or 5 is RECOMMENDED as it offers a reasonable
+   compromise between robustness and latency.
+
+9.2. Validation of Received Packets
+
+   When receiving an RTP packet, the receiver SHOULD check the validity
+   of the ToC fields and match the length of the packet with what is
+   indicated by the ToC fields.  If any invalidity or mismatch is
+   detected, it is RECOMMENDED to discard the received packet to avoid
+   potential severe degradation of the speech quality.  The discarded
+   packet is treated following the same procedure as a lost packet, and
+   the discarded data will be replaced with erasure frames.
+
+   On receipt of an RTP packet with an invalid value of the LLL or NNN
+   fields, the RTP packet SHOULD be treated as lost by the receiver for
+   the purpose of generating erasure frames as described in Section 8.
+
+   On receipt of an RTP packet in an interleave group with other than
+   the expected frame count value, the receiver MAY discard codec data
+   frames off the end of the RTP packet or add erasure codec data frames
+   to the end of the packet in order to manufacture a substitute packet
+   with the expected bundling value.  The receiver MAY instead choose to
+   discard the whole interleave group.
+
+9.3. Processing the Late Packets
+
+   Assume that the receiver has begun playing frames from an interleave
+   group.  The time has come to play frame x from packet n of the
+   interleave group.  Further assume that packet n of the interleave
+   group has not been received.  As described in Section 8, an erasure
+   frame will be sent to the receiving vocoder.
+
+   Now, assume that packet n of the interleave group arrives before
+   frame x+1 of that packet is needed.  Receivers should use frame x+1
+   of the newly received packet n rather than substituting an erasure
+   frame.  In other words, just because packet n was not available the
+   first time it was needed to reconstruct the interleaved speech, the
+   receiver should not assume it is not available when it is
+   subsequently needed for interleaved speech reconstruction.
+
+10.  Mode Request
+
+   The Mode Request signal requests a particular encoding mode for the
+   speech encoding in the reverse direction.  All implementations are
+   RECOMMENDED to honor the Mode Request signal.  The Mode Request
+   signal SHOULD only be used in one-to-one sessions.  In multi-party
+   sessions, any received Mode Request signals SHOULD be ignored.
+
+
+
+Li                          Standards Track                    [Page 13]
+
+RFC 3558          RTP Payload Format for EVRC and SMV          July 2003
+
+
+   In addition, the Mode Request signal MAY also be sent through non-RTP
+   means, which is out of the scope of this specification.
+
+   The three-bit Mode Request field is used to signal the receiver to
+   set a particular encoding mode to its audio encoder.  If the Mode
+   Request field is set to a valid value in RTP packets from node A to
+   node B, it is a request for node B to change to the requested
+   encoding mode for its audio encoder and therefore the bit rate of the
+   RTP stream from node B to node A.  Once a node sets this field to a
+   value, it SHOULD continue to set the field to the same value in
+   subsequent packets until the requested mode is different.  This
+   design helps to eliminate the scenario of getting the codec stuck in
+   an unintended state if one of the packets that carries the Mode
+   Request is lost.  An otherwise silent node MAY send an RTP packet
+   containing a blank frame in order to send a Mode Request.
+
+   Each codec type using this format SHOULD define its own
+   interpretation of the Mode Request field.  Codecs SHOULD follow the
+   convention that higher values of the three-bit field correspond to an
+   equal or lower average output bit rate.
+
+   For the EVRC codec, the Mode Request field MUST be interpreted
+   according to Tables 2.2.1.2-1 and 2.2.1.2-2 of the EVRC codec
+   specifications [1].
+
+   For SMV codec, the Mode Request field MUST be interpreted according
+   to Table 2.2-2 of the SMV codec specifications [2].
+
+11.  Storage Format
+
+   The storage format is used for storing speech frames, e.g., as a file
+   or e-mail attachment.
+
+   The file begins with a magic number to identify the vocoder that is
+   used.  The magic number for EVRC corresponds to the ASCII character
+   string "#!EVRC\n", i.e., "0x23 0x21 0x45 0x56 0x52 0x43 0x0A".  The
+   magic number for SMV corresponds to the ASCII character string
+   "#!SMV\n", i.e., "0x23 0x21 0x53 0x4d 0x56 0x0a".
+
+   The codec data frames are stored in consecutive order, with a single
+   TOC entry field, extended to one octet, prefixing each codec data
+   frame.  The ToC field is extended to one octet by setting the four
+   most significant bits of the octet to zero.  For example, a ToC value
+   of 4 (a full-rate frame) is stored as 0x04.
+
+   Speech frames lost in transmission and non-received frames MUST be
+   stored as erasure frames (frame type 5, see definition in Section
+   5.1) to maintain synchronization with the original media.
+
+
+
+Li                          Standards Track                    [Page 14]
+
+RFC 3558          RTP Payload Format for EVRC and SMV          July 2003
+
+
+12.  IANA Considerations
+
+   Four new MIME sub-types as described in this section have been
+   registered by the IANA.
+
+   The MIME-names for the EVRC and SMV codec are allocated from the IETF
+   tree since all the vocoders covered are expected to be widely used
+   for Voice-over-IP applications.
+
+12.1.  Registration of Media Type EVRC
+
+   Media Type Name:           audio
+
+   Media Subtype Name:        EVRC
+
+   Required Parameter:        none
+
+   Optional parameters:
+      The following parameters apply to RTP transfer only.
+
+      ptime:    Defined as usual for RTP audio (see RFC 2327).
+
+      maxptime: The maximum amount of media which can be encapsulated in
+         each packet, expressed as time in milliseconds.  The time SHALL
+         be calculated as the sum of the time the media present in the
+         packet represents.  The time SHOULD be a multiple of the
+         duration of a single codec data frame (20 msec).  If not
+         signaled, the default maxptime value SHALL be 200 milliseconds.
+
+      maxinterleave: Maximum number for interleaving length (field LLL
+         in the Interleaving Octet).  The interleaving lengths used in
+         the entire session MUST NOT exceed this maximum value.  If not
+         signaled, the maxinterleave length SHALL be 5.
+
+   Encoding considerations:
+      This type is defined for transfer of EVRC-encoded data via RTP
+      using the Interleaved/Bundled packet format specified in Sections
+      4.1, 6, and 7 of RFC 3558.  It is also defined for other transfer
+      methods using the storage format specified in Section 11 of RFC
+      3558.
+
+   Security considerations:
+      See Section 14 "Security Considerations" of RFC 3558.
+
+   Public specification:
+      The EVRC vocoder is specified in 3GPP2 C.S0014.  Transfer methods
+      are specified in RFC 3558.
+
+
+
+
+Li                          Standards Track                    [Page 15]
+
+RFC 3558          RTP Payload Format for EVRC and SMV          July 2003
+
+
+   Additional information:
+      The following information applies for storage format only.
+
+      Magic number: #!EVRC\n (see Section 11 of RFC 3558)
+      File extensions: evc, EVC
+      Macintosh file type code: none
+      Object identifier or OID: none
+
+   Intended usage:
+      COMMON.  It is expected that many VoIP applications (as well as
+      mobile applications) will use this type.
+
+   Person & email address to contact for further information:
+      Adam Li
+      adamli@icsl.ucla.edu
+
+   Author/Change controller:
+      Adam Li
+      adamli@icsl.ucla.edu
+      IETF Audio/Video Transport Working Group
+
+12.2. Registration of Media Type EVRC0
+
+   Media Type Name:           audio
+
+   Media Subtype Name:        EVRC0
+
+   Required Parameters:       none
+
+   Optional parameters:       none
+
+   Encoding considerations:   none
+      This type is only defined for transfer of EVRC-encoded data via
+      RTP using the Header-Free packet format specified in Section 4.2
+      of RFC 3558.
+
+   Security considerations:
+      See Section 14 "Security Considerations" of RFC 3558.
+
+   Public specification:
+      The EVRC vocoder is specified in 3GPP2 C.S0014.  Transfer methods
+      are specified in RFC 3558.
+
+   Additional information:    none
+
+   Intended usage:
+      COMMON.  It is expected that many VoIP applications (as well as
+      mobile applications) will use this type.
+
+
+
+Li                          Standards Track                    [Page 16]
+
+RFC 3558          RTP Payload Format for EVRC and SMV          July 2003
+
+
+   Person & email address to contact for further information:
+      Adam Li
+      adamli@icsl.ucla.edu
+
+   Author/Change controller:
+      Adam Li
+      adamli@icsl.ucla.edu
+      IETF Audio/Video Transport Working Group
+
+12.3. Registration of Media Type SMV
+
+   Media Type Name:           audio
+
+   Media Subtype Name:        SMV
+
+   Required Parameter:        none
+
+   Optional parameters:
+   The following parameters apply to RTP transfer only.
+
+      ptime:    Defined as usual for RTP audio (see RFC 2327).
+
+      maxptime: The maximum amount of media which can be encapsulated
+         in each packet, expressed as time in milliseconds.  The time
+         SHALL be calculated as the sum of the time the media present
+         in the packet represents.  The time SHOULD be a multiple of the
+         duration of a single codec data frame (20 msec).  If not
+         signaled, the default maxptime value SHALL be 200
+         milliseconds.
+
+      maxinterleave: Maximum number for interleaving length (field LLL
+         in the Interleaving Octet).  The interleaving lengths used in
+         the entire session MUST NOT exceed this maximum value.  If not
+         signaled, the maxinterleave length SHALL be 5.
+
+   Encoding considerations:
+      This type is defined for transfer of SMV-encoded data via RTP
+      using the Interleaved/Bundled packet format specified in Section
+      4.1, 6, and 7 of RFC 3558.  It is also defined for other transfer
+      methods using the storage format specified in Section 11 of RFC
+      3558.
+
+   Security considerations:
+      See Section 14 "Security Considerations" of RFC 3558.
+
+   Public specification:
+      The SMV vocoder is specified in 3GPP2 C.S0030-0 v2.0.
+      Transfer methods are specified in RFC 3558.
+
+
+
+Li                          Standards Track                    [Page 17]
+
+RFC 3558          RTP Payload Format for EVRC and SMV          July 2003
+
+
+   Additional information:
+      The following information applies to storage format only.
+
+      Magic number: #!SMV\n (see Section 11 of RFC 3558)
+      File extensions: smv, SMV
+      Macintosh file type code: none
+      Object identifier or OID: none
+
+   Intended usage:
+      COMMON.  It is expected that many VoIP applications (as well as
+      mobile applications) will use this type.
+
+   Person & email address to contact for further information:
+      Adam Li
+      adamli@icsl.ucla.edu
+
+   Author/Change controller:
+      Adam Li
+      adamli@icsl.ucla.edu
+      IETF Audio/Video Transport Working Group
+
+12.4. Registration of Media Type SMV0
+
+   Media Type Name:           audio
+
+   Media Subtype Name:        SMV0
+
+   Required Parameter:        none
+
+   Optional parameters:       none
+
+   Encoding considerations:   none
+      This type is only defined for transfer of SMV-encoded data via RTP
+      using the Header-Free packet format specified in Section 4.2 of
+      RFC 3558.
+
+   Security considerations:
+      See Section 14 "Security Considerations" of RFC 3558.
+
+   Public specification:
+      The SMV vocoder is specified in 3GPP2 C.S0030-0 v2.0.  Transfer
+      methods are specified in RFC 3558.
+
+   Additional information:    none
+
+   Intended usage:
+      COMMON.  It is expected that many VoIP applications (as well as
+      mobile applications) will use this type.
+
+
+
+Li                          Standards Track                    [Page 18]
+
+RFC 3558          RTP Payload Format for EVRC and SMV          July 2003
+
+
+   Person & email address to contact for further information:
+      Adam Li
+      adamli@icsl.ucla.edu
+
+   Author/Change controller:
+      Adam Li
+      adamli@icsl.ucla.edu
+      IETF Audio/Video Transport Working Group
+
+13.  Mapping to SDP Parameters
+
+   Please note that this section applies to the RTP transfer only.
+
+   The information carried in the MIME media type specification has a
+   specific mapping to fields in the Session Description Protocol (SDP)
+   [6], which is commonly used to describe RTP sessions.  When SDP is
+   used to specify sessions employing the EVRC or EMV codec, the mapping
+   is as follows:
+
+      o  The MIME type ("audio") goes in SDP "m=" as the media name.
+
+      o  The MIME subtype (payload format name) goes in SDP "a=rtpmap"
+         as the encoding name.
+
+      o  The parameters "ptime" and "maxptime" go in the SDP "a=ptime"
+         and "a=maxptime" attributes, respectively.
+
+      o  The parameter "maxinterleave" goes in the SDP "a=fmtp"
+         attribute by copying it directly from the MIME media type
+         string as "maxinterleave=value".
+
+   Some examples of SDP session descriptions for EVRC and SMV encodings
+   follow below.
+
+   Example of usage of EVRC:
+
+      m=audio 49120 RTP/AVP 97
+      a=rtpmap:97 EVRC/8000
+      a=fmtp:97 maxinterleave=2
+      a=maxptime:80
+
+   Example of usage of SMV
+
+      m=audio 49122 RTP/AVP 99
+      a=rtpmap:99 SMV0/8000
+      a=fmtp:99
+
+
+
+
+
+Li                          Standards Track                    [Page 19]
+
+RFC 3558          RTP Payload Format for EVRC and SMV          July 2003
+
+
+   Note that the payload format (encoding) names are commonly shown in
+   upper case.  MIME subtypes are commonly shown in lower case.  These
+   names are case-insensitive in both places.  Similarly, parameter
+   names are case-insensitive both in MIME types and in the default
+   mapping to the SDP a=fmtp attribute.
+
+14.  Security Considerations
+
+   RTP packets using the payload format defined in this specification
+   are subject to the security considerations discussed in the RTP
+   specification [4], and any appropriate profile (for example [5]).
+   This implies that confidentiality of the media streams is achieved by
+   encryption.  Because the data compression used with this payload
+   format is applied end-to-end, encryption may be performed after
+   compression so there is no conflict between the two operations.
+
+   A potential denial-of-service threat exists for data encoding using
+   compression techniques that have non-uniform receiver-end
+   computational load.  The attacker can inject pathological datagrams
+   into the stream which are complex to decode and cause the receiver to
+   become overloaded.  However, the encodings covered in this document
+   do not exhibit any significant non-uniformity.
+
+   As with any IP-based protocol, in some circumstances, a receiver may
+   be overloaded simply by the receipt of too many packets, either
+   desired or undesired.  Network-layer authentication may be used to
+   discard packets from undesired sources, but the processing cost of
+   the authentication itself may be too high.  In a multicast
+   environment, pruning of specific sources may be implemented in future
+   versions of IGMP [7] and in multicast routing protocols to allow a
+   receiver to select which sources are allowed to reach it.
+
+   Interleaving may affect encryption.  Depending on the used encryption
+   scheme there may be restrictions on, for example, the time when keys
+   can be changed.  Specifically, the key change may need to occur at
+   the boundary between interleave groups.
+
+15.  Adding Support of Other Frame-Based Vocoders
+
+   As described above, the RTP packet format defined in this document is
+   very flexible and designed to be usable by other frame-based
+   vocoders.
+
+   Additional vocoders using this format MUST have properties as
+   described in Section 3.3.
+
+
+
+
+
+
+Li                          Standards Track                    [Page 20]
+
+RFC 3558          RTP Payload Format for EVRC and SMV          July 2003
+
+
+   For an eligible vocoder to use the payload format mechanisms defined
+   in this document, a new RTP payload format document needs to be
+   published as a standards track RFC.  That document can simply refer
+   to this document and then specify the following parameters:
+
+      o Define the unit used for RTP time stamp;
+      o Define the meaning of the Mode Request bits;
+      o Define corresponding codec data frame type values for ToC;
+      o Define the conversion procedure for vocoders output data frame;
+      o Define a magic number for storage format, and complete the
+        corresponding MIME registration.
+
+16.  Acknowledgements
+
+   The following authors have made significant contributions to this
+   document: Adam H. Li, John D. Villasenor, Dong-Seek Park, Jeong-Hoon
+   Park, Keith Miller, S. Craig Greer, David Leon, Nikolai Leung,
+   Marcello Lioy, Kyle J. McKay, Magdalena L. Espelien, Randall Gellens,
+   Tom Hiller, Peter J. McCann, Stinson S. Mathai, Michael D. Turner,
+   Ajay Rajkumar, Dan Gal, Magnus Westerlund, Lars-Erik Jonsson, Greg
+   Sherwood, and Thomas Zeng.
+
+17.  References
+
+17.1 Normative
+
+   [1]  3GPP2 C.S0014, "Enhanced Variable Rate Codec, Speech Service
+        Option 3 for Wideband Spread Spectrum Digital Systems", January
+        1997.
+
+   [2]  3GPP2 C.S0030-0 v2.0, "Selectable Mode Vocoder, Service Option
+        for Wideband Spread Spectrum Communication Systems", May 2002.
+
+   [3]  Bradner, S., "Key words for use in RFCs to Indicate Requirement
+        Levels", BCP 14, RFC 2119, March 1997.
+
+   [4]  Schulzrinne, H., Casner, S., Jacobson, V. and R. Frederick,
+        "RTP: A Transport Protocol for Real-Time Applications", RFC
+        3550, July 2003.
+
+   [5]  Schulzrinne, H. and S. Casner, "RTP Profile for Audio and Video
+        Conferences with Minimal Control", RFC 3551, July 2003.
+
+   [6]  Handley, M. and V. Jacobson, "SDP: Session Description
+        Protocol", RFC 2327, April 1998.
+
+
+
+
+
+
+Li                          Standards Track                    [Page 21]
+
+RFC 3558          RTP Payload Format for EVRC and SMV          July 2003
+
+
+17.2 Informative
+
+   [7]  Deering, S., "Host Extensions for IP Multicasting", STD 5, RFC
+        1112, August 1989.
+
+18.  Author's Address
+
+   Adam H. Li
+   Image Communication Lab
+   Electrical Engineering Department
+   University of California
+   Los Angeles, CA 90095
+   USA
+
+   Phone: +1 310 825 5178
+   EMail: adamli@icsl.ucla.edu
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Li                          Standards Track                    [Page 22]
+
+RFC 3558          RTP Payload Format for EVRC and SMV          July 2003
+
+
+19.  Full Copyright Statement
+
+   Copyright (C) The Internet Society (2003).  All Rights Reserved.
+
+   This document and translations of it may be copied and furnished to
+   others, and derivative works that comment on or otherwise explain it
+   or assist in its implementation may be prepared, copied, published
+   and distributed, in whole or in part, without restriction of any
+   kind, provided that the above copyright notice and this paragraph are
+   included on all such copies and derivative works.  However, this
+   document itself may not be modified in any way, such as by removing
+   the copyright notice or references to the Internet Society or other
+   Internet organizations, except as needed for the purpose of
+   developing Internet standards in which case the procedures for
+   copyrights defined in the Internet Standards process must be
+   followed, or as required to translate it into languages other than
+   English.
+
+   The limited permissions granted above are perpetual and will not be
+   revoked by the Internet Society or its successors or assigns.
+
+   This document and the information contained herein is provided on an
+   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
+   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
+   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
+   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
+   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+
+Acknowledgement
+
+   Funding for the RFC Editor function is currently provided by the
+   Internet Society.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Li                          Standards Track                    [Page 23]
+
author	Thomas Voss <mail@thomasvoss.com>	2024-11-27 20:54:24 +0100
committer	Thomas Voss <mail@thomasvoss.com>	2024-11-27 20:54:24 +0100
commit	4bfd864f10b68b71482b35c818559068ef8d5797 (patch)
tree	e3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc3558.txt
parent	ea76e11061bda059ae9f9ad130a9895cc85607db (diff)