1 files changed, 2467 insertions, 0 deletions
diff --git a/doc/rfc/rfc3551.txt b/doc/rfc/rfc3551.txt
new file mode 100644
index 0000000..c43ff34
--- /dev/null
+++ b/doc/rfc/rfc3551.txt
@@ -0,0 +1,2467 @@
+
+
+
+
+
+
+Network Working Group                                     H. Schulzrinne
+Request for Comments: 3551                           Columbia University
+Obsoletes: 1890                                                S. Casner
+Category: Standards Track                                  Packet Design
+                                                               July 2003
+
+
+              RTP Profile for Audio and Video Conferences
+                          with Minimal Control
+
+Status of this Memo
+
+   This document specifies an Internet standards track protocol for the
+   Internet community, and requests discussion and suggestions for
+   improvements.  Please refer to the current edition of the "Internet
+   Official Protocol Standards" (STD 1) for the standardization state
+   and status of this protocol.  Distribution of this memo is unlimited.
+
+Copyright Notice
+
+   Copyright (C) The Internet Society (2003).  All Rights Reserved.
+
+Abstract
+
+   This document describes a profile called "RTP/AVP" for the use of the
+   real-time transport protocol (RTP), version 2, and the associated
+   control protocol, RTCP, within audio and video multiparticipant
+   conferences with minimal control.  It provides interpretations of
+   generic fields within the RTP specification suitable for audio and
+   video conferences.  In particular, this document defines a set of
+   default mappings from payload type numbers to encodings.
+
+   This document also describes how audio and video data may be carried
+   within RTP.  It defines a set of standard encodings and their names
+   when used within RTP.  The descriptions provide pointers to reference
+   implementations and the detailed standards.  This document is meant
+   as an aid for implementors of audio, video and other real-time
+   multimedia applications.
+
+   This memorandum obsoletes RFC 1890.  It is mostly backwards-
+   compatible except for functions removed because two interoperable
+   implementations were not found.  The additions to RFC 1890 codify
+   existing practice in the use of payload formats under this profile
+   and include new payload formats defined since RFC 1890 was published.
+
+
+
+
+
+
+
+Schulzrinne & Casner        Standards Track                     [Page 1]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+Table of Contents
+
+   1.  Introduction .................................................  3
+       1.1  Terminology .............................................  3
+   2.  RTP and RTCP Packet Forms and Protocol Behavior ..............  4
+   3.  Registering Additional Encodings .............................  6
+   4.  Audio ........................................................  8
+       4.1  Encoding-Independent Rules ..............................  8
+       4.2  Operating Recommendations ...............................  9
+       4.3  Guidelines for Sample-Based Audio Encodings ............. 10
+       4.4  Guidelines for Frame-Based Audio Encodings .............. 11
+       4.5  Audio Encodings ......................................... 12
+            4.5.1   DVI4 ............................................ 13
+            4.5.2   G722 ............................................ 14
+            4.5.3   G723 ............................................ 14
+            4.5.4   G726-40, G726-32, G726-24, and G726-16 .......... 18
+            4.5.5   G728 ............................................ 19
+            4.5.6   G729 ............................................ 20
+            4.5.7   G729D and G729E ................................. 22
+            4.5.8   GSM ............................................. 24
+            4.5.9   GSM-EFR ......................................... 27
+            4.5.10  L8 .............................................. 27
+            4.5.11  L16 ............................................. 27
+            4.5.12  LPC ............................................. 27
+            4.5.13  MPA ............................................. 28
+            4.5.14  PCMA and PCMU ................................... 28
+            4.5.15  QCELP ........................................... 28
+            4.5.16  RED ............................................. 29
+            4.5.17  VDVI ............................................ 29
+   5.  Video ........................................................ 30
+       5.1  CelB .................................................... 30
+       5.2  JPEG .................................................... 30
+       5.3  H261 .................................................... 30
+       5.4  H263 .................................................... 31
+       5.5  H263-1998 ............................................... 31
+       5.6  MPV ..................................................... 31
+       5.7  MP2T .................................................... 31
+       5.8  nv ...................................................... 32
+   6.  Payload Type Definitions ..................................... 32
+   7.  RTP over TCP and Similar Byte Stream Protocols ............... 34
+   8.  Port Assignment .............................................. 34
+   9.  Changes from RFC 1890 ........................................ 35
+   10. Security Considerations ...................................... 38
+   11. IANA Considerations .......................................... 39
+   12. References ................................................... 39
+       12.1 Normative References .................................... 39
+       12.2 Informative References .................................. 39
+   13. Current Locations of Related Resources ....................... 41
+
+
+
+Schulzrinne & Casner        Standards Track                     [Page 2]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+   14. Acknowledgments .............................................. 42
+   15. Intellectual Property Rights Statement ....................... 43
+   16. Authors' Addresses ........................................... 43
+   17. Full Copyright Statement ..................................... 44
+
+1. Introduction
+
+   This profile defines aspects of RTP left unspecified in the RTP
+   Version 2 protocol definition (RFC 3550) [1].  This profile is
+   intended for the use within audio and video conferences with minimal
+   session control.  In particular, no support for the negotiation of
+   parameters or membership control is provided.  The profile is
+   expected to be useful in sessions where no negotiation or membership
+   control are used (e.g., using the static payload types and the
+   membership indications provided by RTCP), but this profile may also
+   be useful in conjunction with a higher-level control protocol.
+
+   Use of this profile may be implicit in the use of the appropriate
+   applications; there may be no explicit indication by port number,
+   protocol identifier or the like.  Applications such as session
+   directories may use the name for this profile specified in Section
+   11.
+
+   Other profiles may make different choices for the items specified
+   here.
+
+   This document also defines a set of encodings and payload formats for
+   audio and video.  These payload format descriptions are included here
+   only as a matter of convenience since they are too small to warrant
+   separate documents.  Use of these payload formats is NOT REQUIRED to
+   use this profile.  Only the binding of some of the payload formats to
+   static payload type numbers in Tables 4 and 5 is normative.
+
+1.1 Terminology
+
+   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
+   document are to be interpreted as described in RFC 2119 [2] and
+   indicate requirement levels for implementations compliant with this
+   RTP profile.
+
+   This document defines the term media type as dividing encodings of
+   audio and video content into three classes: audio, video and
+   audio/video (interleaved).
+
+
+
+
+
+
+
+Schulzrinne & Casner        Standards Track                     [Page 3]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+2. RTP and RTCP Packet Forms and Protocol Behavior
+
+   The section "RTP Profiles and Payload Format Specifications" of RFC
+   3550 enumerates a number of items that can be specified or modified
+   in a profile.  This section addresses these items.  Generally, this
+   profile follows the default and/or recommended aspects of the RTP
+   specification.
+
+   RTP data header: The standard format of the fixed RTP data
+      header is used (one marker bit).
+
+   Payload types: Static payload types are defined in Section 6.
+
+   RTP data header additions: No additional fixed fields are
+      appended to the RTP data header.
+
+   RTP data header extensions: No RTP header extensions are
+      defined, but applications operating under this profile MAY use
+      such extensions.  Thus, applications SHOULD NOT assume that the
+      RTP header X bit is always zero and SHOULD be prepared to ignore
+      the header extension.  If a header extension is defined in the
+      future, that definition MUST specify the contents of the first 16
+      bits in such a way that multiple different extensions can be
+      identified.
+
+   RTCP packet types: No additional RTCP packet types are defined
+      by this profile specification.
+
+   RTCP report interval: The suggested constants are to be used for
+      the RTCP report interval calculation.  Sessions operating under
+      this profile MAY specify a separate parameter for the RTCP traffic
+      bandwidth rather than using the default fraction of the session
+      bandwidth.  The RTCP traffic bandwidth MAY be divided into two
+      separate session parameters for those participants which are
+      active data senders and those which are not.  Following the
+      recommendation in the RTP specification [1] that 1/4 of the RTCP
+      bandwidth be dedicated to data senders, the RECOMMENDED default
+      values for these two parameters would be 1.25% and 3.75%,
+      respectively.  For a particular session, the RTCP bandwidth for
+      non-data-senders MAY be set to zero when operating on
+      unidirectional links or for sessions that don't require feedback
+      on the quality of reception.  The RTCP bandwidth for data senders
+      SHOULD be kept non-zero so that sender reports can still be sent
+      for inter-media synchronization and to identify the source by
+      CNAME.  The means by which the one or two session parameters for
+      RTCP bandwidth are specified is beyond the scope of this memo.
+
+
+
+
+
+Schulzrinne & Casner        Standards Track                     [Page 4]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+   SR/RR extension: No extension section is defined for the RTCP SR
+      or RR packet.
+
+   SDES use: Applications MAY use any of the SDES items described
+      in the RTP specification.  While CNAME information MUST be sent
+      every reporting interval, other items SHOULD only be sent every
+      third reporting interval, with NAME sent seven out of eight times
+      within that slot and the remaining SDES items cyclically taking up
+      the eighth slot, as defined in Section 6.2.2 of the RTP
+      specification.  In other words, NAME is sent in RTCP packets 1, 4,
+      7, 10, 13, 16, 19, while, say, EMAIL is used in RTCP packet 22.
+
+   Security: The RTP default security services are also the default
+      under this profile.
+
+   String-to-key mapping: No mapping is specified by this profile.
+
+   Congestion: RTP and this profile may be used in the context of
+      enhanced network service, for example, through Integrated Services
+      (RFC 1633) [4] or Differentiated Services (RFC 2475) [5], or they
+      may be used with best effort service.
+
+      If enhanced service is being used, RTP receivers SHOULD monitor
+      packet loss to ensure that the service that was requested is
+      actually being delivered.  If it is not, then they SHOULD assume
+      that they are receiving best-effort service and behave
+      accordingly.
+
+      If best-effort service is being used, RTP receivers SHOULD monitor
+      packet loss to ensure that the packet loss rate is within
+      acceptable parameters.  Packet loss is considered acceptable if a
+      TCP flow across the same network path and experiencing the same
+      network conditions would achieve an average throughput, measured
+      on a reasonable timescale, that is not less than the RTP flow is
+      achieving.  This condition can be satisfied by implementing
+      congestion control mechanisms to adapt the transmission rate (or
+      the number of layers subscribed for a layered multicast session),
+      or by arranging for a receiver to leave the session if the loss
+      rate is unacceptably high.
+
+      The comparison to TCP cannot be specified exactly, but is intended
+      as an "order-of-magnitude" comparison in timescale and throughput.
+      The timescale on which TCP throughput is measured is the round-
+      trip time of the connection.  In essence, this requirement states
+      that it is not acceptable to deploy an application (using RTP or
+      any other transport protocol) on the best-effort Internet which
+      consumes bandwidth arbitrarily and does not compete fairly with
+      TCP within an order of magnitude.
+
+
+
+Schulzrinne & Casner        Standards Track                     [Page 5]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+   Underlying protocol: The profile specifies the use of RTP over
+      unicast and multicast UDP as well as TCP.  (This does not preclude
+      the use of these definitions when RTP is carried by other lower-
+      layer protocols.)
+
+   Transport mapping: The standard mapping of RTP and RTCP to
+      transport-level addresses is used.
+
+   Encapsulation: This profile leaves to applications the
+      specification of RTP encapsulation in protocols other than UDP.
+
+3.  Registering Additional Encodings
+
+   This profile lists a set of encodings, each of which is comprised of
+   a particular media data compression or representation plus a payload
+   format for encapsulation within RTP.  Some of those payload formats
+   are specified here, while others are specified in separate RFCs.  It
+   is expected that additional encodings beyond the set listed here will
+   be created in the future and specified in additional payload format
+   RFCs.
+
+   This profile also assigns to each encoding a short name which MAY be
+   used by higher-level control protocols, such as the Session
+   Description Protocol (SDP), RFC 2327 [6], to identify encodings
+   selected for a particular RTP session.
+
+   In some contexts it may be useful to refer to these encodings in the
+   form of a MIME content-type.  To facilitate this, RFC 3555 [7]
+   provides registrations for all of the encodings names listed here as
+   MIME subtype names under the "audio" and "video" MIME types through
+   the MIME registration procedure as specified in RFC 2048 [8].
+
+   Any additional encodings specified for use under this profile (or
+   others) may also be assigned names registered as MIME subtypes with
+   the Internet Assigned Numbers Authority (IANA).  This registry
+   provides a means to insure that the names assigned to the additional
+   encodings are kept unique.  RFC 3555 specifies the information that
+   is required for the registration of RTP encodings.
+
+   In addition to assigning names to encodings, this profile also
+   assigns static RTP payload type numbers to some of them.  However,
+   the payload type number space is relatively small and cannot
+   accommodate assignments for all existing and future encodings.
+   During the early stages of RTP development, it was necessary to use
+   statically assigned payload types because no other mechanism had been
+   specified to bind encodings to payload types.  It was anticipated
+   that non-RTP means beyond the scope of this memo (such as directory
+   services or invitation protocols) would be specified to establish a
+
+
+
+Schulzrinne & Casner        Standards Track                     [Page 6]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+   dynamic mapping between a payload type and an encoding.  Now,
+   mechanisms for defining dynamic payload type bindings have been
+   specified in the Session Description Protocol (SDP) and in other
+   protocols such as ITU-T Recommendation H.323/H.245.  These mechanisms
+   associate the registered name of the encoding/payload format, along
+   with any additional required parameters, such as the RTP timestamp
+   clock rate and number of channels, with a payload type number.  This
+   association is effective only for the duration of the RTP session in
+   which the dynamic payload type binding is made.  This association
+   applies only to the RTP session for which it is made, thus the
+   numbers can be re-used for different encodings in different sessions
+   so the number space limitation is avoided.
+
+   This profile reserves payload type numbers in the range 96-127
+   exclusively for dynamic assignment.  Applications SHOULD first use
+   values in this range for dynamic payload types.  Those applications
+   which need to define more than 32 dynamic payload types MAY bind
+   codes below 96, in which case it is RECOMMENDED that unassigned
+   payload type numbers be used first.  However, the statically assigned
+   payload types are default bindings and MAY be dynamically bound to
+   new encodings if needed.  Redefining payload types below 96 may cause
+   incorrect operation if an attempt is made to join a session without
+   obtaining session description information that defines the dynamic
+   payload types.
+
+   Dynamic payload types SHOULD NOT be used without a well-defined
+   mechanism to indicate the mapping.  Systems that expect to
+   interoperate with others operating under this profile SHOULD NOT make
+   their own assignments of proprietary encodings to particular, fixed
+   payload types.
+
+   This specification establishes the policy that no additional static
+   payload types will be assigned beyond the ones defined in this
+   document.  Establishing this policy avoids the problem of trying to
+   create a set of criteria for accepting static assignments and
+   encourages the implementation and deployment of the dynamic payload
+   type mechanisms.
+
+   The final set of static payload type assignments is provided in
+   Tables 4 and 5.
+
+
+
+
+
+
+
+
+
+
+
+Schulzrinne & Casner        Standards Track                     [Page 7]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+4.  Audio
+
+4.1  Encoding-Independent Rules
+
+   Since the ability to suppress silence is one of the primary
+   motivations for using packets to transmit voice, the RTP header
+   carries both a sequence number and a timestamp to allow a receiver to
+   distinguish between lost packets and periods of time when no data was
+   transmitted.  Discontiguous transmission (silence suppression) MAY be
+   used with any audio payload format.  Receivers MUST assume that
+   senders may suppress silence unless this is restricted by signaling
+   specified elsewhere.  (Even if the transmitter does not suppress
+   silence, the receiver should be prepared to handle periods when no
+   data is present since packets may be lost.)
+
+   Some payload formats (see Sections 4.5.3 and 4.5.6) define a "silence
+   insertion descriptor" or "comfort noise" frame to specify parameters
+   for artificial noise that may be generated during a period of silence
+   to approximate the background noise at the source.  For other payload
+   formats, a generic Comfort Noise (CN) payload format is specified in
+   RFC 3389 [9].  When the CN payload format is used with another
+   payload format, different values in the RTP payload type field
+   distinguish comfort-noise packets from those of the selected payload
+   format.
+
+   For applications which send either no packets or occasional comfort-
+   noise packets during silence, the first packet of a talkspurt, that
+   is, the first packet after a silence period during which packets have
+   not been transmitted contiguously, SHOULD be distinguished by setting
+   the marker bit in the RTP data header to one.  The marker bit in all
+   other packets is zero.  The beginning of a talkspurt MAY be used to
+   adjust the playout delay to reflect changing network delays.
+   Applications without silence suppression MUST set the marker bit to
+   zero.
+
+   The RTP clock rate used for generating the RTP timestamp is
+   independent of the number of channels and the encoding; it usually
+   equals the number of sampling periods per second.  For N-channel
+   encodings, each sampling period (say, 1/8,000 of a second) generates
+   N samples.  (This terminology is standard, but somewhat confusing, as
+   the total number of samples generated per second is then the sampling
+   rate times the channel count.)
+
+   If multiple audio channels are used, channels are numbered left-to-
+   right, starting at one.  In RTP audio packets, information from
+   lower-numbered channels precedes that from higher-numbered channels.
+
+
+
+
+
+Schulzrinne & Casner        Standards Track                     [Page 8]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+   For more than two channels, the convention followed by the AIFF-C
+   audio interchange format SHOULD be followed [3], using the following
+   notation, unless some other convention is specified for a particular
+   encoding or payload format:
+
+      l  left
+      r  right
+      c  center
+      S  surround
+      F  front
+      R  rear
+
+      channels  description  channel
+                                1     2   3   4   5   6
+      _________________________________________________
+      2         stereo          l     r
+      3                         l     r   c
+      4                         l     c   r   S
+      5                        Fl     Fr  Fc  Sl  Sr
+      6                         l     lc  c   r   rc  S
+
+         Note: RFC 1890 defined two conventions for the ordering of four
+         audio channels.  Since the ordering is indicated implicitly by
+         the number of channels, this was ambiguous.  In this revision,
+         the order described as "quadrophonic" has been eliminated to
+         remove the ambiguity.  This choice was based on the observation
+         that quadrophonic consumer audio format did not become popular
+         whereas surround-sound subsequently has.
+
+   Samples for all channels belonging to a single sampling instant MUST
+   be within the same packet.  The interleaving of samples from
+   different channels depends on the encoding.  General guidelines are
+   given in Section 4.3 and 4.4.
+
+   The sampling frequency SHOULD be drawn from the set:  8,000, 11,025,
+   16,000, 22,050, 24,000, 32,000, 44,100 and 48,000 Hz.  (Older Apple
+   Macintosh computers had a native sample rate of 22,254.54 Hz, which
+   can be converted to 22,050 with acceptable quality by dropping 4
+   samples in a 20 ms frame.)  However, most audio encodings are defined
+   for a more restricted set of sampling frequencies.  Receivers SHOULD
+   be prepared to accept multi-channel audio, but MAY choose to only
+   play a single channel.
+
+4.2  Operating Recommendations
+
+   The following recommendations are default operating parameters.
+   Applications SHOULD be prepared to handle other values.  The ranges
+   given are meant to give guidance to application writers, allowing a
+
+
+
+Schulzrinne & Casner        Standards Track                     [Page 9]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+   set of applications conforming to these guidelines to interoperate
+   without additional negotiation.  These guidelines are not intended to
+   restrict operating parameters for applications that can negotiate a
+   set of interoperable parameters, e.g., through a conference control
+   protocol.
+
+   For packetized audio, the default packetization interval SHOULD have
+   a duration of 20 ms or one frame, whichever is longer, unless
+   otherwise noted in Table 1 (column "ms/packet").  The packetization
+   interval determines the minimum end-to-end delay; longer packets
+   introduce less header overhead but higher delay and make packet loss
+   more noticeable.  For non-interactive applications such as lectures
+   or for links with severe bandwidth constraints, a higher
+   packetization delay MAY be used.  A receiver SHOULD accept packets
+   representing between 0 and 200 ms of audio data.  (For framed audio
+   encodings, a receiver SHOULD accept packets with a number of frames
+   equal to 200 ms divided by the frame duration, rounded up.)  This
+   restriction allows reasonable buffer sizing for the receiver.
+
+4.3  Guidelines for Sample-Based Audio Encodings
+
+   In sample-based encodings, each audio sample is represented by a
+   fixed number of bits.  Within the compressed audio data, codes for
+   individual samples may span octet boundaries.  An RTP audio packet
+   may contain any number of audio samples, subject to the constraint
+   that the number of bits per sample times the number of samples per
+   packet yields an integral octet count.  Fractional encodings produce
+   less than one octet per sample.
+
+   The duration of an audio packet is determined by the number of
+   samples in the packet.
+
+   For sample-based encodings producing one or more octets per sample,
+   samples from different channels sampled at the same sampling instant
+   SHOULD be packed in consecutive octets.  For example, for a two-
+   channel encoding, the octet sequence is (left channel, first sample),
+   (right channel, first sample), (left channel, second sample), (right
+   channel, second sample), ....  For multi-octet encodings, octets
+   SHOULD be transmitted in network byte order (i.e., most significant
+   octet first).
+
+   The packing of sample-based encodings producing less than one octet
+   per sample is encoding-specific.
+
+   The RTP timestamp reflects the instant at which the first sample in
+   the packet was sampled, that is, the oldest information in the
+   packet.
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 10]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+4.4  Guidelines for Frame-Based Audio Encodings
+
+   Frame-based encodings encode a fixed-length block of audio into
+   another block of compressed data, typically also of fixed length.
+   For frame-based encodings, the sender MAY choose to combine several
+   such frames into a single RTP packet.  The receiver can tell the
+   number of frames contained in an RTP packet, if all the frames have
+   the same length, by dividing the RTP payload length by the audio
+   frame size which is defined as part of the encoding.  This does not
+   work when carrying frames of different sizes unless the frame sizes
+   are relatively prime.  If not, the frames MUST indicate their size.
+
+   For frame-based codecs, the channel order is defined for the whole
+   block.  That is, for two-channel audio, right and left samples SHOULD
+   be coded independently, with the encoded frame for the left channel
+   preceding that for the right channel.
+
+   All frame-oriented audio codecs SHOULD be able to encode and decode
+   several consecutive frames within a single packet.  Since the frame
+   size for the frame-oriented codecs is given, there is no need to use
+   a separate designation for the same encoding, but with different
+   number of frames per packet.
+
+   RTP packets SHALL contain a whole number of frames, with frames
+   inserted according to age within a packet, so that the oldest frame
+   (to be played first) occurs immediately after the RTP packet header.
+   The RTP timestamp reflects the instant at which the first sample in
+   the first frame was sampled, that is, the oldest information in the
+   packet.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 11]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+4.5 Audio Encodings
+
+   name of                              sampling              default
+   encoding  sample/frame  bits/sample      rate  ms/frame  ms/packet
+   __________________________________________________________________
+   DVI4      sample        4                var.                   20
+   G722      sample        8              16,000                   20
+   G723      frame         N/A             8,000        30         30
+   G726-40   sample        5               8,000                   20
+   G726-32   sample        4               8,000                   20
+   G726-24   sample        3               8,000                   20
+   G726-16   sample        2               8,000                   20
+   G728      frame         N/A             8,000       2.5         20
+   G729      frame         N/A             8,000        10         20
+   G729D     frame         N/A             8,000        10         20
+   G729E     frame         N/A             8,000        10         20
+   GSM       frame         N/A             8,000        20         20
+   GSM-EFR   frame         N/A             8,000        20         20
+   L8        sample        8                var.                   20
+   L16       sample        16               var.                   20
+   LPC       frame         N/A             8,000        20         20
+   MPA       frame         N/A              var.      var.
+   PCMA      sample        8                var.                   20
+   PCMU      sample        8                var.                   20
+   QCELP     frame         N/A             8,000        20         20
+   VDVI      sample        var.             var.                   20
+
+   Table 1: Properties of Audio Encodings (N/A: not applicable; var.:
+            variable)
+
+   The characteristics of the audio encodings described in this document
+   are shown in Table 1; they are listed in order of their payload type
+   in Table 4.  While most audio codecs are only specified for a fixed
+   sampling rate, some sample-based algorithms (indicated by an entry of
+   "var." in the sampling rate column of Table 1) may be used with
+   different sampling rates, resulting in different coded bit rates.
+   When used with a sampling rate other than that for which a static
+   payload type is defined, non-RTP means beyond the scope of this memo
+   MUST be used to define a dynamic payload type and MUST indicate the
+   selected RTP timestamp clock rate, which is usually the same as the
+   sampling rate for audio.
+
+
+
+
+
+
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 12]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+4.5.1 DVI4
+
+   DVI4 uses an adaptive delta pulse code modulation (ADPCM) encoding
+   scheme that was specified by the Interactive Multimedia Association
+   (IMA) as the "IMA ADPCM wave type".  However, the encoding defined
+   here as DVI4 differs in three respects from the IMA specification:
+
+   o  The RTP DVI4 header contains the predicted value rather than the
+      first sample value contained the IMA ADPCM block header.
+
+   o  IMA ADPCM blocks contain an odd number of samples, since the first
+      sample of a block is contained just in the header (uncompressed),
+      followed by an even number of compressed samples.  DVI4 has an
+      even number of compressed samples only, using the `predict' word
+      from the header to decode the first sample.
+
+   o  For DVI4, the 4-bit samples are packed with the first sample in
+      the four most significant bits and the second sample in the four
+      least significant bits.  In the IMA ADPCM codec, the samples are
+      packed in the opposite order.
+
+   Each packet contains a single DVI block.  This profile only defines
+   the 4-bit-per-sample version, while IMA also specified a 3-bit-per-
+   sample encoding.
+
+   The "header" word for each channel has the following structure:
+
+      int16  predict;  /* predicted value of first sample
+                          from the previous block (L16 format) */
+      u_int8 index;    /* current index into stepsize table */
+      u_int8 reserved; /* set to zero by sender, ignored by receiver */
+
+   Each octet following the header contains two 4-bit samples, thus the
+   number of samples per packet MUST be even because there is no means
+   to indicate a partially filled last octet.
+
+   Packing of samples for multiple channels is for further study.
+
+   The IMA ADPCM algorithm was described in the document IMA Recommended
+   Practices for Enhancing Digital Audio Compatibility in Multimedia
+   Systems (version 3.0).  However, the Interactive Multimedia
+   Association ceased operations in 1997.  Resources for an archived
+   copy of that document and a software implementation of the RTP DVI4
+   encoding are listed in Section 13.
+
+
+
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 13]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+4.5.2 G722
+
+   G722 is specified in ITU-T Recommendation G.722, "7 kHz audio-coding
+   within 64 kbit/s".  The G.722 encoder produces a stream of octets,
+   each of which SHALL be octet-aligned in an RTP packet.  The first bit
+   transmitted in the G.722 octet, which is the most significant bit of
+   the higher sub-band sample, SHALL correspond to the most significant
+   bit of the octet in the RTP packet.
+
+   Even though the actual sampling rate for G.722 audio is 16,000 Hz,
+   the RTP clock rate for the G722 payload format is 8,000 Hz because
+   that value was erroneously assigned in RFC 1890 and must remain
+   unchanged for backward compatibility.  The octet rate or sample-pair
+   rate is 8,000 Hz.
+
+4.5.3 G723
+
+   G723 is specified in ITU Recommendation G.723.1, "Dual-rate speech
+   coder for multimedia communications transmitting at 5.3 and 6.3
+   kbit/s".  The G.723.1 5.3/6.3 kbit/s codec was defined by the ITU-T
+   as a mandatory codec for ITU-T H.324 GSTN videophone terminal
+   applications.  The algorithm has a floating point specification in
+   Annex B to G.723.1, a silence compression algorithm in Annex A to
+   G.723.1 and a scalable channel coding scheme for wireless
+   applications in G.723.1 Annex C.
+
+   This Recommendation specifies a coded representation that can be used
+   for compressing the speech signal component of multi-media services
+   at a very low bit rate.  Audio is encoded in 30 ms frames, with an
+   additional delay of 7.5 ms due to look-ahead.  A G.723.1 frame can be
+   one of three sizes:  24 octets (6.3 kb/s frame), 20 octets (5.3 kb/s
+   frame), or 4 octets.  These 4-octet frames are called SID frames
+   (Silence Insertion Descriptor) and are used to specify comfort noise
+   parameters.  There is no restriction on how 4, 20, and 24 octet
+   frames are intermixed.  The least significant two bits of the first
+   octet in the frame determine the frame size and codec type:
+
+         bits  content                      octets/frame
+         00    high-rate speech (6.3 kb/s)            24
+         01    low-rate speech  (5.3 kb/s)            20
+         10    SID frame                               4
+         11    reserved
+
+
+
+
+
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 14]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+   It is possible to switch between the two rates at any 30 ms frame
+   boundary.  Both (5.3 kb/s and 6.3 kb/s) rates are a mandatory part of
+   the encoder and decoder.  Receivers MUST accept both data rates and
+   MUST accept SID frames unless restriction of these capabilities has
+   been signaled.  The MIME registration for G723 in RFC 3555 [7]
+   specifies parameters that MAY be used with MIME or SDP to restrict to
+   a single data rate or to restrict the use of SID frames.  This coder
+   was optimized to represent speech with near-toll quality at the above
+   rates using a limited amount of complexity.
+
+   The packing of the encoded bit stream into octets and the
+   transmission order of the octets is specified in Rec. G.723.1 and is
+   the same as that produced by the G.723 C code reference
+   implementation.  For the 6.3 kb/s data rate, this packing is
+   illustrated as follows, where the header (HDR) bits are always "0 0"
+   as shown in Fig. 1 to indicate operation at 6.3 kb/s, and the Z bit
+   is always set to zero.  The diagrams show the bit packing in "network
+   byte order", also known as big-endian order.  The bits of each 32-bit
+   word are numbered 0 to 31, with the most significant bit on the left
+   and numbered 0.  The octets (bytes) of each word are transmitted most
+   significant octet first.  The bits of each data field are numbered in
+   the order of the bit stream representation of the encoding (least
+   significant bit first).  The vertical bars indicate the boundaries
+   between field fragments.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 15]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+    0                   1                   2                   3
+    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |    LPC    |HDR|      LPC      |      LPC      |    ACL0   |LPC|
+   |           |   |               |               |           |   |
+   |0 0 0 0 0 0|0 0|1 1 1 1 0 0 0 0|2 2 1 1 1 1 1 1|0 0 0 0 0 0|2 2|
+   |5 4 3 2 1 0|   |3 2 1 0 9 8 7 6|1 0 9 8 7 6 5 4|5 4 3 2 1 0|3 2|
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |  ACL2   |ACL|A| GAIN0 |ACL|ACL|    GAIN0      |    GAIN1      |
+   |         | 1 |C|       | 3 | 2 |               |               |
+   |0 0 0 0 0|0 0|0|0 0 0 0|0 0|0 0|1 1 0 0 0 0 0 0|0 0 0 0 0 0 0 0|
+   |4 3 2 1 0|1 0|6|3 2 1 0|1 0|6 5|1 0 9 8 7 6 5 4|7 6 5 4 3 2 1 0|
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   | GAIN2 | GAIN1 |     GAIN2     |     GAIN3     | GRID  | GAIN3 |
+   |       |       |               |               |       |       |
+   |0 0 0 0|1 1 0 0|1 1 0 0 0 0 0 0|0 0 0 0 0 0 0 0|0 0 0 0|1 1 0 0|
+   |3 2 1 0|1 0 9 8|1 0 9 8 7 6 5 4|7 6 5 4 3 2 1 0|3 2 1 0|1 0 9 8|
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |   MSBPOS    |Z|POS|  MSBPOS   |     POS0      |POS|   POS0    |
+   |             | | 0 |           |               | 1 |           |
+   |0 0 0 0 0 0 0|0|0 0|1 1 1 0 0 0|0 0 0 0 0 0 0 0|0 0|1 1 1 1 1 1|
+   |6 5 4 3 2 1 0| |1 0|2 1 0 9 8 7|9 8 7 6 5 4 3 2|1 0|5 4 3 2 1 0|
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |     POS1      | POS2  | POS1  |     POS2      | POS3  | POS2  |
+   |               |       |       |               |       |       |
+   |0 0 0 0 0 0 0 0|0 0 0 0|1 1 1 1|1 1 0 0 0 0 0 0|0 0 0 0|1 1 1 1|
+   |9 8 7 6 5 4 3 2|3 2 1 0|3 2 1 0|1 0 9 8 7 6 5 4|3 2 1 0|5 4 3 2|
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |     POS3      |   PSIG0   |POS|PSIG2|  PSIG1  |  PSIG3  |PSIG2|
+   |               |           | 3 |     |         |         |     |
+   |1 1 0 0 0 0 0 0|0 0 0 0 0 0|1 1|0 0 0|0 0 0 0 0|0 0 0 0 0|0 0 0|
+   |1 0 9 8 7 6 5 4|5 4 3 2 1 0|3 2|2 1 0|4 3 2 1 0|4 3 2 1 0|5 4 3|
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+                  Figure 1: G.723 (6.3 kb/s) bit packing
+
+   For the 5.3 kb/s data rate, the header (HDR) bits are always "0 1",
+   as shown in Fig. 2, to indicate operation at 5.3 kb/s.
+
+
+
+
+
+
+
+
+
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 16]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+    0                   1                   2                   3
+    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |    LPC    |HDR|      LPC      |      LPC      |   ACL0    |LPC|
+   |           |   |               |               |           |   |
+   |0 0 0 0 0 0|0 1|1 1 1 1 0 0 0 0|2 2 1 1 1 1 1 1|0 0 0 0 0 0|2 2|
+   |5 4 3 2 1 0|   |3 2 1 0 9 8 7 6|1 0 9 8 7 6 5 4|5 4 3 2 1 0|3 2|
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |  ACL2   |ACL|A| GAIN0 |ACL|ACL|     GAIN0     |     GAIN1     |
+   |         | 1 |C|       | 3 | 2 |               |               |
+   |0 0 0 0 0|0 0|0|0 0 0 0|0 0|0 0|1 1 0 0 0 0 0 0|0 0 0 0 0 0 0 0|
+   |4 3 2 1 0|1 0|6|3 2 1 0|1 0|6 5|1 0 9 8 7 6 5 4|7 6 5 4 3 2 1 0|
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   | GAIN2 | GAIN1 |     GAIN2     |    GAIN3      | GRID  | GAIN3 |
+   |       |       |               |               |       |       |
+   |0 0 0 0|1 1 0 0|1 1 0 0 0 0 0 0|0 0 0 0 0 0 0 0|0 0 0 0|1 1 0 0|
+   |3 2 1 0|1 0 9 8|1 0 9 8 7 6 5 4|7 6 5 4 3 2 1 0|4 3 2 1|1 0 9 8|
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |     POS0      | POS1  | POS0  |     POS1      |     POS2      |
+   |               |       |       |               |               |
+   |0 0 0 0 0 0 0 0|0 0 0 0|1 1 0 0|1 1 0 0 0 0 0 0|0 0 0 0 0 0 0 0|
+   |7 6 5 4 3 2 1 0|3 2 1 0|1 0 9 8|1 0 9 8 7 6 5 4|7 6 5 4 3 2 1 0|
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   | POS3  | POS2  |     POS3      | PSIG1 | PSIG0 | PSIG3 | PSIG2 |
+   |       |       |               |       |       |       |       |
+   |0 0 0 0|1 1 0 0|1 1 0 0 0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|
+   |3 2 1 0|1 0 9 8|1 0 9 8 7 6 5 4|3 2 1 0|3 2 1 0|3 2 1 0|3 2 1 0|
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+                  Figure 2: G.723 (5.3 kb/s) bit packing
+
+   The packing of G.723.1 SID (silence) frames, which are indicated by
+   the header (HDR) bits having the pattern "1 0", is depicted in Fig.
+   3.
+
+    0                   1                   2                   3
+    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |    LPC    |HDR|      LPC      |      LPC      |   GAIN    |LPC|
+   |           |   |               |               |           |   |
+   |0 0 0 0 0 0|1 0|1 1 1 1 0 0 0 0|2 2 1 1 1 1 1 1|0 0 0 0 0 0|2 2|
+   |5 4 3 2 1 0|   |3 2 1 0 9 8 7 6|1 0 9 8 7 6 5 4|5 4 3 2 1 0|3 2|
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+                   Figure 3: G.723 SID mode bit packing
+
+
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 17]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+4.5.4  G726-40, G726-32, G726-24, and G726-16
+
+   ITU-T Recommendation G.726 describes, among others, the algorithm
+   recommended for conversion of a single 64 kbit/s A-law or mu-law PCM
+   channel encoded at 8,000 samples/sec to and from a 40, 32, 24, or 16
+   kbit/s channel.  The conversion is applied to the PCM stream using an
+   Adaptive Differential Pulse Code Modulation (ADPCM) transcoding
+   technique.  The ADPCM representation consists of a series of
+   codewords with a one-to-one correspondence to the samples in the PCM
+   stream.  The G726 data rates of 40, 32, 24, and 16 kbit/s have
+   codewords of 5, 4, 3, and 2 bits, respectively.
+
+   The 16 and 24 kbit/s encodings do not provide toll quality speech.
+   They are designed for used in overloaded Digital Circuit
+   Multiplication Equipment (DCME).  ITU-T G.726 recommends that the 16
+   and 24 kbit/s encodings should be alternated with higher data rate
+   encodings to provide an average sample size of between 3.5 and 3.7
+   bits per sample.
+
+   The encodings of G.726 are here denoted as G726-40, G726-32, G726-24,
+   and G726-16.  Prior to 1990, G721 described the 32 kbit/s ADPCM
+   encoding, and G723 described the 40, 32, and 16 kbit/s encodings.
+   Thus, G726-32 designates the same algorithm as G721 in RFC 1890.
+
+   A stream of G726 codewords contains no information on the encoding
+   being used, therefore transitions between G726 encoding types are not
+   permitted within a sequence of packed codewords.  Applications MUST
+   determine the encoding type of packed codewords from the RTP payload
+   identifier.
+
+   No payload-specific header information SHALL be included as part of
+   the audio data.  A stream of G726 codewords MUST be packed into
+   octets as follows:  the first codeword is placed into the first octet
+   such that the least significant bit of the codeword aligns with the
+   least significant bit in the octet, the second codeword is then
+   packed so that its least significant bit coincides with the least
+   significant unoccupied bit in the octet.  When a complete codeword
+   cannot be placed into an octet, the bits overlapping the octet
+   boundary are placed into the least significant bits of the next
+   octet.  Packing MUST end with a completely packed final octet.  The
+   number of codewords packed will therefore be a multiple of 8, 2, 8,
+   and 4 for G726-40, G726-32, G726-24, and G726-16, respectively.  An
+   example of the packing scheme for G726-32 codewords is as shown,
+   where bit 7 is the least significant bit of the first octet, and bit
+   A3 is the least significant bit of the first codeword:
+
+
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 18]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+          0                   1
+          0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
+         |B B B B|A A A A|D D D D|C C C C| ...
+         |0 1 2 3|0 1 2 3|0 1 2 3|0 1 2 3|
+         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
+
+   An example of the packing scheme for G726-24 codewords follows, where
+   again bit 7 is the least significant bit of the first octet, and bit
+   A2 is the least significant bit of the first codeword:
+
+          0                   1                   2
+          0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3
+         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
+         |C C|B B B|A A A|F|E E E|D D D|C|H H H|G G G|F F| ...
+         |1 2|0 1 2|0 1 2|2|0 1 2|0 1 2|0|0 1 2|0 1 2|0 1|
+         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
+
+   Note that the "little-endian" direction in which samples are packed
+   into octets in the G726-16, -24, -32 and -40 payload formats
+   specified here is consistent with ITU-T Recommendation X.420, but is
+   the opposite of what is specified in ITU-T Recommendation I.366.2
+   Annex E for ATM AAL2 transport.  A second set of RTP payload formats
+   matching the packetization of I.366.2 Annex E and identified by MIME
+   subtypes AAL2-G726-16, -24, -32 and -40 will be specified in a
+   separate document.
+
+4.5.5 G728
+
+   G728 is specified in ITU-T Recommendation G.728, "Coding of speech at
+   16 kbit/s using low-delay code excited linear prediction".
+
+   A G.278 encoder translates 5 consecutive audio samples into a 10-bit
+   codebook index, resulting in a bit rate of 16 kb/s for audio sampled
+   at 8,000 samples per second.  The group of five consecutive samples
+   is called a vector.  Four consecutive vectors, labeled V1 to V4
+   (where V1 is to be played first by the receiver), build one G.728
+   frame.  The four vectors of 40 bits are packed into 5 octets, labeled
+   B1 through B5.  B1 SHALL be placed first in the RTP packet.
+
+   Referring to the figure below, the principle for bit order is
+   "maintenance of bit significance".  Bits from an older vector are
+   more significant than bits from newer vectors.  The MSB of the frame
+   goes to the MSB of B1 and the LSB of the frame goes to LSB of B5.
+
+
+
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 19]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+                   1         2         3        3
+         0         0         0         0        9
+         ++++++++++++++++++++++++++++++++++++++++
+         <---V1---><---V2---><---V3---><---V4---> vectors
+         <--B1--><--B2--><--B3--><--B4--><--B5--> octets
+         <------------- frame 1 ---------------->
+
+   In particular, B1 contains the eight most significant bits of V1,
+   with the MSB of V1 being the MSB of B1.  B2 contains the two least
+   significant bits of V1, the more significant of the two in its MSB,
+   and the six most significant bits of V2.  B1 SHALL be placed first in
+   the RTP packet and B5 last.
+
+4.5.6 G729
+
+   G729 is specified in ITU-T Recommendation G.729, "Coding of speech at
+   8 kbit/s using conjugate structure-algebraic code excited linear
+   prediction (CS-ACELP)".  A reduced-complexity version of the G.729
+   algorithm is specified in Annex A to Rec. G.729.  The speech coding
+   algorithms in the main body of G.729 and in G.729 Annex A are fully
+   interoperable with each other, so there is no need to further
+   distinguish between them.  An implementation that signals or accepts
+   use of G729 payload format may implement either G.729 or G.729A
+   unless restricted by additional signaling specified elsewhere related
+   specifically to the encoding rather than the payload format.  The
+   G.729 and G.729 Annex A codecs were optimized to represent speech
+   with high quality, where G.729 Annex A trades some speech quality for
+   an approximate 50% complexity reduction [10].  See the next Section
+   (4.5.7) for other data rates added in later G.729 Annexes.  For all
+   data rates, the sampling frequency (and RTP timestamp clock rate) is
+   8,000 Hz.
+
+   A voice activity detector (VAD) and comfort noise generator (CNG)
+   algorithm in Annex B of G.729 is RECOMMENDED for digital simultaneous
+   voice and data applications and can be used in conjunction with G.729
+   or G.729 Annex A.  A G.729 or G.729 Annex A frame contains 10 octets,
+   while the G.729 Annex B comfort noise frame occupies 2 octets.
+   Receivers MUST accept comfort noise frames if restriction of their
+   use has not been signaled.  The MIME registration for G729 in RFC
+   3555 [7] specifies a parameter that MAY be used with MIME or SDP to
+   restrict the use of comfort noise frames.
+
+   A G729 RTP packet may consist of zero or more G.729 or G.729 Annex A
+   frames, followed by zero or one G.729 Annex B frames.  The presence
+   of a comfort noise frame can be deduced from the length of the RTP
+   payload.  The default packetization interval is 20 ms (two frames),
+   but in some situations it may be desirable to send 10 ms packets.  An
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 20]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+   example would be a transition from speech to comfort noise in the
+   first 10 ms of the packet.  For some applications, a longer
+   packetization interval may be required to reduce the packet rate.
+
+       0                   1                   2                   3
+       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      |L|      L1     |    L2   |    L3   |       P1      |P|    C1   |
+      |0|             |         |         |               |0|         |
+      | |0 1 2 3 4 5 6|0 1 2 3 4|0 1 2 3 4|0 1 2 3 4 5 6 7| |0 1 2 3 4|
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      |       C1      |  S1   | GA1 |  GB1  |    P2   |      C2       |
+      |          1 1 1|       |     |       |         |               |
+      |5 6 7 8 9 0 1 2|0 1 2 3|0 1 2|0 1 2 3|0 1 2 3 4|0 1 2 3 4 5 6 7|
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      |   C2    |  S2   | GA2 |  GB2  |
+      |    1 1 1|       |     |       |
+      |8 9 0 1 2|0 1 2 3|0 1 2|0 1 2 3|
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+                    Figure 4: G.729 and G.729A bit packing
+
+   The transmitted parameters of a G.729/G.729A 10-ms frame, consisting
+   of 80 bits, are defined in Recommendation G.729, Table 8/G.729.  The
+   mapping of the these parameters is given below in Fig. 4.  The
+   diagrams show the bit packing in "network byte order", also known as
+   big-endian order.  The bits of each 32-bit word are numbered 0 to 31,
+   with the most significant bit on the left and numbered 0.  The octets
+   (bytes) of each word are transmitted most significant octet first.
+   The bits of each data field are numbered in the order as produced by
+   the G.729 C code reference implementation.
+
+   The packing of the G.729 Annex B comfort noise frame is shown in Fig.
+   5.
+
+          0                   1
+          0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+         |L|  LSF1   |  LSF2 |   GAIN  |R|
+         |S|         |       |         |E|
+         |F|         |       |         |S|
+         |0|0 1 2 3 4|0 1 2 3|0 1 2 3 4|V|    RESV = Reserved (zero)
+         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+                       Figure 5: G.729 Annex B bit packing
+
+
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 21]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+4.5.7 G729D and G729E
+
+   Annexes D and E to ITU-T Recommendation G.729 provide additional data
+   rates.  Because the data rate is not signaled in the bitstream, the
+   different data rates are given distinct RTP encoding names which are
+   mapped to distinct payload type numbers.  G729D indicates a 6.4
+   kbit/s coding mode (G.729 Annex D, for momentary reduction in channel
+   capacity), while G729E indicates an 11.8 kbit/s mode (G.729 Annex E,
+   for improved performance with a wide range of narrow-band input
+   signals, e.g., music and background noise).  Annex E has two
+   operating modes, backward adaptive and forward adaptive, which are
+   signaled by the first two bits in each frame (the most significant
+   two bits of the first octet).
+
+   The voice activity detector (VAD) and comfort noise generator (CNG)
+   algorithm specified in Annex B of G.729 may be used with Annex D and
+   Annex E frames in addition to G.729 and G.729 Annex A frames.  The
+   algorithm details for the operation of Annexes D and E with the Annex
+   B CNG are specified in G.729 Annexes F and G.  Note that Annexes F
+   and G do not introduce any new encodings.  Receivers MUST accept
+   comfort noise frames if restriction of their use has not been
+   signaled.  The MIME registrations for G729D and G729E in RFC 3555 [7]
+   specify a parameter that MAY be used with MIME or SDP to restrict the
+   use of comfort noise frames.
+
+   For G729D, an RTP packet may consist of zero or more G.729 Annex D
+   frames, followed by zero or one G.729 Annex B frame.  Similarly, for
+   G729E, an RTP packet may consist of zero or more G.729 Annex E
+   frames, followed by zero or one G.729 Annex B frame.  The presence of
+   a comfort noise frame can be deduced from the length of the RTP
+   payload.
+
+   A single RTP packet must contain frames of only one data rate,
+   optionally followed by one comfort noise frame.  The data rate may be
+   changed from packet to packet by changing the payload type number.
+   G.729 Annexes D, E and H describe what the encoding and decoding
+   algorithms must do to accommodate a change in data rate.
+
+   For G729D, the bits of a G.729 Annex D frame are formatted as shown
+   below in Fig. 6 (cf.  Table D.1/G.729).  The frame length is 64 bits.
+
+
+
+
+
+
+
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 22]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+       0                   1                   2                   3
+       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      |L|      L1     |    L2   |    L3   |        P1     |     C1    |
+      |0|             |         |         |               |           |
+      | |0 1 2 3 4 5 6|0 1 2 3 4|0 1 2 3 4|0 1 2 3 4 5 6 7|0 1 2 3 4 5|
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      | C1  |S1 | GA1 | GB1 |  P2   |        C2       |S2 | GA2 | GB2 |
+      |     |   |     |     |       |                 |   |     |     |
+      |6 7 8|0 1|0 1 2|0 1 2|0 1 2 3|0 1 2 3 4 5 6 7 8|0 1|0 1 2|0 1 2|
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+                     Figure 6: G.729 Annex D bit packing
+
+   The net bit rate for the G.729 Annex E algorithm is 11.8 kbit/s and a
+   total of 118 bits are used.  Two bits are appended as "don't care"
+   bits to complete an integer number of octets for the frame.  For
+   G729E, the bits of a data frame are formatted as shown in the next
+   two diagrams (cf. Table E.1/G.729).  The fields for the G729E forward
+   adaptive mode are packed as shown in Fig. 7.
+
+       0                   1                   2                   3
+       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      |0 0|L|      L1     |    L2   |    L3   |        P1     |P| C0_1|
+      |   |0|             |         |         |               |0|     |
+      |   | |0 1 2 3 4 5 6|0 1 2 3 4|0 1 2 3 4|0 1 2 3 4 5 6 7| |0 1 2|
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      |       |   C1_1      |     C2_1    |   C3_1      |    C4_1     |
+      |       |             |             |             |             |
+      |3 4 5 6|0 1 2 3 4 5 6|0 1 2 3 4 5 6|0 1 2 3 4 5 6|0 1 2 3 4 5 6|
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      | GA1 |  GB1  |    P2   |   C0_2      |     C1_2    |   C2_2    |
+      |     |       |         |             |             |           |
+      |0 1 2|0 1 2 3|0 1 2 3 4|0 1 2 3 4 5 6|0 1 2 3 4 5 6|0 1 2 3 4 5|
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      | |    C3_2     |     C4_2    | GA2 | GB2   |DC |
+      | |             |             |     |       |   |
+      |6|0 1 2 3 4 5 6|0 1 2 3 4 5 6|0 1 2|0 1 2 3|0 1|
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+         Figure 7: G.729 Annex E (forward adaptive mode) bit packing
+
+   The fields for the G729E backward adaptive mode are packed as shown
+   in Fig. 8.
+
+
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 23]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+       0                   1                   2                   3
+       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      |1 1|       P1      |P|       C0_1              |     C1_1      |
+      |   |               |0|                    1 1 1|               |
+      |   |0 1 2 3 4 5 6 7|0|0 1 2 3 4 5 6 7 8 9 0 1 2|0 1 2 3 4 5 6 7|
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      |   |  C2_1       | C3_1        | C4_1        |GA1  | GB1   |P2 |
+      |   |             |             |             |     |       |   |
+      |8 9|0 1 2 3 4 5 6|0 1 2 3 4 5 6|0 1 2 3 4 5 6|0 1 2|0 1 2 3|0 1|
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      |     |          C0_2           |       C1_2        |    C2_2   |
+      |     |                    1 1 1|                   |           |
+      |2 3 4|0 1 2 3 4 5 6 7 8 9 0 1 2|0 1 2 3 4 5 6 7 8 9|0 1 2 3 4 5|
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+      | |    C3_2     |     C4_2    | GA2 | GB2   |DC |
+      | |             |             |     |       |   |
+      |6|0 1 2 3 4 5 6|0 1 2 3 4 5 6|0 1 2|0 1 2 3|0 1|
+      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+         Figure 8: G.729 Annex E (backward adaptive mode) bit packing
+
+4.5.8 GSM
+
+   GSM (Group Speciale Mobile) denotes the European GSM 06.10 standard
+   for full-rate speech transcoding, ETS 300 961, which is based on
+   RPE/LTP (residual pulse excitation/long term prediction) coding at a
+   rate of 13 kb/s [11,12,13].  The text of the standard can be obtained
+   from:
+
+   ETSI (European Telecommunications Standards Institute)
+   ETSI Secretariat: B.P.152
+   F-06561 Valbonne Cedex
+   France
+   Phone: +33 92 94 42 00
+   Fax:   +33 93 65 47 16
+
+   Blocks of 160 audio samples are compressed into 33 octets, for an
+   effective data rate of 13,200 b/s.
+
+4.5.8.1  General Packaging Issues
+
+   The GSM standard (ETS 300 961) specifies the bit stream produced by
+   the codec, but does not specify how these bits should be packed for
+   transmission.  The packetization specified here has subsequently been
+   adopted in ETSI Technical Specification TS 101 318.  Some software
+   implementations of the GSM codec use a different packing than that
+   specified here.
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 24]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+               field  field name  bits  field  field name  bits
+               ________________________________________________
+               1      LARc[0]     6     39     xmc[22]     3
+               2      LARc[1]     6     40     xmc[23]     3
+               3      LARc[2]     5     41     xmc[24]     3
+               4      LARc[3]     5     42     xmc[25]     3
+               5      LARc[4]     4     43     Nc[2]       7
+               6      LARc[5]     4     44     bc[2]       2
+               7      LARc[6]     3     45     Mc[2]       2
+               8      LARc[7]     3     46     xmaxc[2]    6
+               9      Nc[0]       7     47     xmc[26]     3
+               10     bc[0]       2     48     xmc[27]     3
+               11     Mc[0]       2     49     xmc[28]     3
+               12     xmaxc[0]    6     50     xmc[29]     3
+               13     xmc[0]      3     51     xmc[30]     3
+               14     xmc[1]      3     52     xmc[31]     3
+               15     xmc[2]      3     53     xmc[32]     3
+               16     xmc[3]      3     54     xmc[33]     3
+               17     xmc[4]      3     55     xmc[34]     3
+               18     xmc[5]      3     56     xmc[35]     3
+               19     xmc[6]      3     57     xmc[36]     3
+               20     xmc[7]      3     58     xmc[37]     3
+               21     xmc[8]      3     59     xmc[38]     3
+               22     xmc[9]      3     60     Nc[3]       7
+               23     xmc[10]     3     61     bc[3]       2
+               24     xmc[11]     3     62     Mc[3]       2
+               25     xmc[12]     3     63     xmaxc[3]    6
+               26     Nc[1]       7     64     xmc[39]     3
+               27     bc[1]       2     65     xmc[40]     3
+               28     Mc[1]       2     66     xmc[41]     3
+               29     xmaxc[1]    6     67     xmc[42]     3
+               30     xmc[13]     3     68     xmc[43]     3
+               31     xmc[14]     3     69     xmc[44]     3
+               32     xmc[15]     3     70     xmc[45]     3
+               33     xmc[16]     3     71     xmc[46]     3
+               34     xmc[17]     3     72     xmc[47]     3
+               35     xmc[18]     3     73     xmc[48]     3
+               36     xmc[19]     3     74     xmc[49]     3
+               37     xmc[20]     3     75     xmc[50]     3
+               38     xmc[21]     3     76     xmc[51]     3
+
+                      Table 2: Ordering of GSM variables
+
+
+
+
+
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 25]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+   Octet  Bit 0   Bit 1   Bit 2   Bit 3   Bit 4   Bit 5   Bit 6   Bit 7
+   _____________________________________________________________________
+       0    1       1       0       1    LARc0.0 LARc0.1 LARc0.2 LARc0.3
+       1 LARc0.4 LARc0.5 LARc1.0 LARc1.1 LARc1.2 LARc1.3 LARc1.4 LARc1.5
+       2 LARc2.0 LARc2.1 LARc2.2 LARc2.3 LARc2.4 LARc3.0 LARc3.1 LARc3.2
+       3 LARc3.3 LARc3.4 LARc4.0 LARc4.1 LARc4.2 LARc4.3 LARc5.0 LARc5.1
+       4 LARc5.2 LARc5.3 LARc6.0 LARc6.1 LARc6.2 LARc7.0 LARc7.1 LARc7.2
+       5  Nc0.0   Nc0.1   Nc0.2   Nc0.3   Nc0.4   Nc0.5   Nc0.6  bc0.0
+       6  bc0.1   Mc0.0   Mc0.1  xmaxc00 xmaxc01 xmaxc02 xmaxc03 xmaxc04
+       7 xmaxc05 xmc0.0  xmc0.1  xmc0.2  xmc1.0  xmc1.1  xmc1.2  xmc2.0
+       8 xmc2.1  xmc2.2  xmc3.0  xmc3.1  xmc3.2  xmc4.0  xmc4.1  xmc4.2
+       9 xmc5.0  xmc5.1  xmc5.2  xmc6.0  xmc6.1  xmc6.2  xmc7.0  xmc7.1
+      10 xmc7.2  xmc8.0  xmc8.1  xmc8.2  xmc9.0  xmc9.1  xmc9.2  xmc10.0
+      11 xmc10.1 xmc10.2 xmc11.0 xmc11.1 xmc11.2 xmc12.0 xmc12.1 xcm12.2
+      12  Nc1.0   Nc1.1   Nc1.2   Nc1.3   Nc1.4   Nc1.5   Nc1.6   bc1.0
+      13  bc1.1   Mc1.0   Mc1.1  xmaxc10 xmaxc11 xmaxc12 xmaxc13 xmaxc14
+      14 xmax15  xmc13.0 xmc13.1 xmc13.2 xmc14.0 xmc14.1 xmc14.2 xmc15.0
+      15 xmc15.1 xmc15.2 xmc16.0 xmc16.1 xmc16.2 xmc17.0 xmc17.1 xmc17.2
+      16 xmc18.0 xmc18.1 xmc18.2 xmc19.0 xmc19.1 xmc19.2 xmc20.0 xmc20.1
+      17 xmc20.2 xmc21.0 xmc21.1 xmc21.2 xmc22.0 xmc22.1 xmc22.2 xmc23.0
+      18 xmc23.1 xmc23.2 xmc24.0 xmc24.1 xmc24.2 xmc25.0 xmc25.1 xmc25.2
+      19  Nc2.0   Nc2.1   Nc2.2   Nc2.3   Nc2.4   Nc2.5   Nc2.6   bc2.0
+      20  bc2.1   Mc2.0   Mc2.1  xmaxc20 xmaxc21 xmaxc22 xmaxc23 xmaxc24
+      21 xmaxc25 xmc26.0 xmc26.1 xmc26.2 xmc27.0 xmc27.1 xmc27.2 xmc28.0
+      22 xmc28.1 xmc28.2 xmc29.0 xmc29.1 xmc29.2 xmc30.0 xmc30.1 xmc30.2
+      23 xmc31.0 xmc31.1 xmc31.2 xmc32.0 xmc32.1 xmc32.2 xmc33.0 xmc33.1
+      24 xmc33.2 xmc34.0 xmc34.1 xmc34.2 xmc35.0 xmc35.1 xmc35.2 xmc36.0
+      25 Xmc36.1 xmc36.2 xmc37.0 xmc37.1 xmc37.2 xmc38.0 xmc38.1 xmc38.2
+      26  Nc3.0   Nc3.1   Nc3.2   Nc3.3   Nc3.4   Nc3.5   Nc3.6   bc3.0
+      27  bc3.1   Mc3.0   Mc3.1  xmaxc30 xmaxc31 xmaxc32 xmaxc33 xmaxc34
+      28 xmaxc35 xmc39.0 xmc39.1 xmc39.2 xmc40.0 xmc40.1 xmc40.2 xmc41.0
+      29 xmc41.1 xmc41.2 xmc42.0 xmc42.1 xmc42.2 xmc43.0 xmc43.1 xmc43.2
+      30 xmc44.0 xmc44.1 xmc44.2 xmc45.0 xmc45.1 xmc45.2 xmc46.0 xmc46.1
+      31 xmc46.2 xmc47.0 xmc47.1 xmc47.2 xmc48.0 xmc48.1 xmc48.2 xmc49.0
+      32 xmc49.1 xmc49.2 xmc50.0 xmc50.1 xmc50.2 xmc51.0 xmc51.1 xmc51.2
+
+                        Table 3: GSM payload format
+
+   In the GSM packing used by RTP, the bits SHALL be packed beginning
+   from the most significant bit.  Every 160 sample GSM frame is coded
+   into one 33 octet (264 bit) buffer.  Every such buffer begins with a
+   4 bit signature (0xD), followed by the MSB encoding of the fields of
+   the frame.  The first octet thus contains 1101 in the 4 most
+   significant bits (0-3) and the 4 most significant bits of F1 (0-3) in
+   the 4 least significant bits (4-7).  The second octet contains the 2
+   least significant bits of F1 in bits 0-1, and F2 in bits 2-7, and so
+   on.  The order of the fields in the frame is described in Table 2.
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 26]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+4.5.8.2   GSM Variable Names and Numbers
+
+   In the RTP encoding we have the bit pattern described in Table 3,
+   where F.i signifies the ith bit of the field F, bit 0 is the most
+   significant bit, and the bits of every octet are numbered from 0 to 7
+   from most to least significant.
+
+4.5.9 GSM-EFR
+
+   GSM-EFR denotes GSM 06.60 enhanced full rate speech transcoding,
+   specified in ETS 300 726 which is available from ETSI at the address
+   given in Section 4.5.8.  This codec has a frame length of 244 bits.
+   For transmission in RTP, each codec frame is packed into a 31 octet
+   (248 bit) buffer beginning with a 4-bit signature 0xC in a manner
+   similar to that specified here for the original GSM 06.10 codec.  The
+   packing is specified in ETSI Technical Specification TS 101 318.
+
+4.5.10 L8
+
+   L8 denotes linear audio data samples, using 8-bits of precision with
+   an offset of 128, that is, the most negative signal is encoded as
+   zero.
+
+4.5.11 L16
+
+   L16 denotes uncompressed audio data samples, using 16-bit signed
+   representation with 65,535 equally divided steps between minimum and
+   maximum signal level, ranging from -32,768 to 32,767.  The value is
+   represented in two's complement notation and transmitted in network
+   byte order (most significant byte first).
+
+   The MIME registration for L16 in RFC 3555 [7] specifies parameters
+   that MAY be used with MIME or SDP to indicate that analog pre-
+   emphasis was applied to the signal before quantization or to indicate
+   that a multiple-channel audio stream follows a different channel
+   ordering convention than is specified in Section 4.1.
+
+4.5.12 LPC
+
+   LPC designates an experimental linear predictive encoding contributed
+   by Ron Frederick, which is based on an implementation written by Ron
+   Zuckerman posted to the Usenet group comp.dsp on June 26, 1992.  The
+   codec generates 14 octets for every frame.  The framesize is set to
+   20 ms, resulting in a bit rate of 5,600 b/s.
+
+
+
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 27]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+4.5.13 MPA
+
+   MPA denotes MPEG-1 or MPEG-2 audio encapsulated as elementary
+   streams.  The encoding is defined in ISO standards ISO/IEC 11172-3
+   and 13818-3.  The encapsulation is specified in RFC 2250 [14].
+
+   The encoding may be at any of three levels of complexity, called
+   Layer I, II and III.  The selected layer as well as the sampling rate
+   and channel count are indicated in the payload.  The RTP timestamp
+   clock rate is always 90,000, independent of the sampling rate.
+   MPEG-1 audio supports sampling rates of 32, 44.1, and 48 kHz (ISO/IEC
+   11172-3, section 1.1; "Scope").  MPEG-2 supports sampling rates of
+   16, 22.05 and 24 kHz.  The number of samples per frame is fixed, but
+   the frame size will vary with the sampling rate and bit rate.
+
+   The MIME registration for MPA in RFC 3555 [7] specifies parameters
+   that MAY be used with MIME or SDP to restrict the selection of layer,
+   channel count, sampling rate, and bit rate.
+
+4.5.14 PCMA and PCMU
+
+   PCMA and PCMU are specified in ITU-T Recommendation G.711.  Audio
+   data is encoded as eight bits per sample, after logarithmic scaling.
+   PCMU denotes mu-law scaling, PCMA A-law scaling.  A detailed
+   description is given by Jayant and Noll [15].  Each G.711 octet SHALL
+   be octet-aligned in an RTP packet.  The sign bit of each G.711 octet
+   SHALL correspond to the most significant bit of the octet in the RTP
+   packet (i.e., assuming the G.711 samples are handled as octets on the
+   host machine, the sign bit SHALL be the most significant bit of the
+   octet as defined by the host machine format).  The 56 kb/s and 48
+   kb/s modes of G.711 are not applicable to RTP, since PCMA and PCMU
+   MUST always be transmitted as 8-bit samples.
+
+   See Section 4.1 regarding silence suppression.
+
+4.5.15 QCELP
+
+   The Electronic Industries Association (EIA) & Telecommunications
+   Industry Association (TIA) standard IS-733, "TR45: High Rate Speech
+   Service Option for Wideband Spread Spectrum Communications Systems",
+   defines the QCELP audio compression algorithm for use in wireless
+   CDMA applications.  The QCELP CODEC compresses each 20 milliseconds
+   of 8,000 Hz, 16-bit sampled input speech into one of four different
+   size output frames:  Rate 1 (266 bits), Rate 1/2 (124 bits), Rate 1/4
+   (54 bits) or Rate 1/8 (20 bits).  For typical speech patterns, this
+   results in an average output of 6.8 kb/s for normal mode and 4.7 kb/s
+   for reduced rate mode.  The packetization of the QCELP audio codec is
+   described in [16].
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 28]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+4.5.16 RED
+
+   The redundant audio payload format "RED" is specified by RFC 2198
+   [17].  It defines a means by which multiple redundant copies of an
+   audio packet may be transmitted in a single RTP stream.  Each packet
+   in such a stream contains, in addition to the audio data for that
+   packetization interval, a (more heavily compressed) copy of the data
+   from a previous packetization interval.  This allows an approximation
+   of the data from lost packets to be recovered upon decoding of a
+   subsequent packet, giving much improved sound quality when compared
+   with silence substitution for lost packets.
+
+4.5.17 VDVI
+
+   VDVI is a variable-rate version of DVI4, yielding speech bit rates of
+   between 10 and 25 kb/s.  It is specified for single-channel operation
+   only.  Samples are packed into octets starting at the most-
+   significant bit.  The last octet is padded with 1 bits if the last
+   sample does not fill the last octet.  This padding is distinct from
+   the valid codewords.  The receiver needs to detect the padding
+   because there is no explicit count of samples in the packet.
+
+   It uses the following encoding:
+
+            DVI4 codeword  VDVI bit pattern
+            _______________________________
+                        0  00
+                        1  010
+                        2  1100
+                        3  11100
+                        4  111100
+                        5  1111100
+                        6  11111100
+                        7  11111110
+                        8  10
+                        9  011
+                       10  1101
+                       11  11101
+                       12  111101
+                       13  1111101
+                       14  11111101
+                       15  11111111
+
+
+
+
+
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 29]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+5.  Video
+
+   The following sections describe the video encodings that are defined
+   in this memo and give their abbreviated names used for
+   identification.  These video encodings and their payload types are
+   listed in Table 5.
+
+   All of these video encodings use an RTP timestamp frequency of 90,000
+   Hz, the same as the MPEG presentation time stamp frequency.  This
+   frequency yields exact integer timestamp increments for the typical
+   24 (HDTV), 25 (PAL), and 29.97 (NTSC) and 30 Hz (HDTV) frame rates
+   and 50, 59.94 and 60 Hz field rates.  While 90 kHz is the RECOMMENDED
+   rate for future video encodings used within this profile, other rates
+   MAY be used.  However, it is not sufficient to use the video frame
+   rate (typically between 15 and 30 Hz) because that does not provide
+   adequate resolution for typical synchronization requirements when
+   calculating the RTP timestamp corresponding to the NTP timestamp in
+   an RTCP SR packet.  The timestamp resolution MUST also be sufficient
+   for the jitter estimate contained in the receiver reports.
+
+   For most of these video encodings, the RTP timestamp encodes the
+   sampling instant of the video image contained in the RTP data packet.
+   If a video image occupies more than one packet, the timestamp is the
+   same on all of those packets.  Packets from different video images
+   are distinguished by their different timestamps.
+
+   Most of these video encodings also specify that the marker bit of the
+   RTP header SHOULD be set to one in the last packet of a video frame
+   and otherwise set to zero.  Thus, it is not necessary to wait for a
+   following packet with a different timestamp to detect that a new
+   frame should be displayed.
+
+5.1  CelB
+
+   The CELL-B encoding is a proprietary encoding proposed by Sun
+   Microsystems.  The byte stream format is described in RFC 2029 [18].
+
+5.2 JPEG
+
+   The encoding is specified in ISO Standards 10918-1 and 10918-2.  The
+   RTP payload format is as specified in RFC 2435 [19].
+
+5.3 H261
+
+   The encoding is specified in ITU-T Recommendation H.261, "Video codec
+   for audiovisual services at p x 64 kbit/s".  The packetization and
+   RTP-specific properties are described in RFC 2032 [20].
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 30]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+5.4 H263
+
+   The encoding is specified in the 1996 version of ITU-T Recommendation
+   H.263, "Video coding for low bit rate communication".  The
+   packetization and RTP-specific properties are described in RFC 2190
+   [21].  The H263-1998 payload format is RECOMMENDED over this one for
+   use by new implementations.
+
+5.5 H263-1998
+
+   The encoding is specified in the 1998 version of ITU-T Recommendation
+   H.263, "Video coding for low bit rate communication".  The
+   packetization and RTP-specific properties are described in RFC 2429
+   [22].  Because the 1998 version of H.263 is a superset of the 1996
+   syntax, this payload format can also be used with the 1996 version of
+   H.263, and is RECOMMENDED for this use by new implementations.  This
+   payload format does not replace RFC 2190, which continues to be used
+   by existing implementations, and may be required for backward
+   compatibility in new implementations.  Implementations using the new
+   features of the 1998 version of H.263 MUST use the payload format
+   described in RFC 2429.
+
+5.6 MPV
+
+   MPV designates the use of MPEG-1 and MPEG-2 video encoding elementary
+   streams as specified in ISO Standards ISO/IEC 11172 and 13818-2,
+   respectively.  The RTP payload format is as specified in RFC 2250
+   [14], Section 3.
+
+   The MIME registration for MPV in RFC 3555 [7] specifies a parameter
+   that MAY be used with MIME or SDP to restrict the selection of the
+   type of MPEG video.
+
+5.7 MP2T
+
+   MP2T designates the use of MPEG-2 transport streams, for either audio
+   or video.  The RTP payload format is described in RFC 2250 [14],
+   Section 2.
+
+
+
+
+
+
+
+
+
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 31]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+5.8 nv
+
+   The encoding is implemented in the program `nv', version 4, developed
+   at Xerox PARC by Ron Frederick.  Further information is available
+   from the author:
+
+   Ron Frederick
+   Blue Coat Systems Inc.
+   650 Almanor Avenue
+   Sunnyvale, CA 94085
+   United States
+   EMail: ronf@bluecoat.com
+
+6.  Payload Type Definitions
+
+   Tables 4 and 5 define this profile's static payload type values for
+   the PT field of the RTP data header.  In addition, payload type
+   values in the range 96-127 MAY be defined dynamically through a
+   conference control protocol, which is beyond the scope of this
+   document.  For example, a session directory could specify that for a
+   given session, payload type 96 indicates PCMU encoding, 8,000 Hz
+   sampling rate, 2 channels.  Entries in Tables 4 and 5 with payload
+   type "dyn" have no static payload type assigned and are only used
+   with a dynamic payload type.  Payload type 2 was assigned to G721 in
+   RFC 1890 and to its equivalent successor G726-32 in draft versions of
+   this specification, but its use is now deprecated and that static
+   payload type is marked reserved due to conflicting use for the
+   payload formats G726-32 and AAL2-G726-32 (see Section 4.5.4).
+   Payload type 13 indicates the Comfort Noise (CN) payload format
+   specified in RFC 3389 [9].  Payload type 19 is marked "reserved"
+   because some draft versions of this specification assigned that
+   number to an earlier version of the comfort noise payload format.
+   The payload type range 72-76 is marked "reserved" so that RTCP and
+   RTP packets can be reliably distinguished (see Section "Summary of
+   Protocol Constants" of the RTP protocol specification).
+
+   The payload types currently defined in this profile are assigned to
+   exactly one of three categories or media types:  audio only, video
+   only and those combining audio and video.  The media types are marked
+   in Tables 4 and 5 as "A", "V" and "AV", respectively.  Payload types
+   of different media types SHALL NOT be interleaved or multiplexed
+   within a single RTP session, but multiple RTP sessions MAY be used in
+   parallel to send multiple media types.  An RTP source MAY change
+   payload types within the same media type during a session.  See the
+   section "Multiplexing RTP Sessions" of RFC 3550 for additional
+   explanation.
+
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 32]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+               PT   encoding    media type  clock rate   channels
+                    name                    (Hz)
+               ___________________________________________________
+               0    PCMU        A            8,000       1
+               1    reserved    A
+               2    reserved    A
+               3    GSM         A            8,000       1
+               4    G723        A            8,000       1
+               5    DVI4        A            8,000       1
+               6    DVI4        A           16,000       1
+               7    LPC         A            8,000       1
+               8    PCMA        A            8,000       1
+               9    G722        A            8,000       1
+               10   L16         A           44,100       2
+               11   L16         A           44,100       1
+               12   QCELP       A            8,000       1
+               13   CN          A            8,000       1
+               14   MPA         A           90,000       (see text)
+               15   G728        A            8,000       1
+               16   DVI4        A           11,025       1
+               17   DVI4        A           22,050       1
+               18   G729        A            8,000       1
+               19   reserved    A
+               20   unassigned  A
+               21   unassigned  A
+               22   unassigned  A
+               23   unassigned  A
+               dyn  G726-40     A            8,000       1
+               dyn  G726-32     A            8,000       1
+               dyn  G726-24     A            8,000       1
+               dyn  G726-16     A            8,000       1
+               dyn  G729D       A            8,000       1
+               dyn  G729E       A            8,000       1
+               dyn  GSM-EFR     A            8,000       1
+               dyn  L8          A            var.        var.
+               dyn  RED         A                        (see text)
+               dyn  VDVI        A            var.        1
+
+               Table 4: Payload types (PT) for audio encodings
+
+
+
+
+
+
+
+
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 33]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+               PT      encoding    media type  clock rate
+                       name                    (Hz)
+               _____________________________________________
+               24      unassigned  V
+               25      CelB        V           90,000
+               26      JPEG        V           90,000
+               27      unassigned  V
+               28      nv          V           90,000
+               29      unassigned  V
+               30      unassigned  V
+               31      H261        V           90,000
+               32      MPV         V           90,000
+               33      MP2T        AV          90,000
+               34      H263        V           90,000
+               35-71   unassigned  ?
+               72-76   reserved    N/A         N/A
+               77-95   unassigned  ?
+               96-127  dynamic     ?
+               dyn     H263-1998   V           90,000
+
+               Table 5: Payload types (PT) for video and combined
+                        encodings
+
+   Session participants agree through mechanisms beyond the scope of
+   this specification on the set of payload types allowed in a given
+   session.  This set MAY, for example, be defined by the capabilities
+   of the applications used, negotiated by a conference control protocol
+   or established by agreement between the human participants.
+
+   Audio applications operating under this profile SHOULD, at a minimum,
+   be able to send and/or receive payload types 0 (PCMU) and 5 (DVI4).
+   This allows interoperability without format negotiation and ensures
+   successful negotiation with a conference control protocol.
+
+7.  RTP over TCP and Similar Byte Stream Protocols
+
+   Under special circumstances, it may be necessary to carry RTP in
+   protocols offering a byte stream abstraction, such as TCP, possibly
+   multiplexed with other data.  The application MUST define its own
+   method of delineating RTP and RTCP packets (RTSP [23] provides an
+   example of such an encapsulation specification).
+
+8.  Port Assignment
+
+   As specified in the RTP protocol definition, RTP data SHOULD be
+   carried on an even UDP port number and the corresponding RTCP packets
+   SHOULD be carried on the next higher (odd) port number.
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 34]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+   Applications operating under this profile MAY use any such UDP port
+   pair.  For example, the port pair MAY be allocated randomly by a
+   session management program.  A single fixed port number pair cannot
+   be required because multiple applications using this profile are
+   likely to run on the same host, and there are some operating systems
+   that do not allow multiple processes to use the same UDP port with
+   different multicast addresses.
+
+   However, port numbers 5004 and 5005 have been registered for use with
+   this profile for those applications that choose to use them as the
+   default pair.  Applications that operate under multiple profiles MAY
+   use this port pair as an indication to select this profile if they
+   are not subject to the constraint of the previous paragraph.
+   Applications need not have a default and MAY require that the port
+   pair be explicitly specified.  The particular port numbers were
+   chosen to lie in the range above 5000 to accommodate port number
+   allocation practice within some versions of the Unix operating
+   system, where port numbers below 1024 can only be used by privileged
+   processes and port numbers between 1024 and 5000 are automatically
+   assigned by the operating system.
+
+9.  Changes from RFC 1890
+
+   This RFC revises RFC 1890.  It is mostly backwards-compatible with
+   RFC 1890 except for functions removed because two interoperable
+   implementations were not found.  The additions to RFC 1890 codify
+   existing practice in the use of payload formats under this profile.
+   Since this profile may be used without using any of the payload
+   formats listed here, the addition of new payload formats in this
+   revision does not affect backwards compatibility.  The changes are
+   listed below, categorized into functional and non-functional changes.
+
+   Functional changes:
+
+   o  Section 11, "IANA Considerations" was added to specify the
+      registration of the name for this profile.  That appendix also
+      references a new Section 3 "Registering Additional Encodings"
+      which establishes a policy that no additional registration of
+      static payload types for this profile will be made beyond those
+      added in this revision and included in Tables 4 and 5.  Instead,
+      additional encoding names may be registered as MIME subtypes for
+      binding to dynamic payload types.  Non-normative references were
+      added to RFC 3555 [7] where MIME subtypes for all the listed
+      payload formats are registered, some with optional parameters for
+      use of the payload formats.
+
+
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 35]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+   o  Static payload types 4, 16, 17 and 34 were added to incorporate
+      IANA registrations made since the publication of RFC 1890, along
+      with the corresponding payload format descriptions for G723 and
+      H263.
+
+   o  Following working group discussion, static payload types 12 and 18
+      were added along with the corresponding payload format
+      descriptions for QCELP and G729.  Static payload type 13 was
+      assigned to the Comfort Noise (CN) payload format defined in RFC
+      3389.  Payload type 19 was marked reserved because it had been
+      temporarily allocated to an earlier version of Comfort Noise
+      present in some draft revisions of this document.
+
+   o  The payload format for G721 was renamed to G726-32 following the
+      ITU-T renumbering, and the payload format description for G726 was
+      expanded to include the -16, -24 and -40 data rates.  Because of
+      confusion regarding draft revisions of this document, some
+      implementations of these G726 payload formats packed samples into
+      octets starting with the most significant bit rather than the
+      least significant bit as specified here.  To partially resolve
+      this incompatibility, new payload formats named AAL2-G726-16, -24,
+      -32 and -40 will be specified in a separate document (see note in
+      Section 4.5.4), and use of static payload type 2 is deprecated as
+      explained in Section 6.
+
+   o  Payload formats G729D and G729E were added following the ITU-T
+      addition of Annexes D and E to Recommendation G.729.  Listings
+      were added for payload formats GSM-EFR, RED, and H263-1998
+      published in other documents subsequent to RFC 1890.  These
+      additional payload formats are referenced only by dynamic payload
+      type numbers.
+
+   o  The descriptions of the payload formats for G722, G728, GSM, VDVI
+      were expanded.
+
+   o  The payload format for 1016 audio was removed and its static
+      payload type assignment 1 was marked "reserved" because two
+      interoperable implementations were not found.
+
+   o  Requirements for congestion control were added in Section 2.
+
+   o  This profile follows the suggestion in the revised RTP spec that
+      RTCP bandwidth may be specified separately from the session
+      bandwidth and separately for active senders and passive receivers.
+
+   o  The mapping of a user pass-phrase string into an encryption key
+      was deleted from Section 2 because two interoperable
+      implementations were not found.
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 36]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+   o  The "quadrophonic" sample ordering convention for four-channel
+      audio was removed to eliminate an ambiguity as noted in Section
+      4.1.
+
+   Non-functional changes:
+
+   o  In Section 4.1, it is now explicitly stated that silence
+      suppression is allowed for all audio payload formats.  (This has
+      always been the case and derives from a fundamental aspect of
+      RTP's design and the motivations for packet audio, but was not
+      explicit stated before.)  The use of comfort noise is also
+      explained.
+
+   o  In Section 4.1, the requirement level for setting of the marker
+      bit on the first packet after silence for audio was changed from
+      "is" to "SHOULD be", and clarified that the marker bit is set only
+      when packets are intentionally not sent.
+
+   o  Similarly, text was added to specify that the marker bit SHOULD be
+      set to one on the last packet of a video frame, and that video
+      frames are distinguished by their timestamps.
+
+   o  RFC references are added for payload formats published after RFC
+      1890.
+
+   o  The security considerations and full copyright sections were
+      added.
+
+   o  According to Peter Hoddie of Apple, only pre-1994 Macintosh used
+      the 22254.54 rate and none the 11127.27 rate, so the latter was
+      dropped from the discussion of suggested sampling frequencies.
+
+   o  Table 1 was corrected to move some values from the "ms/packet"
+      column to the "default ms/packet" column where they belonged.
+
+   o  Since the Interactive Multimedia Association ceased operations, an
+      alternate resource was provided for a referenced IMA document.
+
+   o  A note has been added for G722 to clarify a discrepancy between
+      the actual sampling rate and the RTP timestamp clock rate.
+
+   o  Small clarifications of the text have been made in several places,
+      some in response to questions from readers.  In particular:
+
+      -  A definition for "media type" is given in Section 1.1 to allow
+         the explanation of multiplexing RTP sessions in Section 6 to be
+         more clear regarding the multiplexing of multiple media.
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 37]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+      -  The explanation of how to determine the number of audio frames
+         in a packet from the length was expanded.
+
+      -  More description of the allocation of bandwidth to SDES items
+         is given.
+
+      -  A note was added that the convention for the order of channels
+         specified in Section 4.1 may be overridden by a particular
+         encoding or payload format specification.
+
+      -  The terms MUST, SHOULD, MAY, etc. are used as defined in RFC
+         2119.
+
+   o  A second author for this document was added.
+
+10. Security Considerations
+
+   Implementations using the profile defined in this specification are
+   subject to the security considerations discussed in the RTP
+   specification [1].  This profile does not specify any different
+   security services.  The primary function of this profile is to list a
+   set of data compression encodings for audio and video media.
+
+   Confidentiality of the media streams is achieved by encryption.
+   Because the data compression used with the payload formats described
+   in this profile is applied end-to-end, encryption may be performed
+   after compression so there is no conflict between the two operations.
+
+   A potential denial-of-service threat exists for data encodings using
+   compression techniques that have non-uniform receiver-end
+   computational load.  The attacker can inject pathological datagrams
+   into the stream which are complex to decode and cause the receiver to
+   be overloaded.
+
+   As with any IP-based protocol, in some circumstances a receiver may
+   be overloaded simply by the receipt of too many packets, either
+   desired or undesired.  Network-layer authentication MAY be used to
+   discard packets from undesired sources, but the processing cost of
+   the authentication itself may be too high.  In a multicast
+   environment, source pruning is implemented in IGMPv3 (RFC 3376) [24]
+   and in multicast routing protocols to allow a receiver to select
+   which sources are allowed to reach it.
+
+
+
+
+
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 38]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+11. IANA Considerations
+
+   The RTP specification establishes a registry of profile names for use
+   by higher-level control protocols, such as the Session Description
+   Protocol (SDP), RFC 2327 [6], to refer to transport methods.  This
+   profile registers the name "RTP/AVP".
+
+   Section 3 establishes the policy that no additional registration of
+   static RTP payload types for this profile will be made beyond those
+   added in this document revision and included in Tables 4 and 5.  IANA
+   may reference that section in declining to accept any additional
+   registration requests.  In Tables 4 and 5, note that types 1 and 2
+   have been marked reserved and the set of "dyn" payload types included
+   has been updated.  These changes are explained in Sections 6 and 9.
+
+12.  References
+
+12.1 Normative References
+
+   [1]  Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobson,
+        "RTP:  A Transport Protocol for Real-Time Applications", RFC
+        3550, July 2003.
+
+   [2]  Bradner, S., "Key Words for Use in RFCs to Indicate Requirement
+        Levels", BCP 14, RFC 2119, March 1997.
+
+   [3]  Apple Computer, "Audio Interchange File Format AIFF-C", August
+        1991.  (also ftp://ftp.sgi.com/sgi/aiff-c.9.26.91.ps.Z).
+
+12.2 Informative References
+
+   [4]  Braden, R., Clark, D. and S. Shenker, "Integrated Services in
+        the Internet Architecture: an Overview", RFC 1633, June 1994.
+
+   [5]  Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z. and W.
+        Weiss, "An Architecture for Differentiated Service", RFC 2475,
+        December 1998.
+
+   [6]  Handley, M. and V. Jacobson, "SDP: Session Description
+        Protocol", RFC 2327, April 1998.
+
+   [7]  Casner, S. and P. Hoschka, "MIME Type Registration of RTP
+        Payload Types", RFC 3555, July 2003.
+
+   [8]  Freed, N., Klensin, J. and J. Postel, "Multipurpose Internet
+        Mail Extensions (MIME) Part Four: Registration Procedures", BCP
+        13, RFC 2048, November 1996.
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 39]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+   [9]  Zopf, R., "Real-time Transport Protocol (RTP) Payload for
+        Comfort Noise (CN)", RFC 3389, September 2002.
+
+   [10] Deleam, D. and J.-P. Petit, "Real-time implementations of the
+        recent ITU-T low bit rate speech coders on the TI TMS320C54X
+        DSP: results, methodology, and applications", in Proc. of
+        International Conference on Signal Processing, Technology, and
+        Applications (ICSPAT) , (Boston, Massachusetts), pp. 1656--1660,
+        October 1996.
+
+   [11] Mouly, M. and M.-B. Pautet, The GSM system for mobile
+        communications Lassay-les-Chateaux, France: Europe Media
+        Duplication, 1993.
+
+   [12] Degener, J., "Digital Speech Compression", Dr. Dobb's Journal,
+        December 1994.
+
+   [13] Redl, S., Weber, M. and M. Oliphant, An Introduction to GSM
+        Boston: Artech House, 1995.
+
+   [14] Hoffman, D., Fernando, G., Goyal, V. and M. Civanlar, "RTP
+        Payload Format for MPEG1/MPEG2 Video", RFC 2250, January 1998.
+
+   [15] Jayant, N. and P. Noll, Digital Coding of Waveforms--Principles
+        and Applications to Speech and Video Englewood Cliffs, New
+        Jersey: Prentice-Hall, 1984.
+
+   [16] McKay, K., "RTP Payload Format for PureVoice(tm) Audio", RFC
+        2658, August 1999.
+
+   [17] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., Handley, M.,
+        Bolot, J.-C., Vega-Garcia, A. and S. Fosse-Parisis, "RTP Payload
+        for Redundant Audio Data", RFC 2198, September 1997.
+
+   [18] Speer, M. and D. Hoffman, "RTP Payload Format of Sun's CellB
+        Video Encoding", RFC 2029, October 1996.
+
+   [19] Berc, L., Fenner, W., Frederick, R., McCanne, S. and P. Stewart,
+        "RTP Payload Format for JPEG-Compressed Video", RFC 2435,
+        October 1998.
+
+   [20] Turletti, T. and C. Huitema, "RTP Payload Format for H.261 Video
+        Streams", RFC 2032, October 1996.
+
+   [21] Zhu, C., "RTP Payload Format for H.263 Video Streams", RFC 2190,
+        September 1997.
+
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 40]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+   [22] Bormann, C., Cline, L., Deisher, G., Gardos, T., Maciocco, C.,
+        Newell, D., Ott, J., Sullivan, G., Wenger, S. and C. Zhu, "RTP
+        Payload Format for the 1998 Version of ITU-T Rec. H.263 Video
+        (H.263+)", RFC 2429, October 1998.
+
+   [23] Schulzrinne, H., Rao, A. and R. Lanphier, "Real Time Streaming
+        Protocol (RTSP)", RFC 2326, April 1998.
+
+   [24] Cain, B., Deering, S., Kouvelas, I., Fenner, B. and A.
+        Thyagarajan, "Internet Group Management Protocol, Version 3",
+        RFC 3376, October 2002.
+
+13. Current Locations of Related Resources
+
+   Note:  Several sections below refer to the ITU-T Software Tool
+   Library (STL).  It is available from the ITU Sales Service, Place des
+   Nations, CH-1211 Geneve 20, Switzerland (also check
+   http://www.itu.int).  The ITU-T STL is covered by a license defined
+   in ITU-T Recommendation G.191, "Software tools for speech and audio
+   coding standardization".
+
+   DVI4
+
+   An archived copy of the document IMA Recommended Practices for
+   Enhancing Digital Audio Compatibility in Multimedia Systems (version
+   3.0), which describes the IMA ADPCM algorithm, is available at:
+
+      http://www.cs.columbia.edu/~hgs/audio/dvi/
+
+   An implementation is available from Jack Jansen at
+
+      ftp://ftp.cwi.nl/local/pub/audio/adpcm.shar
+
+   G722
+
+   An implementation of the G.722 algorithm is available as part of the
+   ITU-T STL, described above.
+
+   G723
+
+   The reference C code implementation defining the G.723.1 algorithm
+   and its Annexes A, B, and C are available as an integral part of
+   Recommendation G.723.1 from the ITU Sales Service, address listed
+   above.  Both the algorithm and C code are covered by a specific
+   license.  The ITU-T Secretariat should be contacted to obtain such
+   licensing information.
+
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 41]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+   G726
+
+   G726 is specified in the ITU-T Recommendation G.726, "40, 32, 24, and
+   16 kb/s Adaptive Differential Pulse Code Modulation (ADPCM)".  An
+   implementation of the G.726 algorithm is available as part of the
+   ITU-T STL, described above.
+
+   G729
+
+   The reference C code implementation defining the G.729 algorithm and
+   its Annexes A through I are available as an integral part of
+   Recommendation G.729 from the ITU Sales Service, listed above.  Annex
+   I contains the integrated C source code for all G.729 operating
+   modes.  The G.729 algorithm and associated C code are covered by a
+   specific license.  The contact information for obtaining the license
+   is available from the ITU-T Secretariat.
+
+   GSM
+
+   A reference implementation was written by Carsten Bormann and Jutta
+   Degener (then at TU Berlin, Germany).  It is available at
+
+      http://www.dmn.tzi.org/software/gsm/
+
+   Although the RPE-LTP algorithm is not an ITU-T standard, there is a C
+   code implementation of the RPE-LTP algorithm available as part of the
+   ITU-T STL.  The STL implementation is an adaptation of the TU Berlin
+   version.
+
+   LPC
+
+   An implementation is available at
+
+      ftp://parcftp.xerox.com/pub/net-research/lpc.tar.Z
+
+   PCMU, PCMA
+
+   An implementation of these algorithms is available as part of the
+   ITU-T STL, described above.
+
+14. Acknowledgments
+
+   The comments and careful review of Simao Campos, Richard Cox and AVT
+   Working Group participants are gratefully acknowledged.  The GSM
+   description was adopted from the IMTC Voice over IP Forum Service
+   Interoperability Implementation Agreement (January 1997).  Fred Burg
+   and Terry Lyons helped with the G.729 description.
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 42]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+15. Intellectual Property Rights Statement
+
+   The IETF takes no position regarding the validity or scope of any
+   intellectual property or other rights that might be claimed to
+   pertain to the implementation or use of the technology described in
+   this document or the extent to which any license under such rights
+   might or might not be available; neither does it represent that it
+   has made any effort to identify any such rights.  Information on the
+   IETF's procedures with respect to rights in standards-track and
+   standards-related documentation can be found in BCP-11.  Copies of
+   claims of rights made available for publication and any assurances of
+   licenses to be made available, or the result of an attempt made to
+   obtain a general license or permission for the use of such
+   proprietary rights by implementors or users of this specification can
+   be obtained from the IETF Secretariat.
+
+   The IETF invites any interested party to bring to its attention any
+   copyrights, patents or patent applications, or other proprietary
+   rights which may cover technology that may be required to practice
+   this standard.  Please address the information to the IETF Executive
+   Director.
+
+16. Authors' Addresses
+
+   Henning Schulzrinne
+   Department of Computer Science
+   Columbia University
+   1214 Amsterdam Avenue
+   New York, NY 10027
+   United States
+
+   EMail: schulzrinne@cs.columbia.edu
+
+
+   Stephen L. Casner
+   Packet Design
+   3400 Hillview Avenue, Building 3
+   Palo Alto, CA 94304
+   United States
+
+   EMail: casner@acm.org
+
+
+
+
+
+
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 43]
+
+RFC 3551                    RTP A/V Profile                    July 2003
+
+
+17. Full Copyright Statement
+
+   Copyright (C) The Internet Society (2003).  All Rights Reserved.
+
+   This document and translations of it may be copied and furnished to
+   others, and derivative works that comment on or otherwise explain it
+   or assist in its implementation may be prepared, copied, published
+   and distributed, in whole or in part, without restriction of any
+   kind, provided that the above copyright notice and this paragraph are
+   included on all such copies and derivative works.  However, this
+   document itself may not be modified in any way, such as by removing
+   the copyright notice or references to the Internet Society or other
+   Internet organizations, except as needed for the purpose of
+   developing Internet standards in which case the procedures for
+   copyrights defined in the Internet Standards process must be
+   followed, or as required to translate it into languages other than
+   English.
+
+   The limited permissions granted above are perpetual and will not be
+   revoked by the Internet Society or its successors or assigns.
+
+   This document and the information contained herein is provided on an
+   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
+   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
+   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
+   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
+   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+
+Acknowledgement
+
+   Funding for the RFC Editor function is currently provided by the
+   Internet Society.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Schulzrinne & Casner        Standards Track                    [Page 44]
+