From 4bfd864f10b68b71482b35c818559068ef8d5797 Mon Sep 17 00:00:00 2001
From: Thomas Voss <mail@thomasvoss.com>
Date: Wed, 27 Nov 2024 20:54:24 +0100
Subject: doc: Add RFC documents

---
 doc/rfc/rfc7667.txt | 2691 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 2691 insertions(+)
 create mode 100644 doc/rfc/rfc7667.txt

(limited to 'doc/rfc/rfc7667.txt')

diff --git a/doc/rfc/rfc7667.txt b/doc/rfc/rfc7667.txt
new file mode 100644
index 0000000..6686d0b
--- /dev/null
+++ b/doc/rfc/rfc7667.txt
@@ -0,0 +1,2691 @@
+
+
+
+
+
+
+Internet Engineering Task Force (IETF)                     M. Westerlund
+Request for Comments: 7667                                      Ericsson
+Obsoletes: 5117                                                S. Wenger
+Category: Informational                                            Vidyo
+ISSN: 2070-1721                                            November 2015
+
+
+                             RTP Topologies
+
+Abstract
+
+   This document discusses point-to-point and multi-endpoint topologies
+   used in environments based on the Real-time Transport Protocol (RTP).
+   In particular, centralized topologies commonly employed in the video
+   conferencing industry are mapped to the RTP terminology.
+
+   This document is updated with additional topologies and replaces RFC
+   5117.
+
+Status of This Memo
+
+   This document is not an Internet Standards Track specification; it is
+   published for informational purposes.
+
+   This document is a product of the Internet Engineering Task Force
+   (IETF).  It represents the consensus of the IETF community.  It has
+   received public review and has been approved for publication by the
+   Internet Engineering Steering Group (IESG).  Not all documents
+   approved by the IESG are a candidate for any level of Internet
+   Standard; see Section 2 of RFC 5741.
+
+   Information about the current status of this document, any errata,
+   and how to provide feedback on it may be obtained at
+   http://www.rfc-editor.org/info/rfc7667.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Westerlund & Wenger           Informational                     [Page 1]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+Copyright Notice
+
+   Copyright (c) 2015 IETF Trust and the persons identified as the
+   document authors.  All rights reserved.
+
+   This document is subject to BCP 78 and the IETF Trust's Legal
+   Provisions Relating to IETF Documents
+   (http://trustee.ietf.org/license-info) in effect on the date of
+   publication of this document.  Please review these documents
+   carefully, as they describe your rights and restrictions with respect
+   to this document.  Code Components extracted from this document must
+   include Simplified BSD License text as described in Section 4.e of
+   the Trust Legal Provisions and are provided without warranty as
+   described in the Simplified BSD License.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Westerlund & Wenger           Informational                     [Page 2]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+Table of Contents
+
+   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   4
+   2.  Definitions . . . . . . . . . . . . . . . . . . . . . . . . .   5
+     2.1.  Glossary  . . . . . . . . . . . . . . . . . . . . . . . .   5
+     2.2.  Definitions Related to RTP Grouping Taxonomy  . . . . . .   5
+   3.  Topologies  . . . . . . . . . . . . . . . . . . . . . . . . .   6
+     3.1.  Point to Point  . . . . . . . . . . . . . . . . . . . . .   6
+     3.2.  Point to Point via Middlebox  . . . . . . . . . . . . . .   7
+       3.2.1.  Translators . . . . . . . . . . . . . . . . . . . . .   7
+       3.2.2.  Back-to-Back RTP sessions . . . . . . . . . . . . . .  11
+     3.3.  Point to Multipoint Using Multicast . . . . . . . . . . .  12
+       3.3.1.  Any-Source Multicast (ASM)  . . . . . . . . . . . . .  12
+       3.3.2.  Source-Specific Multicast (SSM) . . . . . . . . . . .  14
+       3.3.3.  SSM with Local Unicast Resources  . . . . . . . . . .  15
+     3.4.  Point to Multipoint Using Mesh  . . . . . . . . . . . . .  17
+     3.5.  Point to Multipoint Using the RFC 3550 Translator . . . .  20
+       3.5.1.  Relay - Transport Translator  . . . . . . . . . . . .  20
+       3.5.2.  Media Translator  . . . . . . . . . . . . . . . . . .  21
+     3.6.  Point to Multipoint Using the RFC 3550 Mixer Model  . . .  22
+       3.6.1.  Media-Mixing Mixer  . . . . . . . . . . . . . . . . .  24
+       3.6.2.  Media-Switching Mixer . . . . . . . . . . . . . . . .  27
+     3.7.  Selective Forwarding Middlebox  . . . . . . . . . . . . .  29
+     3.8.  Point to Multipoint Using Video-Switching MCUs  . . . . .  33
+     3.9.  Point to Multipoint Using RTCP-Terminating MCU  . . . . .  34
+     3.10. Split Component Terminal  . . . . . . . . . . . . . . . .  35
+     3.11. Non-symmetric Mixer/Translators . . . . . . . . . . . . .  38
+     3.12. Combining Topologies  . . . . . . . . . . . . . . . . . .  38
+   4.  Topology Properties . . . . . . . . . . . . . . . . . . . . .  39
+     4.1.  All-to-All Media Transmission . . . . . . . . . . . . . .  39
+     4.2.  Transport or Media Interoperability . . . . . . . . . . .  40
+     4.3.  Per-Domain Bitrate Adaptation . . . . . . . . . . . . . .  40
+     4.4.  Aggregation of Media  . . . . . . . . . . . . . . . . . .  41
+     4.5.  View of All Session Participants  . . . . . . . . . . . .  41
+     4.6.  Loop Detection  . . . . . . . . . . . . . . . . . . . . .  42
+     4.7.  Consistency between Header Extensions and RTCP  . . . . .  42
+   5.  Comparison of Topologies  . . . . . . . . . . . . . . . . . .  42
+   6.  Security Considerations . . . . . . . . . . . . . . . . . . .  43
+   7.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  45
+     7.1.  Normative References  . . . . . . . . . . . . . . . . . .  45
+     7.2.  Informative References  . . . . . . . . . . . . . . . . .  45
+   Acknowledgements  . . . . . . . . . . . . . . . . . . . . . . . .  48
+   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  48
+
+
+
+
+
+
+
+
+Westerlund & Wenger           Informational                     [Page 3]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+1.  Introduction
+
+   Real-time Transport Protocol (RTP) [RFC3550] topologies describe
+   methods for interconnecting RTP entities and their processing
+   behavior for RTP and the RTP Control Protocol (RTCP).  This document
+   tries to address past and existing confusion, especially with respect
+   to terms not defined in RTP but in common use in the communication
+   industry, such as the Multipoint Control Unit or MCU.
+
+   When the Audio-Visual Profile with Feedback (AVPF) [RFC4585] was
+   developed, the main emphasis lay in the efficient support of
+   point-to-point and small multipoint scenarios without centralized
+   multipoint control.  In practice, however, most multipoint
+   conferences operate utilizing centralized units referred to as MCUs.
+   MCUs may implement mixer or translator functionality (in RTP
+   [RFC3550] terminology) and signaling support.  They may also contain
+   additional application-layer functionality.  This document focuses on
+   the media transport aspects of the MCU that can be realized using
+   RTP, as discussed below.  Further considered are the properties of
+   mixers and translators, and how some types of deployed MCUs deviate
+   from these properties.
+
+   This document also codifies new multipoint architectures that have
+   recently been introduced and that were not anticipated in RFC 5117;
+   thus, this document replaces [RFC5117].  These architectures use
+   scalable video coding and simulcasting, and their associated
+   centralized units are referred to as Selective Forwarding Middleboxes
+   (SFMs).  This codification provides a common information basis for
+   future discussion and specification work.
+
+   The new topologies are Point to Point via Middlebox (Section 3.2),
+   Source-Specific Multicast (Section 3.3.2), SSM with Local Unicast
+   Resources (Section 3.3.3), Point to Multipoint Using Mesh
+   (Section 3.4), Selective Forwarding Middlebox (Section 3.7), and
+   Split Component Terminal (Section 3.10).  The Point to Multipoint
+   Using the RFC 3550 Mixer Model (Section 3.6) has been significantly
+   expanded to cover two different versions, namely Media-Mixing Mixer
+   (Section 3.6.1) and Media-Switching Mixer (Section 3.6.2).
+
+   The document's attempt to clarify and explain sections of the RTP
+   spec [RFC3550] is informal.  It is not intended to update or change
+   what is normatively specified within RFC 3550.
+
+
+
+
+
+
+
+
+
+Westerlund & Wenger           Informational                     [Page 4]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+2.  Definitions
+
+2.1.  Glossary
+
+   ASM:  Any-Source Multicast
+
+   AVPF:  The extended RTP profile for RTCP-based feedback
+
+   CSRC:  Contributing Source
+
+   Link:  The data transport to the next IP hop
+
+   Middlebox:  A device that is on the Path that media travel between
+      two endpoints
+
+   MCU:  Multipoint Control Unit
+
+   Path:  The concatenation of multiple links, resulting in an
+      end-to-end data transfer.
+
+   PtM:  Point to Multipoint
+
+   PtP:  Point to Point
+
+   SFM:  Selective Forwarding Middlebox
+
+   SSM:  Source-Specific Multicast
+
+   SSRC:  Synchronization Source
+
+2.2.  Definitions Related to RTP Grouping Taxonomy
+
+   The following definitions have been taken from [RFC7656].
+
+   Communication Session:  A Communication Session is an association
+      among two or more Participants communicating with each other via
+      one or more Multimedia Sessions.
+
+   Endpoint:  A single addressable entity sending or receiving RTP
+      packets.  It may be decomposed into several functional blocks, but
+      as long as it behaves as a single RTP stack mentity, it is
+      classified as a single "endpoint".
+
+   Media Source:  A Media Source is the logical source of a time
+      progressing digital media stream synchronized to a reference
+      clock.  This stream is called a Source Stream.
+
+
+
+
+
+Westerlund & Wenger           Informational                     [Page 5]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   Multimedia Session:   A Multimedia Session is an association among a
+      group of participants engaged in communication via one or more RTP
+      sessions.
+
+3.  Topologies
+
+   This subsection defines several topologies that are relevant for
+   codec control but also RTP usage in other contexts.  The section
+   starts with point-to-point cases, with or without middleboxes.  Then
+   it follows a number of different methods for establishing point-to-
+   multipoint communication.  These are structured around the most
+   fundamental enabler, i.e., multicast, a mesh of connections,
+   translators, mixers, and finally MCUs and SFMs.  The section ends by
+   discussing decomposited terminals, asymmetric middlebox behaviors,
+   and combining topologies.
+
+   The topologies may be referenced in other documents by a shortcut
+   name, indicated by the prefix "Topo-".
+
+   For each of the RTP-defined topologies, we discuss how RTP, RTCP, and
+   the carried media are handled.  With respect to RTCP, we also discuss
+   the handling of RTCP feedback messages as defined in [RFC4585] and
+   [RFC5104].
+
+3.1.  Point to Point
+
+   Shortcut name: Topo-Point-to-Point
+
+   The Point-to-Point (PtP) topology (Figure 1) consists of two
+   endpoints, communicating using unicast.  Both RTP and RTCP traffic
+   are conveyed endpoint to endpoint, using unicast traffic only (even
+   if, in exotic cases, this unicast traffic happens to be conveyed over
+   an IP multicast address).
+
+                            +---+         +---+
+                            | A |<------->| B |
+                            +---+         +---+
+
+                         Figure 1: Point to Point
+
+   The main property of this topology is that A sends to B, and only B,
+   while B sends to A, and only A.  This avoids all complexities of
+   handling multiple endpoints and combining the requirements stemming
+   from them.  Note that an endpoint can still use multiple RTP
+   Synchronization Sources (SSRCs) in an RTP session.  The number of RTP
+   sessions in use between A and B can also be of any number, subject
+   only to system-level limitations like the number range of ports.
+
+
+
+
+Westerlund & Wenger           Informational                     [Page 6]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   RTCP feedback messages for the indicated SSRCs are communicated
+   directly between the endpoints.  Therefore, this topology poses
+   minimal (if any) issues for any feedback messages.  For RTP sessions
+   that use multiple SSRCs per endpoint, it can be relevant to implement
+   support for cross-reporting suppression as defined in "Sending
+   Multiple Media Streams in a Single RTP Session" [MULTI-STREAM-OPT].
+
+3.2.  Point to Point via Middlebox
+
+   This section discusses cases where two endpoints communicate but have
+   one or more middleboxes involved in the RTP session.
+
+3.2.1.  Translators
+
+   Shortcut name: Topo-PtP-Translator
+
+   Two main categories of translators can be distinguished: Transport
+   Translators and Media Translators.  Both translator types share
+   common attributes that separate them from mixers.  For each RTP
+   stream that the translator receives, it generates an individual RTP
+   stream in the other domain.  A translator keeps the SSRC for an RTP
+   stream across the translation, whereas a mixer can select a single
+   RTP stream from multiple received RTP streams (in cases like audio/
+   video switching) or send out an RTP stream composed of multiple mixed
+   media received in multiple RTP streams (in cases like audio mixing or
+   video tiling), but always under its own SSRC, possibly using the CSRC
+   field to indicate the source(s) of the content.  Mixers are more
+   common in point-to-multipoint cases than in PtP.  The reason is that
+   in PtP use cases, the primary focus of a middlebox is enabling
+   interoperability, between otherwise non-interoperable endpoints, such
+   as transcoding to a codec the receiver supports, which can be done by
+   a Media Translator.
+
+   As specified in Section 7.1 of [RFC3550], the SSRC space is common
+   for all participants in the RTP session, independent of on which side
+   of the translator the session resides.  Therefore, it is the
+   responsibility of the endpoints (as the RTP session participants) to
+   run SSRC collision detection, and the SSRC is thus a field the
+   translator cannot change.  Any Source Description (SDES) information
+   associated with an SSRC or CSRC also needs to be forwarded between
+   the domains for any SSRC/CSRC used in the different domains.
+
+   A translator commonly does not use an SSRC of its own and is not
+   visible as an active participant in the RTP session.  One reason to
+   have its own SSRC is when a translator acts as a quality monitor that
+   sends RTCP reports and therefore is required to have an SSRC.
+   Another example is the case when a translator is prepared to use RTCP
+   feedback messages.  This may, for example, occur in a translator
+
+
+
+Westerlund & Wenger           Informational                     [Page 7]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   configured to detect packet loss of important video packets, and it
+   wants to trigger repair by the media sending endpoint, by sending
+   feedback messages.  While such feedback could use the SSRC of the
+   target for the translator (the receiving endpoint), this in turn
+   would require translation of the target RTCP reports to make them
+   consistent.  It may be simpler to expose an additional SSRC in the
+   session.  The only concern is that endpoints failing to support the
+   full RTP specification may have issues with multiple SSRCs reporting
+   on the RTP streams sent by that endpoint, as this use case may be
+   viewed as exotic by implementers.
+
+   In general, a translator implementation should consider which RTCP
+   feedback messages or codec-control messages it needs to understand in
+   relation to the functionality of the translator itself.  This is
+   completely in line with the requirement to also translate RTCP
+   messages between the domains.
+
+3.2.1.1.  Transport Relay/Anchoring
+
+   Shortcut name: Topo-PtP-Relay
+
+   There exist a number of different types of middleboxes that might be
+   inserted between two endpoints on the transport level, e.g., to
+   perform changes on the IP/UDP headers, and are, therefore, basic
+   Transport Translators.  These middleboxes come in many variations
+   including NAT [RFC3022] traversal by pinning the media path to a
+   public address domain relay and network topologies where the RTP
+   stream is required to pass a particular point for audit by employing
+   relaying, or preserving privacy by hiding each peer's transport
+   addresses to the other party.  Other protocols or functionalities
+   that provide this behavior are Traversal Using Relays around NAT
+   (TURN) [RFC5766] servers, Session Border Gateways, and Media
+   Processing Nodes with media anchoring functionalities.
+
+                     +---+        +---+         +---+
+                     | A |<------>| T |<------->| B |
+                     +---+        +---+         +---+
+
+                 Figure 2: Point to Point with Translator
+
+   A common element in these functions is that they are normally
+   transparent at the RTP level, i.e., they perform no changes on any
+   RTP or RTCP packet fields and only affect the lower layers.  They may
+   affect, however, the path since the RTP and RTCP packets are routed
+   between the endpoints in the RTP session, and thereby they indirectly
+   affect the RTP session.  For this reason, one could believe that
+   Transport Translator-type middleboxes do not need to be included in
+   this document.  This topology, however, can raise additional
+
+
+
+Westerlund & Wenger           Informational                     [Page 8]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   requirements in the RTP implementation and its interactions with the
+   signaling solution.  Both in signaling and in certain RTCP fields,
+   network addresses other than those of the relay can occur since B has
+   a different network address than the relay (T).  Implementations that
+   cannot support this will also not work correctly when endpoints are
+   subject to NAT.
+
+   The Transport Relay implementations also have to take into account
+   security considerations.  In particular, source address filtering of
+   incoming packets is usually important in relays, to prevent attackers
+   from injecting traffic into a session, which one peer may, in the
+   absence of adequate security in the relay, think it comes from the
+   other peer.
+
+3.2.1.2.  Transport Translator
+
+   Shortcut name: Topo-Trn-Translator
+
+   Transport Translators (Topo-Trn-Translator) do not modify the RTP
+   stream itself but are concerned with transport parameters.  Transport
+   parameters, in the sense of this section, comprise the transport
+   addresses (to bridge different domains such as unicast to multicast)
+   and the media packetization to allow other transport protocols to be
+   interconnected to a session (in gateways).
+
+   Translators that bridge between different protocol worlds need to be
+   concerned about the mapping of the SSRC/CSRC (Contributing Source)
+   concept to the non-RTP protocol.  When designing a translator to a
+   non-RTP-based media transport, an important consideration is how to
+   handle different sources and their identities.  This problem space is
+   not discussed henceforth.
+
+   Of the Transport Translators, this memo is primarily interested in
+   those that use RTP on both sides, and this is assumed henceforth.
+
+   The most basic Transport Translators that operate below the RTP level
+   were already discussed in Section 3.2.1.1.
+
+3.2.1.3.  Media Translator
+
+   Shortcut name: Topo-Media-Translator
+
+   Media Translators (Topo-Media-Translator) modify the media inside the
+   RTP stream.  This process is commonly known as transcoding.  The
+   modification of the media can be as small as removing parts of the
+   stream, and it can go all the way to a full decoding and re-encoding
+   (down to the sample level or equivalent) utilizing a different media
+
+
+
+
+Westerlund & Wenger           Informational                     [Page 9]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   codec.  Media Translators are commonly used to connect endpoints
+   without a common interoperability point in the media encoding.
+
+   Stand-alone Media Translators are rare.  Most commonly, a combination
+   of Transport and Media Translator is used to translate both the media
+   and the transport aspects of the RTP stream carrying the media
+   between two transport domains.
+
+   When media translation occurs, the translator's task regarding
+   handling of RTCP traffic becomes substantially more complex.  In this
+   case, the translator needs to rewrite endpoint B's RTCP receiver
+   report before forwarding them to endpoint A.  The rewriting is needed
+   as the RTP stream received by B is not the same RTP stream as the
+   other participants receive.  For example, the number of packets
+   transmitted to B may be lower than what A sends, due to the different
+   media format and data rate.  Therefore, if the receiver reports were
+   forwarded without changes, the extended highest sequence number would
+   indicate that B was substantially behind in reception, while it most
+   likely would not be.  Therefore, the translator must translate that
+   number to a corresponding sequence number for the stream the
+   translator received.  Similar requirements exist for most other
+   fields in the RTCP receiver reports.
+
+   A Media Translator may in some cases act on behalf of the "real"
+   source (the endpoint originally sending the media to the translator)
+   and respond to RTCP feedback messages.  This may occur, for example,
+   when a receiving endpoint requests a bandwidth reduction, and the
+   Media Translator has not detected any congestion or other reasons for
+   bandwidth reduction between the sending endpoint and itself.  In that
+   case, it is sensible that the Media Translator reacts to codec
+   control messages itself, for example, by transcoding to a lower media
+   rate.
+
+   A variant of translator behavior worth pointing out is the one
+   depicted in Figure 3 of an endpoint A sending an RTP stream
+   containing media (only) to B.  On the path, there is a device T that
+   manipulates the RTP streams on A's behalf.  One common example is
+   that T adds a second RTP stream containing Forward Error Correction
+   (FEC) information in order to protect A's (non FEC-protected) RTP
+   stream.  In this case, T needs to semantically bind the new FEC RTP
+   stream to A's media-carrying RTP stream, for example, by using the
+   same CNAME as A.
+
+
+
+
+
+
+
+
+
+Westerlund & Wenger           Informational                    [Page 10]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+                 +------+        +------+         +------+
+                 |      |        |      |         |      |
+                 |  A   |------->|  T   |-------->|  B   |
+                 |      |        |      |---FEC-->|      |
+                 +------+        +------+         +------+
+
+                   Figure 3: Media Translator Adding FEC
+
+   There may also be cases where information is added into the original
+   RTP stream, while leaving most or all of the original RTP packets
+   intact (with the exception of certain RTP header fields, such as the
+   sequence number).  One example is the injection of metadata into the
+   RTP stream, carried in their own RTP packets.
+
+   Similarly, a Media Translator can sometimes remove information from
+   the RTP stream, while otherwise leaving the remaining RTP packets
+   unchanged (again with the exception of certain RTP header fields).
+
+   Either type of functionality where T manipulates the RTP stream, or
+   adds an accompanying RTP stream, on behalf of A is also covered under
+   the Media Translator definition.
+
+3.2.2.  Back-to-Back RTP sessions
+
+   Shortcut name: Topo-Back-To-Back
+
+   There exist middleboxes that interconnect two endpoints (A and B)
+   through themselves (MB), but not by being part of a common RTP
+   session.  Instead, they establish two different RTP sessions: one
+   between A and the middlebox and another between the middlebox and B.
+   This topology is called Topo-Back-To-Back.
+
+                   |<--Session A-->|  |<--Session B-->|
+                 +------+        +------+         +------+
+                 |  A   |------->|  MB  |-------->|  B   |
+                 +------+        +------+         +------+
+
+           Figure 4: Back-to-Back RTP Sessions through Middlebox
+
+   The middlebox acts as an application-level gateway and bridges the
+   two RTP sessions.  This bridging can be as basic as forwarding the
+   RTP payloads between the sessions or more complex including media
+   transcoding.  The difference of this topology relative to the single
+   RTP session context is the handling of the SSRCs and the other
+   session-related identifiers, such as CNAMEs.  With two different RTP
+   sessions, these can be freely changed and it becomes the middlebox's
+   responsibility to maintain the correct relations.
+
+
+
+
+Westerlund & Wenger           Informational                    [Page 11]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   The signaling or other above RTP-level functionalities referencing
+   RTP streams may be what is most impacted by using two RTP sessions
+   and changing identifiers.  The structure with two RTP sessions also
+   puts a congestion control requirement on the middlebox, because it
+   becomes fully responsible for the media stream it sources into each
+   of the sessions.
+
+   Adherence to congestion control can be solved locally on each of the
+   two segments or by bridging statistics from the receiving endpoint
+   through the middlebox to the sending endpoint.  From an
+   implementation point, however, the latter requires dealing with a
+   number of inconsistencies.  First, packet loss must be detected for
+   an RTP stream sent from A to the middlebox, and that loss must be
+   reported through a skipped sequence number in the RTP stream from the
+   middlebox to B.  This coupling and the resulting inconsistencies are
+   conceptually easier to handle when considering the two RTP streams as
+   belonging to a single RTP session.
+
+3.3.  Point to Multipoint Using Multicast
+
+   Multicast is an IP-layer functionality that is available in some
+   networks.  Two main flavors can be distinguished: Any-Source
+   Multicast (ASM) [RFC1112] where any multicast group participant can
+   send to the group address and expect the packet to reach all group
+   participants and Source-Specific Multicast (SSM) [RFC3569], where
+   only a particular IP host sends to the multicast group.  Each of
+   these models are discussed below in their respective sections.
+
+3.3.1.  Any-Source Multicast (ASM)
+
+   Shortcut name: Topo-ASM (was Topo-Multicast)
+
+                                   +-----+
+                        +---+     /       \    +---+
+                        | A |----/         \---| B |
+                        +---+   /   Multi-  \  +---+
+                               +    cast     +
+                        +---+   \  Network  /  +---+
+                        | C |----\         /---| D |
+                        +---+     \       /    +---+
+                                   +-----+
+
+               Figure 5: Point to Multipoint Using Multicast
+
+
+
+
+
+
+
+
+Westerlund & Wenger           Informational                    [Page 12]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   Point to Multipoint (PtM) is defined here as using a multicast
+   topology as a transmission model, in which traffic from any multicast
+   group participant reaches all the other multicast group participants,
+   except for cases such as:
+
+   o  packet loss, or
+
+   o  when a multicast group participant does not wish to receive the
+      traffic for a specific multicast group and, therefore, has not
+      subscribed to the IP multicast group in question.  This scenario
+      can occur, for example, where a Multimedia Session is distributed
+      using two or more multicast groups, and a multicast group
+      participant is subscribed only to a subset of these sessions.
+
+   In the above context, "traffic" encompasses both RTP and RTCP
+   traffic.  The number of multicast group participants can vary between
+   one and many, as RTP and RTCP scale to very large multicast groups
+   (the theoretical limit of the number of participants in a single RTP
+   session is in the range of billions).  The above can be realized
+   using ASM.
+
+   For feedback usage, it is useful to define a "small multicast group"
+   as a group where the number of multicast group participants is so low
+   (and other factors such as the connectivity is so good) that it
+   allows the participants to use early or immediate feedback, as
+   defined in AVPF [RFC4585].  Even when the environment would allow for
+   the use of a small multicast group, some applications may still want
+   to use the more limited options for RTCP feedback available to large
+   multicast groups, for example, when there is a likelihood that the
+   threshold of the small multicast group (in terms of multicast group
+   participants) may be exceeded during the lifetime of a session.
+
+   RTCP feedback messages in multicast reach, like media data, every
+   subscriber (subject to packet losses and multicast group
+   subscription).  Therefore, the feedback suppression mechanism
+   discussed in [RFC4585] is typically required.  Each individual
+   endpoint that is a multicast group participant needs to process every
+   feedback message it receives, not only to determine if it is affected
+   or if the feedback message applies only to some other endpoint but
+   also to derive timing restrictions for the sending of its own
+   feedback messages, if any.
+
+
+
+
+
+
+
+
+
+
+Westerlund & Wenger           Informational                    [Page 13]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+3.3.2.  Source-Specific Multicast (SSM)
+
+   Shortcut name: Topo-SSM
+
+   In Any-Source Multicast, any of the multicast group participants can
+   send to all the other multicast group participants, by sending a
+   packet to the multicast group.  In contrast, Source-Specific
+   Multicast [RFC3569][RFC4607] refers to scenarios where only a single
+   source (Distribution Source) can send to the multicast group,
+   creating a topology that looks like the one below:
+
+          +--------+       +-----+
+          |Media   |       |     |       Source-Specific
+          |Sender 1|<----->| D S |          Multicast
+          +--------+       | I O |  +--+----------------> R(1)
+                           | S U |  |  |                    |
+          +--------+       | T R |  |  +-----------> R(2)   |
+          |Media   |<----->| R C |->+  |           :   |    |
+          |Sender 2|       | I E |  |  +------> R(n-1) |    |
+          +--------+       | B   |  |  |          |    |    |
+              :            | U   |  +--+--> R(n)  |    |    |
+              :            | T +-|          |     |    |    |
+              :            | I | |<---------+     |    |    |
+          +--------+       | O |F|<---------------+    |    |
+          |Media   |       | N |T|<--------------------+    |
+          |Sender M|<----->|   | |<-------------------------+
+          +--------+       +-----+       RTCP Unicast
+
+          FT = Feedback Target
+          Transport from the Feedback Target to the Distribution
+          Source is via unicast or multicast RTCP if they are not
+          co-located.
+
+       Figure 6: Point to Multipoint Using Source-Specific Multicast
+
+   In the SSM topology (Figure 6), a number of RTP sending endpoints
+   (RTP sources henceforth) (1 to M) are allowed to send media to the
+   SSM group.  These sources send media to a dedicated Distribution
+   Source, which forwards the RTP streams to the multicast group on
+   behalf of the original RTP sources.  The RTP streams reach the
+   receiving endpoints (receivers henceforth) (R(1) to R(n)).  The
+   receivers' RTCP messages cannot be sent to the multicast group, as
+   the SSM multicast group by definition has only a single IP sender.
+   To support RTCP, an RTP extension for SSM [RFC5760] was defined.  It
+   uses unicast transmission to send RTCP from each of the receivers to
+   one or more Feedback Targets (FT).  The Feedback Targets relay the
+   RTCP unmodified, or provide a summary of the participants' RTCP
+   reports towards the whole group by forwarding the RTCP traffic to the
+
+
+
+Westerlund & Wenger           Informational                    [Page 14]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   Distribution Source.  Figure 6 only shows a single Feedback Target
+   integrated in the Distribution Source, but for scalability the FT can
+   be distributed and each instance can have responsibility for
+   subgroups of the receivers.  For summary reports, however, there
+   typically must be a single Feedback Target aggregating all the
+   summaries to a common message to the whole receiver group.
+
+   The RTP extension for SSM specifies how feedback (both reception
+   information and specific feedback events) are handled.  The more
+   general problems associated with the use of multicast, where everyone
+   receives what the Distribution Source sends, need to be accounted
+   for.
+
+   The aforementioned situation results in common behavior for RTP
+   multicast:
+
+   1.  Multicast applications often use a group of RTP sessions, not
+       one.  Each endpoint needs to be a member of most or all of these
+       RTP sessions in order to perform well.
+
+   2.  Within each RTP session, the number of media sinks is likely to
+       be much larger than the number of RTP sources.
+
+   3.  Multicast applications need signaling functions to identify the
+       relationships between RTP sessions.
+
+   4.  Multicast applications need signaling functions to identify the
+       relationships between SSRCs in different RTP sessions.
+
+   All multicast configurations share a signaling requirement: all of
+   the endpoints need to have the same RTP and payload type
+   configuration.  Otherwise, endpoint A could, for example, be using
+   payload type 97 to identify the video codec H.264, while endpoint B
+   would identify it as MPEG-2, with unpredictable but almost certainly
+   not visually pleasing results.
+
+   Security solutions for this type of group communication are also
+   challenging.  First, the key management and the security protocol
+   must support group communication.  Source authentication becomes more
+   difficult and requires specialized solutions.  For more discussion on
+   this, please review "Options for Securing RTP Sessions" [RFC7201].
+
+3.3.3.  SSM with Local Unicast Resources
+
+   Shortcut name: Topo-SSM-RAMS
+
+   "Unicast-Based Rapid Acquisition of Multicast RTP Sessions" [RFC6285]
+   results in additional extensions to SSM topology.
+
+
+
+Westerlund & Wenger           Informational                    [Page 15]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+    -----------                                       --------------
+   |           |------------------------------------>|              |
+   |           |.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.->|              |
+   |           |                                     |              |
+   | Multicast |          ----------------           |              |
+   |  Source   |         | Retransmission |          |              |
+   |           |-------->|  Server (RS)   |          |              |
+   |           |.-.-.-.->|                |          |              |
+   |           |         |  ------------  |          |              |
+    -----------          | |  Feedback  | |<.=.=.=.=.|              |
+                         | | Target (FT)| |<~~~~~~~~~| RTP Receiver |
+   PRIMARY MULTICAST     |  ------------  |          |   (RTP_Rx)   |
+   RTP SESSION with      |                |          |              |
+   UNICAST FEEDBACK      |                |          |              |
+                         |                |          |              |
+   - - - - - - - - - - - |- - - - - - - - |- - - - - |- - - - - - - |- -
+                         |                |          |              |
+   UNICAST BURST         |  ------------  |          |              |
+   (or RETRANSMISSION)   | |   Burst/   | |<~~~~~~~~>|              |
+   RTP SESSION           | |  Retrans.  | |.........>|              |
+                         | |Source (BRS)| |<.=.=.=.=>|              |
+                         |  ------------  |          |              |
+                         |                |          |              |
+                          ----------------            --------------
+
+      -------> Multicast RTP Stream
+      .-.-.-.> Multicast RTCP Stream
+      .=.=.=.> Unicast RTCP Reports
+      ~~~~~~~> Unicast RTCP Feedback Messages
+      .......> Unicast RTP Stream
+
+             Figure 7: SSM with Local Unicast Resources (RAMS)
+
+   The rapid acquisition extension allows an endpoint joining an SSM
+   multicast session to request media starting with the last sync point
+   (from where media can be decoded without requiring context
+   established by the decoding of prior packets) to be sent at high
+   speed until such time where, after the decoding of these burst-
+   delivered media packets, the correct media timing is established,
+   i.e., media packets are received within adequate buffer intervals for
+   this application.  This is accomplished by first establishing a
+   unicast PtP RTP session between the Burst/Retransmission Source (BRS)
+   (Figure 7) and the RTP Receiver.  The unicast session is used to
+   transmit cached packets from the multicast group at higher then
+   normal speed in order to synchronize the receiver to the ongoing
+   multicast RTP stream.  Once the RTP receiver and its decoder have
+   caught up with the multicast session's current delivery, the receiver
+   switches over to receiving directly from the multicast group.  In
+
+
+
+Westerlund & Wenger           Informational                    [Page 16]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   many deployed applications, the (still existing) PtP RTP session is
+   used as a repair channel, i.e., for RTP Retransmission traffic of
+   those packets that were not received from the multicast group.
+
+3.4.  Point to Multipoint Using Mesh
+
+   Shortcut name: Topo-Mesh
+
+                             +---+      +---+
+                             | A |<---->| B |
+                             +---+      +---+
+                               ^         ^
+                                \       /
+                                 \     /
+                                  v   v
+                                  +---+
+                                  | C |
+                                  +---+
+
+                 Figure 8: Point to Multipoint Using Mesh
+
+   Based on the RTP session definition, it is clearly possible to have a
+   joint RTP session involving three or more endpoints over multiple
+   unicast transport flows, like the joint three-endpoint session
+   depicted above.  In this case, A needs to send its RTP streams and
+   RTCP packets to both B and C over their respective transport flows.
+   As long as all endpoints do the same, everyone will have a joint view
+   of the RTP session.
+
+   This topology does not create any additional requirements beyond the
+   need to have multiple transport flows associated with a single RTP
+   session.  Note that an endpoint may use a single local port to
+   receive all these transport flows (in which case the sending port, IP
+   address, or SSRC can be used to demultiplex), or it might have
+   separate local reception ports for each of the endpoints.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Westerlund & Wenger           Informational                    [Page 17]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+         +-A--------------------+
+         |+---+                 |
+         ||CAM|                 |                 +-B-----------+
+         |+---+     +-UDP1------|                 |-UDP1------+ |
+         |  |       | +-RTP1----|                 |-RTP1----+ | |
+         |  V       | | +-Video-|                 |-Video-+ | | |
+         |+----+    | | |       |<----------------|BV1    | | | |
+         ||ENC |----+-+-+--->AV1|---------------->|       | | | |
+         |+----+    | | +-------|                 |-------+ | | |
+         |  |       | +---------|                 |---------+ | |
+         |  |       +-----------|                 |-----------+ |
+         |  |                   |                 +-------------+
+         |  |                   |
+         |  |                   |                 +-C-----------+
+         |  |       +-UDP2------|                 |-UDP2------+ |
+         |  |       | +-RTP1----|                 |-RTP1----+ | |
+         |  |       | | +-Video-|                 |-Video-+ | | |
+         |  +-------+-+-+--->AV1|---------------->|       | | | |
+         |          | | |       |<----------------|CV1    | | | |
+         |          | | +-------|                 |-------+ | | |
+         |          | +---------|                 |---------+ | |
+         |          +-----------|                 |-----------+ |
+         +----------------------+                 +-------------+
+
+          Figure 9: A Multi-Unicast Mesh with a Joint RTP Session
+
+   Figure 9 depicts endpoint A's view of using a common RTP session when
+   establishing the mesh as shown in Figure 8.  There is only one RTP
+   session (RTP1) but two transport flows (UDP1 and UDP2).  The Media
+   Source (CAM) is encoded and transmitted over the SSRC (AV1) across
+   both transport layers.  However, as this is a joint RTP session, the
+   two streams must be the same.  Thus, a congestion control adaptation
+   needed for the paths A to B and A to C needs to use the most
+   restricting path's properties.
+
+   An alternative structure for establishing the above topology is to
+   use independent RTP sessions between each pair of peers, i.e., three
+   different RTP sessions.  In some scenarios, the same RTP stream may
+   be sent from the transmitting endpoint; however, it also supports
+   local adaptation taking place in one or more of the RTP streams,
+   rendering them non-identical.
+
+
+
+
+
+
+
+
+
+
+Westerlund & Wenger           Informational                    [Page 18]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+          +-A----------------------+              +-B-----------+
+          |+---+                   |              |             |
+          ||MIC|       +-UDP1------|              |-UDP1------+ |
+          |+---+       | +-RTP1----|              |-RTP1----+ | |
+          | |  +----+  | | +-Audio-|              |-Audio-+ | | |
+          | +->|ENC1|--+-+-+--->AA1|------------->|       | | | |
+          | |  +----+  | | |       |<-------------|BA1    | | | |
+          | |          | | +-------|              |-------+ | | |
+          | |          | +---------|              |---------+ | |
+          | |          +-----------|              |-----------+ |
+          | |          ------------|              |-------------|
+          | |                      |              |-------------+
+          | |                      |
+          | |                      |              +-C-----------+
+          | |                      |              |             |
+          | |          +-UDP2------|              |-UDP2------+ |
+          | |          | +-RTP2----|              |-RTP2----+ | |
+          | |  +----+  | | +-Audio-|              |-Audio-+ | | |
+          | +->|ENC2|--+-+-+--->AA2|------------->|       | | | |
+          |    +----+  | | |       |<-------------|CA1    | | | |
+          |            | | +-------|              |-------+ | | |
+          |            | +---------|              |---------+ | |
+          |            +-----------|              |-----------+ |
+          +------------------------+              +-------------+
+
+      Figure 10: A Multi-Unicast Mesh with an Independent RTP Session
+
+   Let's review the topology when independent RTP sessions are used from
+   A's perspective in Figure 10 by considering both how the media is
+   handled and how the RTP sessions are set up in Figure 10.  A's
+   microphone is captured and the audio is fed into two different
+   encoder instances, each with a different independent RTP session,
+   i.e., RTP1 and RTP2, respectively.  The SSRCs (AA1 and AA2) in each
+   RTP session are completely independent, and the media bitrate
+   produced by the encoders can also be tuned differently to address any
+   congestion control requirements differing for the paths A to B
+   compared to A to C.
+
+   From a topologies viewpoint, an important difference exists in the
+   behavior around RTCP.  First, when a single RTP session spans all
+   three endpoints A, B, and C, and their connecting RTP streams, a
+   common RTCP bandwidth is calculated and used for this single joint
+   session.  In contrast, when there are multiple independent RTP
+   sessions, each RTP session has its local RTCP bandwidth allocation.
+
+   Further, when multiple sessions are used, endpoints not directly
+   involved in a session do not have any awareness of the conditions in
+   those sessions.  For example, in the case of the three-endpoint
+
+
+
+Westerlund & Wenger           Informational                    [Page 19]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   configuration in Figure 8, endpoint A has no awareness of the
+   conditions occurring in the session between endpoints B and C
+   (whereas if a single RTP session were used, it would have such
+   awareness).
+
+   Loop detection is also affected.  With independent RTP sessions, the
+   SSRC/CSRC cannot be used to determine when an endpoint receives its
+   own media stream, or a mixed media stream including its own media
+   stream (a condition known as a loop).  The identification of loops
+   and, in most cases, their avoidance, has to be achieved by other
+   means, for example, through signaling or the use of an RTP external
+   namespace binding SSRC/CSRC among any communicating RTP sessions in
+   the mesh.
+
+3.5.  Point to Multipoint Using the RFC 3550 Translator
+
+   This section discusses some additional usages related to point to
+   multipoint of translators compared to the point-to-point cases in
+   Section 3.2.1.
+
+3.5.1.  Relay - Transport Translator
+
+   Shortcut name: Topo-PtM-Trn-Translator
+
+   This section discusses Transport Translator-only usages to enable
+   multipoint sessions.
+
+                        +-----+
+             +---+     /       \     +------------+      +---+
+             | A |<---/         \    |            |<---->| B |
+             +---+   /           \   |            |      +---+
+                    +  Multicast  +->| Translator |
+             +---+   \  Network  /   |            |      +---+
+             | C |<---\         /    |            |<---->| D |
+             +---+     \       /     +------------+      +---+
+                        +-----+
+
+              Figure 11: Point to Multipoint Using Multicast
+
+   Figure 11 depicts an example of a Transport Translator performing at
+   least IP address translation.  It allows the (non-multicast-capable)
+   endpoints B and D to take part in an Any-Source Multicast session
+   involving endpoints A and C, by having the translator forward their
+   unicast traffic to the multicast addresses in use, and vice versa.
+   It must also forward B's traffic to D, and vice versa, to provide
+   both B and D with a complete view of the session.
+
+
+
+
+
+Westerlund & Wenger           Informational                    [Page 20]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+                   +---+      +------------+      +---+
+                   | A |<---->|            |<---->| B |
+                   +---+      |            |      +---+
+                              | Translator |
+                   +---+      |            |      +---+
+                   | C |<---->|            |<---->| D |
+                   +---+      +------------+      +---+
+
+         Figure 12: RTP Translator (Relay) with Only Unicast Paths
+
+   Another translator scenario is depicted in Figure 12.  The translator
+   in this case connects multiple endpoints through unicast.  This can
+   be implemented using a very simple Transport Translator which, in
+   this document, is called a relay.  The relay forwards all traffic it
+   receives, both RTP and RTCP, to all other endpoints.  In doing so, a
+   multicast network is emulated without relying on a multicast-capable
+   network infrastructure.
+
+   For RTCP feedback, this results in a similar set of considerations to
+   those described in the ASM RTP topology.  It also puts some
+   additional signaling requirements onto the session establishment; for
+   example, a common configuration of RTP payload types is required.
+
+   Transport Translators and relays should always consider implementing
+   source address filtering, to prevent attackers from using the
+   listening ports on the translator to inject traffic.  The translator
+   can, however, go one step further, especially if explicit SSRC
+   signaling is used, to prevent endpoints from sending SSRCs other than
+   its own (that are, for example, used by other participants in the
+   session).  This can improve the security properties of the session,
+   despite the use of group keys that on a cryptographic level allows
+   anyone to impersonate another in the same RTP session.
+
+   A translator that doesn't change the RTP/RTCP packet content can be
+   operated without requiring it to have access to the security contexts
+   used to protect the RTP/RTCP traffic between the participants.
+
+3.5.2.  Media Translator
+
+   In the context of multipoint communications, a Media Translator is
+   not providing new mechanisms to establish a multipoint session.  It
+   is more of an enabler, or facilitator, that ensures a given endpoint
+   or a defined subset of endpoints can participate in the session.
+
+   If endpoint B in Figure 11 were behind a limited network path, the
+   translator may perform media transcoding to allow the traffic
+   received from the other endpoints to reach B without overloading the
+   path.  This transcoding can help the other endpoints in the multicast
+
+
+
+Westerlund & Wenger           Informational                    [Page 21]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   part of the session, by not requiring the quality transmitted by A to
+   be lowered to the bitrates that B is actually capable of receiving
+   (and vice versa).
+
+3.6.  Point to Multipoint Using the RFC 3550 Mixer Model
+
+   Shortcut name: Topo-Mixer
+
+   A mixer is a middlebox that aggregates multiple RTP streams that are
+   part of a session by generating one or more new RTP streams and, in
+   most cases, by manipulating the media data.  One common application
+   for a mixer is to allow a participant to receive a session with a
+   reduced amount of resources.
+
+                        +-----+
+             +---+     /       \     +-----------+      +---+
+             | A |<---/         \    |           |<---->| B |
+             +---+   /   Multi-  \   |           |      +---+
+                    +    cast     +->|   Mixer   |
+             +---+   \  Network  /   |           |      +---+
+             | C |<---\         /    |           |<---->| D |
+             +---+     \       /     +-----------+      +---+
+                        +-----+
+
+       Figure 13: Point to Multipoint Using the RFC 3550 Mixer Model
+
+   A mixer can be viewed as a device terminating the RTP streams
+   received from other endpoints in the same RTP session.  Using the
+   media data carried in the received RTP streams, a mixer generates
+   derived RTP streams that are sent to the receiving endpoints.
+
+   The content that the mixer provides is the mixed aggregate of what
+   the mixer receives over the PtP or PtM paths, which are part of the
+   same Communication Session.
+
+   The mixer creates the Media Source and the source RTP stream just
+   like an endpoint, as it mixes the content (often in the uncompressed
+   domain) and then encodes and packetizes it for transmission to a
+   receiving endpoint.  The CSRC Count (CC) and CSRC fields in the RTP
+   header can be used to indicate the contributors to the newly
+   generated RTP stream.  The SSRCs of the to-be-mixed streams on the
+   mixer input appear as the CSRCs at the mixer output.  That output
+   stream uses a unique SSRC that identifies the mixer's stream.  The
+   CSRC should be forwarded between the different endpoints to allow for
+   loop detection and identification of sources that are part of the
+   Communication Session.  Note that Section 7.1 of RFC 3550 requires
+
+
+
+
+
+Westerlund & Wenger           Informational                    [Page 22]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   the SSRC space to be shared between domains for these reasons.  This
+   also implies that any SDES information normally needs to be forwarded
+   across the mixer.
+
+   The mixer is responsible for generating RTCP packets in accordance
+   with its role.  It is an RTP receiver and should therefore send RTCP
+   receiver reports for the RTP streams it receives and terminates.  In
+   its role as an RTP sender, it should also generate RTCP sender
+   reports for those RTP streams it sends.  As specified in Section 7.3
+   of RFC 3550, a mixer must not forward RTCP unaltered between the two
+   domains.
+
+   The mixer depicted in Figure 13 is involved in three domains that
+   need to be separated: the Any-Source Multicast network (including
+   endpoints A and C), endpoint B, and endpoint D.  Assuming all four
+   endpoints in the conference are interested in receiving content from
+   all other endpoints, the mixer produces different mixed RTP streams
+   for B and D, as the one to B may contain content received from D, and
+   vice versa.  However, the mixer may only need one SSRC per media type
+   in each domain where it is the receiving entity and transmitter of
+   mixed content.
+
+   In the multicast domain, a mixer still needs to provide a mixed view
+   of the other domains.  This makes the mixer simpler to implement and
+   avoids any issues with advanced RTCP handling or loop detection,
+   which would be problematic if the mixer were providing non-symmetric
+   behavior.  Please see Section 3.11 for more discussion on this topic.
+   The mixing operation, however, in each domain could potentially be
+   different.
+
+   A mixer is responsible for receiving RTCP feedback messages and
+   handling them appropriately.  The definition of "appropriate" depends
+   on the message itself and the context.  In some cases, the reception
+   of a codec-control message by the mixer may result in the generation
+   and transmission of RTCP feedback messages by the mixer to the
+   endpoints in the other domain(s).  In other cases, a message is
+   handled by the mixer locally and therefore not forwarded to any other
+   domain.
+
+   When replacing the multicast network in Figure 13 (to the left of the
+   mixer) with individual unicast paths as depicted in Figure 14, the
+   mixer model is very similar to the one discussed in Section 3.9
+   below.  Please see the discussion in Section 3.9 about the
+   differences between these two models.
+
+
+
+
+
+
+
+Westerlund & Wenger           Informational                    [Page 23]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+                   +---+      +------------+      +---+
+                   | A |<---->|            |<---->| B |
+                   +---+      |            |      +---+
+                              |   Mixer    |
+                   +---+      |            |      +---+
+                   | C |<---->|            |<---->| D |
+                   +---+      +------------+      +---+
+
+               Figure 14: RTP Mixer with Only Unicast Paths
+
+   We now discuss in more detail the different mixing operations that a
+   mixer can perform and how they can affect RTP and RTCP behavior.
+
+3.6.1.  Media-Mixing Mixer
+
+   The Media-Mixing Mixer is likely the one that most think of when they
+   hear the term "mixer".  Its basic mode of operation is that it
+   receives RTP streams from several endpoints and selects the stream(s)
+   to be included in a media-domain mix.  The selection can be through
+   static configuration or by dynamic, content-dependent means such as
+   voice activation.  The mixer then creates a single outgoing RTP
+   stream from this mix.
+
+   The most commonly deployed Media-Mixing Mixer is probably the audio
+   mixer, used in voice conferencing, where the output consists of a
+   mixture of all the input audio signals; this needs minimal signaling
+   to be successfully set up.  From a signal processing viewpoint, audio
+   mixing is relatively straightforward and commonly possible for a
+   reasonable number of endpoints.  Assume, for example, that one wants
+   to mix N streams from N different endpoints.  The mixer needs to
+   decode those N streams, typically into the sample domain, and then
+   produce N or N+1 mixes.  Different mixes are needed so that each
+   endpoint gets a mix of all other sources except its own, as this
+   would result in an echo.  When N is lower than the number of all
+   endpoints, one may produce a mix of all N streams for the group that
+   are currently not included in the mix; thus, N+1 mixes.  These audio
+   streams are then encoded again, RTP packetized, and sent out.  In
+   many cases, audio level normalization, noise suppression, and similar
+   signal processing steps are also required or desirable before the
+   actual mixing process commences.
+
+   In video, the term "mixing" has a different interpretation than
+   audio.  It is commonly used to refer to the process of spatially
+   combining contributed video streams, which is also known as "tiling".
+   The reconstructed, appropriately scaled down videos can be spatially
+   arranged in a set of tiles, with each tile containing the video from
+   an endpoint (typically showing a human participant).  Tiles can be of
+   different sizes so that, for example, a particularly important
+
+
+
+Westerlund & Wenger           Informational                    [Page 24]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   participant, or the loudest speaker, is being shown in a larger tile
+   than other participants.  A self-view picture can be included in the
+   tiling, which can be either locally produced or feedback from a
+   mixer-received and reconstructed video image.  Such remote loopback
+   allows for confidence monitoring, i.e., it enables the participant to
+   see himself/herself in the same quality as other participants see
+   him/her.  The tiling normally operates on reconstructed video in the
+   sample domain.  The tiled image is encoded, packetized, and sent by
+   the mixer to the receiving endpoints.  It is possible that a
+   middlebox with media mixing duties contains only a single mixer of
+   the aforementioned type, in which case all participants necessarily
+   see the same tiled video, even if it is being sent over different RTP
+   streams.  More common, however, are mixing arrangements where an
+   individual mixer is available for each outgoing port of the
+   middlebox, allowing individual compositions for each receiving
+   endpoint (a feature commonly referred to as personalized layout).
+
+   One problem with media mixing is that it consumes both large amounts
+   of media processing resources (for the decoding and mixing process in
+   the uncompressed domain) and encoding resources (for the encoding of
+   the mixed signal).  Another problem is the quality degradation
+   created by decoding and re-encoding the media, which is the result of
+   the lossy nature of the most commonly used media codecs.  A third
+   problem is the latency introduced by the media mixing, which can be
+   substantial and annoyingly noticeable in case of video, or in case of
+   audio if that mixed audio is lip-synchronized with high-latency
+   video.  The advantage of media mixing is that it is straightforward
+   for the endpoints to handle the single media stream (which includes
+   the mixed aggregate of many sources), as they don't need to handle
+   multiple decodings, local mixing, and composition.  In fact, mixers
+   were introduced in pre-RTP times so that legacy, single stream
+   receiving endpoints (that, in some protocol environments, actually
+   didn't need to be aware of the multipoint nature of the conference)
+   could successfully participate in what a user would recognize as a
+   multiparty video conference.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Westerlund & Wenger           Informational                    [Page 25]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+           +-A---------+          +-MIXER----------------------+
+           | +-RTP1----|          |-RTP1------+        +-----+ |
+           | | +-Audio-|          |-Audio---+ | +---+  |     | |
+           | | |    AA1|--------->|---------+-+-|DEC|->|     | |
+           | | |       |<---------|MA1 <----+ | +---+  |     | |
+           | | |       |          |(BA1+CA1)|\| +---+  |     | |
+           | | +-------|          |---------+ +-|ENC|<-| B+C | |
+           | +---------|          |-----------+ +---+  |     | |
+           +-----------+          |                    |     | |
+                                  |                    |  M  | |
+           +-B---------+          |                    |  E  | |
+           | +-RTP2----|          |-RTP2------+        |  D  | |
+           | | +-Audio-|          |-Audio---+ | +---+  |  I  | |
+           | | |    BA1|--------->|---------+-+-|DEC|->|  A  | |
+           | | |       |<---------|MA2 <----+ | +---+  |     | |
+           | | +-------|          |(AA1+CA1)|\| +---+  |     | |
+           | +---------|          |---------+ +-|ENC|<-| A+C | |
+           +-----------+          |-----------+ +---+  |     | |
+                                  |                    |  M  | |
+           +-C---------+          |                    |  I  | |
+           | +-RTP3----|          |-RTP3------+        |  X  | |
+           | | +-Audio-|          |-Audio---+ | +---+  |  E  | |
+           | | |    CA1|--------->|---------+-+-|DEC|->|  R  | |
+           | | |       |<---------|MA3 <----+ | +---+  |     | |
+           | | +-------|          |(AA1+BA1)|\| +---+  |     | |
+           | +---------|          |---------+ +-|ENC|<-| A+B | |
+           +-----------+          |-----------+ +---+  +-----+ |
+                                  +----------------------------+
+
+            Figure 15: Session and SSRC Details for Media Mixer
+
+   From an RTP perspective, media mixing can be a very simple process,
+   as can be seen in Figure 15.  The mixer presents one SSRC towards the
+   receiving endpoint, e.g., MA1 to Peer A, where the associated stream
+   is the media mix of the other endpoints.  As each peer, in this
+   example, receives a different version of a mix from the mixer, there
+   is no actual relation between the different RTP sessions in terms of
+   actual media or transport-level information.  There are, however,
+   common relationships between RTP1-RTP3, namely SSRC space and
+   identity information.  When A receives the MA1 stream, which is a
+   combination of BA1 and CA1 streams, the mixer may include CSRC
+   information in the MA1 stream to identify the Contributing Sources
+   BA1 and CA1, allowing the receiver to identify the Contributing
+   Sources even if this were not possible through the media itself or
+   through other signaling means.
+
+   The CSRC has, in turn, utility in RTP extensions, like the RTP header
+   extension for Mixer-to-Client Audio Level Indication [RFC6465].  If
+
+
+
+Westerlund & Wenger           Informational                    [Page 26]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   the SSRCs from the endpoint to mixer paths are used as CSRCs in
+   another RTP session, then RTP1, RTP2, and RTP3 become one joint
+   session as they have a common SSRC space.  At this stage, the mixer
+   also needs to consider which RTCP information it needs to expose in
+   the different paths.  In the above scenario, a mixer would normally
+   expose nothing more than the SDES information and RTCP BYE for a CSRC
+   leaving the session.  The main goal would be to enable the correct
+   binding against the application logic and other information sources.
+   This also enables loop detection in the RTP session.
+
+3.6.2.  Media-Switching Mixer
+
+   Media-Switching Mixers are used in limited functionality scenarios
+   where no, or only very limited, concurrent presentation of multiple
+   sources is required by the application and also in more complex
+   multi-stream usages with receiver mixing or tiling, including
+   combined with simulcast and/or scalability between source and mixer.
+   An RTP mixer based on media switching avoids the media decoding and
+   encoding operations in the mixer, as it conceptually forwards the
+   encoded media stream as it was being sent to the mixer.  It does not
+   avoid, however, the decryption and re-encryption cycle as it rewrites
+   RTP headers.  Forwarding media (in contrast to reconstructing-mixing-
+   encoding media) reduces the amount of computational resources needed
+   in the mixer and increases the media quality (both in terms of
+   fidelity and reduced latency).
+
+   A Media-Switching Mixer maintains a pool of SSRCs representing
+   conceptual or functional RTP streams that the mixer can produce.
+   These RTP streams are created by selecting media from one of the RTP
+   streams received by the mixer and forwarded to the peer using the
+   mixer's own SSRCs.  The mixer can switch between available sources if
+   that is required by the concept for the source, like the currently
+   active speaker.  Note that the mixer, in most cases, still needs to
+   perform a certain amount of media processing, as many media formats
+   do not allow to "tune into" the stream at arbitrary points in their
+   bitstream.
+
+   To achieve a coherent RTP stream from the mixer's SSRC, the mixer
+   needs to rewrite the incoming RTP packet's header.  First, the SSRC
+   field must be set to the value of the mixer's SSRC.  Second, the
+   sequence number must be the next in the sequence of outgoing packets
+   it sent.  Third, the RTP timestamp value needs to be adjusted using
+   an offset that changes each time one switches the Media Source.
+   Finally, depending on the negotiation of the RTP payload type, the
+   value representing this particular RTP payload configuration may have
+   to be changed if the different endpoint-to-mixer paths have not
+   arrived on the same numbering for a given configuration.  This also
+
+
+
+
+Westerlund & Wenger           Informational                    [Page 27]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   requires that the different endpoints support a common set of codecs,
+   otherwise media transcoding for codec compatibility would still be
+   required.
+
+   We now consider the operation of a Media-Switching Mixer that
+   supports a video conference with six participating endpoints (A-F)
+   where the two most recent speakers in the conference are shown to
+   each receiving endpoint.  Thus, the mixer has two SSRCs sending video
+   to each peer, and each peer is capable of locally handling two video
+   streams simultaneously.
+
+         +-A---------+             +-MIXER----------------------+
+         | +-RTP1----|             |-RTP1------+        +-----+ |
+         | | +-Video-|             |-Video---+ |        |     | |
+         | | |    AV1|------------>|---------+-+------->|  S  | |
+         | | |       |<------------|MV1 <----+-+-BV1----|  W  | |
+         | | |       |<------------|MV2 <----+-+-EV1----|  I  | |
+         | | +-------|             |---------+ |        |  T  | |
+         | +---------|             |-----------+        |  C  | |
+         +-----------+             |                    |  H  | |
+                                   |                    |     | |
+         +-B---------+             |                    |  M  | |
+         | +-RTP2----|             |-RTP2------+        |  A  | |
+         | | +-Video-|             |-Video---+ |        |  T  | |
+         | | |    BV1|------------>|---------+-+------->|  R  | |
+         | | |       |<------------|MV3 <----+-+-AV1----|  I  | |
+         | | |       |<------------|MV4 <----+-+-EV1----|  X  | |
+         | | +-------|             |---------+ |        |     | |
+         | +---------|             |-----------+        |     | |
+         +-----------+             |                    |     | |
+                                   :                    :     : :
+                                   :                    :     : :
+         +-F---------+             |                    |     | |
+         | +-RTP6----|             |-RTP6------+        |     | |
+         | | +-Video-|             |-Video---+ |        |     | |
+         | | |    FV1|------------>|---------+-+------->|     | |
+         | | |       |<------------|MV11 <---+-+-AV1----|     | |
+         | | |       |<------------|MV12 <---+-+-EV1----|     | |
+         | | +-------|             |---------+ |        |     | |
+         | +---------|             |-----------+        +-----+ |
+         +-----------+             +----------------------------+
+
+
+                   Figure 16: Media-Switching RTP Mixer
+
+
+
+
+
+
+
+Westerlund & Wenger           Informational                    [Page 28]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   The Media-Switching Mixer can, similarly to the Media-Mixing Mixer,
+   reduce the bitrate required for media transmission towards the
+   different peers by selecting and forwarding only a subset of RTP
+   streams it receives from the sending endpoints.  In case the mixer
+   receives simulcast transmissions or a scalable encoding of the Media
+   Source, the mixer has more degrees of freedom to select streams or
+   subsets of streams to forward to a receiving endpoint, both based on
+   transport or endpoint restrictions as well as application logic.
+
+   To ensure that a media receiver in an endpoint can correctly decode
+   the media in the RTP stream after a switch, a codec that uses
+   temporal prediction needs to start its decoding from independent
+   refresh points, or points in the bitstream offering similar
+   functionality (like "dirty refresh points").  For some codecs, for
+   example, frame-based speech and audio codecs, this is easily achieved
+   by starting the decoding at RTP packet boundaries, as each packet
+   boundary provides a refresh point (assuming proper packetization on
+   the encoder side).  For other codecs, particularly in video, refresh
+   points are less common in the bitstream or may not be present at all
+   without an explicit request to the respective encoder.  The Full
+   Intra Request [RFC5104] RTCP codec control message has been defined
+   for this purpose.
+
+   In this type of mixer, one could consider fully terminating the RTP
+   sessions between the different endpoint and mixer paths.  The same
+   arguments and considerations as discussed in Section 3.9 need to be
+   taken into consideration and apply here.
+
+3.7.  Selective Forwarding Middlebox
+
+   Another method for handling media in the RTP mixer is to "project",
+   or make available, all potential RTP sources (SSRCs) into a per-
+   endpoint, independent RTP session.  The middlebox can select which of
+   the potential sources that are currently actively transmitting media
+   will be sent to each of the endpoints.  This is similar to the Media-
+   Switching Mixer but has some important differences in RTP details.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Westerlund & Wenger           Informational                    [Page 29]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+          +-A---------+             +-Middlebox-----------------+
+          | +-RTP1----|             |-RTP1------+       +-----+ |
+          | | +-Video-|             |-Video---+ |       |     | |
+          | | |    AV1|------------>|---------+-+------>|     | |
+          | | |       |<------------|BV1 <----+-+-------|  S  | |
+          | | |       |<------------|CV1 <----+-+-------|  W  | |
+          | | |       |<------------|DV1 <----+-+-------|  I  | |
+          | | |       |<------------|EV1 <----+-+-------|  T  | |
+          | | |       |<------------|FV1 <----+-+-------|  C  | |
+          | | +-------|             |---------+ |       |  H  | |
+          | +---------|             |-----------+       |     | |
+          +-----------+             |                   |  M  | |
+                                    |                   |  A  | |
+          +-B---------+             |                   |  T  | |
+          | +-RTP2----|             |-RTP2------+       |  R  | |
+          | | +-Video-|             |-Video---+ |       |  I  | |
+          | | |    BV1|------------>|---------+-+------>|  X  | |
+          | | |       |<------------|AV1 <----+-+-------|     | |
+          | | |       |<------------|CV1 <----+-+-------|     | |
+          | | |       | :    :    : |: :  : : : : :  : :|     | |
+          | | |       |<------------|FV1 <----+-+-------|     | |
+          | | +-------|             |---------+ |       |     | |
+          | +---------|             |-----------+       |     | |
+          +-----------+             |                   |     | |
+                                    :                   :     : :
+                                    :                   :     : :
+          +-F---------+             |                   |     | |
+          | +-RTP6----|             |-RTP6------+       |     | |
+          | | +-Video-|             |-Video---+ |       |     | |
+          | | |    FV1|------------>|---------+-+------>|     | |
+          | | |       |<------------|AV1 <----+-+-------|     | |
+          | | |       | :    :    : |: :  : : : : :  : :|     | |
+          | | |       |<------------|EV1 <----+-+-------|     | |
+          | | +-------|             |---------+ |       |     | |
+          | +---------|             |-----------+       +-----+ |
+          +-----------+             +---------------------------+
+
+                 Figure 17: Selective Forwarding Middlebox
+
+   In the six endpoint conference depicted above (in Figure 17), one can
+   see that endpoint A is aware of five incoming SSRCs, BV1-FV1.  If
+   this middlebox intends to have a similar behavior as in Section 3.6.2
+   where the mixer provides the endpoints with the two latest speaking
+   endpoints, then only two out of these five SSRCs need concurrently
+   transmit media to A.  As the middlebox selects the source in the
+   different RTP sessions that transmit media to the endpoints, each RTP
+   stream requires the rewriting of certain RTP header fields when being
+   projected from one session into another.  In particular, the sequence
+
+
+
+Westerlund & Wenger           Informational                    [Page 30]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   number needs to be consecutively incremented based on the packet
+   actually being transmitted in each RTP session.  Therefore, the RTP
+   sequence number offset will change each time a source is turned on in
+   an RTP session.  The timestamp (possibly offset) stays the same.
+
+   The RTP sessions can be considered independent, resulting in that the
+   SSRC numbers used can also be handled independently.  This simplifies
+   the SSRC collision detection and avoidance but requires tools such as
+   remapping tables between the RTP sessions.  Using independent RTP
+   sessions is not required, as it is possible for the switching
+   behavior to also perform with a common SSRC space.  However, in this
+   case, collision detection and handling becomes a different problem.
+   It is up to the implementation to use a single common SSRC space or
+   separate ones.
+
+   Using separate SSRC spaces has some implications.  For example, the
+   RTP stream that is being sent by endpoint B to the middlebox (BV1)
+   may use an SSRC value of 12345678.  When that RTP stream is sent to
+   endpoint F by the middlebox, it can use any SSRC value, e.g.,
+   87654321.  As a result, each endpoint may have a different view of
+   the application usage of a particular SSRC.  Any RTP-level identity
+   information, such as SDES items, also needs to update the SSRC
+   referenced, if the included SDES items are intended to be global.
+   Thus, the application must not use SSRC as references to RTP streams
+   when communicating with other peers directly.  This also affects loop
+   detection, which will fail to work as there is no common namespace
+   and identities across the different legs in the Communication Session
+   on the RTP level.  Instead, this responsibility falls onto higher
+   layers.
+
+   The middlebox is also responsible for receiving any RTCP codec
+   control requests coming from an endpoint and deciding if it can act
+   on the request locally or needs to translate the request into the RTP
+   session/transport leg that contains the Media Source.  Both endpoints
+   and the middlebox need to implement conference-related codec control
+   functionalities to provide a good experience.  Commonly used are Full
+   Intra Request to request from the Media Source that switching points
+   be provided between the sources and Temporary Maximum Media Bitrate
+   Request (TMMBR) to enable the middlebox to aggregate congestion
+   control responses towards the Media Source so to enable it to adjust
+   its bitrate (obviously, only in case the limitation is not in the
+   source to middlebox link).
+
+   The Selective Forwarding Middlebox has been introduced in recently
+   developed videoconferencing systems in conjunction with, and to
+   capitalize on, scalable video coding as well as simulcasting.  An
+   example of scalable video coding is Annex G of H.264, but other
+   codecs, including H.264 AVC and VP8, also exhibit scalability, albeit
+
+
+
+Westerlund & Wenger           Informational                    [Page 31]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   only in the temporal dimension.  In both scalable coding and
+   simulcast cases, the video signal is represented by a set of two or
+   more bitstreams, providing a corresponding number of distinct
+   fidelity points.  The middlebox selects which parts of a scalable
+   bitstream (or which bitstream, in the case of simulcasting) to
+   forward to each of the receiving endpoints.  The decision may be
+   driven by a number of factors, such as available bitrate, desired
+   layout, etc.  Contrary to transcoding MCUs, SFMs have extremely low
+   delay and provide features that are typically associated with high-
+   end systems (personalized layout, error localization) without any
+   signal processing at the middlebox.  They are also capable of scaling
+   to a large number of concurrent users, and--due to their very low
+   delay--can also be cascaded.
+
+   This version of the middlebox also puts different requirements on the
+   endpoint when it comes to decoder instances and handling of the RTP
+   streams providing media.  As each projected SSRC can, at any time,
+   provide media, the endpoint either needs to be able to handle as many
+   decoder instances as the middlebox received, or have efficient
+   switching of decoder contexts in a more limited set of actual decoder
+   instances to cope with the switches.  The application also gets more
+   responsibility to update how the media provided is to be presented to
+   the user.
+
+   Note that this topology could potentially be seen as a Media
+   Translator that includes an on/off logic as part of its media
+   translation.  The topology has the property that all SSRCs present in
+   the session are visible to an endpoint.  It also has mixer aspects,
+   as the streams it provides are not basically translated versions, but
+   instead they have conceptual property assigned to them and can be
+   both turned on/off as well as fully or partially delivered.  Thus,
+   this topology appears to be some hybrid between the translator and
+   mixer model.
+
+   The differences between a Selective Forwarding Middlebox and a
+   Switching-Media Mixer (Section 3.6.2) are minor, and they share most
+   properties.  The above requirement on having a large number of
+   decoding instances or requiring efficient switching of decoder
+   contexts, are one point of difference.  The other is how the
+   identification is performed, where the mixer uses CSRC to provide
+   information on what is included in a particular RTP stream that
+   represents a particular concept.  Selective forwarding gets the
+   source information through the SSRC and instead uses other mechanisms
+   to indicate the streams intended usage, if needed.
+
+
+
+
+
+
+
+Westerlund & Wenger           Informational                    [Page 32]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+3.8.  Point to Multipoint Using Video-Switching MCUs
+
+   Shortcut name: Topo-Video-switch-MCU
+
+                   +---+      +------------+      +---+
+                   | A |------| Multipoint |------| B |
+                   +---+      |  Control   |      +---+
+                              |   Unit     |
+                   +---+      |   (MCU)    |      +---+
+                   | C |------|            |------| D |
+                   +---+      +------------+      +---+
+
+        Figure 18: Point to Multipoint Using a Video-Switching MCU
+
+   This PtM topology was popular in early implementations of multipoint
+   videoconferencing systems due to its simplicity, and the
+   corresponding middlebox design has been known as a "video-switching
+   MCU".  The more complex RTCP-terminating MCUs, discussed in the next
+   section, became the norm, however, when technology allowed
+   implementations at acceptable costs.
+
+   A video-switching MCU forwards to a participant a single media
+   stream, selected from the available streams.  The criteria for
+   selection are often based on voice activity in the audio-visual
+   conference, but other conference management mechanisms (like
+   presentation mode or explicit floor control) are known to exist as
+   well.
+
+   The video-switching MCU may also perform media translation to modify
+   the content in bitrate, encoding, or resolution.  However, it still
+   may indicate the original sender of the content through the SSRC.  In
+   this case, the values of the CC and CSRC fields are retained.
+
+   If not terminating RTP, the RTCP sender reports are forwarded for the
+   currently selected sender.  All RTCP receiver reports are freely
+   forwarded between the endpoints.  In addition, the MCU may also
+   originate RTCP control traffic in order to control the session and/or
+   report on status from its viewpoint.
+
+   The video-switching MCU has most of the attributes of a translator.
+   However, its stream selection is a mixing behavior.  This behavior
+   has some RTP and RTCP issues associated with it.  The suppression of
+   all but one RTP stream results in most participants seeing only a
+   subset of the sent RTP streams at any given time, often a single RTP
+   stream per conference.  Therefore, RTCP receiver reports only report
+   on these RTP streams.  Consequently, the endpoints emitting RTP
+   streams that are not currently forwarded receive a view of the
+   session that indicates their RTP streams disappear somewhere en
+
+
+
+Westerlund & Wenger           Informational                    [Page 33]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   route.  This makes the use of RTCP for congestion control, or any
+   type of quality reporting, very problematic.
+
+   To avoid the aforementioned issues, the MCU needs to implement two
+   features.  First, it needs to act as a mixer (see Section 3.6) and
+   forward the selected RTP stream under its own SSRC and with the
+   appropriate CSRC values.  Second, the MCU needs to modify the RTCP
+   RRs it forwards between the domains.  As a result, it is recommended
+   that one implement a centralized video-switching conference using a
+   mixer according to RFC 3550, instead of the shortcut implementation
+   described here.
+
+3.9.  Point to Multipoint Using RTCP-Terminating MCU
+
+   Shortcut name: Topo-RTCP-terminating-MCU
+
+                   +---+      +------------+      +---+
+                   | A |<---->| Multipoint |<---->| B |
+                   +---+      |  Control   |      +---+
+                              |   Unit     |
+                   +---+      |   (MCU)    |      +---+
+                   | C |<---->|            |<---->| D |
+                   +---+      +------------+      +---+
+
+        Figure 19: Point to Multipoint Using Content Modifying MCUs
+
+   In this PtM scenario, each endpoint runs an RTP point-to-point
+   session between itself and the MCU.  This is a very commonly deployed
+   topology in multipoint video conferencing.  The content that the MCU
+   provides to each participant is either:
+
+   a.  a selection of the content received from the other endpoints or
+
+   b.  the mixed aggregate of what the MCU receives from the other PtP
+       paths, which are part of the same Communication Session.
+
+   In case (a), the MCU may modify the content in terms of bitrate,
+   encoding format, or resolution.  No explicit RTP mechanism is used to
+   establish the relationship between the original RTP stream of the
+   media being sent and the RTP stream the MCU sends.  In other words,
+   the outgoing RTP streams typically use a different SSRC, and may well
+   use a different payload type (PT), even if this different PT happens
+   to be mapped to the same media type.  This is a result of the
+   individually negotiated RTP session for each endpoint.
+
+   In case (b), the MCU is the Media Source and generates the Source RTP
+   Stream as it mixes the received content and then encodes and
+   packetizes it for transmission to an endpoint.  According to RTP
+
+
+
+Westerlund & Wenger           Informational                    [Page 34]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   [RFC3550], the SSRC of the contributors are to be signaled using the
+   CSRC/CC mechanism.  In practice, today, most deployed MCUs do not
+   implement this feature.  Instead, the identification of the endpoints
+   whose content is included in the mixer's output is not indicated
+   through any explicit RTP mechanism.  That is, most deployed MCUs set
+   the CC field in the RTP header to zero, thereby indicating no
+   available CSRC information, even if they could identify the original
+   sending endpoints as suggested in RTP.
+
+   The main feature that sets this topology apart from what RFC 3550
+   describes is the breaking of the common RTP session across the
+   centralized device, such as the MCU.  This results in the loss of
+   explicit RTP-level indication of all participants.  If one were using
+   the mechanisms available in RTP and RTCP to signal this explicitly,
+   the topology would follow the approach of an RTP mixer.  The lack of
+   explicit indication has at least the following potential problems:
+
+   1.  Loop detection cannot be performed on the RTP level.  When
+       carelessly connecting two misconfigured MCUs, a loop could be
+       generated.
+
+   2.  There is no information about active media senders available in
+       the RTP packet.  As this information is missing, receivers cannot
+       use it.  It also deprives the client of information related to
+       currently active senders in a machine-usable way, thus preventing
+       clients from indicating currently active speakers in user
+       interfaces, etc.
+
+   Note that many/most deployed MCUs (and video conferencing endpoints)
+   rely on signaling-layer mechanisms for the identification of the
+   Contributing Sources, for example, a SIP conferencing package
+   [RFC4575].  This alleviates, to some extent, the aforementioned
+   issues resulting from ignoring RTP's CSRC mechanism.
+
+3.10.  Split Component Terminal
+
+   Shortcut name: Topo-Split-Terminal
+
+   In some applications, for example, in some telepresence systems,
+   terminals may not be integrated into a single functional unit but
+   composed of more than one subunits.  For example, a telepresence room
+   terminal employing multiple cameras and monitors may consist of
+   multiple video conferencing subunits, each capable of handling a
+   single camera and monitor.  Another example would be a video
+   conferencing terminal in which audio is handled by one subunit, and
+   video by another.  Each of these subunits uses its own physical
+   network interface (for example: Ethernet jack) and network address.
+
+
+
+
+Westerlund & Wenger           Informational                    [Page 35]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   The various (media processing) subunits need (logically and
+   physically) to be interconnected by control functionality, but their
+   media plane functionality may be split.  These types of terminals are
+   referred to as split component terminals.  Historically, the earliest
+   split component terminals were perhaps the independent audio and
+   video conference software tools used over the MBONE in the late
+   1990s.
+
+   An example for such a split component terminal is depicted in
+   Figure 20.  Within split component terminal A, at least audio and
+   video subunits are addressed by their own network addresses.  In some
+   of these systems, the control stack subunit may also have its own
+   network address.
+
+   From an RTP viewpoint, each of the subunits terminates RTP and acts
+   as an endpoint in the sense that each subunit includes its own,
+   independent RTP stack.  However, as the subunits are semantically
+   part of the same terminal, it is appropriate that this semantic
+   relationship is expressed in RTCP protocol elements, namely in the
+   CNAME.
+
+               +---------------------+
+               | Endpoint A          |
+               | Local Area Network  |
+               |      +------------+ |
+               |   +->| Audio      |<+-RTP---\
+               |   |  +------------+ |        \    +------+
+               |   |  +------------+ |         +-->|      |
+               |   +->| Video      |<+-RTP-------->|  B   |
+               |   |  +------------+ |         +-->|      |
+               |   |  +------------+ |        /    +------+
+               |   +->| Control    |<+-SIP---/
+               |      +------------+ |
+               +---------------------+
+
+                    Figure 20: Split Component Terminal
+
+   It is further sensible that the subunits share a common clock from
+   which RTP and RTCP clocks are derived, to facilitate synchronization
+   and avoid clock drift.
+
+   To indicate that audio and video Source Streams generated by
+   different subunits share a common clock, and can be synchronized, the
+   RTP streams generated from those Source Streams need to include the
+   same CNAME in their RTCP SDES packets.  The use of a common CNAME for
+   RTP flows carried in different transport-layer flows is entirely
+   normal for RTP and RTCP senders, and fully compliant RTP endpoints,
+   middleboxes, and other tools should have no problem with this.
+
+
+
+Westerlund & Wenger           Informational                    [Page 36]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   However, outside of the split component terminal scenario (and
+   perhaps a multihomed endpoint scenario, which is not further
+   discussed herein), the use of a common CNAME in RTP streams sent from
+   separate endpoints (as opposed to a common CNAME for RTP streams sent
+   on different transport-layer flows between two endpoints) is rare.
+   It has been reported that at least some third-party tools like some
+   network monitors do not handle gracefully endpoints that use a common
+   CNAME across multiple transport-layer flows: they report an error
+   condition in which two separate endpoints are using the same CNAME.
+   Depending on the sophistication of the support staff, such erroneous
+   reports can lead to support issues.
+
+   The aforementioned support issue can sometimes be avoided if each of
+   the subunits of a split component terminal is configured to use a
+   different CNAME, with the synchronization between the RTP streams
+   being indicated by some non-RTP signaling channel rather than using a
+   common CNAME sent in RTCP.  This complicates the signaling,
+   especially in cases where there are multiple SSRCs in use with
+   complex synchronization requirements, as is the same in many current
+   telepresence systems.  Unless one uses RTCP terminating topologies
+   such as Topo-RTCP-terminating-MCU, sessions involving more than one
+   video subunit with a common CNAME are close to unavoidable.
+
+   The different RTP streams comprising a split terminal system can form
+   a single RTP session or they can form multiple RTP sessions,
+   depending on the visibility of their SSRC values in RTCP reports.  If
+   the receiver of the RTP streams sent by the split terminal sends
+   reports relating to all of the RTP flows (i.e., to each SSRC) in each
+   RTCP report, then a single RTP session is formed.  Alternatively, if
+   the receiver of the RTP streams sent by the split terminal does not
+   send cross-reports in RTCP, then the audio and video form separate
+   RTP sessions.
+
+   For example, in Figure 20, B will send RTCP reports to each of the
+   subunits of A.  If the RTCP packets that B sends to the audio subunit
+   of A include reports on the reception quality of the video as well as
+   the audio, and similarly if the RTCP packets that B sends to the
+   video subunit of A include reports on the reception quality of the
+   audio as well as video, then a single RTP session is formed.
+   However, if the RTCP packets B sends to the audio subunit of A only
+   report on the received audio, and the RTCP packets B sends to the
+   video subunit of A only report on the received video, then there are
+   two separate RTP sessions.
+
+   Forming a single RTP session across the RTP streams sent by the
+   different subunits of a split terminal gives each subunit visibility
+   into reception quality of RTP streams sent by the other subunits.
+
+
+
+
+Westerlund & Wenger           Informational                    [Page 37]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   This information can help diagnose reception quality problems, but at
+   the cost of increased RTCP bandwidth use.
+
+   RTP streams sent by the subunits of a split terminal need to use the
+   same CNAME in their RTCP packets if they are to be synchronized,
+   irrespective of whether a single RTP session is formed or not.
+
+3.11.  Non-symmetric Mixer/Translators
+
+   Shortcut name: Topo-Asymmetric
+
+   It is theoretically possible to construct an MCU that is a mixer in
+   one direction and a translator in another.  The main reason to
+   consider this would be to allow topologies similar to Figure 13,
+   where the mixer does not need to mix in the direction from B or D
+   towards the multicast domains with A and C.  Instead, the RTP streams
+   from B and D are forwarded without changes.  Avoiding this mixing
+   would save media processing resources that perform the mixing in
+   cases where it isn't needed.  However, there would still be a need to
+   mix B's media towards D.  Only in the direction B -> multicast domain
+   or D -> multicast domain would it be possible to work as a
+   translator.  In all other directions, it would function as a mixer.
+
+   The mixer/translator would still need to process and change the RTCP
+   before forwarding it in the directions of B or D to the multicast
+   domain.  One issue is that A and C do not know about the mixed-media
+   stream the mixer sends to either B or D.  Therefore, any reports
+   related to these streams must be removed.  Also, receiver reports
+   related to A's and C's RTP streams would be missing.  To avoid A and
+   C thinking that B and D aren't receiving A and C at all, the mixer
+   needs to insert locally generated reports reflecting the situation
+   for the streams from A and C into B's and D's sender reports.  In the
+   opposite direction, the receiver reports from A and C about B's and
+   D's streams also need to be aggregated into the mixer's receiver
+   reports sent to B and D.  Since B and D only have the mixer as source
+   for the stream, all RTCP from A and C must be suppressed by the
+   mixer.
+
+   This topology is so problematic, and it is so easy to get the RTCP
+   processing wrong, that it is not recommended for implementation.
+
+3.12.  Combining Topologies
+
+   Topologies can be combined and linked to each other using mixers or
+   translators.  However, care must be taken in handling the SSRC/CSRC
+   space.  A mixer does not forward RTCP from sources in other domains,
+   but instead generates its own RTCP packets for each domain it mixes
+   into, including the necessary SDES information for both the CSRCs and
+
+
+
+Westerlund & Wenger           Informational                    [Page 38]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   the SSRCs.  Thus, in a mixed domain, the only SSRCs seen will be the
+   ones present in the domain, while there can be CSRCs from all the
+   domains connected together with a combination of mixers and
+   translators.  The combined SSRC and CSRC space is common over any
+   translator or mixer.  It is important to facilitate loop detection,
+   something that is likely to be even more important in combined
+   topologies due to the mixed behavior between the domains.  Any
+   hybrid, like the Topo-Video-switch-MCU or Topo-Asymmetric, requires
+   considerable thought on how RTCP is dealt with.
+
+4.  Topology Properties
+
+   The topologies discussed in Section 3 have different properties.
+   This section describes these properties.  Note that, even if a
+   certain property is supported within a particular topology concept,
+   the necessary functionality may be optional to implement.
+
+4.1.  All-to-All Media Transmission
+
+   To recapitulate, multicast, and in particular ASM, provides the
+   functionality that everyone may send to, or receive from, everyone
+   else within the session.  SSM can provide a similar functionality by
+   having anyone intending to participate as a sender to send its media
+   to the SSM Distribution Source.  The SSM Distribution Source forwards
+   the media to all receivers subscribed to the multicast group.  Mesh,
+   MCUs, mixers, Selective Forwarding Middleboxes (SFMs), and
+   translators may all provide that functionality at least on some basic
+   level.  However, there are some differences in which type of
+   reachability they provide.
+
+   The topologies that come closest to emulating Any-Source IP
+   Multicast, with all-to-all transmission capabilities, are the
+   Transport Translator function called "relay" in Section 3.5, as well
+   as the Mesh with joint RTP sessions (Section 3.4).  Media
+   Translators, Mesh with independent RTP Sessions, mixers, SFUs, and
+   the MCU variants do not provide a fully meshed forwarding on the
+   transport level; instead, they only allow limited forwarding of
+   content from the other session participants.
+
+   The "all-to-all media transmission" requires that any media
+   transmitting endpoint considers the path to the least-capable
+   receiving endpoint.  Otherwise, the media transmissions may overload
+   that path.  Therefore, a sending endpoint needs to monitor the path
+   from itself to any of the receiving endpoints, to detect the
+   currently least-capable receiver and adapt its sending rate
+   accordingly.  As multiple endpoints may send simultaneously, the
+   available resources may vary.  RTCP's receiver reports help perform
+   this monitoring, at least on a medium time scale.
+
+
+
+Westerlund & Wenger           Informational                    [Page 39]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   The resource consumption for performing all-to-all transmission
+   varies depending on the topology.  Both ASM and SSM have the benefit
+   that only one copy of each packet traverses a particular link.  Using
+   a relay causes the transmission of one copy of a packet per
+   endpoint-to-relay path and packet transmitted.  However, in most
+   cases, the links carrying the multiple copies will be the ones close
+   to the relay (which can be assumed to be part of the network
+   infrastructure with good connectivity to the backbone) rather than
+   the endpoints (which may be behind slower access links).  The Mesh
+   topologies causes N-1 streams of transmitted packets to traverse the
+   first-hop link from the endpoint, in a mesh with N endpoints.  How
+   long the different paths are common is highly situation dependent.
+
+   The transmission of RTCP by design adapts to any changes in the
+   number of participants due to the transmission algorithm, defined in
+   the RTP specification [RFC3550], and the extensions in AVPF [RFC4585]
+   (when applicable).  That way, the resources utilized for RTCP stay
+   within the bounds configured for the session.
+
+4.2.  Transport or Media Interoperability
+
+   All translators, mixers, RTCP-terminating MCUs, and Mesh with
+   individual RTP sessions allow changing the media encoding or the
+   transport to other properties of the other domain, thereby providing
+   extended interoperability in cases where the endpoints lack a common
+   set of media codecs and/or transport protocols.  Selective Forwarding
+   Middleboxes can adopt the transport and (at least) selectively
+   forward the encoded streams that match a receiving endpoint's
+   capability.  It requires an additional translator to change the media
+   encoding if the encoded streams do not match the receiving endpoint's
+   capabilities.
+
+4.3.  Per-Domain Bitrate Adaptation
+
+   Endpoints are often connected to each other with a heterogeneous set
+   of paths.  This makes congestion control in a Point-to-Multipoint set
+   problematic.  In the ASM, SSM, Mesh with common RTP session, and
+   Transport Relay scenarios, each individual sending endpoint has to
+   adapt to the receiving endpoint behind the least-capable path,
+   yielding suboptimal quality for the endpoints behind the more capable
+   paths.  This is no longer an issue when Media Translators, mixers,
+   SFMs, or MCUs are involved, as each endpoint only needs to adapt to
+   the slowest path within its own domain.  The translator, mixer, SFM,
+   or MCU topologies all require their respective outgoing RTP streams
+   to adjust the bitrate, packet rate, etc., to adapt to the least-
+   capable path in each of the other domains.  That way one can avoid
+   lowering the quality to the least-capable endpoint in all the domains
+   at the cost (complexity, delay, equipment) of the mixer, SFM, or
+
+
+
+Westerlund & Wenger           Informational                    [Page 40]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   translator, and potentially the media sender (multicast/layered
+   encoding and sending the different representations).
+
+4.4.  Aggregation of Media
+
+   In the all-to-all media property mentioned above and provided by ASM,
+   SSM, Mesh with common RTP session, and relay, all simultaneous media
+   transmissions share the available bitrate.  For endpoints with
+   limited reception capabilities, this may result in a situation where
+   even a minimal, acceptable media quality cannot be accomplished,
+   because multiple RTP streams need to share the same resources.  One
+   solution to this problem is to use a mixer, or MCU, to aggregate the
+   multiple RTP streams into a single one, where the single RTP stream
+   takes up less resources in terms of bitrate.  This aggregation can be
+   performed according to different methods.  Mixing or selection are
+   two common methods.  Selection is almost always possible and easy to
+   implement.  Mixing requires resources in the mixer and may be
+   relatively easy and not impair the quality too badly (audio) or quite
+   difficult (video tiling, which is not only computationally complex
+   but also reduces the pixel count per stream, with corresponding loss
+   in perceptual quality).
+
+4.5.  View of All Session Participants
+
+   The RTP protocol includes functionality to identify the session
+   participants through the use of the SSRC and CSRC fields.  In
+   addition, it is capable of carrying some further identity information
+   about these participants using the RTCP SDES.  In topologies that
+   provide a full all-to-all functionality, i.e., ASM, Mesh with common
+   RTP session, and relay, a compliant RTP implementation offers the
+   functionality directly as specified in RTP.  In topologies that do
+   not offer all-to-all communication, it is necessary that RTCP is
+   handled correctly in domain bridging functions.  RTP includes
+   explicit specification text for translators and mixers, and for SFMs
+   the required functionality can be derived from that text.  However,
+   the MCU described in Section 3.8 cannot offer the full functionality
+   for session participant identification through RTP means.  The
+   topologies that create independent RTP sessions per endpoint or pair
+   of endpoints, like a Back-to-Back RTP session, MESH with independent
+   RTP sessions, and the RTCP terminating MCU (Section 3.9), with an
+   exception of SFM, do not support RTP-based identification of session
+   participants.  In all those cases, other non-RTP-based mechanisms
+   need to be implemented if such knowledge is required or desirable.
+   When it comes to SFM, the SSRC namespace is not necessarily joint.
+   Instead, identification will require knowledge of SSRC/CSRC mappings
+   that the SFM performed; see Section 3.7.
+
+
+
+
+
+Westerlund & Wenger           Informational                    [Page 41]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+4.6.  Loop Detection
+
+   In complex topologies with multiple interconnected domains, it is
+   possible to unintentionally form media loops.  RTP and RTCP support
+   detecting such loops, as long as the SSRC and CSRC identities are
+   maintained and correctly set in forwarded packets.  Loop detection
+   will work in ASM, SSM, Mesh with joint RTP session, and relay.  It is
+   likely that loop detection works for the video-switching MCU,
+   Section 3.8, at least as long as it forwards the RTCP between the
+   endpoints.  However, the Back-to-Back RTP sessions, Mesh with
+   independent RTP sessions, and SFMs will definitely break the loop
+   detection mechanism.
+
+4.7.  Consistency between Header Extensions and RTCP
+
+   Some RTP header extensions have relevance not only end to end but
+   also hop to hop, meaning at least some of the middleboxes in the path
+   are aware of their potential presence through signaling, intercept
+   and interpret such header extensions, and potentially also rewrite or
+   generate them.  Modern header extensions generally follow "A General
+   Mechanism for RTP Header Extensions" [RFC5285], which allows for all
+   of the above.  Examples for such header extensions include the Media
+   ID (MID) in [SDP-BUNDLE].  At the time of writing, there was also a
+   proposal for how to include some SDES into an RTP header extension
+   [RTCP-SDES].
+
+   When such header extensions are in use, any middlebox that
+   understands it must ensure consistency between the extensions it sees
+   and/or generates and the RTCP it receives and generates.  For
+   example, the MID of the bundle is sent in an RTP header extension and
+   also in an RTCP SDES message.  This apparent redundancy was
+   introduced as unaware middleboxes may choose to discard RTP header
+   extensions.  Obviously, inconsistency between the MID sent in the RTP
+   header extension and in the RTCP SDES message could lead to
+   undesirable results, and, therefore, consistency is needed.
+   Middleboxes unaware of the nature of a header extension, as specified
+   in [RFC5285], are free to forward or discard header extensions.
+
+5.  Comparison of Topologies
+
+   The table below attempts to summarize the properties of the different
+   topologies.  The legend to the topology abbreviations are:
+   Topo-Point-to-Point (PtP), Topo-ASM (ASM), Topo-SSM (SSM), Topo-Trn-
+   Translator (TT), Topo-Media-Translator (including Transport
+   Translator) (MT), Topo-Mesh with joint session (MJS), Topo-Mesh with
+   individual sessions (MIS), Topo-Mixer (Mix), Topo-Asymmetric (ASY),
+   Topo-Video-switch-MCU (VSM), Topo-RTCP-terminating-MCU (RTM), and
+   Selective Forwarding Middlebox (SFM).  In the table below, Y
+
+
+
+Westerlund & Wenger           Informational                    [Page 42]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   indicates Yes or full support, N indicates No support, (Y) indicates
+   partial support, and N/A indicates not applicable.
+
+   Property             PtP  ASM SSM  TT MT MJS MIS Mix ASY VSM RTM SFM
+   ---------------------------------------------------------------------
+   All-to-All Media      N    Y  (Y)  Y  Y   Y  (Y) (Y) (Y) (Y) (Y) (Y)
+   Interoperability      N/A  N   N   Y  Y   Y   Y   Y   Y   N   Y   Y
+   Per-Domain Adaptation N/A  N   N   N  Y   N   Y   Y   Y   N   Y   Y
+   Aggregation of Media  N    N   N   N  N   N   N   Y  (Y)  Y   Y   N
+   Full Session View     Y    Y   Y   Y  Y   Y   N   Y   Y  (Y)  N   Y
+   Loop Detection        Y    Y   Y   Y  Y   Y   N   Y   Y  (Y)  N   N
+
+   Please note that the Media Translator also includes the Transport
+   Translator functionality.
+
+6.  Security Considerations
+
+   The use of mixers, SFMs, and translators has impact on security and
+   the security functions used.  The primary issue is that mixers, SFMs,
+   and translators modify packets, thus preventing the use of integrity
+   and source authentication, unless they are trusted devices that take
+   part in the security context, e.g., the device can send Secure Real-
+   time Transport Protocol (SRTP) and Secure Real-time Transport Control
+   Protocol (SRTCP) [RFC3711] packets to endpoints in the Communication
+   Session.  If encryption is employed, the Media Translator, SFM, and
+   mixer need to be able to decrypt the media to perform its function.
+   A Transport Translator may be used without access to the encrypted
+   payload in cases where it translates parts that are not included in
+   the encryption and integrity protection, for example, IP address and
+   UDP port numbers in a media stream using SRTP [RFC3711].  However, in
+   general, the translator, SFM, or mixer needs to be part of the
+   signaling context and get the necessary security associations (e.g.,
+   SRTP crypto contexts) established with its RTP session participants.
+
+   Including the mixer, SFM, and translator in the security context
+   allows the entity, if subverted or misbehaving, to perform a number
+   of very serious attacks as it has full access.  It can perform all
+   the attacks possible (see RFC 3550 and any applicable profiles) as if
+   the media session were not protected at all, while giving the
+   impression to the human session participants that they are protected.
+
+   Transport Translators have no interactions with cryptography that
+   work above the transport layer, such as SRTP, since that sort of
+   translator leaves the RTP header and payload unaltered.  Media
+   Translators, on the other hand, have strong interactions with
+   cryptography, since they alter the RTP payload.  A Media Translator
+   in a session that uses cryptographic protection needs to perform
+   cryptographic processing to both inbound and outbound packets.
+
+
+
+Westerlund & Wenger           Informational                    [Page 43]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   A Media Translator may need to use different cryptographic keys for
+   the inbound and outbound processing.  For SRTP, different keys are
+   required, because an RFC 3550 Media Translator leaves the SSRC
+   unchanged during its packet processing, and SRTP key sharing is only
+   allowed when distinct SSRCs can be used to protect distinct packet
+   streams.
+
+   When the Media Translator uses different keys to process inbound and
+   outbound packets, each session participant needs to be provided with
+   the appropriate key, depending on whether they are listening to the
+   translator or the original source.  (Note that there is an
+   architectural difference between RTP media translation, in which
+   participants can rely on the RTP payload type field of a packet to
+   determine appropriate processing, and cryptographically protected
+   media translation, in which participants must use information that is
+   not carried in the packet.)
+
+   When using security mechanisms with translators, SFMs, and mixers, it
+   is possible that the translator, SFM, or mixer could create different
+   security associations for the different domains they are working in.
+   Doing so has some implications:
+
+   First, it might weaken security if the mixer/translator accepts a
+   weaker algorithm or key in one domain rather than in another.
+   Therefore, care should be taken that appropriately strong security
+   parameters are negotiated in all domains.  In many cases,
+   "appropriate" translates to "similar" strength.  If a key-management
+   system does allow the negotiation of security parameters resulting in
+   a different strength of the security, then this system should notify
+   the participants in the other domains about this.
+
+   Second, the number of crypto contexts (keys and security-related
+   state) needed (for example, in SRTP [RFC3711]) may vary between
+   mixers, SFMs, and translators.  A mixer normally needs to represent
+   only a single SSRC per domain and therefore needs to create only one
+   security association (SRTP crypto context) per domain.  In contrast,
+   a translator needs one security association per participant it
+   translates towards, in the opposite domain.  Considering Figure 11,
+   the translator needs two security associations towards the multicast
+   domain: one for B and one for D.  It may be forced to maintain a set
+   of totally independent security associations between itself and B and
+   D, respectively, so as to avoid two-time pad occurrences.  These
+   contexts must also be capable of handling all the sources present in
+   the other domains.  Hence, using completely independent security
+   associations (for certain keying mechanisms) may force a translator
+   to handle N*DM keys and related state, where N is the total number of
+   SSRCs used over all domains and DM is the total number of domains.
+
+
+
+
+Westerlund & Wenger           Informational                    [Page 44]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   The ASM, SSM, Relay, and Mesh (with common RTP session) topologies
+   each have multiple endpoints that require shared knowledge about the
+   different crypto contexts for the endpoints.  These multiparty
+   topologies have special requirements on the key management as well as
+   the security functions.  Specifically, source authentication in these
+   environments has special requirements.
+
+   There exist a number of different mechanisms to provide keys to the
+   different participants.  One example is the choice between group keys
+   and unique keys per SSRC.  The appropriate keying model is impacted
+   by the topologies one intends to use.  The final security properties
+   are dependent on both the topologies in use and the keying
+   mechanisms' properties and need to be considered by the application.
+   Exactly which mechanisms are used is outside of the scope of this
+   document.  Please review RTP Security Options [RFC7201] to get a
+   better understanding of most of the available options.
+
+7.  References
+
+7.1.  Normative References
+
+   [RFC3550]  Schulzrinne, H., Casner, S., Frederick, R., and V.
+              Jacobson, "RTP: A Transport Protocol for Real-Time
+              Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550,
+              July 2003, <http://www.rfc-editor.org/info/rfc3550>.
+
+   [RFC4585]  Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey,
+              "Extended RTP Profile for Real-time Transport Control
+              Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585,
+              DOI 10.17487/RFC4585, July 2006,
+              <http://www.rfc-editor.org/info/rfc4585>.
+
+   [RFC7656]  Lennox, J., Gross, K., Nandakumar, S., Salgueiro, G., and
+              B. Burman, Ed., "A Taxonomy of Grouping Semantics and
+              Mechanisms for Real-Time Transport Protocol (RTP)
+              Sources", RFC 7656, November 2015,
+              <http://www.rfc-editor.org/info/rfc7656>.
+
+7.2.  Informative References
+
+   [MULTI-STREAM-OPT]
+              Lennox, J., Westerlund, M., Wu, W., and C. Perkins,
+              "Sending Multiple Media Streams in a Single RTP Session:
+              Grouping RTCP Reception Statistics and Other Feedback",
+              Work in Progress, draft-ietf-avtcore-rtp-multi-stream-
+              optimisation-08, October 2015.
+
+
+
+
+
+Westerlund & Wenger           Informational                    [Page 45]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   [RFC1112]  Deering, S., "Host extensions for IP multicasting", STD 5,
+              RFC 1112, DOI 10.17487/RFC1112, August 1989,
+              <http://www.rfc-editor.org/info/rfc1112>.
+
+   [RFC3022]  Srisuresh, P. and K. Egevang, "Traditional IP Network
+              Address Translator (Traditional NAT)", RFC 3022,
+              DOI 10.17487/RFC3022, January 2001,
+              <http://www.rfc-editor.org/info/rfc3022>.
+
+   [RFC3569]  Bhattacharyya, S., Ed., "An Overview of Source-Specific
+              Multicast (SSM)", RFC 3569, DOI 10.17487/RFC3569, July
+              2003, <http://www.rfc-editor.org/info/rfc3569>.
+
+   [RFC3711]  Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
+              Norrman, "The Secure Real-time Transport Protocol (SRTP)",
+              RFC 3711, DOI 10.17487/RFC3711, March 2004,
+              <http://www.rfc-editor.org/info/rfc3711>.
+
+   [RFC4575]  Rosenberg, J., Schulzrinne, H., and O. Levin, Ed., "A
+              Session Initiation Protocol (SIP) Event Package for
+              Conference State", RFC 4575, DOI 10.17487/RFC4575, August
+              2006, <http://www.rfc-editor.org/info/rfc4575>.
+
+   [RFC4607]  Holbrook, H. and B. Cain, "Source-Specific Multicast for
+              IP", RFC 4607, DOI 10.17487/RFC4607, August 2006,
+              <http://www.rfc-editor.org/info/rfc4607>.
+
+   [RFC5104]  Wenger, S., Chandra, U., Westerlund, M., and B. Burman,
+              "Codec Control Messages in the RTP Audio-Visual Profile
+              with Feedback (AVPF)", RFC 5104, DOI 10.17487/RFC5104,
+              February 2008, <http://www.rfc-editor.org/info/rfc5104>.
+
+   [RFC5117]  Westerlund, M. and S. Wenger, "RTP Topologies", RFC 5117,
+              DOI 10.17487/RFC5117, January 2008,
+              <http://www.rfc-editor.org/info/rfc5117>.
+
+   [RFC5285]  Singer, D. and H. Desineni, "A General Mechanism for RTP
+              Header Extensions", RFC 5285, DOI 10.17487/RFC5285, July
+              2008, <http://www.rfc-editor.org/info/rfc5285>.
+
+   [RFC5760]  Ott, J., Chesterfield, J., and E. Schooler, "RTP Control
+              Protocol (RTCP) Extensions for Single-Source Multicast
+              Sessions with Unicast Feedback", RFC 5760,
+              DOI 10.17487/RFC5760, February 2010,
+              <http://www.rfc-editor.org/info/rfc5760>.
+
+
+
+
+
+
+Westerlund & Wenger           Informational                    [Page 46]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+   [RFC5766]  Mahy, R., Matthews, P., and J. Rosenberg, "Traversal Using
+              Relays around NAT (TURN): Relay Extensions to Session
+              Traversal Utilities for NAT (STUN)", RFC 5766,
+              DOI 10.17487/RFC5766, April 2010,
+              <http://www.rfc-editor.org/info/rfc5766>.
+
+   [RFC6285]  Ver Steeg, B., Begen, A., Van Caenegem, T., and Z. Vax,
+              "Unicast-Based Rapid Acquisition of Multicast RTP
+              Sessions", RFC 6285, DOI 10.17487/RFC6285, June 2011,
+              <http://www.rfc-editor.org/info/rfc6285>.
+
+   [RFC6465]  Ivov, E., Ed., Marocco, E., Ed., and J. Lennox, "A Real-
+              time Transport Protocol (RTP) Header Extension for Mixer-
+              to-Client Audio Level Indication", RFC 6465,
+              DOI 10.17487/RFC6465, December 2011,
+              <http://www.rfc-editor.org/info/rfc6465>.
+
+   [RFC7201]  Westerlund, M. and C. Perkins, "Options for Securing RTP
+              Sessions", RFC 7201, DOI 10.17487/RFC7201, April 2014,
+              <http://www.rfc-editor.org/info/rfc7201>.
+
+   [RTCP-SDES]
+              Westerlund, M., Burman, B., Even, R., and M. Zanaty, "RTP
+              Header Extension for RTCP Source Description Items", Work
+              in Progress, draft-ietf-avtext-sdes-hdr-ext-02, July 2015.
+
+   [SDP-BUNDLE]
+              Holmberg, C., Alvestrand, H., and C. Jennings,
+              "Negotiating Media Multiplexing Using the Session
+              Description Protocol (SDP)", Work in Progress,
+              draft-ietf-mmusic-sdp-bundle-negotiation-23, July 2015.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Westerlund & Wenger           Informational                    [Page 47]
+
+RFC 7667                     RTP Topologies                November 2015
+
+
+Acknowledgements
+
+   The authors would like to thank Mark Baugher, Bo Burman, Ben
+   Campbell, Umesh Chandra, Alex Eleftheriadis, Roni Even, Ladan Gharai,
+   Geoff Hunt, Suresh Krishnan, Keith Lantz, Jonathan Lennox, Scarlet
+   Liuyan, Suhas Nandakumar, Colin Perkins, and Dan Wing for their help
+   in reviewing and improving this document.
+
+Authors' Addresses
+
+   Magnus Westerlund
+   Ericsson
+   Farogatan 2
+   SE-164 80 Kista
+   Sweden
+
+   Phone: +46 10 714 82 87
+   Email: magnus.westerlund@ericsson.com
+
+
+   Stephan Wenger
+   Vidyo
+   433 Hackensack Ave
+   Hackensack, NJ  07601
+   United States
+
+   Email: stewe@stewe.org
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Westerlund & Wenger           Informational                    [Page 48]
+
-- 
cgit v1.2.3