summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc7262.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/rfc/rfc7262.txt')
-rw-r--r--doc/rfc/rfc7262.txt675
1 files changed, 675 insertions, 0 deletions
diff --git a/doc/rfc/rfc7262.txt b/doc/rfc/rfc7262.txt
new file mode 100644
index 0000000..8e3ed39
--- /dev/null
+++ b/doc/rfc/rfc7262.txt
@@ -0,0 +1,675 @@
+
+
+
+
+
+
+Internet Engineering Task Force (IETF) A. Romanow
+Request for Comments: 7262 Cisco Systems
+Category: Informational S. Botzko
+ISSN: 2070-1721 Polycom
+ M. Barnes
+ MLB@Realtime Communications, LLC
+ June 2014
+
+
+ Requirements for Telepresence Multistreams
+
+Abstract
+
+ This memo discusses the requirements for specifications that enable
+ telepresence interoperability by describing behaviors and protocols
+ for Controlling Multiple Streams for Telepresence (CLUE). In
+ addition, the problem statement and related definitions are also
+ covered herein.
+
+Status of This Memo
+
+ This document is not an Internet Standards Track specification; it is
+ published for informational purposes.
+
+ This document is a product of the Internet Engineering Task Force
+ (IETF). It represents the consensus of the IETF community. It has
+ received public review and has been approved for publication by the
+ Internet Engineering Steering Group (IESG). Not all documents
+ approved by the IESG are a candidate for any level of Internet
+ Standard; see Section 2 of RFC 5741.
+
+ Information about the current status of this document, any errata,
+ and how to provide feedback on it may be obtained at
+ http://www.rfc-editor.org/info/rfc7262.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Romanow, et al. Informational [Page 1]
+
+RFC 7262 CLUE Telepresence Requirements June 2014
+
+
+Copyright Notice
+
+ Copyright (c) 2014 IETF Trust and the persons identified as the
+ document authors. All rights reserved.
+
+ This document is subject to BCP 78 and the IETF Trust's Legal
+ Provisions Relating to IETF Documents
+ (http://trustee.ietf.org/license-info) in effect on the date of
+ publication of this document. Please review these documents
+ carefully, as they describe your rights and restrictions with respect
+ to this document. Code Components extracted from this document must
+ include Simplified BSD License text as described in Section 4.e of
+ the Trust Legal Provisions and are provided without warranty as
+ described in the Simplified BSD License.
+
+Table of Contents
+
+ 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
+ 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3
+ 3. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 4
+ 4. Problem Statement . . . . . . . . . . . . . . . . . . . . . . 5
+ 5. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 6
+ 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 10
+ 7. Security Considerations . . . . . . . . . . . . . . . . . . . 10
+ 8. Informative References . . . . . . . . . . . . . . . . . . . 11
+
+1. Introduction
+
+ Telepresence systems greatly improve collaboration. In a
+ telepresence conference (as used herein), the goal is to create an
+ environment that gives the users a feeling of (co-located) presence
+ -- the feeling that a local user is in the same room with other local
+ users and remote parties. Currently, systems from different vendors
+ often do not interoperate because they do the same tasks differently,
+ as discussed in the Problem Statement section below (see Section 4).
+
+ The approach taken in this memo is to set requirements for a future
+ specification(s) that, when fulfilled by an implementation of the
+ specification(s), provide for interoperability between IETF protocol-
+ based telepresence systems. It is anticipated that a solution for
+ the requirements set out in this memo likely involves the exchange of
+ adequate information about participating sites; this information that
+ is currently not standardized by the IETF.
+
+ The purpose of this document is to describe the requirements for a
+ specification that enables interworking between different SIP-based
+ [RFC3261] telepresence systems, by exchanging and negotiating
+ appropriate information. In the context of the requirements in this
+
+
+
+Romanow, et al. Informational [Page 2]
+
+RFC 7262 CLUE Telepresence Requirements June 2014
+
+
+ document and related solution documents, this includes both point-to-
+ point SIP sessions as well as SIP-based conferences as described in
+ the SIP conferencing framework [RFC4353] and the SIP-based conference
+ control [RFC4579] specifications. Non-IETF protocol-based systems,
+ such as those based on ITU-T Rec. H.323 [ITU.H323], are out of scope.
+ These requirements are for the specification, they are not
+ requirements on the telepresence systems implementing the solution/
+ protocol that will be specified.
+
+ Today, telepresence systems of different vendors can follow radically
+ different architectural approaches while offering a similar user
+ experience. CLUE will not dictate telepresence architectural and
+ implementation choices; however, it will describe a protocol
+ architecture for CLUE and how it relates to other protocols. CLUE
+ enables interoperability between telepresence systems by exchanging
+ information about the systems' characteristics. Systems can use this
+ information to control their behavior to allow for interoperability
+ between those systems.
+
+ A telepresence session requires at least one sending and one
+ receiving endpoint. Multiparty telepresence sessions include more
+ than 2 endpoints and centralized infrastructure such as Multipoint
+ Control Units (MCUs) or equivalent. CLUE specifies the syntax,
+ semantics, and control flow of information to enable the best
+ possible user experience at those endpoints.
+
+ Sending endpoints, or MCUs, are not mandated to use any of the CLUE
+ specifications that describe their capabilities, attributes, or
+ behavior. Similarly, it is not envisioned that endpoints or MCUs
+ will ever have to take information received into account. However,
+ by making available as much information as possible, and by taking
+ into account as much information as has been received or exchanged,
+ MCUs and endpoints are expected to select operation modes that enable
+ the best possible user experience under their constraints.
+
+ The document structure is as follows: definitions are set out,
+ followed by a description of the problem of telepresence
+ interoperability that led to this work. Then the requirements for a
+ specification addressing the current shortcomings are enumerated and
+ discussed.
+
+2. Terminology
+
+ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+ "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
+ document are to be interpreted as described in RFC 2119 [RFC2119].
+
+
+
+
+
+Romanow, et al. Informational [Page 3]
+
+RFC 7262 CLUE Telepresence Requirements June 2014
+
+
+3. Definitions
+
+ The following terms are used throughout this document and serve as a
+ reference for other documents.
+
+ Audio Mixing: refers to the accumulation of scaled audio signals
+ to produce a single audio stream. See "RTP Topologies" [RFC5117].
+
+ Conference: used as defined in "A Framework for Conferencing
+ within the Session Initiation Protocol (SIP)" [RFC4353].
+
+ Endpoint: The logical point of final termination through
+ receiving, decoding and rendering, and/or initiation through
+ capturing, encoding, and sending of media streams. An endpoint
+ consists of one or more physical devices that source and sink
+ media streams, and exactly one participant [RFC4353] (which, in
+ turn, includes exactly one SIP user agent). In contrast to an
+ endpoint, an MCU may also send and receive media streams, but it
+ is not the initiator or the final terminator in the sense that
+ media is captured or rendered. Endpoints can be anything from
+ multiscreen/multicamera rooms to handheld devices.
+
+ Endpoint Characteristics: include placement of capture and
+ rendering devices, capture/render angle, resolution of cameras and
+ screens, spatial location, and mixing parameters of microphones.
+ Endpoint characteristics are not specific to individual media
+ streams sent by the endpoint.
+
+ Layout: How rendered media streams are spatially arranged with
+ respect to each other on a telepresence endpoint with a single
+ screen and a single loudspeaker, and how rendered media streams
+ are arranged with respect to each other on a telepresence endpoint
+ with multiple screens or loudspeakers. Note that audio as well as
+ video are encompassed by the term layout -- in other words,
+ included is the placement of audio streams on loudspeakers as well
+ as video streams on video screens.
+
+ Local: Sender and/or receiver physically co-located ("local") in
+ the context of the discussion.
+
+ MCU: Multipoint Control Unit (MCU) - a device that connects two or
+ more endpoints together into one single multimedia conference
+ [RFC5117]. An MCU may include a mixer [RFC4353].
+
+ Media: Any data that, after suitable encoding, can be conveyed
+ over RTP, including audio, video, or timed text.
+
+
+
+
+
+Romanow, et al. Informational [Page 4]
+
+RFC 7262 CLUE Telepresence Requirements June 2014
+
+
+ Model: a set of assumptions a telepresence system of a given
+ vendor adheres to and expects the remote telepresence system(s) to
+ also adhere to.
+
+ Remote: Sender and/or receiver on the other side of the
+ communication channel (depending on context); i.e., not local. A
+ remote can be an endpoint or an MCU.
+
+ Render: the process of generating a representation from a media,
+ such as displayed motion video or sound emitted from loudspeakers.
+
+ Telepresence: an environment that gives non-co-located users or
+ user groups a feeling of (co-located) presence -- the feeling that
+ a local user is in the same room with other local users and the
+ remote parties. The inclusion of Remote parties is achieved
+ through multimedia communication including at least audio and
+ video signals of high fidelity.
+
+4. Problem Statement
+
+ In order to create a "being there" experience characteristic of
+ telepresence, media inputs need to be transported, received, and
+ coordinated between participating systems. Different telepresence
+ systems take diverse approaches in crafting a solution, or they
+ implement similar solutions quite differently.
+
+ They use disparate techniques, and they describe, control and
+ negotiate media in dissimilar fashions. Such diversity creates an
+ interoperability problem. The same issues are solved in different
+ ways by different systems, so that they are not directly
+ interoperable. This makes interworking difficult at best and
+ sometimes impossible.
+
+ Worse, even if those extensions are based on common standards such as
+ SIP, many telepresence systems use proprietary protocol extensions to
+ solve telepresence-related problems.
+
+ Some degree of interworking between systems from different vendors is
+ possible through transcoding and translation. This requires
+ additional devices, which are expensive, are often not entirely
+ automatic, and sometimes introduce unwelcome side effects, such as
+ additional delay or degraded performance. Specialized knowledge is
+ currently required to operate a telepresence conference with
+ endpoints from different vendors, for example to configure
+ transcoding and translating devices. Often such conferences do not
+ start as planned or are interrupted by difficulties that arise.
+
+
+
+
+
+Romanow, et al. Informational [Page 5]
+
+RFC 7262 CLUE Telepresence Requirements June 2014
+
+
+ The general problem that needs to be solved can be described as
+ follows. Today, each endpoint renders the audio and video captures
+ it receives according to an implicitly assumed model that stipulates
+ how to produce a realistic depiction of the remote location. If all
+ endpoints are manufactured by the same vendor, they all share the
+ same implicit model and render the received captures correctly.
+ However, if the devices are from different vendors, the models used
+ for rendering presence can and usually do differ. The result can be
+ that the telepresence systems actually connect, but the user
+ experience will suffer, for example one system assumes that the first
+ video stream is captured from the right camera, whereas the other
+ assumes the first video stream is captured from the left camera.
+
+ If Alice and Bob are at different sites, Alice needs to tell Bob
+ about the camera and sound equipment arrangement at her site so that
+ Bob's receiver can create an accurate rendering of her site. Alice
+ and Bob need to agree on what the salient characteristics are as well
+ as how to represent and communicate them. Characteristics may
+ include number, placement, capture/render angle, resolution of
+ cameras and screens, spatial location, and audio mixing parameters of
+ microphones.
+
+ The telepresence multistream work seeks to describe the sender
+ situation in a way that allows the receiver to render it
+ realistically even though it may have a different rendering model
+ than the sender.
+
+5. Requirements
+
+ Although some aspects of these requirements can be met by existing
+ technology, such as the Session Description Protocol (SDP) [RFC4566],
+ they are stated here to have a complete record of the requirements
+ for CLUE. Determining whether a requirement needs new work or not
+ will be part of the solution development, and is not discussed in
+ this document. Note that the term "solution" is used in these
+ requirements to mean the protocol specifications, including
+ extensions to existing protocols as well as any new protocols,
+ developed to support the use cases. The solution might introduce
+ additional functionality that is not mapped directly to these
+ requirements; e.g., the detailed information carried in the signaling
+ protocol(s). In cases where the requirements are directly relevant
+ to specific use cases as described in [RFC7205], a reference to the
+ use case is provided.
+
+
+
+
+
+
+
+
+Romanow, et al. Informational [Page 6]
+
+RFC 7262 CLUE Telepresence Requirements June 2014
+
+
+ REQ-1: The solution MUST support a description of the spatial
+ arrangement of source video images sent in video streams
+ that enables a satisfactory reproduction at the receiver of
+ the original scene. This applies to each site in a point-
+ to-point or a multipoint meeting and refers to the spatial
+ ordering within a site, not to the ordering of images
+ between sites.
+
+ This requirement relates to all the use cases described in
+ [RFC7205].
+
+ REQ-1a: The solution MUST support a means of allowing the
+ preservation of the order of images in the captured
+ scene. For example, if John is to Susan's right in
+ the image capture, John is also to Susan's right in
+ the rendered image.
+
+ REQ-1b: The solution MUST support a means of allowing the
+ preservation of order of images in the scene in two
+ dimensions - horizontal and vertical.
+
+ REQ-1c: The solution MUST support a means to identify the
+ relative location, within a scene, of the point of
+ capture of individual video captures in three
+ dimensions.
+
+ REQ-1d: The solution MUST support a means to identify the
+ area of coverage, within a scene, of individual
+ video captures in three dimensions.
+
+ REQ-2: The solution MUST support a description of the spatial
+ arrangement of captured source audio sent in audio streams
+ that enables a satisfactory reproduction at the receiver in
+ a spatially correct manner. This applies to each site in a
+ point to point or a multipoint meeting and refers to the
+ spatial ordering within a site, not the ordering of channels
+ between sites.
+
+ This requirement relates to all the use cases described in
+ [RFC7205], but is particularly important in the
+ Heterogeneous Systems use case.
+
+ REQ-2a: The solution MUST support a means of preserving the
+ spatial order of audio in the captured scene. For
+ example, if John sounds as if he is on Susan's
+ right in the captured audio, John voice is also
+ placed on Susan's right in the rendered image.
+
+
+
+
+Romanow, et al. Informational [Page 7]
+
+RFC 7262 CLUE Telepresence Requirements June 2014
+
+
+ REQ-2b: The solution MUST support a means to identify the
+ number and spatial arrangement of audio channels
+ including monaural, stereophonic (2.0), and 3.0
+ (left, center, right) audio channels.
+
+ REQ-2c: The solution MUST support a means to identify the
+ point of capture of individual audio captures in
+ three dimensions.
+
+ REQ-2d: The solution MUST support a means to identify the
+ area of coverage of individual audio captures in
+ three dimensions.
+
+ REQ-3: The solution MUST enable individual audio streams to be
+ associated with one or more video image captures, and
+ individual video image captures to be associated with one or
+ more audio captures, for the purpose of rendering proper
+ position.
+
+ This requirement relates to all the use cases described in
+ [RFC7205].
+
+ REQ-4: The solution MUST enable interoperability between endpoints
+ that have a different number of similar devices. For
+ example, an endpoint may have 1 screen, 1 loudspeaker, 1
+ camera, 1 mic, and another endpoint may have 3 screens, 2
+ loudspeakers, 3 cameras and 2 microphones. Or, in a
+ multipoint conference, an endpoint may have 1 screen,
+ another may have 2 screens, and a third may have 3 screens.
+ This includes endpoints where the number of devices of a
+ given type is zero.
+
+ This requirement relates to the Point-to-Point Meeting:
+ Symmetric and Multipoint Meeting use cases described in
+ [RFC7205].
+
+ REQ-5: The solution MUST support means of enabling interoperability
+ between telepresence endpoints where cameras are of
+ different picture aspect ratios.
+
+ REQ-6: The solution MUST provide scaling information that enables
+ rendering of a video image at the actual size of the
+ captured scene.
+
+ REQ-7: The solution MUST support means of enabling interoperability
+ between telepresence endpoints where displays are of
+ different resolutions.
+
+
+
+
+Romanow, et al. Informational [Page 8]
+
+RFC 7262 CLUE Telepresence Requirements June 2014
+
+
+ REQ-8: The solution MUST support methods for handling different bit
+ rates in the same conference.
+
+ REQ-9: The solution MUST support means of enabling interoperability
+ between endpoints that send and receive different numbers of
+ media streams.
+
+ This requirement relates to the Heterogeneous Systems and
+ Multipoint Meeting use cases.
+
+ REQ-10: The solution MUST ensure that endpoints that support
+ telepresence extensions can establish a session with a SIP
+ endpoint that does not support the telepresence extensions.
+ For example, in the case of a SIP endpoint that supports a
+ single audio and a single video stream, an endpoint that
+ supports the telepresence extensions would setup a session
+ with a single audio and single video stream using existing
+ SIP and SDP mechanisms.
+
+ REQ-11: The solution MUST support a mechanism for determining
+ whether or not an endpoint or MCU is capable of telepresence
+ extensions.
+
+ REQ-12: The solution MUST support a means to enable more than two
+ endpoints to participate in a teleconference.
+
+ This requirement relates to the Multipoint Meeting use case.
+
+ REQ-13: The solution MUST support both transcoding and switching
+ approaches for providing multipoint conferences.
+
+ REQ-14: The solution MUST support mechanisms to allow media from one
+ source endpoint or/and multiple source endpoints to be sent
+ to a remote endpoint at a particular point in time. Which
+ media is sent at a point in time may be based on local
+ policy.
+
+ REQ-15: The solution MUST provide mechanisms to support the
+ following:
+
+ * Presentations with different media sources
+
+ * Presentations for which the media streams are visible to
+ all endpoints
+
+
+
+
+
+
+
+Romanow, et al. Informational [Page 9]
+
+RFC 7262 CLUE Telepresence Requirements June 2014
+
+
+ * Multiple, simultaneous presentation media streams,
+ including presentation media streams that are spatially
+ related to each other.
+
+ The requirement relates to the Presentation use case.
+
+ REQ-16: The specification of any new protocols for the solution MUST
+ provide extensibility mechanisms.
+
+ REQ-17: The solution MUST support a mechanism for allowing
+ information about media captures to change during a
+ conference.
+
+ REQ-18: The solution MUST provide a mechanism for the secure
+ exchange of information about the media captures.
+
+6. Acknowledgements
+
+ This document has benefited from all the comments on the CLUE mailing
+ list and a number of discussions. So many people contributed that it
+ is not possible to list them all. However, the comments provided by
+ Roberta Presta, Christian Groves and Paul Coverdale during WGLC were
+ particularly helpful in completing the WG document.
+
+7. Security Considerations
+
+ REQ-18 identifies the need to securely transport the information
+ about media captures. It is important to note that session setup for
+ a telepresence session will use SIP for basic session setup and
+ either SIP or the Centralized Conferencing Manipulation Protocol
+ (CCMP) [RFC6503] for a multiparty telepresence session. Information
+ carried in the SIP signaling can be secured by the SIP security
+ mechanisms as defined in [RFC3261]. In the case of conference
+ control using CCMP, the security model and mechanisms as defined in
+ the Centralized Conferencing (XCON) Framework [RFC5239] and CCMP
+ [RFC6503] documents would meet the requirement. Any additional
+ signaling mechanism used to transport the information about media
+ captures needs to define the mechanisms by which the information is
+ secure. The details for the mechanisms needs to be defined and
+ described in the CLUE framework document and related solution
+ document(s).
+
+
+
+
+
+
+
+
+
+
+Romanow, et al. Informational [Page 10]
+
+RFC 7262 CLUE Telepresence Requirements June 2014
+
+
+8. Informative References
+
+ [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
+ Requirement Levels", BCP 14, RFC 2119, March 1997.
+
+ [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston,
+ A., Peterson, J., Sparks, R., Handley, M., and E.
+ Schooler, "SIP: Session Initiation Protocol", RFC 3261,
+ June 2002.
+
+ [RFC4353] Rosenberg, J., "A Framework for Conferencing with the
+ Session Initiation Protocol (SIP)", RFC 4353, February
+ 2006.
+
+ [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
+ Description Protocol", RFC 4566, July 2006.
+
+ [RFC4579] Johnston, A. and O. Levin, "Session Initiation Protocol
+ (SIP) Call Control - Conferencing for User Agents", BCP
+ 119, RFC 4579, August 2006.
+
+ [RFC5117] Westerlund, M. and S. Wenger, "RTP Topologies", RFC 5117,
+ January 2008.
+
+ [RFC5239] Barnes, M., Boulton, C., and O. Levin, "A Framework for
+ Centralized Conferencing", RFC 5239, June 2008.
+
+ [RFC6503] Barnes, M., Boulton, C., Romano, S., and H. Schulzrinne,
+ "Centralized Conferencing Manipulation Protocol", RFC
+ 6503, March 2012.
+
+ [RFC7205] Romanow, A., Botzko, S., Duckworth, M., and R. Even, "Use
+ Cases for Telepresence Multistreams", RFC 7205, April
+ 2014.
+
+ [ITU.H323] ITU-T, "Packet-based Multimedia Communications Systems",
+ ITU-T Recommendation H.323, December 2009.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Romanow, et al. Informational [Page 11]
+
+RFC 7262 CLUE Telepresence Requirements June 2014
+
+
+Authors' Addresses
+
+ Allyn Romanow
+ Cisco Systems
+ San Jose, CA 95134
+ USA
+
+ EMail: allyn@cisco.com
+
+
+ Stephen Botzko
+ Polycom
+ Andover, MA 01810
+ USA
+
+ EMail: stephen.botzko@polycom.com
+
+
+ Mary Barnes
+ MLB@Realtime Communications, LLC
+
+ EMail: mary.ietf.barnes@gmail.com
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Romanow, et al. Informational [Page 12]
+