1 files changed, 955 insertions, 0 deletions
diff --git a/doc/rfc/rfc7205.txt b/doc/rfc/rfc7205.txt
new file mode 100644
index 0000000..5de54f8
--- /dev/null
+++ b/doc/rfc/rfc7205.txt
@@ -0,0 +1,955 @@
+
+
+
+
+
+
+Internet Engineering Task Force (IETF)                        A. Romanow
+Request for Comments: 7205                                         Cisco
+Category: Informational                                        S. Botzko
+ISSN: 2070-1721                                             M. Duckworth
+                                                                 Polycom
+                                                            R. Even, Ed.
+                                                     Huawei Technologies
+                                                              April 2014
+
+
+                Use Cases for Telepresence Multistreams
+
+Abstract
+
+   Telepresence conferencing systems seek to create an environment that
+   gives users (or user groups) that are not co-located a feeling of co-
+   located presence through multimedia communication that includes at
+   least audio and video signals of high fidelity.  A number of
+   techniques for handling audio and video streams are used to create
+   this experience.  When these techniques are not similar,
+   interoperability between different systems is difficult at best, and
+   often not possible.  Conveying information about the relationships
+   between multiple streams of media would enable senders and receivers
+   to make choices to allow telepresence systems to interwork.  This
+   memo describes the most typical and important use cases for sending
+   multiple streams in a telepresence conference.
+
+Status of This Memo
+
+   This document is not an Internet Standards Track specification; it is
+   published for informational purposes.
+
+   This document is a product of the Internet Engineering Task Force
+   (IETF).  It represents the consensus of the IETF community.  It has
+   received public review and has been approved for publication by the
+   Internet Engineering Steering Group (IESG).  Not all documents
+   approved by the IESG are a candidate for any level of Internet
+   Standard; see Section 2 of RFC 5741.
+
+   Information about the current status of this document, any errata,
+   and how to provide feedback on it may be obtained at
+   http://www.rfc-editor.org/info/rfc7205.
+
+
+
+
+
+
+
+
+
+Romanow, et al.               Informational                     [Page 1]
+
+RFC 7205                 Telepresence Use Cases               April 2014
+
+
+Copyright Notice
+
+   Copyright (c) 2014 IETF Trust and the persons identified as the
+   document authors.  All rights reserved.
+
+   This document is subject to BCP 78 and the IETF Trust's Legal
+   Provisions Relating to IETF Documents
+   (http://trustee.ietf.org/license-info) in effect on the date of
+   publication of this document.  Please review these documents
+   carefully, as they describe your rights and restrictions with respect
+   to this document.  Code Components extracted from this document must
+   include Simplified BSD License text as described in Section 4.e of
+   the Trust Legal Provisions and are provided without warranty as
+   described in the Simplified BSD License.
+
+Table of Contents
+
+   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
+   2.  Overview of Telepresence Scenarios  . . . . . . . . . . . . .   4
+   3.  Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . .   6
+     3.1.  Point-to-Point Meeting: Symmetric . . . . . . . . . . . .   7
+     3.2.  Point-to-Point Meeting: Asymmetric  . . . . . . . . . . .   7
+     3.3.  Multipoint Meeting  . . . . . . . . . . . . . . . . . . .   9
+     3.4.  Presentation  . . . . . . . . . . . . . . . . . . . . . .  10
+     3.5.  Heterogeneous Systems . . . . . . . . . . . . . . . . . .  11
+     3.6.  Multipoint Education Usage  . . . . . . . . . . . . . . .  12
+     3.7.  Multipoint Multiview (Virtual Space)  . . . . . . . . . .  14
+     3.8.  Multiple Presentation Streams - Telemedicine  . . . . . .  15
+   4.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  16
+   5.  Security Considerations . . . . . . . . . . . . . . . . . . .  16
+   6.  Informative References  . . . . . . . . . . . . . . . . . . .  16
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Romanow, et al.               Informational                     [Page 2]
+
+RFC 7205                 Telepresence Use Cases               April 2014
+
+
+1.  Introduction
+
+   Telepresence applications try to provide a "being there" experience
+   for conversational video conferencing.  Often, this telepresence
+   application is described as "immersive telepresence" in order to
+   distinguish it from traditional video conferencing and from other
+   forms of remote presence not related to conversational video
+   conferencing, such as avatars and robots.  The salient
+   characteristics of telepresence are often described as: being actual
+   sized, providing immersive video, preserving interpersonal
+   interaction, and allowing non-verbal communication.
+
+   Although telepresence systems are based on open standards such as RTP
+   [RFC3550], SIP [RFC3261], H.264 [ITU.H264], and the H.323 [ITU.H323]
+   suite of protocols, they cannot easily interoperate with each other
+   without operator assistance and expensive additional equipment that
+   translates from one vendor's protocol to another.
+
+   The basic features that give telepresence its distinctive
+   characteristics are implemented in disparate ways in different
+   systems.  Currently, telepresence systems from diverse vendors
+   interoperate to some extent, but this is not supported in a
+   standards-based fashion.  Interworking requires that translation and
+   transcoding devices be included in the architecture.  Such devices
+   increase latency, reducing the quality of interpersonal interaction.
+   Use of these devices is often not automatic; it frequently requires
+   substantial manual configuration and a detailed understanding of the
+   nature of underlying audio and video streams.  This state of affairs
+   is not acceptable for the continued growth of telepresence -- these
+   systems should have the same ease of interoperability as do
+   telephones.  Thus, a standard way of describing the multiple streams
+   constituting the media flows and the fundamental aspects of their
+   behavior would allow telepresence systems to interwork.
+
+   This document presents a set of use cases describing typical
+   scenarios.  Requirements will be derived from these use cases in a
+   separate document.  The use cases are described from the viewpoint of
+   the users.  They are illustrative of the user experience that needs
+   to be supported.  It is possible to implement these use cases in a
+   variety of different ways.
+
+   Many different scenarios need to be supported.  This document
+   describes in detail the most common and basic use cases.  These will
+   cover most of the requirements.  There may be additional scenarios
+   that bring new features and requirements that can be used to extend
+   the initial work.
+
+
+
+
+
+Romanow, et al.               Informational                     [Page 3]
+
+RFC 7205                 Telepresence Use Cases               April 2014
+
+
+   Point-to-point and multipoint telepresence conferences are
+   considered.  In some use cases, the number of screens is the same at
+   all sites; in others, the number of screens differs at different
+   sites.  Both use cases are considered.  Also included is a use case
+   describing display of presentation material or content.
+
+   The multipoint use cases may include a variety of systems from
+   conference room systems to handheld devices, and such a use case is
+   described in the document.
+
+   This document's structure is as follows: Section 2 gives an overview
+   of scenarios, and Section 3 describes use cases.
+
+2.  Overview of Telepresence Scenarios
+
+   This section describes the general characteristics of the use cases
+   and what the scenarios are intended to show.  The typical setting is
+   a business conference, which was the initial focus of telepresence.
+   Recently, consumer products are also being developed.  We
+   specifically do not include in our scenarios the physical
+   infrastructure aspects of telepresence, such as room construction,
+   layout, and decoration.  Furthermore, these use cases do not describe
+   all the aspects needed to create the best user experience (for
+   example, the human factors).
+
+   We also specifically do not attempt to precisely define the
+   boundaries between telepresence systems and other systems, nor do we
+   attempt to identify the "best" solution for each presented scenario.
+
+   Telepresence systems are typically composed of one or more video
+   cameras and encoders and one or more display screens of large size
+   (diagonal around 60 inches).  Microphones pick up sound, and audio
+   codec(s) produce one or more audio streams.  The cameras used to
+   capture the telepresence users are referred to as "participant
+   cameras" (and likewise for screens).  There may also be other
+   cameras, such as for document display.  These will be referred to as
+   "presentation cameras" or "content cameras", which generally have
+   different formats, aspect ratios, and frame rates from the
+   participant cameras.  The presentation streams may be shown on
+   participant screens or on auxiliary display screens.  A user's
+   computer may also serve as a virtual content camera, generating an
+   animation or playing a video for display to the remote participants.
+
+   We describe such a telepresence system as sending one or more video
+   streams, audio streams, and presentation streams to the remote
+   system(s).
+
+
+
+
+
+Romanow, et al.               Informational                     [Page 4]
+
+RFC 7205                 Telepresence Use Cases               April 2014
+
+
+   The fundamental parameters describing today's typical telepresence
+   scenarios include:
+
+   1.   The number of participating sites
+
+   2.   The number of visible seats at a site
+
+   3.   The number of cameras
+
+   4.   The number and type of microphones
+
+   5.   The number of audio channels
+
+   6.   The screen size
+
+   7.   The screen capabilities -- such as resolution, frame rate,
+        aspect ratio
+
+   8.   The arrangement of the screens in relation to each other
+
+   9.   The number of primary screens at each site
+
+   10.  Type and number of presentation screens
+
+   11.  Multipoint conference display strategies -- for example, the
+        camera-to-screen mappings may be static or dynamic
+
+   12.  The camera point of capture
+
+   13.  The cameras fields of view and how they spatially relate to each
+        other
+
+   As discussed in the introduction, the basic features that give
+   telepresence its distinctive characteristics are implemented in
+   disparate ways in different systems.
+
+   There is no agreed upon way to adequately describe the semantics of
+   how streams of various media types relate to each other.  Without a
+   standard for stream semantics to describe the particular roles and
+   activities of each stream in the conference, interoperability is
+   cumbersome at best.
+
+   In a multiple-screen conference, the video and audio streams sent
+   from remote participants must be understood by receivers so that they
+   can be presented in a coherent and life-like manner.  This includes
+   the ability to present remote participants at their actual size for
+   their apparent distance, while maintaining correct eye contact,
+
+
+
+
+Romanow, et al.               Informational                     [Page 5]
+
+RFC 7205                 Telepresence Use Cases               April 2014
+
+
+   gesticular cues, and simultaneously providing a spatial audio sound
+   stage that is consistent with the displayed video.
+
+   The receiving device that decides how to render incoming information
+   needs to understand a number of variables such as the spatial
+   position of the speaker, the field of view of the cameras, the camera
+   zoom, which media stream is related to each of the screens, etc.  It
+   is not simply that individual streams must be adequately described,
+   to a large extent this already exists, but rather that the semantics
+   of the relationships between the streams must be communicated.  Note
+   that all of this is still required even if the basic aspects of the
+   streams, such as the bit rate, frame rate, and aspect ratio, are
+   known.  Thus, this problem has aspects considerably beyond those
+   encountered in interoperation of video conferencing systems that have
+   a single camera/screen.
+
+3.  Use Cases
+
+   The use cases focus on typical implementations.  There are a number
+   of possible variants for these use cases; for example, the audio
+   supported may differ at the end points (such as mono or stereo versus
+   surround sound), etc.
+
+   Many of these systems offer a "full conference room" solution, where
+   local participants sit at one side of a table and remote participants
+   are displayed as if they are sitting on the other side of the table.
+   The cameras and screens are typically arranged to provide a panoramic
+   view of the remote room (left to right from the local user's
+   viewpoint).
+
+   The sense of immersion and non-verbal communication is fostered by a
+   number of technical features, such as:
+
+   1.  Good eye contact, which is achieved by careful placement of
+       participants, cameras, and screens.
+
+   2.  Camera field of view and screen sizes are matched so that the
+       images of the remote room appear to be full size.
+
+   3.  The left side of each room is presented on the right screen at
+       the far end; similarly, the right side of the room is presented
+       on the left screen.  The effect of this is that participants of
+       each site appear to be sitting across the table from each other.
+       If 2 participants on the same site glance at each other, all
+       participants can observe it.  Likewise, if a participant at one
+       site gestures to a participant on the other site, all
+       participants observe the gesture itself and the participants it
+       includes.
+
+
+
+Romanow, et al.               Informational                     [Page 6]
+
+RFC 7205                 Telepresence Use Cases               April 2014
+
+
+3.1.  Point-to-Point Meeting: Symmetric
+
+   In this case, each of the 2 sites has an identical number of screens,
+   with cameras having fixed fields of view, and 1 camera for each
+   screen.  The sound type is the same at each end.  As an example,
+   there could be 3 cameras and 3 screens in each room, with stereo
+   sound being sent and received at each end.
+
+   Each screen is paired with a corresponding camera.  Each camera/
+   screen pair is typically connected to a separate codec, producing an
+   encoded stream of video for transmission to the remote site, and
+   receiving a similarly encoded stream from the remote site.
+
+   Each system has one or multiple microphones for capturing audio.  In
+   some cases, stereophonic microphones are employed.  In other systems,
+   a microphone may be placed in front of each participant (or pair of
+   participants).  In typical systems, all the microphones are connected
+   to a single codec that sends and receives the audio streams as either
+   stereo or surround sound.  The number of microphones and the number
+   of audio channels are often not the same as the number of cameras.
+   Also, the number of microphones is often not the same as the number
+   of loudspeakers.
+
+   The audio may be transmitted as multi-channel (stereo/surround sound)
+   or as distinct and separate monophonic streams.  Audio levels should
+   be matched, so the sound levels at both sites are identical.
+   Loudspeaker and microphone placements are chosen so that the sound
+   "stage" (orientation of apparent audio sources) is coordinated with
+   the video.  That is, if a participant at one site speaks, the
+   participants at the remote site perceive her voice as originating
+   from her visual image.  In order to accomplish this, the audio needs
+   to be mapped at the received site in the same fashion as the video.
+   That is, audio received from the right side of the room needs to be
+   output from loudspeaker(s) on the left side at the remote site, and
+   vice versa.
+
+3.2.  Point-to-Point Meeting: Asymmetric
+
+   In this case, each site has a different number of screens and cameras
+   than the other site.  The important characteristic of this scenario
+   is that the number of screens is different between the 2 sites.  This
+   creates challenges that are handled differently by different
+   telepresence systems.
+
+   This use case builds on the basic scenario of 3 screens to 3 screens.
+   Here, we use the common case of 3 screens and 3 cameras at one site,
+   and 1 screen and 1 camera at the other site, connected by a point-to-
+   point call.  The screen sizes and camera fields of view at both sites
+
+
+
+Romanow, et al.               Informational                     [Page 7]
+
+RFC 7205                 Telepresence Use Cases               April 2014
+
+
+   are basically similar, such that each camera view is designed to show
+   2 people sitting side by side.  Thus, the 1-screen room has up to 2
+   people seated at the table, while the 3-screen room may have up to 6
+   people at the table.
+
+   The basic considerations of defining left and right and indicating
+   relative placement of the multiple audio and video streams are the
+   same as in the 3-3 use case.  However, handling the mismatch between
+   the 2 sites of the number of screens and cameras requires more
+   complicated maneuvers.
+
+   For the video sent from the 1-camera room to the 3-screen room,
+   usually what is done is to simply use 1 of the 3 screens and keep the
+   second and third screens inactive or, for example, put up the current
+   date.  This would maintain the "full-size" image of the remote side.
+
+   For the other direction, the 3-camera room sending video to the
+   1-screen room, there are more complicated variations to consider.
+   Here are several possible ways in which the video streams can be
+   handled.
+
+   1.  The 1-screen system might simply show only 1 of the 3 camera
+       images, since the receiving side has only 1 screen.  2 people are
+       seen at full size, but 4 people are not seen at all.  The choice
+       of which one of the 3 streams to display could be fixed, or could
+       be selected by the users.  It could also be made automatically
+       based on who is speaking in the 3-screen room, such that the
+       people in the 1-screen room always see the person who is
+       speaking.  If the automatic selection is done at the sender, the
+       transmission of streams that are not displayed could be
+       suppressed, which would avoid wasting bandwidth.
+
+   2.  The 1-screen system might be capable of receiving and decoding
+       all 3 streams from all 3 cameras.  The 1-screen system could then
+       compose the 3 streams into 1 local image for display on the
+       single screen.  All 6 people would be seen, but smaller than full
+       size.  This could be done in conjunction with reducing the image
+       resolution of the streams, such that encode/decode resources and
+       bandwidth are not wasted on streams that will be downsized for
+       display anyway.
+
+   3.  The 3-screen system might be capable of including all 6 people in
+       a single stream to send to the 1-screen system.  For example, it
+       could use PTZ (Pan Tilt Zoom) cameras to physically adjust the
+       cameras such that 1 camera captures the whole room of 6 people.
+       Or, it could recompose the 3 camera images into 1 encoded stream
+       to send to the remote site.  These variations also show all 6
+       people but at a reduced size.
+
+
+
+Romanow, et al.               Informational                     [Page 8]
+
+RFC 7205                 Telepresence Use Cases               April 2014
+
+
+   4.  Or, there could be a combination of these approaches, such as
+       simultaneously showing the speaker in full size with a composite
+       of all 6 participants in a smaller size.
+
+   The receiving telepresence system needs to have information about the
+   content of the streams it receives to make any of these decisions.
+   If the systems are capable of supporting more than one strategy,
+   there needs to be some negotiation between the 2 sites to figure out
+   which of the possible variations they will use in a specific point-
+   to-point call.
+
+3.3.  Multipoint Meeting
+
+   In a multipoint telepresence conference, there are more than 2 sites
+   participating.  Additional complexity is required to enable media
+   streams from each participant to show up on the screens of the other
+   participants.
+
+   Clearly, there are a great number of topologies that can be used to
+   display the streams from multiple sites participating in a
+   conference.
+
+   One major objective for telepresence is to be able to preserve the
+   "being there" user experience.  However, in multi-site conferences,
+   it is often (in fact, usually) not possible to simultaneously provide
+   full-size video, eye contact, and common perception of gestures and
+   gaze by all participants.  Several policies can be used for stream
+   distribution and display: all provide good results, but they all make
+   different compromises.
+
+   One common policy is called site switching.  Let's say the speaker is
+   at site A and the other participants are at various "remote" sites.
+   When the room at site A shown, all the camera images from site A are
+   forwarded to the remote sites.  Therefore, at each receiving remote
+   site, all the screens display camera images from site A.  This can be
+   used to preserve full-size image display, and also provide full
+   visual context of the displayed far end, site A.  In site switching,
+   there is a fixed relation between the cameras in each room and the
+   screens in remote rooms.  The room or participants being shown are
+   switched from time to time based on who is speaking or by manual
+   control, e.g., from site A to site B.
+
+   Segment switching is another policy choice.  In segment switching
+   (assuming still that site A is where the speaker is, and "remote"
+   refers to all the other sites), rather than sending all the images
+   from site A, only the speaker at site A is shown.  The camera images
+   of the current speaker and previous speakers (if any) are forwarded
+   to the other sites in the conference.  Therefore, the screens in each
+
+
+
+Romanow, et al.               Informational                     [Page 9]
+
+RFC 7205                 Telepresence Use Cases               April 2014
+
+
+   site are usually displaying images from different remote sites -- the
+   current speaker at site A and the previous ones.  This strategy can
+   be used to preserve full-size image display and also capture the non-
+   verbal communication between the speakers.  In segment switching, the
+   display depends on the activity in the remote rooms (generally, but
+   not necessarily based on audio/speech detection).
+
+   A third possibility is to reduce the image size so that multiple
+   camera views can be composited onto one or more screens.  This does
+   not preserve full-size image display, but it provides the most visual
+   context (since more sites or segments can be seen).  Typically in
+   this case, the display mapping is static, i.e., each part of each
+   room is shown in the same location on the display screens throughout
+   the conference.
+
+   Other policies and combinations are also possible.  For example,
+   there can be a static display of all screens from all remote rooms,
+   with part or all of one screen being used to show the current speaker
+   at full size.
+
+3.4.  Presentation
+
+   In addition to the video and audio streams showing the participants,
+   additional streams are used for presentations.
+
+   In systems available today, generally only one additional video
+   stream is available for presentations.  Often, this presentation
+   stream is half-duplex in nature, with presenters taking turns.  The
+   presentation stream may be captured from a PC screen, or it may come
+   from a multimedia source such as a document camera, camcorder, or a
+   DVD.  In a multipoint meeting, the presentation streams for the
+   currently active presentation are always distributed to all sites in
+   the meeting, so that the presentations are viewed by all.
+
+   Some systems display the presentation streams on a screen that is
+   mounted either above or below the 3 participant screens.  Other
+   systems provide screens on the conference table for observing
+   presentations.  If multiple presentation screens are used, they
+   generally display identical content.  There is considerable variation
+   in the placement, number, and size of presentation screens.
+
+   In some systems, presentation audio is pre-mixed with the room audio.
+   In others, a separate presentation audio stream is provided (if the
+   presentation includes audio).
+
+
+
+
+
+
+
+Romanow, et al.               Informational                    [Page 10]
+
+RFC 7205                 Telepresence Use Cases               April 2014
+
+
+   In H.323 [ITU.H323] systems, H.239 [ITU.H239] is typically used to
+   control the video presentation stream.  In SIP systems, similar
+   control mechanisms can be provided using the Binary Floor Control
+   Protocol (BFCP) [RFC4582] for the presentation token.  These
+   mechanisms are suitable for managing a single presentation stream.
+
+   Although today's systems remain limited to a single video
+   presentation stream, there are obvious uses for multiple presentation
+   streams:
+
+   1.  Frequently, the meeting convener is following a meeting agenda,
+       and it is useful for her to be able to show that agenda to all
+       participants during the meeting.  Other participants at various
+       remote sites are able to make presentations during the meeting,
+       with the presenters taking turns.  The presentations and the
+       agenda are both shown, either on separate screens, or perhaps
+       rescaled and shown on a single screen.
+
+   2.  A single multimedia presentation can itself include multiple
+       video streams that should be shown together.  For instance, a
+       presenter may be discussing the fairness of media coverage.  In
+       addition to slides that support the presenter's conclusions, she
+       also has video excerpts from various news programs that she shows
+       to illustrate her findings.  She uses a DVD player for the video
+       excerpts so that she can pause and reposition the video as
+       needed.
+
+   3.  An educator who is presenting a multiscreen slide show.  This
+       show requires that the placement of the images on the multiple
+       screens at each site be consistent.
+
+   There are many other examples where multiple presentation streams are
+   useful.
+
+3.5.  Heterogeneous Systems
+
+   It is common in meeting scenarios for people to join the conference
+   from a variety of environments, using different types of endpoint
+   devices.  A multiscreen immersive telepresence conference may include
+   someone on a PC-based video conferencing system, a participant
+   calling in by phone, and (soon) someone on a handheld device.
+
+   What experience/view will each of these devices have?
+
+   Some may be able to handle multiple streams, and others can handle
+   only a single stream.  (Here, we are not talking about legacy
+   systems, but rather systems built to participate in such a
+   conference, although they are single stream only.)  In a single video
+
+
+
+Romanow, et al.               Informational                    [Page 11]
+
+RFC 7205                 Telepresence Use Cases               April 2014
+
+
+   stream, the stream may contain one or more compositions depending on
+   the available screen space on the device.  In most cases, an
+   intermediate transcoding device will be relied upon to produce a
+   single stream, perhaps with some kind of continuous presence.
+
+   Bit rates will vary -- the handheld device and phone having lower bit
+   rates than PC and multiscreen systems.
+
+   Layout is accomplished according to different policies.  For example,
+   a handheld device and PC may receive the active speaker stream.  The
+   decision can either be made explicitly by the receiver or by the
+   sender if it can receive some kind of rendering hint.  The same is
+   true for audio -- i.e., that it receives a mixed stream or a number
+   of the loudest speakers if mixing is not available in the network.
+
+   For the PC-based conferencing participant, the user's experience
+   depends on the application.  It could be single stream, similar to a
+   handheld device but with a bigger screen.  Or, it could be multiple
+   streams, similar to an immersive telepresence system but with a
+   smaller screen.  Control for manipulation of streams can be local in
+   the software application, or in another location and sent to the
+   application over the network.
+
+   The handheld device is the most extreme.  How will that participant
+   be viewed and heard?  It should be an equal participant, though the
+   bandwidth will be significantly less than an immersive system.  A
+   receiver may choose to display output coming from a handheld device
+   differently based on the resolution, but that would be the case with
+   any low-resolution video stream, e.g., from a powerful PC on a bad
+   network.
+
+   The handheld device will send and receive a single video stream,
+   which could be a composite or a subset of the conference.  The
+   handheld device could say what it wants or could accept whatever the
+   sender (conference server or sending endpoint) thinks is best.  The
+   handheld device will have to signal any actions it wants to take the
+   same way that an immersive system signals actions.
+
+3.6.  Multipoint Education Usage
+
+   The importance of this example is that the multiple video streams are
+   not used to create an immersive conferencing experience with
+   panoramic views at all the sites.  Instead, the multiple streams are
+   dynamically used to enable full participation of remote students in a
+   university class.  In some instances, the same video stream is
+   displayed on multiple screens in the room; in other instances, an
+   available stream is not displayed at all.
+
+
+
+
+Romanow, et al.               Informational                    [Page 12]
+
+RFC 7205                 Telepresence Use Cases               April 2014
+
+
+   The main site is a university auditorium that is equipped with 3
+   cameras.  One camera is focused on the professor at the podium.  A
+   second camera is mounted on the wall behind the professor and
+   captures the class in its entirety.  The third camera is co-located
+   with the second and is designed to capture a close-up view of a
+   questioner in the audience.  It automatically zooms in on that
+   student using sound localization.
+
+   Although the auditorium is equipped with 3 cameras, it is only
+   equipped with 2 screens.  One is a large screen located at the front
+   so that the class can see it.  The other is located at the rear so
+   the professor can see it.  When someone asks a question, the front
+   screen shows the questioner.  Otherwise, it shows the professor
+   (ensuring everyone can easily see her).
+
+   The remote sites are typical immersive telepresence rooms, each with
+   3 camera/screen pairs.
+
+   All remote sites display the professor on the center screen at full
+   size.  A second screen shows the entire classroom view when the
+   professor is speaking.  However, when a student asks a question, the
+   second screen shows the close-up view of the student at full size.
+   Sometimes the student is in the auditorium; sometimes the speaking
+   student is at another remote site.  The remote systems never display
+   the students that are actually in that room.
+
+   If someone at a remote site asks a question, then the screen in the
+   auditorium will show the remote student at full size (as if they were
+   present in the auditorium itself).  The screen in the rear also shows
+   this questioner, allowing the professor to see and respond to the
+   student without needing to turn her back on the main class.
+
+   When no one is asking a question, the screen in the rear briefly
+   shows a full-room view of each remote site in turn, allowing the
+   professor to monitor the entire class (remote and local students).
+   The professor can also use a control on the podium to see a
+   particular site -- she can choose either a full-room view or a
+   single-camera view.
+
+   Realization of this use case does not require any negotiation between
+   the participating sites.  Endpoint devices (and a Multipoint Control
+   Unit (MCU), if present) need to know who is speaking and what video
+   stream includes the view of that speaker.  The remote systems need
+   some knowledge of which stream should be placed in the center.  The
+   ability of the professor to see specific sites (or for the system to
+   show all the sites in turn) would also require the auditorium system
+
+
+
+
+
+Romanow, et al.               Informational                    [Page 13]
+
+RFC 7205                 Telepresence Use Cases               April 2014
+
+
+   to know what sites are available and to be able to request a
+   particular view of any site.  Bandwidth is optimized if video that is
+   not being shown at a particular site is not distributed to that site.
+
+3.7.  Multipoint Multiview (Virtual Space)
+
+   This use case describes a virtual space multipoint meeting with good
+   eye contact and spatial layout of participants.  The use case was
+   proposed very early in the development of video conferencing systems
+   as described in 1983 by Allardyce and Randal [virtualspace].  The use
+   case is illustrated in Figure 2-5 of their report.  The virtual space
+   expands the point-to-point case by having all multipoint conference
+   participants "seated" in a virtual room.  In this case, each
+   participant has a fixed "seat" in the virtual room, so each
+   participant expects to see a different view having a different
+   participant on his left and right side.  Today, the use case is
+   implemented in multiple telepresence-type video conferencing systems
+   on the market.  The term "virtual space" was used in their report.
+   The main difference between the result obtained with modern systems
+   and those from 1983 are larger screen sizes.
+
+   Virtual space multipoint as defined here assumes endpoints with
+   multiple cameras and screens.  Usually, there is the same number of
+   cameras and screens at a given endpoint.  A camera is positioned
+   above each screen.  A key aspect of virtual space multipoint is the
+   details of how the cameras are aimed.  The cameras are each aimed on
+   the same area of view of the participants at the site.  Thus, each
+   camera takes a picture of the same set of people but from a different
+   angle.  Each endpoint sender in the virtual space multipoint meeting
+   therefore offers a choice of video streams to remote receivers, each
+   stream representing a different viewpoint.  For example, a camera
+   positioned above a screen to a participant's left may take video
+   pictures of the participant's left ear; while at the same time, a
+   camera positioned above a screen to the participant's right may take
+   video pictures of the participant's right ear.
+
+   Since a sending endpoint has a camera associated with each screen, an
+   association is made between the receiving stream output on a
+   particular screen and the corresponding sending stream from the
+   camera associated with that screen.  These associations are repeated
+   for each screen/camera pair in a meeting.  The result of this system
+   is a horizontal arrangement of video images from remote sites, one
+   per screen.  The image from each screen is paired with the camera
+   output from the camera above that screen, resulting in excellent eye
+   contact.
+
+
+
+
+
+
+Romanow, et al.               Informational                    [Page 14]
+
+RFC 7205                 Telepresence Use Cases               April 2014
+
+
+3.8.  Multiple Presentation Streams - Telemedicine
+
+   This use case describes a scenario where multiple presentation
+   streams are used.  In this use case, the local site is a surgery room
+   connected to one or more remote sites that may have different
+   capabilities.  At the local site, 3 main cameras capture the whole
+   room (the typical 3-camera telepresence case).  Also, multiple
+   presentation inputs are available: a surgery camera that is used to
+   provide a zoomed view of the operation, an endoscopic monitor, a
+   flouroscope (X-ray imaging), an ultrasound diagnostic device, an
+   electrocardiogram (ECG) monitor, etc.  These devices are used to
+   provide multiple local video presentation streams to help the surgeon
+   monitor the status of the patient and assist in the surgical process.
+
+   The local site may have 3 main screens and one (or more) presentation
+   screen(s).  The main screens can be used to display the remote
+   experts.  The presentation screen(s) can be used to display multiple
+   presentation streams from local and remote sites simultaneously.  The
+   3 main cameras capture different parts of the surgery room.  The
+   surgeon can decide the number, the size, and the placement of the
+   presentations displayed on the local presentation screen(s).  He can
+   also indicate which local presentation captures are provided for the
+   remote sites.  The local site can send multiple presentation captures
+   to remote sites, and it can receive from them multiple presentations
+   related to the patient or the procedure.
+
+   One type of remote site is a single- or dual-screen and one-camera
+   system used by a consulting expert.  In the general case, the remote
+   sites can be part of a multipoint telepresence conference.  The
+   presentation screens at the remote sites allow the experts to see the
+   details of the operation and related data.  Like the main site, the
+   experts can decide the number, the size, and the placement of the
+   presentations displayed on the presentation screens.  The
+   presentation screens can display presentation streams from the
+   surgery room, from other remote sites, or from local presentation
+   streams.  Thus, the experts can also start sending presentation
+   streams that can carry medical records, pathology data, or their
+   references and analysis, etc.
+
+   Another type of remote site is a typical immersive telepresence room
+   with 3 camera/screen pairs, allowing more experts to join the
+   consultation.  These sites can also be used for education.  The
+   teacher, who is not necessarily the surgeon, and the students are in
+   different remote sites.  Students can observe and learn the details
+   of the whole procedure, while the teacher can explain and answer
+   questions during the operation.
+
+
+
+
+
+Romanow, et al.               Informational                    [Page 15]
+
+RFC 7205                 Telepresence Use Cases               April 2014
+
+
+   All remote education sites can display the surgery room.  Another
+   option is to display the surgery room on the center screen, and the
+   rest of the screens can show the teacher and the student who is
+   asking a question.  For all the above sites, multiple presentation
+   screens can be used to enhance visibility: one screen for the zoomed
+   surgery stream and the others for medical image streams, such as MRI
+   images, cardiograms, ultrasonic images, and pathology data.
+
+4.  Acknowledgements
+
+   The document has benefitted from input from a number of people
+   including Alex Eleftheriadis, Marshall Eubanks, Tommy Andre Nyquist,
+   Mark Gorzynski, Charles Eckel, Nermeen Ismail, Mary Barnes, Pascal
+   Buhler, and Jim Cole.
+
+   Special acknowledgement to Lennard Xiao, who contributed the text for
+   the telemedicine use case, and to Claudio Allocchio for his detailed
+   review of the document.
+
+5.  Security Considerations
+
+   While there are likely to be security considerations for any solution
+   for telepresence interoperability, this document has no security
+   considerations.
+
+6.  Informative References
+
+   [ITU.H239]  ITU-T, "Role management and additional media channels for
+               H.300-series terminals", ITU-T Recommendation H.239,
+               September 2005.
+
+   [ITU.H264]  ITU-T, "Advanced video coding for generic audiovisual
+               services", ITU-T Recommendation H.264, April 2013.
+
+   [ITU.H323]  ITU-T, "Packet-based Multimedia Communications Systems",
+               ITU-T Recommendation H.323, December 2009.
+
+   [RFC3261]   Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston,
+               A., Peterson, J., Sparks, R., Handley, M., and E.
+               Schooler, "SIP: Session Initiation Protocol", RFC 3261,
+               June 2002.
+
+   [RFC3550]   Schulzrinne, H., Casner, S., Frederick, R., and V.
+               Jacobson, "RTP: A Transport Protocol for Real-Time
+               Applications", STD 64, RFC 3550, July 2003.
+
+   [RFC4582]   Camarillo, G., Ott, J., and K. Drage, "The Binary Floor
+               Control Protocol (BFCP)", RFC 4582, November 2006.
+
+
+
+Romanow, et al.               Informational                    [Page 16]
+
+RFC 7205                 Telepresence Use Cases               April 2014
+
+
+   [virtualspace]
+               Allardyce, L. and L. Randall, "Development of
+               Teleconferencing Methodologies with Emphasis on Virtual
+               Space Video and Interactive Graphics", April 1983,
+               <http://www.dtic.mil/docs/citations/ADA127738>.
+
+Authors' Addresses
+
+   Allyn Romanow
+   Cisco
+   San Jose, CA  95134
+   US
+
+   EMail: allyn@cisco.com
+
+
+   Stephen Botzko
+   Polycom
+   Andover, MA  01810
+   US
+
+   EMail: stephen.botzko@polycom.com
+
+
+   Mark Duckworth
+   Polycom
+   Andover, MA  01810
+   US
+
+   EMail: mark.duckworth@polycom.com
+
+
+   Roni Even (editor)
+   Huawei Technologies
+   Tel Aviv
+   Israel
+
+   EMail: roni.even@mail01.huawei.com
+
+
+
+
+
+
+
+
+
+
+
+
+
+Romanow, et al.               Informational                    [Page 17]
+