diff options
Diffstat (limited to 'doc/rfc/rfc5897.txt')
-rw-r--r-- | doc/rfc/rfc5897.txt | 1291 |
1 files changed, 1291 insertions, 0 deletions
diff --git a/doc/rfc/rfc5897.txt b/doc/rfc/rfc5897.txt new file mode 100644 index 0000000..06ae2f7 --- /dev/null +++ b/doc/rfc/rfc5897.txt @@ -0,0 +1,1291 @@ + + + + + + +Internet Engineering Task Force (IETF) J. Rosenberg +Request for Comments: 5897 jdrosen.net +Category: Informational June 2010 +ISSN: 2070-1721 + + + Identification of Communications Services + in the Session Initiation Protocol (SIP) + +Abstract + + This document considers the problem of service identification in the + Session Initiation Protocol (SIP). Service identification is the + process of determining the user-level use case that is driving the + signaling being utilized by the user agent (UA). This document + discusses the uses of service identification, and outlines several + architectural principles behind the process. It identifies perils + when service identification is not done properly -- including fraud, + interoperability failures, and stifling of innovation. It then + outlines a set of recommended practices for service identification. + +Status of This Memo + + This document is not an Internet Standards Track specification; it is + published for informational purposes. + + This document is a product of the Internet Engineering Task Force + (IETF). It represents the consensus of the IETF community. It has + received public review and has been approved for publication by the + Internet Engineering Steering Group (IESG). Not all documents + approved by the IESG are a candidate for any level of Internet + Standard; see Section 2 of RFC 5741. + + Information about the current status of this document, any errata, + and how to provide feedback on it may be obtained at + http://www.rfc-editor.org/info/rfc5897. + +Copyright Notice + + Copyright (c) 2010 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents + (http://trustee.ietf.org/license-info) in effect on the date of + publication of this document. Please review these documents + carefully, as they describe your rights and restrictions with respect + to this document. Code Components extracted from this document must + + + +Rosenberg Informational [Page 1] + +RFC 5897 Service ID in SIP June 2010 + + + include Simplified BSD License text as described in Section 4.e of + the Trust Legal Provisions and are provided without warranty as + described in the Simplified BSD License. + +Table of Contents + + 1. Introduction ....................................................3 + 2. Services and Service Identification .............................4 + 3. Example Services ................................................6 + 3.1. IPTV vs. Multimedia ........................................6 + 3.2. Gaming vs. Voice Chat ......................................7 + 3.3. Gaming vs. Voice Chat #2 ...................................7 + 3.4. Configuration vs. Pager Messaging ..........................7 + 4. Using Service Identification ....................................8 + 4.1. Application Invocation in the User Agent ...................8 + 4.2. Application Invocation in the Network ......................9 + 4.3. Network Quality-of-Service Authorization ..................10 + 4.4. Service Authorization .....................................10 + 4.5. Accounting and Billing ....................................11 + 4.6. Negotiation of Service ....................................11 + 4.7. Dispatch to Devices .......................................11 + 5. Key Principles of Service Identification .......................12 + 5.1. Services Are a By-Product of Signaling ....................12 + 5.2. Identical Signaling Produces Identical Services ...........13 + 5.3. Do What I Say, Not What I Mean ............................14 + 5.4. Declarative Service Identifiers Are Redundant .............15 + 5.5. URIs Are Key for Differentiated Signaling .................15 + 6. Perils of Declarative Service Identification ...................16 + 6.1. Fraud .....................................................16 + 6.2. Systematic Interoperability Failures ......................17 + 6.3. Stifling of Service Innovation ............................18 + 7. Recommendations ................................................20 + 7.1. Use Derived Service Identification ........................20 + 7.2. Design for SIP's Negotiative Expressiveness ...............20 + 7.3. Presence ..................................................21 + 7.4. Intra-Domain ..............................................21 + 7.5. Device Dispatch ...........................................21 + 8. Security Considerations ........................................22 + 9. Acknowledgements ...............................................22 + 10. Informative References ........................................22 + + + + + + + + + + + +Rosenberg Informational [Page 2] + +RFC 5897 Service ID in SIP June 2010 + + +1. Introduction + + The Session Initiation Protocol (SIP) [RFC3261] defines mechanisms + for initiating and managing communications sessions between agents. + SIP allows for a broad array of session types between agents. It can + manage audio sessions, ranging from low-bitrate voice-only up to + multi-channel high-fidelity music. It can manage video sessions, + ranging from small, "talking-head" style video chat, up to high- + definition multipoint video conferencing and ranging from low- + bandwidth user-generated content, up to high-definition movie and TV + content. SIP endpoints can be anything -- adaptors that convert an + old analog telephone to Voice over IP (VoIP), dedicated hardphones, + fancy hardphones with rich displays and user entry capabilities, + softphones on a PC, buddy-list and presence applications on a PC, + dedicated videoconferencing peripherals, and speakerphones. + + This breadth of applicability is SIP's greatest asset, but it also + introduces numerous challenges. One of these is that, when an + endpoint generates a SIP INVITE for a session, or receives one, that + session can potentially be within the context of any number of + different use cases and endpoint types. For example, a SIP INVITE + with a single audio stream could represent a Push-To-Talk session + between mobile devices, a VoIP session between softphones, or audio- + based access to stored content on a server. + + Each of these different use cases represents a different service. + The service is the user-visible use case that is driving the behavior + of the user agents and servers in the SIP network. + + The differing services possible with SIP have driven implementors and + system designers to seek techniques for service identification. + Service identification is the process of determining and/or signaling + the specific use case that is driving the signaling being generated + by a user agent. At first glance, this seems harmless and easy + enough. It is tempting to define a new header, "Service-ID", for + example, and have a user agent populate it with any number of well- + known tokens that define what the service is. It could then be + consumed for any number of purposes. A token placed into the + signaling for this purpose is called a service identifier. + + Service identification and service identifiers, when used properly, + can be beneficial. However, when done improperly, service + identification can lead to fraud, systemic interoperability failures, + and a complete stifling of the innovation that SIP was meant to + achieve. The purpose of this document is to describe service + identification in more detail and describe how these problems arise. + + + + + +Rosenberg Informational [Page 3] + +RFC 5897 Service ID in SIP June 2010 + + + Section 2 begins by defining a service and the service identification + problem. Section 3 gives some concrete examples of services and why + they can be challenging to identify. Section 4 explores the ways in + which a service identification can be utilized within a network. + Next, Section 5 discusses the key architectural principles of service + identification. Section 6 describes what declarative service + invocation is, and how it can lead to fraud, interoperability + failures, and stifling of service innovation. + + Consequently, this document concludes that declarative service + identification -- the process by which a user agent inserts a moniker + into a message that defines the desired service, separate from + explicit and well-defined protocol mechanisms -- is harmful. + + Instead of performing declarative service identification, this + document recommends derived service identification, and gives several + recommendations around it in Section 7: + + 1. The identity of a service should always be derived from the + explicit signaling in the protocol messages and other contextual + information, and never indicated by the user through a separate + identifier placed into the message. + + 2. The process of service identification based on signaling messages + must be designed to SIP's negotiative expressiveness, and + therefore handle heterogeneity and not assume a fixed set of use + cases. + + 3. Presence can help in providing URIs that can be utilized to + connect to specific services, thereby creating explicit + indications in the signaling that can be used to derive a service + identity. + + 4. Service identities placed into signaling messages for the + purposes of caching the service identity are strictly for intra- + domain usage. + + 5. Device dispatch should be based on feature tags that map to well- + defined SIP extensions and capabilities. Service dispatch should + not be based on abstract service identifiers. + +2. Services and Service Identification + + The problem of identifying services within SIP is not a new one. The + problem has been considered extensively in the context of presence. + In particular, the presence data model for SIP [RFC4479] defines the + concept of a service as one of the core notions that presence + describes. Services are described in Section 3.3 of RFC 4479. + + + +Rosenberg Informational [Page 4] + +RFC 5897 Service ID in SIP June 2010 + + + Essentially, the service is the user-visible use case that is driving + the behavior of the user agents and servers in the SIP network. + Being user-visible means that there is a difference in user + experience between two services that are different. That user + experience can be part of the call, or outside of the call. Within a + call, the user experience can be based on different media types (an + audio call vs. a video chat), different content within a particular + media type (stored content, such as a movie or TV session), different + devices (a wireless device for "telephony" vs. a PC application for + "voice chat"), different user interfaces (a buddy-list view of voice + on a PC application vs. a software emulation of a hardphone), + different communities that can be accessed (voice chat with other + users that have the same voice chat client vs. voice communications + with any endpoint on the Public Switched Telephone Network (PSTN)), + or different applications that are invoked by the user (manually + selecting a Push-To-Talk application from a wireless phone vs. a + telephony application). Outside of a call, the difference in user + experience can be a billing one (cheaper for one service than + another), a notification feature for one and not another (for + example, an IM that gets sent whenever a user makes a call), and + so on. + + In some cases, there is very little difference in the underlying + technology that will support two different services, and in other + cases, there are big differences. However, for the purposes of this + discussion, the key definition is that two services are distinct when + there is a perceived difference by the user in the two services. + + This leads naturally to the desire to perform service identification. + Service identification is defined as the process of: + + 1. determining the underlying service that is driving a particular + signaling exchange, + + 2. associating that service with a service identifier, and + + 3. attaching that moniker to a signaling message (typically a SIP + INVITE). + + Once service identification is performed, the service identifier can + then be used for various purposes within the network. Service + identification can be done in the endpoints, in which case the UA + would insert the moniker directly into the signaling message based on + its awareness of the service. Or, it can be done within a server in + the network (such as a proxy), based on inspection of the SIP + message, or based on hints placed into the message by the user. + + + + + +Rosenberg Informational [Page 5] + +RFC 5897 Service ID in SIP June 2010 + + + When service identification is performed entirely by inspecting the + signaling, this is called derived service identification. When it is + done based on knowledge possessed only by the invoking user agent, it + is called declarative service identification. Declarative service + identification can only be done in user agents, by definition. + +3. Example Services + + It is very useful to consider several example services, especially + ones that appear difficult to differentiate from each other. In + cases where it is hard to differentiate, service identification -- + and in particular, declarative service identification -- appears + highly attractive (and indeed, required). + +3.1. IPTV vs. Multimedia + + IP Television (IPTV) is the usage of IP networks to access + traditional television content, such as movies and shows. SIP can be + utilized to establish a session to a media server in a network, which + then serves up multimedia content and streams it as an audio and + video stream towards the client. Whether SIP is ideal for IPTV is, + in itself, a good question. However, such a discussion is outside + the scope of this document. + + Consider multimedia conferencing. The user accesses a voice and + video conference at a conference server. The user might join in + listen-only mode, in which case the user receives audio and video + streams, but does not send. + + These two services -- IPTV and listen-only multimedia conferencing -- + clearly appear as different services. They have different user + experiences and applications. A user is unlikely to ever be confused + about whether a session is IPTV or listen-only multimedia + conferencing. Indeed, they are likely to have different software + applications or endpoints for the two services. + + However, these two services look remarkably alike based on the + signaling. Both utilize audio and video. Both could utilize the + same codecs. Both are unidirectional streams (from a server in the + network to the client). Thus, it would appear on the surface that + there is no way to differentiate them, based on inspection of the + signaling alone. + + + + + + + + + +Rosenberg Informational [Page 6] + +RFC 5897 Service ID in SIP June 2010 + + +3.2. Gaming vs. Voice Chat + + Consider an interactive game, played between two users from their + mobile devices. The game involves the users sending each other game + moves, using a messaging channel, in addition to voice. In another + service, users have a voice and IM chat conversation using a buddy- + list application on their PC. + + In both services, there are two media streams -- audio and messaging. + The audio uses the same codecs. Both use the Message Session Relay + Protocol (MSRP) [RFC4975]. In both cases, the caller would send an + INVITE to the Address of Record (AOR) of the target user. However, + these represent fairly different services, in terms of user + experience. + +3.3. Gaming vs. Voice Chat #2 + + Consider a variation on the example in Section 3.2. In this + variation, two users are playing an interactive game between their + phones. However, the game itself is set up and controlled using a + proprietary mechanism -- not using SIP at all. However, the client + application allows the user to chat with their opponent. The chat + session is a simple voice session set up between the players. + + Compare this with a basic telephone call between the two users. Both + involve a single audio session. Both use the same codecs. They + appear to be identical. However, different user experiences are + needed. For example, we desire traditional telephony features (such + as call forwarding and call screening) to be applied in the telephone + service, but not in the gaming chat service. + +3.4. Configuration vs. Pager Messaging + + The SIP MESSAGE method [RFC3428] provides a way to send one-shot + messages to a particular AOR. This specification is primarily aimed + at Short Message Service (SMS)-style messaging, commonly found in + wireless phones. Receipt of a MESSAGE request would cause the + messaging application on a phone to launch, allowing the user to + browse the message history and respond. + + However, a MESSAGE request is sometimes used for the delivery of + content to a device for other purposes. For example, some providers + use it to deliver configuration updates, such as new phone settings + or parameters, or to indicate that a new version of firmware is + available. Though not designed for this purpose, the MESSAGE method + gets used since, in existing wireless networks, SMS is used for this + purpose, and the MESSAGE request is the SIP equivalent of SMS. + + + + +Rosenberg Informational [Page 7] + +RFC 5897 Service ID in SIP June 2010 + + + Consequently, the MESSAGE request sent to a phone can be for two + different services. One would require invocation of a messaging app, + whereas the other would be consumed by the software in the phone, + without any user interaction at all. + +4. Using Service Identification + + It is important to understand what the service identity would be + utilized for, if known. This section discusses the primary uses. + These are application invocation in user agents and the network, + Quality of Service authorization, service authorization, accounting + and billing, service negotiation, and device dispatch. + +4.1. Application Invocation in the User Agent + + In some of the examples above, there were multiple software + applications executing on the host. One common way of achieving this + is to utilize a common SIP user agent implementation that listens for + requests on a single port. When an incoming INVITE or MESSAGE + arrives, it must be delivered to the appropriate application + software. When each service is bound to a distinct software + application, it would seem that the service identity is needed to + dispatch the message to the appropriate piece of software. This is + shown in Figure 1. + + +---------------------------------+ + | | + | +-------------+ +-------------+ | + | | UI | | UI | | + | +-------------+ +-------------+ | + | +-------------+ +-------------+ | + | | | | | | + | | Service 1 | | Service 2 | | + | | | | | | + | +-------------+ +-------------+ | + | +-----------------------------+ | + | | | | + | | SIP | | + | | Layer | | + | | | | + | +-----------------------------+ | + | | + +---------------------------------+ + + Physical Device + + Figure 1 + + + + +Rosenberg Informational [Page 8] + +RFC 5897 Service ID in SIP June 2010 + + + The role of the SIP layer is to parse incoming messages, handle the + SIP state machinery for transactions and dialogs, and then dispatch + requests to the appropriate service. This software architecture is + analogous to the way web servers frequently work. An HTTP server + listens on port 80 for requests, and based on the HTTP Request-URI, + dispatches the request to a number of disparate applications. The + same is happening here. For the example services in Section 3.2, an + incoming INVITE for the gaming service would be delivered to the + gaming application software. An incoming INVITE for the voice chat + service would be delivered to the voice chat application software. + The example in Section 3.3 is similar. For the examples in + Section 3.4, a MESSAGE request for user-to-user messaging would be + delivered to the messaging or SMS app, and a MESSAGE request + containing configuration data would be delivered to a configuration + update application. + + Unlike the web, however, in all three use cases, the user initiating + communications has (or appears to have -- more below) only a single + identifier for the recipient -- their AOR. Consequently, the SIP + Request-URI cannot be used for dispatching, as it is identical in all + three cases. + +4.2. Application Invocation in the Network + + Another usage of a service identifier would be to cause servers in + the SIP network to provide additional processing, based on the + service. For example, an INVITE issued by a user agent for IPTV + would pass through a server that does some kind of content rights + management, authorizing whether the user is allowed to access that + content. On the other hand, an INVITE issued by a user for + multimedia conferencing would pass through a server providing + "traditional" telephony features, such as outbound call screening and + call recording. It would make no sense for the INVITE associated + with IPTV to have outbound call screening and call recording applied, + and it would make no sense for the multimedia conferencing INVITE to + be processed by the content rights management server. Indeed, in + these cases, it's not just an efficiency issue (invoking servers when + not needed), but rather, truly incorrect behavior can occur. For + example, if an outbound call screening application is set to block + outbound calls to everything except for the phone numbers of friends + and family, an IPTV request that gets processed by such a server + would be blocked (as it's not targeted to the AOR of a friend or + family member). This would block a user's attempt to access IPTV + services, when that was not the goal at all. + + Similarly, a MESSAGE request as described in Section 3.4 might need + to pass through a message server for filtering when it is associated + with chat, but not when it is associated with a configuration update. + + + +Rosenberg Informational [Page 9] + +RFC 5897 Service ID in SIP June 2010 + + + Consider a filter that gets applied to MESSAGE requests, and that + filter runs in a server in the network. The filter operation + prevents user Joe from sending messages to user Bob that contain the + words "stock" or "purchase", due to some regulations that disallow + Joe and Bob from discussing stock trading. However, a MESSAGE for + configuration purposes might contain an XML document that uses the + token "stock" as some kind of attribute. This configuration update + would be discarded by the filtering server, when it should not have + been. + +4.3. Network Quality-of-Service Authorization + + The IP network can provide differing levels of Quality of Service + (QoS) to IP packets. This service can include guaranteed throughput, + latency, or loss characteristics. Typically, the user agent will + make some kind of QoS request, either using explicit signaling + protocols (such as the Resource ReSerVation Protocol (RSVP) + [RFC2205]) or through marking of a Diffserv value in packets. The + network will need to make a policy decision based on whether or not + these QoS treatments are authorized. One common authorization policy + is to check if the user has invoked a service using SIP that they are + authorized to invoke, and that this service requires the level of QoS + treatment the user has requested. + + For example, consider IPTV and multimedia conferencing as described + in Section 3.1. IPTV is a non-real-time service. Consequently, + media traffic for IPTV would be authorized for bandwidth guarantees, + but not for latency or loss guarantees. On the other hand, + multimedia conferencing is in real time. Its traffic would require + bandwidth, loss, and latency guarantees from the network. + + Consequently, if a user should make an RSVP reservation for a media + stream, and ask for latency guarantees for that stream, the network + would choose to be able to authorize it if the service was multimedia + conferencing, but not if it was IPTV. This would require the server + performing the QoS authorization to know the service associated with + the INVITE that set up the session. + +4.4. Service Authorization + + Frequently, a network administrator will want to authorize whether a + user is allowed to invoke a particular service. Not all users will + be authorized to use all services that are provided. For example, a + user may not be authorized to access IPTV services, whereas they are + authorized to utilize multimedia processing. A user might not be + able to utilize a multiplayer gaming service, whereas they are + authorized to utilize voice chat services. + + + + +Rosenberg Informational [Page 10] + +RFC 5897 Service ID in SIP June 2010 + + + Consequently, when an INVITE arrives at a server in the network, the + server will need to determine what the requested service is, so that + the server can make an authorization decision. + +4.5. Accounting and Billing + + Service authorization and accounting/billing go hand in hand. One of + the primary reasons for authorizing that a user can utilize a service + is that they are being billed differently based on the type of + service. Consequently, one of the goals of a service identity is to + be able to include it in accounting records, so that the appropriate + billing model can be applied. + + For example, in the case of IPTV, a service provider can bill based + on the content (US $5 per movie, perhaps), whereas for multimedia + conferencing, they can bill by the minute. This requires the + accounting streams to indicate which service was invoked for the + particular session. + +4.6. Negotiation of Service + + In some cases, when the caller initiates a session, they don't + actually know which service will be utilized. Rather, they might + choose to offer up all of the services they have available to the + called party, and then let the called party decide, or let the system + make a decision based on overlapping service capabilities. + + As an example, a user can do both the game and the voice chat service + described in Section 3.2. The user initiates a session to a target + AOR, but the devices used by the target can only support voice chat. + The called device returns, in its call acceptance, an indication that + only voice chat can be used. Consequently, voice chat gets utilized + for the session. + +4.7. Dispatch to Devices + + When a user has multiple devices, each with varying capabilities in + terms of service, it is useful to dispatch an incoming request to the + right device based on whether the device can support the service that + has been requested. + + For example, if a user initiates a gaming session with voice chat, + and the target user has two devices -- one that can support the + gaming service, and another that cannot -- the INVITE should be + dispatched to the device that supports the gaming session. + + + + + + +Rosenberg Informational [Page 11] + +RFC 5897 Service ID in SIP June 2010 + + +5. Key Principles of Service Identification + + In this section, we describe several key principles of service + identification: + + 1. Services are a by-product of signaling + + 2. Identical signaling produces identical services + + 3. Declarative service identification is an example of "Do What I + Mean" (DWIM) + + 4. Declarative service identifiers are redundant + + 5. URIs are a key mechanism for producing differentiated signaling + +5.1. Services Are a By-Product of Signaling + + Declarative service identification -- the addition of a service + identifier by clients in order to inform other entities of what the + service is -- is a very compelling solution to solving the use cases + described above. It provides a clear way for each of the use cases + to be differentiated. On the other hand, derived service + identification appears "hard", since the signaling appears to be the + same for these different services. + + Declarative service identification misses a key point, which cannot + be stressed enough, and which represents the core architectural + principle to be understood here: + + A service is the byproduct of the signaling and the context around + it (the user profile, time of day, and so on) -- the effects of + the signaling message once it is launched into the network. The + service identity is therefore always derivable from the signaling + and its context without additional identifiers. In other words, + derived service identification is always possible when signaling + is being properly handled. + + When a user sends an INVITE request to the network and targets that + request at an IPTV server, and includes the Session Description + Protocol (SDP) for audio and video streaming, the *result* of sending + such an INVITE is that an IPTV session occurs. The entire purpose of + the INVITE is to establish such a session, and therefore, invoke the + service. Thus, a service is not something that is different from the + rest of the signaling message. A service is what the user gets after + the network and other user agents have processed a signaling message. + + + + + +Rosenberg Informational [Page 12] + +RFC 5897 Service ID in SIP June 2010 + + + It may seem that delayed offers (SIP INVITE requests that lack SDP) + make it impossible to perform derived service identification. After + all, in some of the cases above, the differentiation was done using + the SDP in the request. What if it's not there? The answer is + simple -- if it's not there, and the SDP is being offered by the + called party, you cannot in fact know the service at the time of the + INVITE. That's the whole point of delayed offer -- to give the + called party the chance to offer up what it wants for the session. + In cases where service identification is needed at request time, + delayed offer cannot be used. + +5.2. Identical Signaling Produces Identical Services + + This principle is a natural conclusion of the previous assertion. If + a service is the byproduct of signaling, how can a user have + different experiences and different services when the signaling + message is the same? They cannot. + + But how can that be? From the examples in Section 3, it would seem + that there are services that are different, but have identical + signaling. If we hold true to the assertion, there is in fact only + one logical conclusion: + + If two services are different, but their signaling appears to be + the same, it is because one or more of the following is true: + + 1. there is in fact something different that has been overlooked + + 2. something has been implied from the signaling, when in fact it + should have been signaled explicitly + + 3. the signaling mechanism should be changed so that there is, in + fact, something that is different + + To illustrate this, let us take each of the example services in + Section 3 and investigate whether there is, or should be, something + different in the signaling in each case. + + IPTV vs. Multimedia Conferencing: The two services described in + Section 3.1 appear to have identical signaling. They both involve + audio and video streams, both of which are unidirectional. Both + might utilize the same codecs. However, there is another + important difference in the signaling -- the target URI. In the + case of IPTV, the request is targeted at a media server or to a + particular piece of content to be viewed. In the case of + multimedia conferencing, the target is a conference server. The + administrator of the domain can therefore examine the Request-URI + + + + +Rosenberg Informational [Page 13] + +RFC 5897 Service ID in SIP June 2010 + + + and figure out whether it is targeted for a conference server or a + content server, and use that to derive the service associated with + the request. + + Gaming vs. Voice Chat: Though both sessions involve MSRP and voice, + and both are targeted to the same AOR of the called user, there is + a difference. The MSRP messages for the gaming session carry + content that is game specific, whereas the MSRP messages for the + voice chat are just regular text, meant for rendering to a user. + Thus, the MSRP session in the SDP will indicate the specific + content type that MSRP is carrying, and this type will differ in + both cases. Even if the game moves look like text, since they are + being consumed by an automata, there is an underlying schema that + dictates their content, and therefore, this schema represents the + actual content type that should be signaled. + + Gaming vs. Voice Chat #2: In this case, both sessions involve only + voice, and both are targeted at the same AOR. Indeed, there truly + is nothing different -- if indeed the signaling works this way. + However, there is an alternative mechanism for performing the + signaling. For the gaming session, the proprietary protocol can + be used to exchange a URI that can be used to identify the voice + chat function on the phone that is associated with the game (for + example, a Globally Routable User Agent URI (GRUU) can be used + [RFC5627]). Indeed, the gaming chat is not targeting the USER -- + it's targeting the gaming instance on the phone. Thus, if a + special GRUU is used for the gaming chat, this makes the signaling + different between these two services. + + Configuration vs. Pager Messaging: Just as in the case of gaming vs. + voice chat, the content type of the messages differentiates the + service that occurs as a consequence of the messages. + +5.3. Do What I Say, Not What I Mean + + "Do What I Mean", abbreviated as DWIM, is a concept in computer + science. It is sometimes used to describe a function that tries to + intelligently guess at what the user intended. It is in contrast to + "Do What I Say", or DWIS, which describes a function that behaves + concretely based on the inputs provided. Systems built on the DWIM + concept can have unexpected behaviors, because they are driven by + unstated rules. + + Declarative service identification is an example of DWIM. The + service identifier has no well-defined impact on the state machinery + or protocols in the system; it has various side effects based on an + assumption of what is meant by the service identifier. Derived + service identification, on the other hand, is an expression of the + + + +Rosenberg Informational [Page 14] + +RFC 5897 Service ID in SIP June 2010 + + + principle of DWIS -- the behavior of the system is based entirely on + the specifics of the protocol and are well defined by the protocol + specification. The service identifier is just a shorthand for + summarizing things that are well defined by signaling. + + As a litmus test to differentiate the two cases, consider the + following question. If a request contained a service identifier, and + that request were processed by a domain that didn't understand the + concept of service identifiers at all, would the request be rejected + if that service were not supported, or would it complete but do the + wrong thing? If it is the latter case, it's DWIM. If it's the + former, it's DWIS. + +5.4. Declarative Service Identifiers Are Redundant + + Because a declarative service identifier is, by definition, inside of + the signaling message, and because the signaling itself completely + defines the behavior of the service, another natural conclusion is + that a declarative service identifier is redundant with the signaling + itself. It says nothing that could not or should not otherwise be + derived from examination of the signaling. + +5.5. URIs Are Key for Differentiated Signaling + + In the IPTV example and in the second gaming example, it was + ultimately the Request-URI that was (or should be) different between + the two services. This is important. In many cases where services + appear the same, it is because the resource that is being targeted is + not, in fact, the user. Rather, it is a resource that is linked with + the user. This resource might be an instance of a software + application on the particular device of a user, or a resource in the + network that acts on behalf of the user. + + The Request-URI is an infinitely large namespace for identifying + these resources. It is an ideal mechanism for providing + differentiation when there would otherwise be none. + + Returning again to the example in Section 3.3, we can see that it + does make more sense to target the gaming chat session at a software + instance on the user's phone, rather than at the user themselves. + The gaming chat session should really only go to the phone on which + the user is playing the game. The software instance does indeed live + only on that phone, whereas the user themselves can be contacted in + many ways. We don't want telephony features invoked for the gaming + chat session, because those features only make sense when someone is + trying to communicate with the USER. When someone is trying to + + + + + +Rosenberg Informational [Page 15] + +RFC 5897 Service ID in SIP June 2010 + + + communicate with a software instance that acts on behalf of the user, + a different set of rules apply, since the target of the request is + completely different. + +6. Perils of Declarative Service Identification + + Based on these principles, several perils of declarative service + identification can be described. They are: + + 1. Declarative service identification can be used for fraud + + 2. Declarative service identification can hurt interoperability + + 3. Declarative service identification can stifle service innovation + +6.1. Fraud + + Declarative service identification can lead to fraud. If a provider + uses the service identifier for billing and accounting purposes, or + for authorization purposes, it opens an avenue for attack. The user + can construct the signaling message so that its actual effect (which + is the service the user will receive), is what the user desires, but + the user places a service identifier into the request (which is what + is used for billing and authorization) that identifies a cheaper + service, or one that the user is not authorized to receive. In such + a case, the user will receive service, and not be billed properly for + it. + + If, however, the domain administrator derived the service identifier + from the signaling itself (derived service identification), the user + cannot lie. If they did lie, they wouldn't get the desired service. + + Consider the example of IPTV vs. multimedia conferencing. If + multimedia conferencing is cheaper, the user could send an INVITE for + an IPTV session, but include a service identifier that indicates + multimedia conferencing. The user gets the service associated with + IPTV, but at the cost of multimedia conferencing. + + This same principle shows up in other places -- for example, in the + identification of an emergency services call [ECRIT-FRAMEWORK]. It + is desirable to give emergency services calls special treatment, such + as being free and authorized even when the user cannot otherwise make + calls, and to give them priority. If emergency calls were indicated + through something other than the target of the call being an + emergency services URN [RFC5031], it would open an avenue for fraud. + The user could place any desired URI in the request-URI, and indicate + separately, through a declarative identifier, that the call is an + emergency services call. This would then get special treatment but + + + +Rosenberg Informational [Page 16] + +RFC 5897 Service ID in SIP June 2010 + + + of course would get routed to the target URI. The only way to + prevent this fraud is to consider an emergency call as any call whose + target is an emergency services URN. Thus, the service + identification here is based on the target of the request. When the + target is an emergency services URN, the request can get special + treatment. The user cannot lie, since there is no way to separately + indicate that this is an emergency call, besides targeting it to an + emergency URN. + +6.2. Systematic Interoperability Failures + + How can declarative service identification cause loss of + interoperability? When an identifier is used to drive functionality + -- such as dispatch on the phones, in the network, or QoS + authorization -- it means that the wrong thing can happen when this + field is not set properly. Consider a user in domain 1, calling a + user in domain 2. Domain 1 provides the user with a service they + call "voice chat", which utilizes voice and IM for real-time + conversation, driven off of a buddy-list application on a PC. + Domain 2 provides their users with a service they call "text + telephony", which is a voice service on a wireless device that also + allows the user to send text messages. Consider the case where + domain 1 and domain 2 both have their user agents insert a service + identifier into the request, and then use that to perform QoS + authorization, accounting, and invocation of applications in the + network and in the device. The user in domain 1 calls the user in + domain 2, and inserts the identifier "Voice Chat" into the INVITE. + When this arrives at the server in domain 2, the service identifier + is unknown. Consequently, the request does not get the proper QoS + treatment, even if the call itself will succeed. + + If, on the other hand, derived service identification were used, the + service identifier could be removed by domain 2, and then recomputed + based on the signaling to match its own notion of services. In this + case, domain 2 could derive the "text telephony" identifier, and the + request completes successfully. + + Declarative service identification, used between domains, causes + interoperability failures unless all interconnected domains agree on + exactly the same set of services and how to name them. Of course, + lack of service identifiers does not guarantee service + interoperability. However, SIP was built with rich tools for + negotiation of capabilities at a finely granular level. One user + agent can make a call using audio and video, but if the receiving UA + only supports audio, SIP allows both sides to negotiate down to the + lowest common denominator. Thus, communication is still provided. + As another example, if one agent initiates a Push-To-Talk session + (which is audio with a companion floor control mechanism), and the + + + +Rosenberg Informational [Page 17] + +RFC 5897 Service ID in SIP June 2010 + + + other side only did regular audio, SIP would be able to negotiate + back down to a regular voice call. As another example, if a calling + user agent is running a high-definition video conferencing endpoint, + and the called user agent supports just a regular video endpoint, the + codecs themselves can negotiate downward to a lower rate, picture + size, and so on. Thus, interoperability is achieved. Interestingly, + the final "service" may no longer be well characterized by the + service identifier that would have been placed in the original + INVITE. For example, in this case, if the original INVITE from the + caller had contained the service identifier "hi-fi video", but the + video gets negotiated down to a lower rate and picture size, the + service identifier is no longer really appropriate. That is why + services need to be derived by signaling -- because the signaling + itself provides negotiation and interoperability between different + domains. + + This illustrates another key aspect of the interoperability problem. + Declarative service identification will result in inconsistencies + between its service identifiers and the results of any SIP + negotiation that might otherwise be applied in the session. + + When a service identifier becomes something that both proxies and the + user agent need to understand in order to properly treat a request + (which is the case for declarative service identification), it + becomes equivalent to including a token in the Proxy-Require and + Require header fields of every single SIP request. The very reason + that [RFC4485] frowns upon usage of Require and certainly Proxy- + Require is the huge impact on interoperability it causes. It is for + this same reason that declarative service identification needs to be + avoided. + +6.3. Stifling of Service Innovation + + The probability that any two service providers end up with the same + set of services, and give those services the same names, becomes + smaller and smaller as the number of providers grow. Indeed, it + would almost certainly require a centralized authority to identify + what the services are, how they work, and what they are named. This, + in turn, leads to a requirement for complete homogeneity in order to + facilitate interconnection. Two providers cannot usefully + interconnect unless they agree on the set of services they are + offering to their customers and each do the same thing. This is + because each provider has become dependent on inclusion of the proper + service identifier in the request, in order for the overall treatment + of the request to proceed correctly. This is, in a very real sense, + anathema to the entire notion of SIP, which is built on the idea that + heterogeneous domains can interconnect and still get + interoperability. + + + +Rosenberg Informational [Page 18] + +RFC 5897 Service ID in SIP June 2010 + + + Declarative service identification leads to a requirement for + homogeneity in service definitions across providers that + interconnect, ruining the very service heterogeneity that SIP was + meant to bring. + + Indeed, Metcalfe's Law says that the value of a network grows with + the square of the number of participants. As a consequence of this, + once a bunch of large domains did get together, agree on a set of + services, and then agree on a set of well-known identifiers for those + services, it would force other providers to also deploy the same + services, in order to obtain the value that interconnection brings. + This, in turn, will stifle innovation, and quickly force the set of + services in SIP to become fixed and never expand beyond the ones + initially agreed upon. This, too, is anathema to the very framework + on which SIP is built, and defeats much of the purpose of why + providers have chosen to deploy SIP in their own networks. + + Consider the following example. Several providers get together and + standardize on a bunch of service identifiers. One of these uses + audio and video (say, "multimedia conversation"). This service is + successful and is widely utilized. Endpoints look for this + identifier to dispatch calls to the right software applications, and + the network looks for it to invoke features, perform accounting, and + provide QoS. A new provider gets the idea for a new service (say, + "avatar-enhanced multimedia conversation"). In this service, there + is audio and video, but there is a third stream, which renders an + avatar. A caller can press buttons on their phone, to cause the + avatar on the other person's device to show emotion, make noise, and + so on. This is similar to the way emoticons are used today in IM. + This service is enabled by adding a third media stream (and + consequently, a third m-line) to the SDP. + + Normally, this service would be backwards-compatible with a regular + audio-video endpoint, which would just reject the third media stream. + However, because a large network has been deployed that is expecting + to see the token, "multimedia conversation" and its associated audio+ + video service, it is nearly impossible for the new provider to roll + out this new service. If they did, it would fail completely, or + partially fail, when their users call users in other provider + domains. + + + + + + + + + + + +Rosenberg Informational [Page 19] + +RFC 5897 Service ID in SIP June 2010 + + +7. Recommendations + + From these principles, several recommendations can be made. + +7.1. Use Derived Service Identification + + Derived service identification -- where an identifier for a service + is obtained by inspection of the signaling and of other contextual + data (such as subscriber profile) -- is reasonable, and when done + properly, does not lead to the perils described above. However, + declarative service identification -- where user agents indicate what + the service is, separate from the rest of the signaling -- leads to + the perils described above. + + If it appears that the signaling currently defined in standards is + not sufficient to identify the service, it may be due to lack of + sufficient signaling to convey what is needed, or may be because + request URIs should be used for differentiation and they are not + being used. By applying the litmus tests described in Section 5.3, + network designers can determine whether or not the system is + attempting to perform declarative service identification. + +7.2. Design for SIP's Negotiative Expressiveness + + One of SIP's key strengths is its ability to negotiate a common view + of a session between participants. This means that the service that + is ultimately received can vary wildly, depending on the types of + endpoints in the call and their capabilities. Indeed, this fact + becomes even more evident when calls are set up between domains. + + As such, when performing derived service identification, domains + should be aware that sessions may arrive from different networks and + different endpoints. Consequently, the service identification + algorithm must be complete -- meaning it computes the best answer for + any possible signaling message that might be received and any session + that might be set up. + + In a homogeneous environment, the process of service identification + is easy. The service provider will know the set of services they are + providing, and based on the specific call flows for each specific + service, can construct rules to differentiate one service from + another. However, when different providers interconnect, or when + different endpoints are introduced, assumptions about what services + are used, and how they are signaled, no longer apply. To provide the + best user experience possible, a provider doing service + identification needs to perform a "best-match" operation, such that + + + + + +Rosenberg Informational [Page 20] + +RFC 5897 Service ID in SIP June 2010 + + + any legal SIP signaling -- not just the specific call flows running + within their own network amongst a limited set of endpoints -- is + mapped to the appropriate service. + +7.3. Presence + + Presence can help a great deal with providing unique URIs for + different services. When a user wishes to contact another user, and + knows only the AOR for the target (which is usually the case), the + user can fetch the presence document for the target. That document, + in turn, can contain numerous service URIs for contacting the target + with different services. Those URIs can then be used in the Request- + URI for differentiation. When possible, this is the best solution to + the problem. + +7.4. Intra-Domain + + Service identifiers themselves are not bad; derived service + identification allows each domain to cache the results of the service + identification process for usage by another network element within + the same domain. However, service identifiers are fundamentally + useful within a particular domain, and any such header must be + stripped at a network boundary. Consequently, the process of service + identification and their associated service identifiers is always an + intra-domain operation. + +7.5. Device Dispatch + + Device dispatch should be done following the principles of [RFC3841], + using implicit preferences based on the signaling. For example, + [RFC5688] defines a new UA capability that can be used to dispatch + requests based on different types of application media streams. + + However, it is a mistake to try and use a service identifier as a UA + capability. Consider a service called "multimedia telephony", which + adds video to the existing PSTN experience. A user has two devices, + one of which is used for multimedia telephony and the other strictly + for a voice-assisted game. It is tempting to have the telephony + device include a UA capability [RFC3840] called "multimedia + telephony" in its registration. A calling multimedia telephony + device can then include the Accept-Contact header field [RFC3841] + containing this feature tag. The proxy serving the called party, + applying the basic algorithms of [RFC3841], will correctly route the + call to the terminating device. + + However, if the calling party is not within the same domain, and the + calling domain does not know about or use this feature tag, there + will be no Accept-Contact header field, even if the calling party was + + + +Rosenberg Informational [Page 21] + +RFC 5897 Service ID in SIP June 2010 + + + using a service that is a good match for "multimedia telephony". In + such a case, the call may be delivered to both devices, but it will + yield a poorer user experience. That's because device dispatch was + done using declarative service identification. + + The best way to avoid this problem is to use feature tags that can be + matched to well-defined signaling features -- media types, required + SIP extensions, and so on. In particular, the golden rule is that + the granularity of feature tags must be equivalent to the granularity + of individual features that can be signaled in SIP. + +8. Security Considerations + + Oftentimes, the service associated with a request is utilized for + purposes such as authorization, accounting, and billing. When + service identification is not done properly, the possibility of + unauthorized service use and network fraud is introduced. It is for + this reason, discussed extensively in Section 6.1, that the usage of + declarative service identifiers inserted by a UA is not recommended. + +9. Acknowledgements + + This document is based on discussions with Paul Kyzivat and + Andrew Allen, who contributed significantly to the ideas here. Much + of the content in this document is a result of discussions amongst + participants in the SIPPING mailing list, including Dean Willis, + Tom Taylor, Eric Burger, Dale Worley, Christer Holmberg, and + John Elwell, amongst many others. Thanks to Spencer Dawkins, + Tolga Asveren, Mahesh Anjanappa, and Claudio Allochio for reviews of + this document. + +10. Informative References + + [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, + A., Peterson, J., Sparks, R., Handley, M., and E. + Schooler, "SIP: Session Initiation Protocol", RFC 3261, + June 2002. + + [RFC4479] Rosenberg, J., "A Data Model for Presence", RFC 4479, + July 2006. + + [RFC4485] Rosenberg, J. and H. Schulzrinne, "Guidelines for Authors + of Extensions to the Session Initiation Protocol (SIP)", + RFC 4485, May 2006. + + [RFC4975] Campbell, B., Mahy, R., and C. Jennings, "The Message + Session Relay Protocol (MSRP)", RFC 4975, September 2007. + + + + +Rosenberg Informational [Page 22] + +RFC 5897 Service ID in SIP June 2010 + + + [RFC5031] Schulzrinne, H., "A Uniform Resource Name (URN) for + Emergency and Other Well-Known Services", RFC 5031, + January 2008. + + [ECRIT-FRAMEWORK] + Rosen, B., Schulzrinne, H., Polk, J., and A. Newton, + "Framework for Emergency Calling using Internet + Multimedia", Work in Progress, July 2009. + + [RFC5627] Rosenberg, J., "Obtaining and Using Globally Routable User + Agent URIs (GRUUs) in the Session Initiation Protocol + (SIP)", RFC 5627, October 2009. + + [RFC5688] Rosenberg, J., "A Session Initiation Protocol (SIP) Media + Feature Tag for MIME Application Subtypes", RFC 5688, + January 2010. + + [RFC3428] Campbell, B., Rosenberg, J., Schulzrinne, H., Huitema, C., + and D. Gurle, "Session Initiation Protocol (SIP) Extension + for Instant Messaging", RFC 3428, December 2002. + + [RFC3841] Rosenberg, J., Schulzrinne, H., and P. Kyzivat, "Caller + Preferences for the Session Initiation Protocol (SIP)", + RFC 3841, August 2004. + + [RFC3840] Rosenberg, J., Schulzrinne, H., and P. Kyzivat, + "Indicating User Agent Capabilities in the Session + Initiation Protocol (SIP)", RFC 3840, August 2004. + + [RFC2205] Braden, B., Zhang, L., Berson, S., Herzog, S., and S. + Jamin, "Resource ReSerVation Protocol (RSVP) -- Version 1 + Functional Specification", RFC 2205, September 1997. + +Author's Address + + Jonathan Rosenberg + jdrosen.net + Monmouth, NJ + USA + + EMail: jdrosen@jdrosen.net + URI: http://www.jdrosen.net + + + + + + + + + +Rosenberg Informational [Page 23] + |