diff options
Diffstat (limited to 'doc/rfc/rfc4396.txt')
-rw-r--r-- | doc/rfc/rfc4396.txt | 3699 |
1 files changed, 3699 insertions, 0 deletions
diff --git a/doc/rfc/rfc4396.txt b/doc/rfc/rfc4396.txt new file mode 100644 index 0000000..be4f173 --- /dev/null +++ b/doc/rfc/rfc4396.txt @@ -0,0 +1,3699 @@ + + + + + + +Network Working Group J. Rey +Request for Comments: 4396 Y. Matsui +Category: Standards Track Panasonic + February 2006 + + + RTP Payload Format + for 3rd Generation Partnership Project (3GPP) Timed Text + +Status of This Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Copyright Notice + + Copyright (C) The Internet Society (2006). + +Abstract + + This document specifies an RTP payload format for the transmission of + 3GPP (3rd Generation Partnership Project) timed text. 3GPP timed + text is a time-lined, decorated text media format with defined + storage in a 3GP file. Timed Text can be synchronized with + audio/video contents and used in applications such as captioning, + titling, and multimedia presentations. In the following sections, + the problems of streaming timed text are addressed, and a payload + format for streaming 3GPP timed text over RTP is specified. + + + + + + + + + + + + + + + + + + + + +Rey & Matsui Standards Track [Page 1] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + +Table of Contents + + 1. Introduction ....................................................3 + 2. Motivation, Requirements, and Design Rationale ..................3 + 2.1. Motivation .................................................3 + 2.2. Basic Components of the 3GPP Timed Text Media Format .......4 + 2.3. Requirements ...............................................5 + 2.4. Limitations ................................................6 + 2.5. Design Rationale ...........................................7 + 3. Terminology ....................................................10 + 4. RTP Payload Format for 3GPP Timed Text .........................12 + 4.1. Payload Header Definitions ................................13 + 4.1.1. Common Payload Header Fields .......................15 + 4.1.2. TYPE 1 Header ......................................17 + 4.1.3. TYPE 2 Header ......................................20 + 4.1.4. TYPE 3 Header ......................................23 + 4.1.5. TYPE 4 Header ......................................24 + 4.1.6. TYPE 5 Header ......................................25 + 4.2. Buffering of Sample Descriptions ..........................25 + 4.2.1. Dynamic SIDX Wraparound Mechanism ..................26 + 4.3. Finding Payload Header Values in 3GP Files ................28 + 4.4. Fragmentation of Timed Text Samples .......................31 + 4.5. Reassembling Text Samples at the Receiver .................33 + 4.6. On Aggregate Payloads .....................................35 + 4.7. Payload Examples ..........................................39 + 4.8. Relation to RFC 3640 ......................................43 + 4.9. Relation to RFC 2793 ......................................44 + 5. Resilient Transport ............................................45 + 6. Congestion Control .............................................46 + 7. Scene Description ..............................................47 + 7.1. Text Rendering Position and Composition ...................47 + 7.2. SMIL Usage ................................................48 + 7.3. Finding Layout Values in a 3GP File .......................48 + 8. 3GPP Timed Text Media Type .....................................49 + 9. SDP Usage ......................................................53 + 9.1. Mapping to SDP ............................................53 + 9.2. Parameter Usage in the SDP Offer/Answer Model .............53 + 9.2.1. Unicast Usage ......................................54 + 9.2.2. Multicast Usage ....................................57 + 9.3. Offer/Answer Examples .....................................58 + 9.4. Parameter Usage outside of Offer/Answer ...................60 + 10. IANA Considerations ...........................................60 + 11. Security Considerations .......................................60 + 12. References ....................................................61 + 12.1. Normative References .....................................61 + 12.2. Informative References ...................................61 + 13. Basics of the 3GP File Structure ..............................64 + 14. Acknowledgements ..............................................65 + + + +Rey & Matsui Standards Track [Page 2] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + +1. Introduction + + 3GPP timed text is a media format for time-lined, decorated text + specified in the 3GPP Technical Specification TS 26.245, "Transparent + end-to-end packet switched streaming service (PSS); Timed Text Format + (Release 6)" [1]. Besides plain text, the 3GPP timed text format + allows the creation of decorated text such as that for karaoke + applications, scrolling text for newscasts, or hyperlinked text. + These contents may or may not be synchronized with other media, such + as audio or video. + + The purpose of this document is to provide a means to stream 3GPP + timed text contents using RTP [3]. This includes the streaming of + timed text being read out of a (3GP) file, as well as the streaming + of timed text generated in real-time, a.k.a. live streaming. + + Section 2 contains the motivation for this document, an overview of + the media format, the requirements, and the design rationale. + Section 3 defines the terminology used. Section 4 specifies the + payload headers, the fragmentation and re-assembly rules for text + samples, the rules for payload aggregation, and the relations of this + document to RFC 3640 [12] and RFC 2793 [22]. Section 5 specifies + some simple schemes for resilient transport and gives pointers to + other possible mechanisms. Section 6 addresses congestion control. + Section 7 specifies scene description. Section 8 defines the media + type. Section 9 specifies SDP for unicast and multicast sessions, + including usage in the Offer/Answer model [13]. Sections 10 and 11 + address IANA and security considerations. Section 12 lists + references. Basics of the 3GP File Structure are in Section 13. + +2. Motivation, Requirements, and Design Rationale + +2.1. Motivation + + The 3GPP timed text format was developed for use in the services + specified in the 3GPP Transparent End-to-end Packet-switched + Streaming Services (3GPP PSS) specification [16]. + + As of today, PSS allows downloading 3GPP timed text contents stored + in 3GP files. However, due to the lack of a RTP payload format, it + is not possible to stream 3GPP timed text contents over RTP. + + This document specifies such a payload format. + + + + + + + + +Rey & Matsui Standards Track [Page 3] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + +2.2. Basic Components of the 3GPP Timed Text Media Format + + Before going into the details of the design, it is necessary to know + how the media format is constructed. We can identify four + differentiated functional components: layout information, default + formatting, text strings, and decoration. In the following, we + shortly explain these and match them to their designations in a 3GP + file: + + o Initial spatial layout information related to the text + strings: These are the height and width of the text region + where text is displayed, the position of the text region in + the display, and the layer or proximity of the text to the + user. In 3GP files, this information is contained in the + Track Header Box (3GP file designations are capitalized for + clarity). + + o Default settings for formatting and positioning of text: style + (font, size, color,...), background color, horizontal and + vertical justification, line width, scrolling, etc. For 3GP + files, this corresponds to the Sample Descriptions. + + o The actual text strings: encoded characters using either UTF-8 + [18] or UTF-16 [19] encoding. + + o The decoration: If some characters have different style, + delay, blink, etc., this needs to be indicated. The + decoration is only present in the text samples if it is + actually needed. Otherwise, the default settings as above + apply. In 3GP files, within each Text Sample, the decoration + (i.e., Modifier Boxes) is appended to the text strings, if + needed. At the time of writing this payload format, the + following modifiers are specified in the 3GPP timed text media + format specification [1]: + + - text highlight + - highlight color + - blinking text + - karaoke feature + - hyperlink + - text delay + - text style + - positioning of the text box + - text wrap indication + + + + + + + +Rey & Matsui Standards Track [Page 4] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + +2.3. Requirements + + Once the basic components are known, it is necessary to define which + requirements the payload format shall fulfill: + + 1. It shall enable both live streaming and streaming from a 3GP + file. + + Informative note: For the purpose of this document, the + term "live streaming" refers to those scenarios where + the timed text stream is sent from a live encoder. Upon + reception, the content may or may not be stored in a 3GP + file. Typically, in live streaming applications, the + sender encapsulates the timed text content in RTP + packets following the guidelines given in this document. + At the receiving side, a buffer is used to cancel the + network delay and delay jitter. If receiver and sender + support packet loss resilience mechanisms (see Section + 5), it may also be possible to recover from packet + losses. Note that how sender and receiver actually + manage and dimension the buffers is an implementation + design choice. + + 2. Furthermore, it shall be possible for an RTP receiver using this + payload format, and capable of storing in 3GP format, to obtain + all necessary information from the RTP packets for storing the + received text contents according to the 3GP file format. This + file may or may not be the same as the original file. + + Informative note: The 3GP file format itself is based on + the ISO Base Media File Format recommendation [2]. + Section 13.1 gives some insight into the 3GP file + structure. Further, Sections 4.3 and 7.3 specify where + the information needed for filling in payload headers is + found in a 3GP file. For live streaming, appropriate + values complying with the format and units described in + [1] shall be used. Where needed, clarifications on + appropriate values are given in this document. + + 3. It shall enable efficient and resilient transport of timed text + contents over RTP. In particular: + + a. Enable the transmission of the sample descriptions by both + out-of-band and in-band means. Sample descriptions are + important information, which potentially apply to several + text samples. These default formatting settings are + typically transmitted out-of-band (reliably) once at the + initialization phase. If additional sample descriptions + + + +Rey & Matsui Standards Track [Page 5] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + are needed in the course of a session, these may also be + sent out-of-band or in-band. In-band transmission, + although unreliable, may be more appropriate for sending + sample descriptions if these should be sent frequently, as + opposed to establishing an additional communication channel + for SDP, for example. It is also useful in cases where an + out-of-band channel may not be available and for live + streaming, where contents are not known a priori. Thus, + the payload format shall enable out-of-band and in-band + transmission of sample descriptions. Section 4.1.6 + specifies a payload header for transmitting sample + descriptions in-band. Section 9 specifies how sample + descriptions are mapped to SDP. + + b. Enable the fragmentation of a text sample into several RTP + packets in order to cover a wide range of applications and + network environments. In general, fragmentation should be + a rare event, given the low bit rates and relatively small + text sample sizes. However, the 3GPP Timed Text media + format does allow for larger text samples. Therefore, the + payload format shall take this into account and provide a + means for coping with fragmentation and reassembly. Section + 4.4 deals with fragmentation. + + c. Enable the aggregation of units into an RTP packet for + making the transport more efficient. In a mobile + communication environment, a typical text sample size is + around 100-200 bytes. If the available bit rate and the + packet size allow it, units should be aggregated into one + RTP packet. Section 4.6 deals with aggregation. + + d. Enable the use of resilient transport mechanisms, such as + repetition, retransmission [11], and FEC [7] (see Section + 5). For a more general discussion, refer to RFC 2354 [8], + which discusses available mechanisms for stream repair. + +2.4. Limitations + + The payload headers have been optimized in size for RTP. Instead + of using 32-bit (S)LEN, SDUR, and SIDX header fields, which would + carry many unused bits much of the time, it has been a design + choice to reduce the size of these fields. As a consequence, this + payload format has reduced maximum values with respect to sizes and + durations of (text) samples and sample descriptions. These maximum + values differ from those allowed in 3GP files, where they are + expressed using 32-bit (unsigned) integers. In some cases, + + + + + +Rey & Matsui Standards Track [Page 6] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + extension mechanisms are provided to deal with larger values. + However, it is noted that the values used here should be enough for + the streaming applications targeted. + + The following limitations apply: + + 1. The maximum size of text samples carried in RTP packets is + restricted to be a 16-bit (unsigned) integer (this includes the + text strings and modifiers). This means a maximum size for the + unit would be about 64 Kbytes. No extension mechanism is + provided. + + 2. The sample description index values are restricted to be an 8- + bit (unsigned) integer. An extension mechanism is given in + Section 4.3. + + 3. The text sample duration is restricted to be a 24-bit (unsigned) + integer. This yields a maximum duration at a timestamp + clockrate of 1000 Hz of about 4.6 hours. Nevertheless, an + extension mechanism is provided in Section 4.3. + + 4. Sample descriptions are also restricted in size: If the size + cannot be expressed as a 16-bit (unsigned) integer, the sample + description shall not be conveyed. As in the case of the sample + size, no extension mechanism is provided. + + 5. A further limitation concerns the UTF-16 encodings supported: + Only transport of text strings following big endian byte order + is supported. See Section 4.1.1 for details. + +2.5. Design Rationale + + The following design choices were made: + + 1. 'Unit' approach: The payload formats specified in this document + follow a simple scheme: a 3-byte common header (Common Payload + Header) followed by a specific header for each text sample + (fragment) type. Following these headers, the text sample + contents are placed (Section 4.1.1 and following). This + structure is called a 'unit'. + + The following units have been devised to comply with the + requirements mentioned in Section 2.3: + + a. A TYPE 1 unit that contains one complete text sample, + + b. A TYPE 2 unit that contains a complete text string or a + fragment thereof, + + + +Rey & Matsui Standards Track [Page 7] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + c. A TYPE 3 unit that contains the complete modifiers or only + the first fragment thereof, + + d. A TYPE 4 unit that contains one modifier fragment other + than the first, and + + e. A TYPE 5 unit that contains one sample description. + + This 'unit' approach was motivated by the following reasons: + + 1. Allows a simple classification of the text samples and + text sample fragments that can be conveyed by the + payload format. + + 2. Enables easy interoperability with RFC 3640 [12]. + During the development of this payload format, interest + was shown from MPEG-4 standardization participants in + developing a common payload structure for the transport + of 3GPP Timed Text. While interoperability is not + strictly necessary for this payload format to work, it + has been pursued in this payload format. Section 4.8 + explains how this is done. + + 2. Character count is not implemented. This payload format does + detect lost text samples fragments, but it does not enable an + RTP receiver to find out the exact number of text characters + lost. In fact, the fragment size included in the payload + headers does not help in finding the number of lost characters + because the UTF-8/UTF-16 [18][19] encodings used yield a + variable number of bytes per character. + + For finding the exact number of lost characters, an additional + field reflecting the character count (and possibly the character + offset) upon fragmentation would be required. This would + additionally require that the entity performing fragmentation + count the characters included in each text fragment. + + One benefit of having a character count would be that the + display application would be able to replace missing characters + through some other character representing character loss. For + example: + + If we take the "Some text is lost now" and assume the loss + of a packet containing the text in the middle, this could + be displayed (with a character count): + + "Some ############now" + + + + +Rey & Matsui Standards Track [Page 8] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + As opposed to: + + "Some #now" + + which is what this payload format enables ("#" indicates a + missing character or packet, respectively). + + However, it is the consensus of the working group that for + applications such as subtitling applications and multimedia + presentations that use this payload format, such partial error + correction is not worth the cost of including two additional + fields; namely, character count and character offset. Instead, + it is recommended that some more overhead be invested to provide + full error correction by protecting the less text sample + fragments using the measures outlined in Section 5. + + 3. Fragment re-assembly: In order to re-assemble the text samples, + offset information is needed. Instead of a character or byte + offset, a single byte, TOTAL/THIS, is used. These two values + indicate the total number and current index of fragments of a + text sample. This is simpler than having a character offset + field in each fragment. Details in Section 4.1.3. + + 4. A length field, LEN, is present in the common header fields. + While the length in the RTP payload format is not needed by most + RTP applications (typically lower layers, like UDP, provide this + information), it does ease interoperability with RFC 3640. This + is because the Access Units (AUs) used for carriage of data in + RFC 3640 must include a length indication. Details are in + Section 4.8. + + 5. The header fields in the specific payload headers (TYPE headers + in Sections 4.1.2 to 4.1.6) have been arranged for easy + processing on 32-bit machines. For this reason, the fields SIDX + and SDUR are swapped in TYPE 1 unit, compared to the other + units. + + + + + + + + + + + + + + + +Rey & Matsui Standards Track [Page 9] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + +3. Terminology + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this + document are to be interpreted as described in RFC 2119 [5]. + + Furthermore, the following terms are used and have specific meaning + within the context of this document: + + text sample or whole text sample + + In the 3GPP Timed Text media format [1], these terms refer to a + unit of timed text data as contained in the source (3GP) file. + This includes the text string byte count, possibly a Byte Order + Mark, the text string and any modifiers that may follow. Its + equivalent in audio/video would be a frame. + + In this document, however, a text sample contains only text + strings followed by zero or more modifiers. This definition of + text sample excludes the 16-bit text string byte count and the + 16-bit Byte Order Mark (BOM) present in 3GP file text samples + (see Section 4.3 and Figure 9). The 16-bit BOM is not + transported in RTP, as explained in Section 4.1.1. + + text strings + + The actual text characters encoded either as UTF-8 or UTF-16. + When using this payload format, the text string does not contain + any byte order mark (BOM). See Figure 9 for details. + + fragment or text sample fragment + + A fraction of a text sample. A fragment may contain either text + strings or modifier (decoration) contents, but not both at the + same time. + + sample contents + + General term to identify timed text data transported when using + this payload format. Sample contents may be one or several text + samples, sample descriptions, and sample fragments (note that, + as per Section 4.6, there is only one case in which more than + one fragment may be included in a payload). + + + + + + + + +Rey & Matsui Standards Track [Page 10] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + decoration or modifiers + + These terms are used interchangeably throughout the document to + denote the contents of the text sample that modify the default + text formatting. Modifiers may, for example, specify different + font size for a particular sequence of characters or define + karaoke timing for the sample. + + sample description + + Information that is potentially shared by more than one text + sample. In a 3GP file, a sample description is stored in a + place where it can be shared. It contains setup and default + information such as scrolling direction, text box position, + delay value, default font, background color, etc. + + units or transport units + + The payload headers specified in this document encapsulate text + samples, fragments thereof, and sample descriptions by placing a + common header and specific payload header (Sections 4.1.1 to + 4.1.6) before them, thus building what is here called a + (transport) unit. + + aggregation or aggregate packet + + The payload of an aggregate (RTP) packet consists of several + (transport) units. + + track or stream + + 3GP files contain audio/video and text tracks. This document + enables streaming of text tracks using RTP. Therefore, these + terms are used interchangeably in this document in the context + of 3GP files. + + Media Header Box / Track Header Box / ... + + The 3GP file format makes use of these structures defined in the + ISO Base File Format [2]. When referring to these in this + document, initials are capitalized for clarity. + + + + + + + + + + +Rey & Matsui Standards Track [Page 11] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + +4. RTP Payload Format for 3GPP Timed Text + + The format of an RTP packet containing 3GPP timed text is shown + below: + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |V=2|P|X| CC |M| PT | sequence number | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | timestamp | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | synchronization source (SSRC) identifier | + /+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | |U| R | TYPE| LEN | : + | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : + U| : (variable header fields depending on TYPE : + N| : : + I< +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + T| | | + | : SAMPLE CONTENTS : + | | +-+-+-+-+-+-+-+-+ + | | | + \+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 1. 3GPP Timed Text RTP Packet Format + + Marker bit (M): The marker bit SHALL be set to 1 if the RTP packet + includes one or more whole text samples or the last fragment of a + text sample; otherwise, it is set to zero (0). + + Timestamp: The timestamp MUST indicate the sampling instant of the + earliest (or only) unit contained in the RTP packet. The initial + value SHOULD be randomly determined, as specified in RTP [3]. + + The timestamp value should provide enough timing resolution for + expressing the duration of text samples, for synchronizing text + with other media, and for performing RTP Control Protocol (RTCP) + measurements such as the interarrival delay jitter or the RTCP + Packet Receipt Times Report Block (Section 4.3 of RFC 3611 + [20]). This is compliant to RTP, Section 5.1: + + "The resolution of the clock MUST be sufficient for the + desired synchronization accuracy and for measuring packet + arrival jitter (one tick per video frame is typically not + sufficient)". + + + + + +Rey & Matsui Standards Track [Page 12] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + The above observation applies to both timed text tracks included + in a 3GP file and live streaming sessions. In the case of a 3GP + timed text track, the timestamp clockrate is the value of the + "timescale" parameter in the Media Header Box for that text + track. Each track in a 3GP file MAY have its own clockrate as + specified in the Media Header Box. Likewise, live streaming + applications SHALL use an appropriate timestamp clockrate. A + default value of 1000 Hz is RECOMMENDED. Other timestamp + clockrates MAY be used. In this case, the typical behavior here + is to match the 3GPP timed text clockrate to that used by an + associated audio or video stream. + + In an aggregate payload, units MUST be placed in play-out order, + i.e., earliest first in the payload. If TYPE 1 units are + aggregated, the timestamp of the subsequent units MUST be + obtained by adding the timed text sample duration of previous + samples to the RTP timestamp value. There are two exceptions to + this rule: TYPE 5 units and an aggregate payload containing two + fragments of the same text sample. The details of the timestamp + calculation are given in Section 4.6. + + Finally, timestamp clockrates MUST be signaled by out-of-band + means at session setup, e.g., using the media type "rate" + parameter in SDP. See Section 9 for details. + + Payload Type (PT): The payload type is set dynamically and sent by + out-of-band means. + + The usage of the remaining RTP header fields (namely, V, P, X, CC, SN + and SSRC) follows the rules of RTP and the profile in use. + +4.1. Payload Header Definitions + + The (transport) units specified in this document consist of a set of + common fields (U, R, TYPE, LEN), followed by specific header fields + (TYPES 1-5) and text sample contents. See Figure 1 and Figure 2. + + In Figure 2, two example RTP packets are depicted. The first + contains an aggregate RTP payload with two complete text samples, and + the second contains one text sample fragment. After each unit header + is explained, detailed payload examples follow in Section 4.7. + + + + + + + + + + +Rey & Matsui Standards Track [Page 13] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + +----------------------+ + | | + | RTP Header | + | | + ---------+----------------------+ + | | | + | |COMMON + TYPE 1 Header| + | ........................ + UNIT 1 - | | + | | Text Sample | + | | | + |-------\........................ + -------/| | + | |COMMON + TYPE 1 Header| + | ........................ + UNIT 2 - | | + | | Text Sample | + | | | + | | | + ---------+----------------------+ + + +----------------------+ + | | + | RTP Header | + | | + ---------+----------------------+ + | | COMMON + TYPE 2 | + | | (or 3 or 4) Hdr | + | ........................ + UNIT 3 - | | + | | Text Sample Fragment | + | | | + | | | + ---------+----------------------+ + + Figure 2. Example RTP packets + + + + + + + + + + + + + + + +Rey & Matsui Standards Track [Page 14] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + +4.1.1. Common Payload Header Fields + + The fields common to all payload headers have the following format: + + 0 1 2 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |U| R |TYPE | LEN | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 3. Common payload header fields + + Where: + + o U (1 bit) "UTF Transformation flag": This is used to inform RTP + receivers whether UTF-8 (U=0) or UTF-16 (U=1) was used to encode + the text string. UTF-16 text strings transported by this payload + format MUST be serialized in big endian order, a.k.a. network byte + order. + + Informative note: Timed text clients complying with the 3GPP + Timed Text format [1] are only required to understand the big + endian serialization. Thus, in order to ease interoperability, + the reverse serialization (little endian) is not supported by + this payload format. + + For the payload formats defined in this document, the U bit is only + used in TYPE 1 and TYPE 2 headers. Senders MUST set the U bit to + zero in TYPE 3, TYPE 4, and TYPE 5 headers. Consequently, + receivers MUST ignore the U bit in TYPE 3, TYPE 4, and TYPE 5 + headers. + + o R (4 bits) "Reserved bits": for future extensions. This field MUST + be set to zero (0x0) and MUST be ignored by receivers. + + o TYPE (3 bits) "Type Field": This field specifies which specific + header fields follow. The following TYPE values are defined: + + - TYPE 1, for a whole text sample. + - TYPE 2, for a text string fragment (without modifiers). + - TYPE 3, for a whole modifier box or the first fragment of a + modifier box. + - TYPE 4, for a modifier fragment other than first. + - TYPE 5, for a sample description. Exactly one header per + sample description. + - TYPE 0, 6, and 7 are reserved for future extensions. Note + that future extensions are possible, e.g., a unit that + explicitly signals the number of characters present in a + + + +Rey & Matsui Standards Track [Page 15] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + fragment (see Section 2.5). In order to guarantee backwards- + compatibility, it SHALL be possible that older clients ignore + (newer) units they do not understand, without invalidating the + timestamp calculation mechanisms or otherwise preventing them + from decoding the other units. + + o Finally, the LEN (16 bits) "Length Field": indicates the size (in + bytes) of this header field and all the fields following, i.e., the + LEN field followed by the unit payload: text strings and modifiers + (if any). This definition only excludes the initial U/R/TYPE byte + of the common header. The LEN field follows network byte order. + + The way in which LEN is obtained when streaming out of a 3GP file + depends on the particular unit type. This is explained for each + unit in the sections below. + + For live streaming, both sample length and the LEN value for the + current fragment MUST be calculated during the sampling process or + during fragmentation. + + In general, LEN may take the following values: + + - TYPE = 1, LEN >= 8 + - TYPE = 2, LEN > 9 + - TYPE = 3, LEN > 6 + - TYPE = 4, LEN > 6 + - TYPE = 5, LEN > 3 + + Receivers MUST discard units that do not comply with these values. + However, the RTP header fields and the rest of the units in the + payload (if any) are still useful, as guaranteed by the requirement + for future extensions above. + + In the following subsections the different payload headers for the + values of TYPE are specified. + + + + + + + + + + + + + + + + +Rey & Matsui Standards Track [Page 16] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + +4.1.2. TYPE 1 Header + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |U| R |TYPE | LEN (always >=8) | SIDX | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | SDUR | TLEN | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | TLEN | + +-+-+-+-+-+-+-+-+ + + Figure 4. TYPE 1 Header Format + + This header type is used to transport whole text samples. This unit + should be the most common case, i.e., the text sample should usually + be small enough to be transported in one unit without having to + separate text strings from modifiers. In an aggregate (RTP packet) + payload containing several text samples, every sample is preceded by + its own TYPE 1 header (see Figure 12). + + Informative note: As indicated in Section 3, "Terminology", a + text sample is composed of the text strings followed by the + modifiers (if any). This is also how text samples are stored in + 3GP files. The separation of a text sample into text strings + and modifiers is only needed for large samples (or small + available IP MTU sizes; see Section 4.4), and it is accomplished + with TYPE 2 and TYPE 3 headers, as explained in the sections + below. + + Note also that empty text samples are considered whole text samples, + although they do not contain sample contents. Empty text samples may + be used to clear the display or to put an end to samples of unknown + duration, for example. Units without sample contents SHALL have a + LEN field value of 8 (0x0008). + + The fields above have the following meaning: + + o U, R, and TYPE, as defined in Section 4.1.1. + + o LEN, in this case, represents the length of the (complete) text + sample plus eight (8) bytes of headers. For finding the length of + the text sample in the Sample Size Box of 3GP files, see Section + 4.3. + + o SIDX (8 bits) "Text Sample Entry Index": This is an index used to + identify the sample descriptions. + + + + +Rey & Matsui Standards Track [Page 17] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + The SIDX field is used to find the sample description corresponding + to the unit's payload. There are two types of SIDX values: static + and dynamic. + + Static SIDX values are used to identify sample descriptions that + MUST be sent out-of-band and MUST remain active during the whole + session. A static SIDX value is unequivocally linked to one + particular sample description during the whole session. Carrying + many sample descriptions out-of-band SHOULD be avoided, since these + may become large and, ultimately, transport is not the goal of the + out-of-band channel. Thus, this feature is RECOMMENDED for + transporting those sample descriptions that provide a set of + minimum default format settings. Static SIDX values MUST fall in + the (closed) interval [129,254]. + + Dynamic SIDX values are used for sample descriptions sent in-band. + Sample descriptions MAY be sent in-band for several reasons: + because they are generated in real time, for transport resiliency, + or both. A dynamic SIDX value is unequivocally linked to one + particular sample description during the period in which this is + active in the session, and it SHALL NOT be modified during that + period. This period MAY be smaller than or equal to the session + duration. This period is not known a priori. A maximum of 64 + dynamic simultaneously active SIDX values is allowed at any moment. + Dynamic SIDX values MUST fall in the closed interval [0,127]. This + should be enough for both recorded content and live streaming + applications. Nevertheless, a wraparound mechanism is provided in + Section 4.2.1 to handle streaming sessions where more than 64 SIDX + values might be needed. Servers MAY make use of dynamic sample + descriptions. Clients MUST be able to receive and interpret + dynamic sample descriptions. + + Finally, SIDX values 128 and 255 are reserved for future use. + + o SDUR (24 bits) "Text Sample Duration": indicates the sample + duration in RTP timestamp units of the text sample. For this + field, a length of 3 bytes is preferred to 2 bytes. This is + because, for a typical clockrate of 1000 Hz, 16 bits would allow + for a maximum duration of just 65 seconds, which might be too short + for some streams. On the other hand, 24 bits at 1000 Hz allow for + a maximum duration of about 4.6 hours, while for 90 KHz, this value + is about 3 minutes. These values should be enough for streaming + applications. However, if a larger duration is needed, the + extension mechanism specified in Section 4.3 SHALL be used. + + Apart from defining the time period during which the text is + displayed, the duration field is also used to find the timestamp of + subsequent units within the aggregate RTP packet payload (if any). + + + +Rey & Matsui Standards Track [Page 18] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + This is explained in Section 4.6. + + Text samples have generally a known duration at the time of + transmission. However, in some cases such as live streaming, the + time for which a text piece shall be presented might not be known a + priori. Thus, the value zero SDUR=0 (0x000000) is reserved to + signal unknown duration. The amount of time that a sample of + unknown duration is presented is determined by the timestamp of the + next sample that shall be displayed at the receiver: Text samples + of unknown duration SHALL be displayed until the next text sample + becomes active, as indicated by its timestamp. + + The next example illustrates how units of unknown duration MUST be + presented. If no text sample following is available, it is an + implementation issue what should be displayed. For example, a + server could send an empty sample to clear the text box. + + Example: Imagine you are in an airport watching the latest news + report while you wait for your plane. Airports are loud, so the + news report is transcribed in the lower area of the screen. + This area displays two lines of text: the headlines and the + words spoken by the news speaker. As usual, the headlines are + shown for a longer time than the rest. This time is, in + principle, unknown to the stream server, which is streaming + live. A headline is just replaced when the next headline is + received. + + However, upon storing a text sample with SDUR=0 in a 3GP file, the + SDUR value MUST be changed to the effective duration of the text + sample, which MUST be always greater than zero (note that the ISO + file format [2] explicitly forbids a sample duration of zero). The + effective duration MUST be calculated as the timestamp difference + between the current sample (with unknown duration) and the next + text sample that is displayed. + + Note that samples of unknown duration SHALL NOT use features, which + require knowledge of the duration of the sample up front. Such + features are scrolling and karaoke in [1]. This also applies for + future extensions of the Timed Text format. Furthermore, only + sample descriptions (TYPE 5 units) MAY follow units of unknown + duration in the same aggregate payload. Otherwise, it would not be + possible to calculate the timestamp of these other units. + + For text contents stored in 3GP files, see Section 4.3 for details + on how to extract the duration value. For live streaming, live + encoders SHALL assign appropriate values and units according to [1] + and later releases. + + + + +Rey & Matsui Standards Track [Page 19] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + o TLEN (16 bits), "Text String Length", is a byte count of the text + string. The decoder needs the text string length in order to know + where the modifiers in the payload start. TLEN is not present in + text string fragments (TYPE 2) since it can be deductively + calculated from the LEN values of each fragment. + + The TLEN value is obtained from the text samples as contained in + 3GP files. Refer to Section 4.3. For live content, the TLEN MUST + be obtained during the sampling process. + + o Finally, the actual text sample is placed after the TLEN field. As + defined in Section 3, a text sample consists of a string of + characters encoded using either UTF-8 or UTF-16, followed by zero + or more modifiers. Note also that no BOM and no byte count are + included in the strings carried in the payload (as opposed to text + samples stored in 3GP files [1]). + +4.1.3. TYPE 2 Header + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |U| R |TYPE | LEN( always >9) | TOTAL | THIS | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | SDUR | SIDX | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | SLEN | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 5. TYPE 2 Header Format + + This header type is used to transport either a whole text string or a + fragment of it. TYPE 2 units SHALL NOT contain modifiers. In + detail: + + o U, R, and TYPE, as defined in Section 4.1.1. + + o SIDX and SDUR, as defined in Section 4.1.2. + + Note that the U, SIDX, and SDUR fields are meaningful since + partial text strings can also be displayed. + + o The LEN field (16 bits) indicates the length of the text string + fragment plus nine (9) bytes of headers. Its value is calculated + upon fragmentation. LEN MUST always be greater than nine (0x0009). + Otherwise, the unit MUST be discarded. + + + + + +Rey & Matsui Standards Track [Page 20] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + According to the guidelines in Section 4.4, text strings MUST be + split at character boundaries for allowing the display of text + fragments. Therefore, a text fragment MUST contain at least one + character in either UTF-8 or UTF-16. Actually, this is just a + formalism since by observing the guidelines, much larger fragments + should be created. + + Note also that TYPE 2 units do not contain an explicit text string + length, TLEN (see TYPE 1). This is because TYPE 2 units do not + contain any modifiers after the text string. If needed, the length + of the received string can be obtained using the LEN values of the + TYPE 2 units. + + o The SLEN field (16 bits) indicates the size (in bytes) of the + original (whole) text sample to which this fragment belongs. This + length comprises the text string plus any modifier boxes present + (and includes neither the byte order mark nor the text string + length as mentioned in Section 3, "Terminology"). + + Regarding the text sample length: Timed text samples are not + generated at regular intervals, nor is there a default sample size. + If 3GP files are streamed, the length of the text samples is + calculated beforehand and included in the track itself, while for + live encoding it is the real time encoder that SHALL choose an + appropriate size for each text sample. In this case, the amount of + text 'captured' in a sample depends on the text source and the + particular application (see examples below). Samples may, e.g., be + tailored to match the packet MTU as closely as possible or to + provide a given redundancy for the available bit rate. The + encoding application MUST also take into account the delay + constraints of the real-time session and assess whether FEC, + retransmission, or other similar techniques are reasonable options + for stream repair. + + The following examples shall illustrate how a real-time encoder may + choose its settings to adapt to the scenario constraints. + + Example: Imagine a newscast scenario, where the spoken news is + transcribed and synchronized with the image and voice of the + reporter. We assume that the news speaker talks at an average + speed of 5 words per second with an average word length of 5 + characters plus one space per word, i.e., 30 characters per + second. We assume an available IP MTU of 576 bytes and an + available bitrate of 576*8 bits per second = 4.6 Kbps. We + assume each character can be encoded using 2 bytes in UTF-16. + In this scenario, several constraints may apply; for example: + available IP MTU, available bandwidth, allowable delay, and + required redundancy. If the target were to minimize the + + + +Rey & Matsui Standards Track [Page 21] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + packet overhead, a text sample covering 8 seconds of text + would be closest to the IP MTU: + + IP/UDP/RTP/TYPE1 Header + (8-second text sample) + = 20 + 8 + 12 + 8 + (~6 chars/word * 5 word/s * 8 s * 2 chars/word) + = 528 bytes < 576 bytes + + For other scenarios, like lossy networks, it may happen that just + one packet per sample is too low a redundancy. In this case, a + choice could be that the encoder 'collects' text every second, thus + yielding text samples (TYPE 1 units) of 68 bytes, TYPE 1 header + included. We can, e.g., include three contiguous text samples in + one RTP payload: the current and last two text samples (see below). + This accounts to a total IP packet size of 20 + 8 + 12 + 3*(8 + 60) + = 244 bytes. Now, with the same available bitrate of 4.6 Kbps, + these 244-byte packets can be sent redundantly up two times per + second: + + RTP payload (1,2,3)(1,2,3) (2,3,4)(2,3,4) (3,4,5)(3,4,5) ... + Time: <----1s------> <----1s------> <-----1s-----> ... + + This means that each text sample is sent at least six times, + which should provide enough redundancy. Although not as + bandwidth efficient (488*8 < 528*8 < 576*8 bps) as the + previous packetization, this option increases the stream + redundancy while still meeting the delay and bandwidth + constraints. + + Another example would be a user sending timed text from a + type-in area in the display. In this case, the text sample is + created as soon as the user clicks the 'send' button. + Depending on the packet length, fragmentation may be needed. + + In a video conferencing application, text is synchronized with + audio and video. Thus, the text samples shall be displayed + long enough to be read by a human, shall fit in the video + screen, and shall 'capture' the audio contents rendered during + the time the corresponding video and audio is rendered. + + For stored content, see Section 4.3 for details on how to find the + SLEN value in a 3GP file. For live content, the SLEN MUST be + obtained during the sampling process. + + Finally, note that clients MAY use SLEN to buffer space for the + remaining fragments of a text sample. + + o The fields TOTAL (4 bits) and THIS (4 bits) indicate the total + number of fragments in which the original text sample (i.e., the + + + +Rey & Matsui Standards Track [Page 22] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + text string and its modifiers) has been fragmented and which order + occupies the current fragment in that sequence, respectively. Note + that the sequence number alone cannot replace the functionality of + the THIS field, since packets (and fragments) may be repeated, + e.g., as in repeated transmission (see Section 5). Thus, an + indication for "fragment offset" is needed. + + The usual "byte offset" field is not used here for two reasons: a) + it would take one more byte and b) it does not provide any + information on the character offset. UTF-8/UTF-16 text strings + have, in general, a variable character length ranging from 1 to 6 + bytes. Therefore, the TOTAL/THIS solution is preferred. It could + also be argued that the LEN and SLEN fields be used for this + purpose, but while they would provide information about the + completeness of the text sample, they do not specify the order of + the fragments. + + In all cases (TYPEs 2, 3 and 4), if the value of THIS is greater + than TOTAL or if TOTAL equals zero (0x0), the fragment SHALL be + discarded. + + o Finally, the sample contents following the SLEN field consist of a + fragment of the UTF-8/UTF-16 character string; no modifiers follow. + +4.1.4. TYPE 3 Header + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |U| R |TYPE | LEN( always >6) |TOTAL | THIS | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | SDUR | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 6. TYPE 3 Header Format + + This header type is used to transport either the entire modifier + contents present in a text sample or just the first fragment of them. + This depends on whether the modifier boxes fit in the current RTP + payload. + + If a text sample containing modifiers is fragmented, this header MUST + be used to transport the first fragment or, if possible, the complete + modifiers. + + In detail: + + o The U, R, and TYPE fields are defined as in Section 4.1.1. + + + +Rey & Matsui Standards Track [Page 23] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + o LEN indicates the length of the modifier contents. Its value is + obtained upon fragmentation. Additionally, the LEN field MUST be + greater than six (0x0006). Otherwise, the unit MUST be discarded. + + o The TOTAL/THIS field has the same meaning as for TYPE 2. + + For TYPE 3 units containing the last (trailing) modifier fragment, + the value of TOTAL MUST be equal to that of THIS (TOTAL=THIS). In + addition, TOTAL=THIS MUST be greater than one, because the total + number of fragments of a text sample is logically always larger + than one. + + Otherwise, if TOTAL is different from THIS in a TYPE 3 unit, this + means that the unit contains the first fragment of the modifiers. + + o The SDUR has the same definition for TYPE 1. Since the fragments + are always transported in own RTP packets, this field is only + needed to know how long this fragment is valid. This may, e.g., be + used to determine how long it should be kept in the display buffer. + + Note that the SLEN and SIDX fields are not present in TYPE 3 unit + headers. This is because a) these fragments do not contain text + strings and b) these types of fragments are applied over text string + fragments, which already contain this information. + +4.1.5. TYPE 4 Header + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |U| R |TYPE | LEN( always >6) |TOTAL | THIS | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | SDUR | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 7. TYPE 4 Header Format + + This header type is placed before modifier fragments, other than the + first one. + + The U, R, and TYPE fields are used as per Section 4.1.1. + + LEN indicates as for TYPE 3 the length of the modifier contents and + SHALL also be obtained upon fragmentation. The LEN field MUST be + greater than six (0x0006). Otherwise, the unit MUST be discarded. + + TOTAL/THIS is used as in TYPE 2. + + + + +Rey & Matsui Standards Track [Page 24] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + The SDUR field is defined as in TYPE 1. The reasoning behind the + absence of SLEN and SIDX is the same as in TYPE 3 units. + +4.1.6. TYPE 5 Header + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |U| R |TYPE | LEN( always >3) | SIDX | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 8. TYPE 5 Header Format + + This header type is used to transport (dynamic) sample descriptions. + Every sample description MUST have its own TYPE 5 header. + + The U, R, and TYPE fields are used as per Section 4.1.1. + + The LEN field indicates the length of the sample description, plus + three units accounting for the SIDX and LEN field itself. Thus, this + field MUST be greater than three (0x0003). Otherwise, the unit MUST + be discarded. + + If the sample is streamed from a 3GP file, the length of the sample + description contents (i.e., what comes after SIDX in the unit itself) + is obtained from the file (see Section 4.3). + + The SIDX field contains a dynamic SIDX value assigned to the sample + description carried as sample content of this unit. As only dynamic + sample descriptions are carried using TYPE 5, the possible SIDX + values are in the (closed) interval [0,127]. + + Senders MAY make use of TYPE 5 units. All receivers MUST implement + support for TYPE 5 units, since it adds minimum complexity and may + increase the robustness of the streaming session. + + The next section specifies how SIDX values are calculated. + +4.2. Buffering of Sample Descriptions + + The buffering of sample descriptions is a matter of the client's + timed text codec implementation. In order to work properly, this + payload format requires that: + + o Static sample descriptions MUST be buffered at the client, at + least, for the duration of the session. + + + + + +Rey & Matsui Standards Track [Page 25] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + o If dynamic sample descriptions are used, their buffering and + update of the SIDX values MUST follow the mechanism described in + the next section. + +4.2.1. Dynamic SIDX Wraparound Mechanism + + The use of dynamic sample descriptions by senders is OPTIONAL. + However, if they are used, senders MUST implement this mechanism. + Receivers MUST always implement it. + + Dynamic SIDX values remain active either during the entire duration + of the session (if used just once) or in different intervals of it + (if used once or more). + + Note: In the following, SIDX means dynamic SIDX. + + For choosing the wraparound mechanism, the following rationale was + used: There are 128 dynamic SIDX values possible, [0..127]. If one + chooses to allow a maximum of 127 to be used as dynamic SIDXs, then + any reordered packet with a new sample description would make the + mechanism fail. For example, if the last packet received is SIDX=5, + then all 127 values except SIDX=6 would be "active". Now, if a + reordered packet arrives with a new description, SIDX=9, it will be + mistakenly discarded, because the SIDX=9 is, at that moment, marked + as "active" and active sample descriptions shall not be re-written. + Therefore, a "guard interval" is introduced. This guard interval + reduces the number of active SIDXs at any point in time to 64. + Although most timed text applications will probably need less than 64 + sample descriptions during a session (in total), a wraparound + mechanism to handle the need for more is described here. + + Thereby, a sliding window of 64 active SIDX values is used. Values + within the window are "active"; all others are marked "inactive". An + SIDX value becomes active if at least one sample description + identified by that SIDX has been received. Since sample descriptions + MAY be sent redundantly, it is possible that a client receives a + given SIDX several times. However, active sample descriptions SHALL + NOT be overwritten: The receiver SHALL ignore redundant sample + descriptions and it MUST use the already cached copy. The "guard + interval" of (64) inactive values ensures that the correct + association SIDX <-> sample description is always used. + + Informative note: As for the "guard interval" value itself, 64 + as 128/2 was considered simple enough while still meeting the + expected maximum number of sample descriptions. Besides that, + there's no other motivation for choosing 64 or a different + value. + + + + +Rey & Matsui Standards Track [Page 26] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + The following algorithm is used to buffer dynamic sample descriptions + and to maintain the dynamic SIDX values: + + Let X be the last SIDX received that updated the range of active + sample descriptions. Let Y be a value within the allowed range for + dynamic SIDX: [0,127], and different from X. Let Z be the SIDX of + the last received sample description. Then: + + 1. Initialize all dynamic SIDX values as inactive. For stored + contents, read the sample description index in the Sample to + Chunk box ("stsc") for that sample. For live streaming, the + first value MAY be zero or any other value in the interval + above. Go to step 2. + + 2. First, in-band sample description with SIDX=Z is received and + stored; set X=Z. Go to step 3. + + 3. Any SIDX within the interval [X+1 modulo(128), X+64 modulo(128)] + is marked as inactive, and any corresponding sample description + is deleted. Any SIDX within the interval [X+65 modulo(128), X] + is set active. Go to step 4 (wait state). + + 4. Wait for next sample description. Once the client is + initialized, the interval of active SIDX values MUST change + whenever a sample description with an SIDX value in the inactive + set is received. That is, upon reception of a sample + description with SIDX=Z, do the following: + + a. If Z is in the (closed) interval [X+1 modulo(128), X+64 + modulo(128)] then set X=Z, store the sample description, and + go to step 3. + + b. Else, Z must be in the interval [X+65 modulo(128), X], thus: + + i. If SIDX=Z is not stored, then store the sample + description. Go to beginning of step 4 (wait state). + ii. Else, go to the beginning of step 4 (wait state). + + Informative note: It is allowed that any value of SIDX=X be sent + in the interval [0,127]. For example, if [64..127] is the + current active set and SIDX=0 is sent, a new sample description + is defined (0) and an old one deleted (64); thus [65..127] and + [0] are active. Similarly, one could now send SIDX=64, thus + inverting the active and inactive sets. + + Example: + If X=4, any SIDX in the interval [5,68] is inactive. Active + SIDX values are in the complementary interval [69,127] plus + + + +Rey & Matsui Standards Track [Page 27] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + [0,4]. For example, if the client receives a SIDX=6, then the + active interval is now different: [0,6] plus [71,127]. If the + received SIDX is in the current active interval, no change SHALL + be applied. + +4.3. Finding Payload Header Values in 3GP Files + + For the purpose of streaming timed text contents, some values in the + boxes contained in a 3GP file are mapped to fields of this payload + header. This section explains where to find those values. + + Additionally, for the duration and sample description indexes, + extension mechanisms are provided. All senders MUST implement the + extension mechanisms described herein. + + If the file is streamed out of a 3GP file, the following guidelines + SHALL be followed. + + Note: All fields in the objects (boxes) of a 3GP file are found + in network byte order. + + Information obtained from the Sample Table Box (stbl): + + o Sample Descriptions and Sample Description length: The Sample + Description box (stsd, inside the stbl) contains the sample + descriptions. For timed text media, each element of stsd is a + timed text sample entry (type "tx3g"). + + The (unsigned) 32 bits of the "size" field in the stsd box + represent the length (in bytes) of the sample description, as + carried in TYPE 5 units. On the other hand, the LEN field of + TYPE 5 units is restricted to 16 bits. Therefore, if the + value of "size" is greater than (2^16-1-3)[bytes], then the + sample description SHALL NOT be streamed with this payload + format. There is no extension mechanism defined in this case, + since fragmentation of sample descriptions is not defined + (sample descriptions are typically up to some 200 bytes in + size). Note: The three (3) accounts for the TYPE 5 header + fields included in the LEN value. + + o SDUR from the Decoding Time to Sample Box (stts). The + (unsigned) 32 bits of the "sample delta" field are used for + calculating SDUR. However, since the SDUR field is only 3 + bytes long, text samples with duration values larger than + (2^24-1)/(timestamp clockrate)[seconds] cannot be streamed + directly. The solution is simple: Copies of the corresponding + text sample SHALL be sent. Thereby, the timestamp and + duration values SHALL be adjusted so that a continuous display + + + +Rey & Matsui Standards Track [Page 28] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + is guaranteed as if just one sample would have been sent. + That is, a sample with timestamp TS and duration SDUR can be + sent as two samples having timestamps TS1 and TS2 and + durations SDUR1 and SDUR2, such that TS1=TS, TS2=TS1+SDUR1, + and SDUR=SDUR1+SDUR2. + + o Text sample length from the Sample Size Box (stsz). The + (unsigned) 32 bits of the "sample size" or "entry size" (one + of them, depending on whether the sample size is fixed or + variable) indicate the length (in bytes) of the 3GP text + sample. For obtaining the length of the (actual) streamed + text sample, the lengths of the text string byte count (2 + bytes) and, in case of UTF-16 strings, the length the BOM + (also 2 bytes) SHALL be deducted. This is illustrated in + Figure 9. + + Text Sample according to 3GPP TS 26.245 + + TEXT SAMPLE (length=stsz) + .--------------------------------------------------. + / \ + TEXT STRING (length=TBC) + .------------------------------------. + / \ + TBC BOM MODIFIERS + +---+---+----------------------------------+-----------+ + || + || TBC BOM -> TLEN field + || +---+---+ U bit + || + \/ + + Text Sample according to this Payload Format + + TEXT SAMPLE (length=SLEN w/o TBC,BOM) + .--------------------------------------------. + / \ + TEXT STRING (length=TLEN) + .--------------------------------. + / \ + TEXT STRING MODIFIERS + +----------------------------------+-----------+ + + KEY: + TBC = Text string Byte Count + BOM = Byte Order Mark + + Figure 9. Text sample composition + + + +Rey & Matsui Standards Track [Page 29] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + Moreover, since the LEN field in TYPE 1 unit header is 16 bits + long, larger text sample sizes than (2^16-1-8) [bytes] SHALL + NOT be streamed. Also, in this case, no extension mechanism + is defined. This is because this maximum is considered enough + for the targeted streaming applications. (Note: The eight (8) + accounts for the TYPE 1 header fields included in the LEN + value). + + o SIDX from the Sample to Chunk Box (stsc): The stsc Box is used + to find samples and their corresponding sample descriptions. + These are referenced by the "sample description index", a + 32-bit (unsigned) integer. If possible, these indices may be + directly mapped to the SIDX field. However, there are several + cases where this may not be possible: + + a) The total number of indices used is greater than + the number of indices available, i.e., if the static + sample descriptions are more than 127 or the dynamic ones + are more than 64. + + b) The original SIDX value ranges do not fit in the + allowed ranges for static (129-254) or dynamic (0-127) + values. + + Therefore, when assigning SIDX values to the sample + descriptions, the following guidelines are provided: + + o Static sample descriptions can simply be assigned + consecutive values within the range 129-254 (closed + interval). This range should be well enough for static + sample descriptions. + + o As for dynamic sample descriptions: + + a) Streams that use less than 64 dynamic sample + descriptions SHOULD use consecutive values for SIDX + anywhere in the range 0-127 (closed interval). + + b) For streams with more than 64 sample descriptions, + the SIDX values MUST be assigned in usage order, and if + any sample description shall be used after it has been + set inactive, it will need to be re-sent and assigned a + new SIDX value (according to the algorithm in Section + 4.2.1). + + + + + + + +Rey & Matsui Standards Track [Page 30] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + Information obtained from the Media Data Box: + + o Text strings, TLEN, U bit, and modifiers from the Media Data + Box (mdat). Text strings, 16-bit text string byte count, Byte + Order Mark (BOM, indicating UTF encoding), and modifier boxes + can be found here. + + For TYPE 1 units, the value of TLEN is extracted from the text + string byte count that precedes the text string in the text + sample, as stored in the 3GP file. If UTF-16 encoding is + used, two (2) more bytes have to be deducted from this byte + count beforehand, in order to exclude the BOM. See Figure 9. + +4.4. Fragmentation of Timed Text Samples + + This section explains why text samples may have to be fragmented and + discusses some of the possible approaches to doing it. A solution is + proposed together with rules and recommendations for fragmenting and + transporting text samples. + + 3GPP Timed Text applications are expected to operate at low bitrates. + This fact, added to the small size of timed text samples (typically + one or two hundred bytes) makes fragmentation of text samples a rare + event. Samples should usually fit into the MTU size of the used + network path. + + Nevertheless, some text strings (e.g., ending roll in a movie) and + some modifier boxes (i.e., for hyperlinks, for karaoke, or for + styles) may become large. This may also apply for future modifier + boxes. In such cases, the first option to consider is whether it is + possible to adjust the encoding (e.g., the size of sample) in such a + way that fragmentation is avoided. If it is, this is preferred to + fragmentation and SHOULD be done. + + Otherwise, if this is not possible or other constraints prevent it, + fragmentation MAY be used, and the basic guidelines given in this + document MUST be followed: + + o It is RECOMMENDED that text samples be fragmented as seldom as + possible, i.e., the least possible number of fragments is created + out of a text sample. + + o If there is some bitrate and free space in the payload available, + sample descriptions (if at hand) SHOULD be aggregated. + + o Text strings MUST split at character boundaries; see TYPE 2 header. + Otherwise, it is not possible to display the text contents of a + fragment if a previous fragment was lost. As a consequence, text + + + +Rey & Matsui Standards Track [Page 31] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + string fragmentation requires knowledge of the UTF-8/UTF-16 + encoding formats to determine character boundaries. + + o Unlike text strings, the modifier boxes are NOT REQUIRED to be + split at meaningful boundaries. However, it is RECOMMENDED that + this be done whenever possible. This decreases the effects of + packet loss. This payload format does not ensure that partially + received modifiers are applied to text strings. If only part of + the modifiers is received, it is an application issue how to deal + with these, i.e., whether or not to use them. + + Informative note: Ensuring that partially received modifiers can + be applied to text strings in all cases (for all modifier types + and for all fragment loss constellations) would place additional + requirements on the payload format. In particular, this would + require that: a) senders understand the semantics of the + modifier boxes and b) specific fragment headers for each of the + modifier boxes are defined, in addition to the payload formats + defined below. Understanding the modifiers semantics means + knowing, e.g., where each modifier starts and ends, which text + fragments are affected, which modifiers may or may not be split, + or what the fields indicate. This is necessary to be able to + split the modifiers in such a way that each fragment can be + applied independently of previous packet losses. This would + require a more intelligent fragmentation entity and more complex + headers. Given the low probability of fragmentation and the + desire to keep the requirements low, it does not seem reasonable + to specify such modifier box specific headers. + + o Modifier and text string fragments SHOULD be protected against + packet losses, i.e., using FEC [7], retransmission [11], repetition + (Section 5), or an equivalent technique. This minimizes the + effects of packet loss. + + o An additional requirement when fragmenting text samples is that the + start of the modifiers MUST be indicated using the payload header + defined for that purpose, i.e., a TYPE 3 unit MUST be used (see + Section 4.1.4). This enables a receiver to detect the start of the + modifiers as long as there are not two or more consecutive packet + losses. + + o Finally, sample descriptions SHALL NOT be fragmented because they + contain important information that may affect several text samples. + + + + + + + + +Rey & Matsui Standards Track [Page 32] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + +4.5. Reassembling Text Samples at the Receiver + + The payload headers defined in this document allow reassembling + fragmented text samples. For this purpose, the standard RTP + timestamp, the duration field (SDUR), and the fields TOTAL/THIS in + the payload headers are used. + + Units that belong to the same text sample MUST have the same + timestamp. TYPE 5 units do not comply with this rule since they are + not part of any particular text sample. + + The process for collecting the different fragments (units) of a text + sample is as follows: + + 1. Search for units having the same timestamp value, i.e., units + that belong to the same text sample or sample descriptions that + shall become available at that time instant. If several units + of the same sample are repeated, only one of them SHALL be used. + Repeated units are those that have the same timestamp and the + same values for TOTAL/THIS. + + Note that, as mentioned in Section 4.1.1, the receiver + SHALL ignore units with unrecognized TYPE value. + However, the RTP header fields and the rest of the units + (if any) in the payload are still useful. + + 2. Check within this set whether any of the units from the text + sample is missing. This is done using the TOTAL and THIS + fields; the TOTAL field indicates how many fragments were + created out of the text sample, and the THIS field indicates the + position of this fragment in the text sample. As result of this + operation, two outcomes are possible: + + a. No fragment is missing. Then, the THIS field SHALL be used + to order the fragments and reassemble the text sample + before forwarding it to the decoding application. Special + care SHALL be taken when reassembling the text string as + indicated in bullet 4 below. + + b. One or more fragments are missing: Check whether this + fragment belongs to the text string or to the modifiers. + TYPE 2 units identify text string fragments, and TYPE 3 and + 4 identify modifier fragments: + + i. If the fragment or fragments missing belong to the text + string and the modifiers were received complete, then + the received text characters may, at least, be + displayed as plain text. Some modifiers may only be + + + +Rey & Matsui Standards Track [Page 33] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + applied as long as it is possible to identify the + character numbers, e.g., if only the last text string + fragment is lost. This is the case for modifiers + defining specific font styles ('styl'), highlighted + characters ('hlit'), karaoke feature ('krok'), and + blinking characters ('blnk'). Other modifiers such as + 'dlay' or 'tbox' can be applied without the knowledge + of the character number. It is an application issue to + decide whether or not to apply the modifiers. + + ii. If the fragment missing belongs to the modifiers and + the text strings were received complete, then the + incomplete modifiers may be used. The text string + SHOULD at least be displayed as plain text. As + mentioned in Section 4.4, modifiers may split without + observing meaningful boundaries. Hence, it may not + always be possible to make use of partially received + modifiers. However, to avoid this, it is RECOMMENDED + that the modifiers do split at meaningful boundaries. + + iii. A third possibility is that it is not possible to + discern whether modifiers or text strings were received + complete. For example, if the TYPE 3 unit of a sample + plus the following or preceding packet is lost, there + is no way for the RTP receiver to know if one or both + packets lost belong to the modifiers or if there are + also some missing text strings. Repetition, FEC, + retransmission, or other protection mechanisms as per + section 4.6 are RECOMMENDED to avoid this situation. + + iv. Finally, if it is sure that neither text strings nor + modifiers were received complete, then the text strings + and the modifiers may be rendered partially or may be + discarded. This is an application choice. + + 3. Sample descriptions can be directly associated with the + reassembled text samples, via the sample description index + (SIDX). + + 4. Reassembling of text strings: Since the text strings transported + in RTP packets MUST NOT include any byte order mark (BOM), the + receiver MUST prepend it to the reassembled UTF-16 string before + handling it to the timed text decoder (see Figure 9). The value + of the BOM is 0xFEFF because only big endian serialization of + UTF-16 strings is supported by this payload format. + + + + + + +Rey & Matsui Standards Track [Page 34] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + +4.6. On Aggregate Payloads + + Units SHOULD be aggregated to avoid overhead, whenever possible. The + aggregate payloads MUST comply with one of the following ordered + configurations: + + 1. Zero or more sample descriptions (TYPE 5) followed by zero or more + whole text samples (TYPE 1 units). At least one unit of either + type MUST be present. + + 2. Zero or more sample descriptions followed by zero or one modifier + fragment, either TYPE 3 or TYPE 4. At least one unit MUST be + present. + + 3. Zero or more sample descriptions, followed by zero or one text + string fragment (TYPE 2), followed by zero or one TYPE 3 unit. If + a TYPE 2 unit and a TYPE 3 unit are present, then they MUST belong + to the same text sample. At least one unit MUST be present. + + Some observations: + + o Different aggregates than the ones listed above SHALL NOT be used. + + o Sample descriptions MUST be placed in the aggregate payload before + the occurrence of any non-TYPE 5 units. + + o Correct reception of TYPE 5 units is important since their contents + may be referenced by several other units in the stream. + + Receivers are unable to use text samples until their corresponding + sample descriptions are received. Accordingly, a sender SHOULD + send multiple copies of a sample description to ensure reliability + (see Section 5). Receivers MAY use payload-specific feedback + messages [21] to tell a sender that they have received a particular + sample description. + + o Regarding timestamp calculation: In general, the rules for + calculating the timestamp of units in an aggregate payload depend + on the type of unit. Based on the possible constellations for + aggregate payloads, as above, we have: + + o Sample descriptions MUST receive the RTP timestamp of the + packet in which they are included. + + Note that for TYPE 5 units, the timestamp actually does not + represent the instant when they are played out, but instead + the instant at which they become available for use. + + + + +Rey & Matsui Standards Track [Page 35] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + o For the first configuration: The first TYPE 1 unit receives + the RTP timestamp. The timestamp of any subsequent TYPE 1 + unit MUST be obtained by adding sample duration and + timestamp, both of the preceding TYPE 1 unit. + + o For the second and third configuration, all units, TYPE 2, + 3, and 4, MUST receive the RTP timestamp. + + Refer to detailed examples on the timestamp calculation + below. + + o As per configuration 3 above, a payload MAY contain several + fragments of one (and only one) text sample. If it does, then + exactly one TYPE 2 unit followed by exactly one TYPE 3 unit is + allowed in the same payload. This is in line with RFC 3640 [12], + Section 2.4, which explicitly disallows combining fragments of + different samples in the same RTP payload. Note that, in this + special case, no timestamp calculation is needed. That is, the RTP + timestamp of both units is equal to the timestamp in the packet's + RTP header. + + o Finally, note that the use of empty text samples allows for + aggregating non-consecutive TYPE 1 units in the same payload. Two + text samples, with timestamps TS1 and TS3 and durations SDUR1 and + SDUR3, are not consecutive if it holds TS1+SDUR1 < TS3. A solution + for this is to include an empty TYPE 1 unit with duration SDUR2 + between them, such that TS2+SDUR2 = TS1+SDUR1+SDUR2 = TS3. + + Some examples of aggregate payloads are illustrated in Figure 10. + (Note: The figure is not scaled.) + + + + + + + + + + + + + + + + + + + + + +Rey & Matsui Standards Track [Page 36] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + N/A TS1 TS2 TS3 + +------+-----+------+-----+ + |TYPE5 |TYPE1|TYPE1 |TYPE1| + +------+-----+------+-----+ + N/A sdur1 sdur2 sdur3 + + N/A TS4 + +-----+-------+ + |TYPE5| TYPE 1| a) + +-----+-------+ + N/A sdur4 + + TS4 TS4 TS4 + +--------------+ +--------------+ + | TYPE2 | |TYPE2 |TYPE 3 | b) + +--------------+ +--------------+ + sdur4 sdur4 sdur4 + + TS4 TS4 + +--------------+ +--------------+ + | TYPE2| TYPE 3| | TYPE4 | c) + +--------------+ +--------------+ + sdur4 sdur4 sdur4 + + |----------PAYLOAD 1------| |--PAYLOAD 2---| |--PAYLOAD 3---| + rtpts1 rtpts2 rtpts3 + + KEY: + TSx = Text Sample x + rtptsy = the standard RTP timestamp for PAYLOAD y + sdurx = the duration of Text Sample x + N/A = not applicable + + Figure 10. Example aggregate payloads + + In Figure 10, four text samples (TS1 through TS4) are sent using + three RTP packets. These configurations have been chosen to show how + the 5 TYPE headers are used. Additionally, three different + possibilities for the last text sample, TS4, are depicted: a), b), + and c). + + In Figure 11, option b) from Figure 10 is chosen to illustrate how + the timestamp for each unit is found. + + + + + + + + +Rey & Matsui Standards Track [Page 37] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + N/A TS1 TS2 TS3 TS4 TS4 TS4 + +------+-----+------+-----+ +--------------+ +--------------+ + |TYPE5 |TYPE1|TYPE1 |TYPE1| | TYPE2 | |TYPE2 |TYPE 3 | + +------+-----+------+-----+ +--------------+ +--------------+ + N/A sdur1 sdur2 sdur3 sdur4 sdur4 sdur4 + + (#1) (#2) (#3) (#4) (#5) (#6) (#7) + + |----------PAYLOAD 1------| |--PAYLOAD 2---| |--PAYLOAD 3---| + rtpts1 rtpts2 rtpts3 + + Figure 11. Selected payloads from Figure 10 + + Assuming TSx means Text Sample x, rtptsy represents the standard RTP + timestamp for PAYLOAD y and sdurx, the duration of Text Sample x, the + timestamp for unit #z, ts(#z), can be found as the sum of rtptsy and + the cumulative sum of the durations of preceding units in that + payload (except in the case of PAYLOAD 3 as per rule 3 above). Thus, + we have: + + 1. for the units in the first aggregate payload, PAYLOAD 1: + + ts(#1) = rtpts1 + ts(#2) = rtpts1 + ts(#3) = rtpts1 + sdur1 + ts(#4) = rtpts1 + sdur1 + sdur2 + + Note that the TYPE 5 and the first TYPE 1 unit have both the + RTP timestamp. + + 2. for PAYLOAD 2: + + ts(#5) = rtpts2 + + 3. for PAYLOAD 3: + + ts(#6) = ts(#7) = rtpsts2 = rtpts3 + + According to configuration 3 above, the TYPE2 and the TYPE 3 + units shall belong to the same sample. Hence, rtpts3 must be + equal to rtpts2. For the same reason, the value of SDUR is + not be used to calculate the timestamp of the next unit. + + + + + + + + + +Rey & Matsui Standards Track [Page 38] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + +4.7. Payload Examples + + Some examples of payloads using the defined headers are shown below: + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |V=2|P|X| CC |M| PT | sequence number | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | timestamp | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | synchronization source (SSRC) identifier | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |U| R |TYPE1| LEN (always >=8) | SIDX | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | SDUR | TLEN | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | TLEN | | + +---------------+ | + | text string (no.bytes=TLEN) | + | | + | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | modifiers (no.bytes=LEN - 8 - TLEN) | + | | + | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |U| R |TYPE1| LEN (always >=8) | SIDX | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | SDUR | TLEN | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | TLEN | | + +---------------+ | + | text string (no.bytes=TLEN) | + | | + | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | modifiers (no.bytes=LEN - 8 - TLEN) | + | +-+-+-+-+-+-+-+-+ + | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 12. A payload carrying two TYPE 1 units + + In Figure 12, an RTP packet carrying two TYPE 1 units is depicted. + It can be seen how the length fields LEN and TLEN can be used to find + the start of the next unit (LEN), the start of the modifiers (TLEN), + and the length of the modifiers (LEN-TLEN). + + + +Rey & Matsui Standards Track [Page 39] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |V=2|P|X| CC |M| PT | sequence number | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | timestamp | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | synchronization source (SSRC) identifier | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |U| R |TYPE5| LEN( always >3) | SIDX | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | + | sample description (no.bytes=LEN - 3) | + | | + | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |U| R |TYPE1| LEN (always >=8) | SIDX | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | SDUR | TLEN | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | TLEN | | + +-+-+-+-+-+-+-+-+ | + | text string fragment (no.bytes=TLEN) | + | | + | | + | +-+-+-+-+-+-+-+-+ + | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 13. An RTP packet carrying a TYPE 5 and a TYPE 1 unit + + In Figure 13, a sample description and a TYPE 1 unit are aggregated. + The TYPE 1 unit happens to contain only text strings and is small, so + an additional TYPE 5 unit is included to take advantage of the + available bits in the packet. + + + + + + + + + + + + + + + + +Rey & Matsui Standards Track [Page 40] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |V=2|P|X| CC |M| PT | sequence number | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | timestamp | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | synchronization source (SSRC) identifier | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |U| R |TYPE2| LEN( always >9) |TOTAL=4|THIS=1 | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | SDUR | SIDX | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | SLEN | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | + | text string fragment (no.bytes=LEN - 9) | + | | + : : + : : + | +-+-+-+-+-+-+-+-+ + | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 14. Payload with first text string fragment of a sample + + In Figures 14, 15, and 16, a text sample is split into three RTP + packets. In Figure 14, the text string is big and takes the whole + packet length. In Figure 15, the only possibility for carrying two + fragments of the same text sample is represented (see configuration 3 + in Section 4.6). The last packet, shown in Figure 16, carries the + last modifier fragment, a TYPE 4. + + + + + + + + + + + + + + + + + + + + +Rey & Matsui Standards Track [Page 41] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |V=2|P|X| CC |M| PT | sequence number | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | timestamp | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | synchronization source (SSRC) identifier | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |U| R |TYPE2| LEN( always >9) |TOTAL=4|THIS=2 | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | SDUR | SIDX | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | SLEN | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | + | text string fragment (no.bytes=LEN - 9) | + | | + | | + | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |U| R |TYPE3| LEN( always >6) |TOTAL=4|THIS=3 | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | SDUR | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | + | | + | modifiers (no.bytes=LEN - 6) | + | +-+-+-+-+-+-+-+-+ + | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 15. An RTP packet carrying a TYPE 2 unit and a TYPE 3 unit + + + + + + + + + + + + + + + + + + + + +Rey & Matsui Standards Track [Page 42] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |V=2|P|X| CC |M| PT | sequence number | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | timestamp | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | synchronization source (SSRC) identifier | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |U| R |TYPE4| LEN( always >6) |TOTAL=4|THIS=4 | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | SDUR | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | + | | + | modifiers (no.bytes=LEN - 6) | + | +-+-+-+-+-+-+-+-+ + | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 16. An RTP packet carrying last modifiers fragment (TYPE 4) + +4.8. Relation to RFC 3640 + + RFC 3640 [12] defines a payload format for the transport of any non- + multiplexed MPEG-4 elementary stream. One of the various MPEG-4 + elementary stream types is MPEG-4 timed text streams, specified in + MPEG-4 part 17 [26], also known as ISO/IEC 14496-17. MPEG-4 timed + text streams are capable of carrying 3GPP timed text data, as + specified in 3GPP TS 26.245 [1]. + + MPEG-4 timed text streams are intentionally constructed so as to + guarantee interoperability between RFC 3640 and this payload format. + This means that the construction of the RTP packets carrying timed + text is the same. That is, the MPEG-4 timed text elementary stream + as per ISO/IEC 14496-17 is identical to the (aggregate) payloads + constructed using this payload format. + + Figure 17 illustrates the process of constructing an RTP packet + containing timed text. As can be seen in the partition block, the + (transport) units used in this payload format are identical to the + Timed Text Units (TTUs) defined in ISO/IEC 14496-17. Likewise, the + rules for payload aggregation as per Section 4.6 are identical to + those defined in ISO/IEC 14496-17 and are compliant with RFC 3640. + As a result, an RTP packet that uses this payload format is identical + to an RTP packet using RFC 3640 conveying TTUs according to ISO/IEC + 14496-17. In particular, MPEG-4 Part 17 specifies that when using + + + + + +Rey & Matsui Standards Track [Page 43] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + RFC 3640 for transporting timed text streams, the "streamType" + parameter value is set to 0x0D, and the value of the + "objectTypeIndication" in "config" takes the value 0x08. + + +--------------------------------------+ + Text samples | +--------------+ +--------------+ | + as per 3GPP | |Text Sample 1 | |Text Sample N | | + TS 26245 | +--------------+ +--------------+ | + +--------------------------------------+ + \/ + +-------------------------------------------------------------------+ + | Partition Text Samples into units. TTU[i]= TYPE i units. | + | | + |[U R TYPE LEN][{TOTAL,THIS}SIDX{SDUR}{TLEN}{SLEN}][SampleContents] | + |{..} means present if applicable, [..] means always present | + +-------------------------------------------------------------------+ + \/ \/ + +-------------------------------------------------------------------+ + | Aggregation (if possible) | + +-------------------------------------------------------------------+ + \/ \/ + +-------------------------------------------------------------------+ + | RTP Entity adds and fills RTP header and Sends RTP packet, where | + | RTP packets according to this Payload Format = | + | RTP packets carrying MPEG-4 Timed Text ES over RFC 3640 | + +-------------------------------------------------------------------+ + + Figure 17. Relation to RFC 3640 + + Note: The use of RFC 3640 for transport of ISO/IEC 14496-17 data does + not require any new SDP parameters or any new mode definition. + +4.9. Relation to RFC 2793 + + RFC 2793 [22] and its revision, RFC 4103 [23], specify a protocol for + enabling text conversation. Typical applications of this payload + format are text communication terminals and text conferencing tools. + Text session contents are specified in ITU-T Recommendation T.140 + [24]. T.140 text is UTF-8 coded as specified in T.140 [24] with no + extra framing. The T140block contains one or more T.140 code + elements as specified in T.140. Code elements are control sequences + such as "New Line", "Interrupt", "String Terminator", or "Start of + String". Most T.140 code elements are single ISO 10646 [25] + characters, but some are multiple character sequences. Each + character is UTF-8 encoded [18] into one or more octets. + + + + + + +Rey & Matsui Standards Track [Page 44] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + This payload format may also be used for conversational applications + (even for instant messaging). However, this is not its main target. + The differentiating feature of 3GPP Timed Text media format is that + it allows text decoration. This is especially useful in multimedia + presentations, karaoke, commercial banners, news tickers, clickable + text strings, and captions. T.140 text contents used in RFC 2793 do + not allow the use of text decoration. + + Furthermore, the conversational text RTP payload format recommends a + method to include redundant text from already transmitted packets in + order to reduce the risk of text loss caused by packet loss. Thereby + payloads would include a redundant copy of the last payload sent. + This payload format does not describe such a method, but this is also + applicable here. As explained in Section 5, packet redundancy SHOULD + be used, whenever possible. The aggregation guidelines in Section + 4.6 allow redundant payloads. + +5. Resilient Transport + + Apart from the basic fragmentation guidelines described in the + section above, the simplest option for packet-loss-resilient + transport is packet repetition. This mechanism may consist of a + strict window-based repetition mechanism or, simply, a repetition + mechanism in a wider sense, where new and old packets are mixed, for + example. + + A server MAY decide to use repetition as a measure for packet loss + resilience. Thereby, a server MAY send the same RTP payloads or just + some of the units from the payloads. + + As for the case of complete payloads, single repeated units MUST + exactly match the same units sent in the first transmission; i.e., if + fragmentation is needed, it SHALL be performed only once for each + text sample. Only then, a receiver can use the already received and + the repeated units to reconstruct the original text samples. Since + the RTP timestamp is used to group together the fragments of a + sample, care must taken to preserve the timing of units when + constructing new RTP packets. + + For example, if a text sample was originally sent as a single + non-fragmented text sample (one TYPE 1 unit), a repetition of + that sample MUST be sent also as a single non-fragmented text + sample in one unit. Likewise, if the original text sample was + fragmented and spread over several RTP packets (say, a total of + 3 units), then the repeated fragments SHALL also have the same + byte boundaries and use the same unit headers and bytes per + fragment. + + + + +Rey & Matsui Standards Track [Page 45] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + With repetition, repeated units resolve to the same timestamp as + their originals. Where redundant units are available, only one of + them SHALL be used. + + Regarding the RTP header fields: + + o If the whole RTP payload is repeated, all payload-specific fields + in the RTP header (the M, TS and PT fields) MUST keep their + original values except the sequence number, which MUST be + incremented to comply with RTP (the fields TOTAL/THIS enable to + re-assemble fragments with different sequence numbers). + + o In packets containing single repeated units, the general rules in + Section 3 for assigning values to the RTP header fields apply. + Keeping the value of the RTP timestamp to preserve the timing of + the units is particularly relevant here. + + Apart from repetition, other mechanisms such as FEC [7], + retransmission [11], or similar techniques could be used to cope with + packet losses. + +6. Congestion Control + + Congestion control for RTP SHALL be implemented in accordance with + RTP [3] and the applicable RTP profile, e.g., RTP/AVP [17]. + + When using this payload format, mainly two factors may affect the + congestion control: + + o The use of (unit) aggregation may make the payload format more + bandwidth efficient, by avoiding header overhead and thus reducing + the used bitrate. + + o The use of resilient transport mechanisms: Although timed text + applications typically operate at low bitrates, the increase due to + resilient transport shall be considered for congestion control + mechanisms. This applies to all mechanisms but especially to less + efficient ones like repetition. + + + + + + + + + + + + + +Rey & Matsui Standards Track [Page 46] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + +7. Scene Description + +7.1. Text Rendering Position and Composition + + In order to set up a timed text session, regardless of the stream + being stored in a 3GP file or streamed live, some initial layout + information is needed by the communicating peers. + + +-------------------------------------------+ + | <-> tx | +-------------+ + | +-------------------------------+ |<---|Display Area | + | ^ | | | +-------------+ + | : | | | + | :ty| | | +-------------+ + | : | |<---------|Video track | + | : | | | +-------------+ + | : | | | + | : | | | + | : | | | + | v | | | + | - | x-------------------------+ | | +-------------+ + |h ^ | | |<-----------|Text Track | + |e : +---|-------------------------|-+ | +-------------+ + |i : | +---------------------+ | | + |g : | | | | | +-------------+ + |h : | | |<------------ |Text Box | + |t v | +---------------------+ | | +-------------+ + | - +-------------------------+ | + +-------------------------------------------+ + <........................> + w i d t h + + Figure 18. Illustration of text rendering position and composition + + The parameters used for negotiating the position and size of the text + track in the display area are shown in Figure 18. These are the + "width" and "height" of the text track, its translation values, "tx" + and "ty", and its "layer" or proximity to the user. + + At the same time, the sender of the stream needs to know the + receiver's capabilities. In this case, the maximum allowable values + for the text track height and width: "max-h" and "max-w", for the + stream the receiver shall display. + + This layout information MUST be conveyed in a reliable form before + the start of the session, e.g., during session announcement or in an + Offer/Answer (O/A) exchange. An example of a reliable transport may + be the out-of-band channel used for SDP. Sections 8 and 9 provide + + + +Rey & Matsui Standards Track [Page 47] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + details on the mapping of these parameters to SDP descriptions and + their usage in O/A. + + For stored content, the layout values expressing stream properties + MUST be obtained from the Track Header Box. See Section 7.3. + + For live streaming, appropriate values as negotiated during session + setup shall be used. + +7.2. SMIL Usage + + The attributes contained in the Track Header Boxes of a 3GP file only + specify the spatial relationship of the tracks within the given 3GP + file. + + If multiple 3GP files are sent, they require spatial synchronization. + For example, for a text and video stream, the positions of the text + and video tracks in Figure 18 shall be determined. For this purpose, + SMIL [9] MAY be used. + + SMIL assigns regions in the display to each of those files and places + the tracks within those regions. Generally, in SMIL, the position of + one track (or stream) is expressed relative to another track. This + is different from the 3GP file, where the upper left corner is the + reference for all translation offsets. Hence, only if the position + in SMIL is relative to the video track origin, then this translation + offset has the same value as (tx, ty) in the 3GP file. + + Note also that the original track header information is used for each + track only within its region, as assigned by SMIL. Therefore, even + if SMIL scene description is used, the track header information + pieces SHOULD be sent anyway, as they represent the intrinsic media + properties. See 3GPP SMIL Language Profile in [27] for details. + +7.3. Finding Layout Values in a 3GP File + + In a 3GP file, within the Track Header Box (tkhd): + + o tx, ty: These values specify the translation offset of the + (text) track relative to the upper left corner of the video + track, if present. They are the second but last and third but + last values in the unity matrix; values are fixed-point 16.16 + values, restricted to be (signed) integers (i.e., the lower 16 + bits of each value shall be all zeros). Therefore, only the + first 16 bits are used for obtaining the value of the media + type parameters. + + + + + +Rey & Matsui Standards Track [Page 48] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + o width, height: They have the same name in the tkhd box. All + (unsigned) 32 bits are meaningful. + + o layer: All (signed) 16 bits are used. + +8. 3GPP Timed Text Media Type + + The media subtype for the 3GPP Timed Text codec is allocated from the + standards tree. The top-level media type under which this payload + format is registered is 'video'. This registration is done using the + template defined in [29] and following RFC 3555 [28]. + + The receiver MUST ignore any unrecognized parameter. + + Media type: video + + Media subtype: 3gpp-tt + + Required parameters + + rate: + Refer to Section 3 in RFC 4396. + + sver: + The parameter "sver" contains a list of supported + backwards-compatible versions of the timed text format + specification (3GPP TS 26.245) that the sender accepts + to receive (and that are the same that it would be + willing to send). The first value is the value + preferred to receive (or preferred to send). The first + value MAY be followed by a comma-separated list of + versions that SHOULD be used as alternatives. The order + is meaningful, being first the most preferred and last + the least preferred. Each entry has the format + Zi(xi*256+yi), where "Zi" is the number of the Release + and "xi" and "yi" are taken from the 3GPP specification + version (i.e., vZi.xi.yi). For example, for 3GPP TS + 26.245 v6.0.0, Zi(xi*256+yi)=6(0), the version value is + "60". (Note that "60" is the concatenation of the + values Zi=6 and (xi*256+yi)=0 and not their product.) + + If no "sver" value is available, for example, when + streaming out of a 3GP file, the default value "60", + corresponding to the 3GPP Release 6 version of 3GPP TS + 26.245, SHALL be used. + + + + + + +Rey & Matsui Standards Track [Page 49] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + Optional parameters: + + tx: + This parameter indicates the horizontal translation + offset in pixels of the text track with respect to the + origin of the video track. This value is the decimal + representation of a 16-bit signed integer. Refer to TS + 3GPP 26.245 for an illustration of this parameter. + + ty: + This parameter indicates the vertical translation offset + in pixels of the text track with respect to the origin + of the video track. This value is the decimal + representation of a 16-bit signed integer. Refer to TS + 3GPP 26.245 for an illustration of this parameter. + + layer: + This parameter indicates the proximity of the text track + to the viewer. More negative values mean closer to the + viewer. This parameter has no units. This value is the + decimal representation of a 16-bit signed integer. + + tx3g: + This parameter MUST be used for conveying sample + descriptions out-of-band. It contains a comma-separated + list of base64-encoded entries. The entries of this + list MAY follow any particular order and the list SHALL + NOT be empty. Each entry is the result of running + base64 encoding over the concatenation of the (static) + SIDX value as an 8-bit unsigned integer and the (static) + sample description for that SIDX, in that order. The + format of a sample description entry can be found in + 3GPP TS 26.245 Release 6 and later releases. All + servers and clients MUST understand this parameter and + MUST be capable of using the sample description(s) + contained in it. Please refer to RFC 3548 [6] for + details on the base64 encoding. + + width: + This parameter indicates the width in pixels of the text + track or area of the text being sent. This value is the + decimal representation of a 32-bit unsigned integer. + Refer to TS 3GPP 26.245 for an illustration of this + parameter. + + + + + + + +Rey & Matsui Standards Track [Page 50] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + height: + This parameter indicates the height in pixels of the + text track being sent. This value is the decimal + representation of a 32-bit unsigned integer. Refer to + TS 3GPP 26.245 for an illustration of this parameter. + + max-w: + This parameter indicates display capabilities. This is + the maximum "width" value that the sender of this + parameter supports. This value is the decimal + representation of a 32-bit unsigned integer. + + max-h: + This parameter indicates display capabilities. This is + the maximum "height" value that the sender of this + parameter supports. This value is the decimal + representation of a 32-bit unsigned integer. + + Encoding considerations: + + This media type is framed (see Section 4.8 in [29]) and + partially contains binary data. + + Restrictions on usage: + + This media type depends on RTP framing, and hence is only + defined for transfer via RTP [3]. Transport within other + framing protocols is not defined at this time. + + Security considerations: + + Please refer to Section 11 of RFC 4396. + + Interoperability considerations: + + The 3GPP Timed Text media format and its file storage is + specified in Release 6 of 3GPP TS 26.245, "Transparent end-to- + end packet switched streaming service (PSS); Timed Text Format + (Release 6)". Note also that 3GPP may in future releases + specify extensions or updates to the timed text media format in + a backwards-compatible way, e.g., new modifier boxes or + extensions to the sample descriptions. The payload format + defined in RFC 4396 allows for such extensions. For future 3GPP + Releases of the Timed Text Format, the parameter "sver" is used + to identify the exact specification used. + + + + + + +Rey & Matsui Standards Track [Page 51] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + The defined storage format for 3GPP Timed Text format is the + 3GPP File Format (3GP) [30]. 3GP files may be transferred using + the media type video/3gpp as registered by RFC 3839 [31]. The + 3GPP File Format is a container file that may contain, e.g., + audio and video that may be synchronized with the 3GPP Timed + Text. + + Published specification: RFC 4396 + + Applications which use this media type: + + Multimedia streaming applications. + + Additional information: + + The 3GPP Timed Text media format is specified in 3GPP TS 26.245, + "Transparent end-to-end packet switched streaming service (PSS); + Timed Text Format (Release 6)". This document and future + extensions to the 3GPP Timed Text format are publicly available + at http://www.3gpp.org. + + Magic number(s): None. + + File extension(s): None. + + Macintosh File Type Code(s): None. + + Person & email address to contact for further information: + + Jose Rey, jose.rey@eu.panasonic.com + Yoshinori Matsui, matsui.yoshinori@jp.panasonic.com + Audio/Video Transport Working Group. + + Intended usage: COMMON + + Authors: + Jose Rey + Yoshinori Matsui + + Change controller: IETF Audio/Video Transport Working Group delegated + from the IESG. + + + + + + + + + + +Rey & Matsui Standards Track [Page 52] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + +9. SDP Usage + +9.1. Mapping to SDP + + The information carried in the media type specification has a + specific mapping to fields in SDP [4]. If SDP is used to specify + sessions using this payload format, the mapping is done as follows: + + o The media type ("video") goes in the SDP "m=" as the media name. + + m=video <port number> RTP/<RTP profile> <dynamic payload type> + + o The media subtype ("3gpp-tt") and the timestamp clockrate "rate" + (the RECOMMENDED 1000 Hz or other value) go in SDP "a=rtpmap" line + as the encoding name and rate, respectively: + + a=rtpmap:<payload type> 3gpp-tt/1000 + + o The REQUIRED parameter "sver" goes in the SDP "a=fmtp" attribute by + copying it directly from the media type string as a semicolon- + separated parameter=value pair. + + o The OPTIONAL parameters "tx", "ty", "layer", "tx3g", "width", + "height", "max-w" and "max-h" go in the SDP "a=fmtp" attribute by + copying them directly from the media type string as a semicolon + separated list of parameter=value(s) pairs: + + a=fmtp:<dynamic payload type> <parameter + name>=<value>[,<value>][; <parameter name>=<value>] + + o Any parameter unknown to the device that uses the SDP SHALL be + ignored. For example, parameters added to the media format in + later specifications MAY be copied into the SDP and SHALL be + ignored by receivers that do not understand them. + +9.2. Parameter Usage in the SDP Offer/Answer Model + + In this section, the meaning of the SDP parameters defined in this + document within the Offer/Answer [13] context is explained. + + In unicast, sender and receiver typically negotiate the streams, + i.e., which codecs and parameter values are used in the session. + This is also possible in multicast to a lesser extent. + + Additionally, the meaning of the parameters MAY vary depending on + which direction is used. In the following sections, a + "<directionality> offer" means an offer that contains a stream set to + <directionality>. <directionality> may take the values sendrecv, + + + +Rey & Matsui Standards Track [Page 53] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + sendonly, and recvonly. Similar considerations apply for answers. + For example, an answer to a sendonly offer is a recvonly answer. + +9.2.1. Unicast Usage + + The following types of parameters are used in this payload format: + + 1. Declarative parameters: Offerer and answerer declare the values + they will use for the incoming (sendrecv/recvonly) or outgoing + (sendonly) stream. Offerer and answerer MAY use different + values. + + a. "tx", "ty", and "layer": These are parameters describing + where the received text track is placed. Depending on the + directionality: + + i. They MUST appear in all sendrecv offers and answers and + in all recvonly offers and answers (thus applying to + the incoming stream). In the case of sendrecv offers + and answers and in recvonly offers, these values SHOULD + be used by the sender of the stream unless it has a + particular preference, in which case, it MUST make sure + that these different values do not corrupt the + presentation. For recvonly answers, the answerer MAY + accept the proposed values for the incoming stream (in + a sendonly offer; see ii. below) or respond with + different ones. The offerer MUST use the returned + values. + + ii. They MAY appear in sendonly offers and MUST appear in + sendonly answers. In sendonly offers, they specify the + values that the offerer proposes for sending (see + example in Section 9.3). In sendonly answers, these + values SHOULD be copied from the corresponding recvonly + offer upon accepting the stream, unless a particular + preference by the receiver of the stream exists, as + explained in the previous point. + + 2. Parameters describing the display capabilities, "max-h" and + "max-w", which indicate the maximum dimensions of the text track + (text display area) for the incoming stream "tx" and "ty" values + (see Figure 18). "max-h" and "max-w" MUST be included in all + offers and answers where "tx" and "ty" refer to the incoming + stream, thus excluding sendonly offers and answers (see example + in Section 9.3), where they SHALL NOT be present. + + + + + + +Rey & Matsui Standards Track [Page 54] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + 3. Parameters describing the sent stream properties, i.e., the + sender of the stream decides upon the values of these: + + a. "width" and "height" specify the text track dimensions. + They SHALL ALWAYS be present in sendrecv and sendonly + offers and answers. For recvonly answers, the answerer + MUST include the offered parameter values (if any) verbatim + in the answer upon accepting the stream. + + b. "tx3g" contains static sample descriptions. It MAY only be + present in sendrecv and sendonly offers and answers. This + parameter applies to the stream that offerers or answerers + send. + + 4. Negotiable parameters, which MUST be agreed on. This is the + case of "sver". This parameter MUST be present in every offer + and answer. The answerer SHALL choose one supported value from + the offerer's list, or else it MUST remove the stream or reject + the session. + + 5. Symmetric parameters: "rate", timestamp clockrate, belongs to + this class. Symmetric parameters MUST be echoed verbatim in the + answer. Otherwise, the stream MUST be removed or the session + rejected. + + The following table summarizes all options: + + + + + + + + + + + + + + + + + + + + + + + + + +Rey & Matsui Standards Track [Page 55] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + +..---------------------------+----------+----------+----------+ + | ``--..__ Directionality/ | sendrecv | recvonly | sendonly | + + Type of ``--..__ O or A +----------+----------+----------+ + | Parameter ``--..__ | O/A | O/A | O/A | + +--------------+------------``+----------+----------+----------+ + | Declarative |tx, ty, layer | M/M | M/M | m/M | + | | | | | | + +--------------+--------------+----------+----------+----------+ + | Display |max-h, max-w | M/M | M/M | -/- | + | Capabilities | | | | | + +--------------+--------------+----------+----------+----------+ + | Stream |height, width | M/M | -/(M) | M/M | + | properties |tx3g | m/m | -/- | m/m | + | | | | | | + +--------------+--------------+----------+----------+----------+ + | Negotiable |sver | M/M | M/M | M/M | + | | | | | | + +--------------+--------------+----------+----------+----------+ + | Symmetric |rate | M/M | M/M | M/M | + +--------------+--------------+----------+----------+----------+ + + Table 1. Parameter usage in Unicast Offer / Answer. + + KEY: + o M means MUST be present. + o m means MAY be present (such as proposed values). + o (M) or (m) means MUST or MAY, if applicable. + o a hyphen ("-") means the parameter MUST NOT be present. + + Other observations regarding parameter usage: + + o Translation and transparency values: In sendonly offers, "tx", + "ty", and "layer" indicate proposed values. This is useful for + visually composed sessions where the different streams occupy + different parts of the display, e.g., a video stream and the + captions. These are just suggested values; the peer rendering + the text ultimately decides where to place the text track. + + o Text track (area) dimensions, "height" and "width": In the case + of sendonly offers, an answerer accepting the offer MUST be + prepared to render the stream using these values. If any of + these conditions are not met, the stream MUST be removed or the + session rejected. + + o Display capabilities, "max-h" and "max-w": An answerer sending a + stream SHALL ensure that the "height" and "width" values in the + answer are compatible with the offerer's signaled capabilities. + + + + +Rey & Matsui Standards Track [Page 56] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + o Version handling via "sver": The idea is that offerer and + answerer communicate using the same version. This is achieved by + letting the answerer choose from a list of supported versions, + "sver". For recvonly streams, the first value in the list is the + preferred version to receive. Consequently, for sendonly (and + sendrecv) streams, the first value is the one preferred for + sending (and receiving). The answerer MUST choose one value and + return it in the answer. Upon receiving the answer, the offerer + SHALL be prepared to send (sendonly and sendrecv) and receive + (recvonly and sendrecv) a stream using that version. If none of + the versions in the list is supported, the stream MUST be removed + or the session rejected. Note that, if alternative non- + compatible versions are offered, then this SHALL be done using + different payload types. + +9.2.2. Multicast Usage + + In multicast, the parameter usage is similar to the unicast case, + except as follows: + + o the parameters "tx", "ty", and "layer" in multicast offers only + have meaning for sendrecv and recvonly streams. In order for all + clients to have the same vision of the session, they MUST be used + symmetrically. + + o for "height", "width", and "tx3g" (for sendrecv and sendonly), + multicast offers specify which values of these parameters the + participants MUST use for sending. Thus, if the stream is + accepted, the answerer MUST also include them verbatim in the + answer (also "tx3g", if present). + + o The capability parameters, "max-h" and "max-w", SHALL NOT be used + in multicast. If the offered text track should change in size, a + new offer SHALL be used instead. + + o Regarding version handling: + + In the case of multicast offers, an answerer MAY accept a multicast + offer as long as one of the versions listed in the "sver" is + supported. Therefore, if the stream is accepted, the answerer MUST + choose its preferred version, but, unlike in unicast, the offerer + SHALL NOT change the offered stream to this chosen version because + there may be other session participants that do support the newer + extensions. Consequently, different session participants may end + up using different backwards-compatible media format versions. It + is RECOMMENDED that the multicast offer contains a limited number + of versions, in order for all participants to have the same view of + the session. This is a responsibility of the session creator. If + + + +Rey & Matsui Standards Track [Page 57] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + none of the offered versions is supported, the stream SHALL be + removed or the session rejected. Also in this case, if alternative + non-compatible versions are offered, then this SHALL be done using + different payload types. + +9.3. Offer/Answer Examples + + In these unicast O/A examples, the long lines are wrapped around. + Static sample descriptions are shortened for clarity. + + For sendrecv: + + O -> A + + m=video <port> RTP/AVP 98 + a=rtpmap:98 3gpp-tt/1000 + a=fmtp:98 tx=100; ty=100; layer=0; height=80; width=100; max-h=120; + max-w=160; sver=6256,60; tx3g=81... + a=sendrecv + + A -> O + + m=video <port> RTP/AVP 98.. + a=rtpmap:98 3gpp-tt/1000 + a=fmtp:98 tx=100; ty=95; layer=0; height=90; width=100; max-h=100; + max-w=160; sver=60; tx3g=82... + a=sendrecv + + In this example, the offerer is telling the answerer where it will + place the received stream and what is the maximum height and width + allowable for the stream that it will receive. Also, it tells the + answerer the dimensions of the text track for the stream sent and + which sample description it shall use. It offers two versions, 6256 + and 60. The answerer responds with an equivalent set of parameters + for the stream it receives. In this case, the answerer's "max-h" and + "max-w" are compatible with the offerer's "height" and "width". + Otherwise, the answerer would have to remove this stream, and the + offerer would have to issue a new offer taking the answerer's + capabilities into account. This is possible only if multiple payload + types are present in the initial offer so that at least one of them + matches the answerer's capabilities as expressed by "max-h" and + "max-w" in the negative answer. Note also that the answerer's text + box dimensions fit within the maximum values signaled in the offer. + Finally, the answerer chooses to use version 60 of the timed text + format. + + + + + + +Rey & Matsui Standards Track [Page 58] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + For recvonly: + + Offerer -> Answerer + + m=video <port> RTP/AVP 98 + a=rtpmap:98 3gpp-tt/1000 + a=fmtp:98 tx=100; ty=100; layer=0; max-h=120; max-w=160; sver=6256,60 + a=recvonly + + A -> O + + m=video <port> RTP/AVP 98.. + a=rtpmap:98 3gpp-tt/1000 + a=fmtp:98 tx=100; ty=100; layer=0; height=90; width=100; sver=60; + tx3g=82... + a=sendonly + + In this case, the offer is different from the previous case: It does + not include the stream properties "height", "width", and "tx3g". The + answerer copies the "tx", "ty", and "layer" values, thus + acknowledging these. "max-h" and "max-w" are not present in the + answer because the "tx" and "ty" (and "layer") in this special case + do not apply to the received stream, but to the sent stream. Also, + if offerer and answerer had very different display sizes, it would + not be possible to express the answerer's capabilities. In the + example above and for an answerer with a 50x50 display, the + translation values are already out of range. + + For sendonly: + + O -> A + + m=video <port> RTP/AVP 98 + a=rtpmap:98 3gpp-tt/1000 + a=fmtp:98 tx=100; ty=100; layer=0; height=80; width=100; + sver=6256,60; tx3g=81... + a=sendonly + + A -> O + + m=video <port> RTP/AVP 98.. + a=rtpmap:98 3gpp-tt/1000 + a=fmtp:98 tx=100; ty=100; layer=0; height=80; width=100; max-h=100; + max-w=160; sver=60 + a=recvonly + + + + + + +Rey & Matsui Standards Track [Page 59] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + Note that "max-h" and "max-w" are not present in the offer. Also, + with this answer, the answerer would accept the offer as is (thus + echoing "tx", "ty", "height", "width", and "layer") and additionally + inform the offerer about its capabilities: "max-h" and "max-w". + + Another possible answer for this case would be: + + A -> O + + m=video <port> RTP/AVP 98.. + a=rtpmap:98 3gpp-tt/1000 + a=fmtp:98 tx=120; ty=105; layer=0; max-h=95; max-w=150; sver=60 + a=recvonly + + In this case, the answerer does not accept the values offered. The + offerer MUST use these values or else remove the stream. + +9.4. Parameter Usage outside of Offer/Answer + + SDP may also be employed outside of the Offer/Answer context, for + instance for multimedia sessions that are announced through the + Session Announcement Protocol (SAP) [14] or streamed through the Real + Time Streaming Protocol (RTSP) [15]. + + In this case, the receiver of a session description is required to + support the parameters and given values for the streams, or else it + MUST reject the session. It is the responsibility of the sender (or + creator) of the session descriptions to define the session parameters + so that the probability of unsuccessful session setup is minimized. + This is out of the scope of this document. + +10. IANA Considerations + + IANA has registered the media subtype name "3gpp-tt" for the media + type "video" as specified in Section 8 of this document. + +11. Security Considerations + + RTP packets using the payload format defined in this specification + are subject to the security considerations discussed in the RTP + specification [3] and any applicable RTP profile, e.g., AVP [17]. + + In particular, an attacker may invalidate the current set of active + sample descriptions at the client by means of repeating a packet with + an old sample description, i.e., replay attack. This would mean that + the display of the text would be corrupted, if displayed at all. + Another form of attack may consist of sending redundant fragments, + whose boundaries do not match the exact boundaries of the originals + + + +Rey & Matsui Standards Track [Page 60] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + (as indicated by LEN) or fragments that carry different sample + lengths (SLEN). This may cause a decoder to crash. + + These types of attack may easily be avoided by using source + authentication and integrity protection. + + Additionally, peers in a timed text session may desire to retain + privacy in their communication, i.e., confidentiality. + + This payload format does not provide any mechanisms for achieving + these. Confidentiality, integrity protection, and authentication + have to be solved by a mechanism external to this payload format, + e.g., SRTP [10]. + +12. References + +12.1. Normative References + + [1] Transparent end-to-end packet switched streaming service (PSS); + Timed Text Format (Release 6), TS 26.245 v 6.0.0, June 2004. + + [2] ISO/IEC 14496-12:2004 Information technology - Coding of audio- + visual objects - Part 12: ISO base media file format. + + [3] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, + "RTP: A Transport Protocol for Real-Time Applications", STD 64, + RFC 3550, July 2003. + + [4] Handley, M. and V. Jacobson, "SDP: Session Description + Protocol", RFC 2327, April 1998. + + [5] Bradner, S., "Key words for use in RFCs to Indicate Requirement + Levels", BCP 14, RFC 2119, March 1997. + + [6] Josefsson, S., "The Base16, Base32, and Base64 Data Encodings", + RFC 3548, July 2003. + +12.2. Informative References + + [7] Rosenberg, J. and H. Schulzrinne, "An RTP Payload Format for + Generic Forward Error Correction", RFC 2733, December 1999. + + [8] Perkins, C. and O. Hodson, "Options for Repair of Streaming + Media", RFC 2354, June 1998. + + [9] W3C, "Synchronised Multimedia Integration Language (SMIL 2.0)", + August, 2001. + + + + +Rey & Matsui Standards Track [Page 61] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + [10] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. + Norrman, "The Secure Real-time Transport Protocol (SRTP)", RFC + 3711, March 2004. + + [11] Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R. Hakenberg, + "RTP Retransmission Payload Format", Work in Progress, September + 2005. + + [12] van der Meer, J., Mackie, D., Swaminathan, V., Singer, D., and + P. Gentric, "RTP Payload Format for Transport of MPEG-4 + Elementary Streams", RFC 3640, November 2003. + + [13] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with + Session Description Protocol (SDP)", RFC 3264, June 2002. + + [14] Handley, M., Perkins, C., and E. Whelan, "Session Announcement + Protocol", RFC 2974, October 2000. + + [15] Schulzrinne, H., Rao, A., and R. Lanphier, "Real Time Streaming + Protocol (RTSP)", RFC 2326, April 1998. + + [16] Transparent end-to-end packet switched streaming service (PSS); + Protocols and codecs (Release 6), TS 26.234 v 6.1.0, September + 2004. + + [17] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and Video + Conferences with Minimal Control", STD 65, RFC 3551, July 2003. + + [18] Yergeau, F., "UTF-8, a transformation format of ISO 10646", STD + 63, RFC 3629, November 2003. + + [19] Hoffman, P. and F. Yergeau, "UTF-16, an encoding of ISO 10646", + RFC 2781, February 2000. + + [20] Friedman, T., Caceres, R., and A. Clark, "RTP Control Protocol + Extended Reports (RTCP XR)", RFC 3611, November 2003. + + [21] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, + "Extended RTP Profile for RTCP-based Feedback (RTP/AVPF)", Work + in Progress, August 2004. + + [22] Hellstrom, G., "RTP Payload for Text Conversation", RFC 2793, + May 2000. + + [23] Hellstrom, G. and P. Jones, "RTP Payload for Text Conversation", + RFC 4103, June 2005. + + + + + +Rey & Matsui Standards Track [Page 62] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + [24] ITU-T Recommendation T.140 (1998) - Text conversation protocol + for multimedia application, with amendment 1, (2000). + + [25] ISO/IEC 10646-1: (1993), Universal Multiple Octet Coded + Character Set. + + [26] ISO/IEC FCD 14496-17 Information technology - Coding of audio- + visual objects - Part 17: Streaming text format, Work in + progress, June 2004. + + [27] Transparent end-to-end Packet-switched Streaming Service (PSS); + 3GPP SMIL language profile, (Release 6), TS 26.246 v 6.0.0, June + 2004. + + [28] Casner, S. and P. Hoschka, "MIME Type Registration of RTP + Payload Formats", RFC 3555, July 2003. + + [29] Freed, N. and J. Klensin, "Media Type Specifications and + Registration Procedures", BCP 13, RFC 4288, December 2005. + + [30] Transparent end-to-end packet switched streaming service (PSS); + 3GPP file format (3GP) (Release 6), TS 26.244 V6.3. March 2005. + + [31] Castagno, R. and D. Singer, "MIME Type Registrations for 3rd + Generation Partnership Project (3GPP) Multimedia files", RFC + 3839, July 2004. + + + + + + + + + + + + + + + + + + + + + + + + + +Rey & Matsui Standards Track [Page 63] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + +13. Basics of the 3GP File Structure + + This section provides a coarse overview of the 3GP file structure, + which follows the ISO Base Media file Format [2]. + + Each 3GP file consists of "Boxes". In general, a 3GP file contains + the File Type Box (ftyp), the Movie Box (moov), and the Media Data + Box (mdat). The File Type Box identifies the type and properties of + the 3GP file itself. The Movie Box and the Media Data Box, serving + as containers, include their own boxes for each media. Boxes start + with a header, which indicates both size and type (these fields are + called, namely, "size" and "type"). Additionally, each box type may + include a number of boxes. + + In the following, only those boxes are mentioned that are useful for + the purposes of this payload format. + + The Movie Box (moov) contains one or more Track Boxes (trak), which + include information about each track. A Track Box contains, among + others, the Track Header Box (tkhd), the Media Header Box (mdhd), and + the Media Information Box (minf). + + The Track Header Box specifies the characteristics of a single track, + where a track is, in this case, the streamed text during a session. + Exactly one Track Header Box is present for a track. It contains + information about the track, such as the spatial layout (width and + height), the video transformation matrix, and the layer number. + Since these pieces of information are essential and static (i.e., + constant) for the duration of the session, they must be sent prior to + the transmission of any text samples. + + The Media Header Box contains the "timescale" or number of time units + that pass in one second, i.e., cycles per second or Hertz. The Media + Information Box includes the Sample Table Box (stbl), which contains + all the time and data indexing of the media samples in a track. Using + this box, it is possible to locate samples in time and to determine + their type, size, container, and offset into that container. Inside + the Sample Table Box, we can find the Sample Description Box (stsd, + for finding sample descriptions), the Decoding Time to Sample Box + (stts, for finding sample duration), the Sample Size Box (stsz), and + the Sample to Chunk Box (stsc, for finding the sample description + index). + + Finally, the Media Data Box contains the media data itself. In timed + text tracks, this box contains text samples. Its equivalent to audio + and video is audio and video frames, respectively. The text sample + consists of the text length, the text string, and one or several + Modifier Boxes. The text length is the size of the text in bytes. + + + +Rey & Matsui Standards Track [Page 64] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + + The text string is plain text to render. The Modifier Box is + information to render in addition to the text, such as color, font, + etc. + +14. Acknowledgements + + The authors would like to thank Dave Singer, Jan van der Meer, Magnus + Westerlund, and Colin Perkins for their comments and suggestions + about this document. + + The authors would also like to thank Markus Gebhard for the free and + publicly available JavE ASCII Editor (used for the ASCII drawings in + this document) and Henrik Levkowetz for the Idnits web service. + +Authors' Addresses + + Jose Rey + Panasonic R&D Center Germany GmbH + Monzastr. 4c + D-63225 Langen, Germany + + EMail: jose.rey@eu.panasonic.com + Phone: +49-6103-766-134 + Fax: +49-6103-766-166 + + + Yoshinori Matsui + Matsushita Electric Industrial Co., LTD. + 1006 Kadoma + Kadoma-shi, Osaka, Japan + + EMail: matsui.yoshinori@jp.panasonic.com + Phone: +81 6 6900 9689 + Fax: +81 6 6900 9699 + + + + + + + + + + + + + + + + + +Rey & Matsui Standards Track [Page 65] + +RFC 4396 Payload Format for 3GPP Timed Text February 2006 + + +Full Copyright Statement + + Copyright (C) The Internet Society (2006). + + This document is subject to the rights, licenses and restrictions + contained in BCP 78, and except as set forth therein, the authors + retain all their rights. + + This document and the information contained herein are provided on an + "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS + OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET + ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, + INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE + INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED + WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +Intellectual Property + + The IETF takes no position regarding the validity or scope of any + Intellectual Property Rights or other rights that might be claimed to + pertain to the implementation or use of the technology described in + this document or the extent to which any license under such rights + might or might not be available; nor does it represent that it has + made any independent effort to identify any such rights. Information + on the procedures with respect to rights in RFC documents can be + found in BCP 78 and BCP 79. + + Copies of IPR disclosures made to the IETF Secretariat and any + assurances of licenses to be made available, or the result of an + attempt made to obtain a general license or permission for the use of + such proprietary rights by implementers or users of this + specification can be obtained from the IETF on-line IPR repository at + http://www.ietf.org/ipr. The IETF invites any interested party to + bring to its attention any copyrights, patents or patent + applications, or other proprietary rights that may cover technology + that may be required to implement this standard. Please address the + information to the IETF at ietf-ipr@ietf.org. + +Acknowledgement + + Funding for the RFC Editor function is provided by the IETF + Administrative Support Activity (IASA). + + + + + + + + + +Rey & Matsui Standards Track [Page 66] + |