diff options
author | Thomas Voss <mail@thomasvoss.com> | 2024-11-27 20:54:24 +0100 |
---|---|---|
committer | Thomas Voss <mail@thomasvoss.com> | 2024-11-27 20:54:24 +0100 |
commit | 4bfd864f10b68b71482b35c818559068ef8d5797 (patch) | |
tree | e3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc6295.txt | |
parent | ea76e11061bda059ae9f9ad130a9895cc85607db (diff) |
doc: Add RFC documents
Diffstat (limited to 'doc/rfc/rfc6295.txt')
-rw-r--r-- | doc/rfc/rfc6295.txt | 9579 |
1 files changed, 9579 insertions, 0 deletions
diff --git a/doc/rfc/rfc6295.txt b/doc/rfc/rfc6295.txt new file mode 100644 index 0000000..2be4a6a --- /dev/null +++ b/doc/rfc/rfc6295.txt @@ -0,0 +1,9579 @@ + + + + + + +Internet Engineering Task Force (IETF) J. Lazzaro +Request for Comments: 6295 J. Wawrzynek +Obsoletes: 4695 UC Berkeley +Category: Standards Track June 2011 +ISSN: 2070-1721 + + + RTP Payload Format for MIDI + +Abstract + + This memo describes a Real-time Transport Protocol (RTP) payload + format for the MIDI (Musical Instrument Digital Interface) command + language. The format encodes all commands that may legally appear on + a MIDI 1.0 DIN cable. The format is suitable for interactive + applications (such as network musical performance) and content- + delivery applications (such as file streaming). The format may be + used over unicast and multicast UDP and TCP, and it defines tools for + graceful recovery from packet loss. Stream behavior, including the + MIDI rendering method, may be customized during session setup. The + format also serves as a mode for the mpeg4-generic format, to support + the MPEG 4 Audio Object Types for General MIDI, Downloadable Sounds + Level 2, and Structured Audio. This document obsoletes RFC 4695. + +Status of This Memo + + This is an Internet Standards Track document. + + This document is a product of the Internet Engineering Task Force + (IETF). It represents the consensus of the IETF community. It has + received public review and has been approved for publication by the + Internet Engineering Steering Group (IESG). Further information on + Internet Standards is available in Section 2 of RFC 5741. + + Information about the current status of this document, any errata, + and how to provide feedback on it may be obtained at + http://www.rfc-editor.org/info/rfc6295. + +Copyright Notice + + Copyright (c) 2011 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents + (http://trustee.ietf.org/license-info) in effect on the date of + publication of this document. Please review these documents + carefully, as they describe your rights and restrictions with respect + + + +Lazzaro & Wawrzynek Standards Track [Page 1] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + to this document. Code Components extracted from this document must + include Simplified BSD License text as described in Section 4.e of + the Trust Legal Provisions and are provided without warranty as + described in the Simplified BSD License. + +Table of Contents + + 1. Introduction ....................................................4 + 1.1. Terminology ................................................6 + 1.2. Bitfield Conventions .......................................6 + 2. Packet Format ...................................................6 + 2.1. RTP Header .................................................7 + 2.2. MIDI Payload ..............................................11 + 3. MIDI Command Section ...........................................13 + 3.1. Timestamps ................................................14 + 3.2. Command Coding ............................................16 + 4. The Recovery Journal System ....................................22 + 5. Recovery Journal Format ........................................24 + 6. Session Description Protocol ...................................28 + 6.1. Session Descriptions for Native Streams ...................29 + 6.2. Session Descriptions for mpeg4-generic Streams ............30 + 6.3. Parameters ................................................33 + 7. Extensibility ..................................................34 + 8. Congestion Control .............................................35 + 9. Security Considerations ........................................35 + 10. Acknowledgements ..............................................36 + 11. IANA Considerations ...........................................37 + 11.1. rtp-midi Media Type Registration .........................38 + 11.1.1. Repository Request for audio/rtp-midi .............40 + 11.2. mpeg4-generic Media Type Registration ....................42 + 11.2.1. Repository Request for Mode rtp-midi for + mpeg4-generic .....................................44 + 11.3. asc Media Type Registration ..............................46 + 12. Changes from RFC 4695 .........................................48 + Appendix A. The Recovery Journal Channel Chapters .................52 + A.1. Recovery Journal Definitions ..............................52 + A.2. Chapter P: MIDI Program Change ............................56 + A.3. Chapter C: MIDI Control Change ............................57 + A.3.1. Log Inclusion Rules ................................58 + A.3.2. Controller Log Format ..............................59 + A.3.3. Log List Coding Rules ..............................61 + A.3.4. The Parameter System ...............................64 + A.4. Chapter M: MIDI Parameter System ..........................66 + A.4.1. Log Inclusion Rules ................................68 + A.4.2. Log Coding Rules ...................................69 + A.4.2.1. The Value Tool ..............................71 + A.4.2.2. The Count Tool ..............................74 + A.5. Chapter W: MIDI Pitch Wheel ...............................74 + + + +Lazzaro & Wawrzynek Standards Track [Page 2] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + A.6. Chapter N: MIDI NoteOff and NoteOn ........................75 + A.6.1. Header Structure ...................................77 + A.6.2. Note Structures ....................................78 + A.7. Chapter E: MIDI Note Command Extras .......................79 + A.7.1. Note Log Format ....................................80 + A.7.2. Log Inclusion Rules ................................80 + A.8. Chapter T: MIDI Channel Aftertouch ........................81 + A.9. Chapter A: MIDI Poly Aftertouch ..........................82 + Appendix B. The Recovery Journal System Chapters ..................83 + B.1. System Chapter D: Simple System Commands ..................83 + B.1.1. Undefined System Commands .....................84 + B.2. System Chapter V: Active Sense Command ....................87 + B.3. System Chapter Q: Sequencer State Commands ................87 + B.3.1. Non-Compliant Sequencers ......................89 + B.4. System Chapter F: MIDI Time Code Tape Position ............90 + B.4.1. Partial Frames ....................................93 + B.5. System Chapter X: System Exclusive ........................94 + B.5.1. Chapter Format ................................94 + B.5.2. Log Inclusion Semantics .......................96 + B.5.3. TCOUNT and COUNT Fields .......................99 + Appendix C. Session Configuration Tools ....... ..................100 + C.1. Configuration Tools: Stream Subsetting ...................101 + C.2. Configuration Tools: The Journalling System ..............106 + C.2.1. The j_sec Parameter ...............................106 + C.2.2. The j_update Parameter ............................107 + C.2.2.1. The anchor Sending Policy ..................108 + C.2.2.2. The closed-loop Sending Policy .............109 + C.2.2.3. The open-loop Sending Policy ...............113 + C.2.3. Recovery Journal Chapter Inclusion Parameters .....114 + C.3. Configuration Tools: Timestamp Semantics .................119 + C.3.1. The comex Algorithm ...............................120 + C.3.2. The async Algorithm ...............................121 + C.3.3. The buffer Algorithm ..............................122 + C.4. Configuration Tools: Packet Timing Tools .................123 + C.4.1. Packet Duration Tools .............................123 + C.4.2. The guardtime Parameter ...........................124 + C.5. Configuration Tools: Stream Description ..................125 + C.6. Configuration Tools: MIDI Rendering ......................131 + C.6.1. The multimode Parameter ...........................132 + C.6.2. Renderer Specification ............................133 + C.6.3. Renderer Initialization ...........................135 + C.6.4. MIDI Channel Mapping ..............................137 + C.6.4.1. The smf_info Parameter .....................138 + C.6.4.2. The smf_inline, smf_url, and smf_cid + Parameters .................................140 + C.6.4.3. The chanmask Parameter .....................140 + C.6.5. The audio/asc Media Type ..........................141 + C.7. Interoperability .........................................143 + + + +Lazzaro & Wawrzynek Standards Track [Page 3] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + C.7.1. MIDI Content-Streaming Applications ...............144 + C.7.2. MIDI Network Musical Performance Applications .....147 + Appendix D. Parameter Syntax Definitions .... ....................153 + Appendix E. A MIDI Overview for Networking Specialists ...........160 + E.1. Commands Types ...........................................162 + E.2. Running Status ...........................................163 + E.3. Command Timing ...........................................163 + E.4. AudioSpecificConfig Templates for MMA Renderers ..........164 + References .......................................................169 + Normative References ..........................................169 + Informative References ........................................170 + +1. Introduction + + This document obsoletes [RFC4695]. + + The Internet Engineering Task Force (IETF) has developed a set of + focused tools for multimedia networking ([RFC3550] [RFC4566] + [RFC3261] [RFC2326]). These tools can be combined in different ways + to support a variety of real-time applications over Internet Protocol + (IP) networks. + + For example, a telephony application might use the Session Initiation + Protocol (SIP, [RFC3261]) to set up a phone call. Call setup would + include negotiations to agree on a common audio codec [RFC3264]. + Negotiations would use the Session Description Protocol (SDP, + [RFC4566]) to describe candidate codecs. + + After a call is set up, audio data would flow between the parties + using the Real Time Protocol (RTP, [RFC3550]) under any applicable + profile (for example, the Audio/Visual Profile (AVP, [RFC3551])). + The tools used in this telephony example (SIP, SDP, and RTP) might be + combined in a different way to support a content-streaming + application, perhaps in conjunction with other tools, such as the + Real Time Streaming Protocol (RTSP, [RFC2326]). + + The MIDI (Musical Instrument Digital Interface) command language + [MIDI] is widely used in musical applications that are analogous to + the examples described above. On stage and in the recording studio, + MIDI is used for the interactive remote control of musical + instruments, an application similar in spirit to telephony. On web + pages, Standard MIDI Files (SMFs, [MIDI]) rendered using the General + MIDI standard [MIDI] provide a low-bandwidth substitute for audio + streaming. + + [RFC4695] was motivated by a simple premise: if MIDI performances + could be sent as RTP streams that are managed by IETF session tools, + a hybridization of the MIDI and IETF application domains might occur. + + + +Lazzaro & Wawrzynek Standards Track [Page 4] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + For example, interoperable MIDI networking might foster network music + performance applications, in which a group of musicians located at + different physical locations interact over a network to perform as + they would if they were located in the same room [NMP]. As a second + example, the streaming community might begin to use MIDI for low- + bitrate audio coding, perhaps in conjunction with normative sound- + synthesis methods [MPEGSA]. + + Five years after [RFC4695], these applications have not yet reached + the mainstream. However, experiments in academia and industry + continue. This memo, which obsoletes [RFC4695] and fixes minor + errata (see Section 12), has been written in service of these + experiments. + + To enable MIDI applications to use RTP, this memo defines an RTP + payload format and its media type. Sections 2-5 and Appendices A and + B define the RTP payload format. Section 6 and Appendices C and D + define the media types identifying the payload format, the parameters + needed for configuration, and the utilization of the parameters in + SDP. + + Appendix C also includes interoperability guidelines for the example + applications described above: network musical performance using SIP + (Appendix C.7.2) and content streaming using RTSP (Appendix C.7.1). + + Another potential application area for RTP MIDI is MIDI networking + for professional audio equipment and electronic musical instruments. + We do not offer interoperability guidelines for this application in + this memo. However, RTP MIDI has been designed with stage and studio + applications in mind, and we expect that efforts to define a stage + and studio framework will rely on RTP MIDI for MIDI transport + services. + + Some applications may require MIDI media delivery at a certain + service quality level (latency, jitter, packet loss, etc.). RTP + itself does not provide service guarantees. However, applications + may use lower-layer network protocols to configure the quality of the + transport services that RTP uses. These protocols may act to reserve + network resources for RTP flows [RFC2205] or may simply direct RTP + traffic onto a dedicated "media network" in a local installation. + Note that RTP and the MIDI payload format do provide tools that + applications may use to achieve the best possible real-time + performance at a given service level. + + This memo normatively defines the syntax and semantics of the MIDI + payload format. However, this memo does not define algorithms for + sending and receiving packets. An ancillary document [RFC4696] + + + + +Lazzaro & Wawrzynek Standards Track [Page 5] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + provides informative guidance on algorithms. Supplemental + information may be found in related conference publications [NMP] + [GRAME]. + + Throughout this memo, the phrase "native stream" refers to a stream + that uses the rtp-midi media type. The phrase "mpeg4-generic stream" + refers to a stream that uses the mpeg4-generic media type (in mode + rtp-midi) to operate in an MPEG 4 environment [RFC3640]. Section 6 + describes this distinction in detail. + +1.1. Terminology + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this + document are to be interpreted as described in BCP 14, RFC 2119 + [RFC2119]. + +1.2. Bitfield Conventions + + Several bitfield coding idioms are used in this document. As most of + these idioms only appear in Appendices A and B, we define them in + Appendix A.1. + + However, a few of these idioms also appear in the main text of this + document. For convenience, we describe them below: + + o R flag bit. R flag bits are reserved for future use. Senders + MUST set R bits to 0. Receivers MUST ignore R bit values. + + o LENGTH field. All fields named LENGTH (as distinct from LEN) code + the number of octets in the structure that contains it, including + the header it resides in and all hierarchical levels below it. If + a structure contains a LENGTH field, a receiver MUST use the + LENGTH field value to advance past the structure during parsing, + rather than use knowledge about the internal format of the + structure. + +2. Packet Format + + In this section, we introduce the format of RTP MIDI packets. The + description includes some background information on RTP for the + benefit of MIDI implementors new to IETF tools. Implementors should + consult [RFC3550] for an authoritative description of RTP. + + This memo assumes that the reader is familiar with MIDI syntax and + semantics. Appendix E provides a MIDI overview, at a level of detail + sufficient to understand most of this memo. Implementors should + consult [MIDI] for an authoritative description of MIDI. + + + +Lazzaro & Wawrzynek Standards Track [Page 6] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + The MIDI payload format maps a MIDI command stream (16 voice channels + + systems) onto an RTP stream. An RTP media stream is a sequence of + logical packets that share a common format. Each packet consists of + two parts: the RTP header and the MIDI payload. Figure 1 shows this + format (vertical space delineates the header and payload). + + We describe RTP packets as "logical" packets to highlight the fact + that RTP itself is not a network-layer protocol. Instead, RTP + packets are mapped onto network protocols (such as unicast UDP, + multicast UDP, or TCP) by an application [ALF]. The interleaved mode + of the Real Time Streaming Protocol (RTSP, [RFC2326]) is an example + of an RTP mapping to TCP transport, as is [RFC4571]. + +2.1. RTP Header + + [RFC3550] provides a complete description of the RTP header fields. + In this section, we clarify the role of a few RTP header fields for + MIDI applications. All fields are coded in network byte order (big- + endian). + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | V |P|X| CC |M| PT | Sequence number | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Timestamp | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | SSRC | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | MIDI command section ... | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Journal section ... | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 1 -- Packet Format + + The behavior of the 1-bit M field depends on the media type of the + stream. For native streams, the M bit MUST be set to 1 if the MIDI + command section has a non-zero LEN field and MUST be set to 0 + otherwise. For mpeg4-generic streams, the M bit MUST be set to 1 for + all packets (to conform to [RFC3640]). + + In an RTP MIDI stream, the 16-bit sequence number field is + initialized to a randomly chosen value and is incremented by one + (modulo 2^16) for each packet sent in the stream. A related + + + +Lazzaro & Wawrzynek Standards Track [Page 7] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + quantity, the 32-bit extended packet sequence number, may be computed + by tracking rollovers of the 16-bit sequence number. Note that + different receivers of the same stream may compute different extended + packet sequence numbers, depending on when the receiver joined the + session. + + The 32-bit timestamp field sets the base timestamp value for the + packet. The payload codes MIDI command timing relative to this + value. The timestamp units are set by the clock rate parameter. For + example, if the clock rate has a value of 44100 Hz, two packets whose + base timestamp values differ by 2 seconds have RTP timestamp fields + that differ by 88200. + + Note that the clock rate parameter is not encoded within each RTP + MIDI packet. A receiver of an RTP MIDI stream becomes aware of the + clock rate as part of the session setup process. For example, if a + session management tool uses the Session Description Protocol (SDP, + [RFC4566]) to describe a media session, the clock rate parameter is + set using the rtpmap attribute. We show examples of session setup in + Section 6. + + For RTP MIDI streams destined to be rendered into audio, the clock + rate SHOULD be an audio sample rate of 32 KHz or higher. This + recommendation is due to the sensitivity of human musical perception + to small timing errors in musical note sequences and due to the + timbral changes that occur when two near-simultaneous MIDI NoteOns + are rendered with a different timing than that desired by the content + author due to clock rate quantization. RTP MIDI streams that are not + destined for audio rendering (such as MIDI streams that control stage + lighting) MAY use a lower clock rate but SHOULD use a clock rate high + enough to avoid timing artifacts in the application. + + For RTP MIDI streams destined to be rendered into audio, the clock + rate SHOULD be chosen from rates in common use in professional audio + applications or in consumer audio distribution. At the time of this + writing, these rates include 32 KHz, 44.1 KHz, 48 KHz, 64 KHz, 88.2 + KHz, 96 KHz, 176.4 KHz, and 192 KHz. If the RTP MIDI session is a + part of a synchronized media session that includes another (non-MIDI) + RTP audio stream with a clock rate of 32 KHz or higher, the RTP MIDI + stream SHOULD use a clock rate that matches the clock rate of the + other audio stream. However, if the RTP MIDI stream is destined to + be rendered into audio, the RTP MIDI stream SHOULD NOT use a clock + rate lower than 32 KHz, even if this second stream has a clock rate + lower than 32 KHz. + + Timestamps of consecutive packets do not necessarily increment at a + fixed rate because RTP MIDI packets are not necessarily sent at a + fixed rate. The degree of packet transmission regularity reflects + + + +Lazzaro & Wawrzynek Standards Track [Page 8] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + the underlying application dynamics. Interactive applications may + vary the packet-sending rate to track the gestural rate of a human + performer, whereas content-streaming applications may send packets at + a fixed rate. + + Therefore, the timestamps for two sequential RTP packets may be + identical, or the second packet may have a timestamp arbitrarily + larger than the first packet (modulo 2^32). Section 3 places + additional restrictions on the RTP timestamps for two sequential RTP + packets, as does the guardtime parameter (Appendix C.4.2). + + We use the term "media time" to denote the temporal duration of the + media coded by an RTP packet. The media time coded by a packet is + computed by subtracting the last command timestamp in the MIDI + command section from the RTP timestamp (modulo 2^32). If the MIDI + list of the MIDI command section of a packet is empty, the media time + coded by the packet is 0 ms. Appendix C.4.1 discusses media time + issues in detail. + + We now define RTP session semantics, in the context of sessions + specified using the Session Description Protocol [RFC4566]. A + session description media line ("m=") specifies an RTP session. An + RTP session has an independent space of 2^32 synchronization sources. + Synchronization source identifiers are coded in the SSRC header field + of RTP session packets. The payload types that may appear in the PT + header field of RTP session packets are listed at the end of the + media line. + + Several RTP MIDI streams may appear in an RTP session. Each stream + is distinguished by a unique SSRC value and has a unique sequence + number and RTP timestamp space. Multiple streams in the RTP session + may be sent by a single party. Multiple parties may send streams in + the RTP session. An RTP MIDI stream encodes data for a single MIDI + command name space (16 voice channels + systems). + + Streams in an RTP session may use different payload types or they may + use the same payload type. However, each party may send, at most, + one RTP MIDI stream for each payload type mapped to an RTP MIDI + payload format in an RTP session. Recall that dynamic binding of + payload type numbers in [RFC4566] lets a party map many payload type + numbers to the RTP MIDI payload format; thus, a party may send many + RTP MIDI streams in a single RTP session. Pairs of streams (unicast + or multicast) that communicate between two parties in an RTP session + and that share a payload type have the same association as a MIDI + cable pair that cross-connects two devices in a MIDI 1.0 DIN network. + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 9] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + The RTP session architecture described above is efficient in its use + of network ports, as one RTP session (using a port pair per party) + supports the transport of many MIDI name spaces (16 MIDI channels + + systems). We define tools for grouping and labelling MIDI name + spaces across streams and sessions in Appendix C.5 of this memo. + + The RTP header timestamps for each stream in an RTP session have + separately and randomly chosen initialization values. Receivers use + the timing fields encoded in the RTP Control Protocol (RTCP, + [RFC3550]) sender reports to synchronize the streams sent by a party. + The SSRC values for each stream in an RTP session are also separately + and randomly chosen, as described in [RFC3550]. Receivers use the + CNAME field encoded in RTCP sender reports to verify that streams + were sent by the same party and to detect SSRC collisions, as + described in [RFC3550]. + + In some applications, a receiver renders MIDI commands into audio (or + into control actions, such as the rewind of a tape deck or the + dimming of stage lights). In other applications, a receiver presents + a MIDI stream to software programs via an Application Programming + Interface (API). Appendix C.6 defines session configuration tools to + specify what receivers should do with a MIDI command stream. + + If a multimedia session uses different RTP MIDI streams to send + different classes of media, the streams MUST be sent over different + RTP sessions. For example, if a multimedia session uses one MIDI + stream for audio and a second MIDI stream to control a lighting + system, the audio and lighting streams MUST be sent over different + RTP sessions, each with its own media line. + + Session description tools defined in Appendix C.5 let a sending party + split a single MIDI name space (16 voice channels + systems) over + several RTP MIDI streams. Split transport of a MIDI command stream + is a delicate task because correct command stream reconstruction by a + receiver depends on exact timing synchronization across the streams. + + To support split name spaces, we define the following requirements: + + o A party MUST NOT send several RTP MIDI streams that share a MIDI + name space in the same RTP session. Instead, each stream MUST be + sent from a different RTP session. + + o If several RTP MIDI streams sent by a party share a MIDI name + space, all streams MUST use the same SSRC value and MUST use the + same randomly chosen RTP timestamp initialization value. + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 10] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + These rules let a receiver identify streams that share a MIDI name + space (by matching SSRC values) and also let a receiver accurately + reconstruct the source MIDI command stream (by using RTP timestamps + to interleave commands from the two streams). Care MUST be taken by + senders to ensure that SSRC changes due to collisions are reflected + in both streams. Receivers MUST regularly examine the RTCP CNAME + fields associated with the linked streams to ensure that the assumed + link is legitimate and not the result of an SSRC collision by another + sender. + + Except for the special cases described above, a party may send many + RTP MIDI streams in the same session. However, it is sometimes + advantageous for two RTP MIDI streams to be sent over different RTP + sessions. For example, two streams may need different values for RTP + session-level attributes (such as the sendonly and recvonly + attributes). As a second example, two RTP sessions may be needed to + send two unicast streams in a multimedia session that originate on + different computers (with different IP numbers). Two RTP sessions + are needed in this case because transport addresses are specified on + the RTP-session or multimedia-session level, not on a payload type + level. + + On a final note, in some uses of MIDI, parties send bidirectional + traffic to conduct transactions (such as file exchange). These + commands were designed to work over MIDI 1.0 DIN cable networks and + may be configured in a multicast topology, which uses pure "party- + line" signalling. Thus, if a multimedia session ensures a multicast + connection between all parties, bidirectional MIDI commands will work + without additional support from the RTP MIDI payload format. + +2.2. MIDI Payload + + The payload (Figure 1) MUST begin with the MIDI command section. The + MIDI command section codes a (possibly empty) list of timestamped + MIDI commands and provides the essential service of the payload + format. + + The payload MAY also contain a journal section. The journal section + provides resiliency by coding the recent history of the stream. A + flag in the MIDI command section codes the presence of a journal + section in the payload. + + Section 3 defines the MIDI command section. Sections 4 and 5 and + Appendices A and B define the recovery journal, the default format + for the journal section. Here, we describe how these payload + sections operate in a stream in an RTP session. + + + + + +Lazzaro & Wawrzynek Standards Track [Page 11] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + The journalling method for a stream is set at the start of a session + and MUST NOT be changed thereafter. A stream may be set to use the + recovery journal, to use an alternative journal format (none are + defined in this memo), or not to use a journal. + + The default journalling method of a stream is inferred from its + transport type. Streams that use unreliable transport (such as UDP) + default to using the recovery journal. Streams that use reliable + transport (such as TCP) default to not using a journal. Appendix + C.2.1 defines session configuration tools for overriding these + defaults. For all types of transport, a sender MUST transmit an RTP + packet stream with consecutive sequence numbers (modulo 2^16). + + If a stream uses the recovery journal, every payload in the stream + MUST include a journal section. If a stream does not use + journalling, a journal section MUST NOT appear in a stream payload. + If a stream uses an alternative journal format, the specification for + the journal format defines an inclusion policy. + + If a stream is sent over UDP transport, the Maximum Transmission Unit + (MTU) of the underlying network limits the practical size of the + payload section (for example, an Ethernet MTU is 1500 octets) for + applications where predictable and minimal packet transmission + latency is critical. A sender SHOULD NOT create RTP MIDI UDP packets + whose sizes exceed the MTU of the underlying network. Instead, the + sender SHOULD take steps to keep the maximum packet size under the + MTU limit. + + These steps may take many forms. The default closed-loop recovery + journal sending policy (defined in Appendix C.2.2.2) uses RTP Control + Protocol (RTCP, [RFC3550]) feedback to manage the RTP MIDI packet + size. In addition, Section 3.2 and Appendix B.5.2 provide specific + tools for managing the size of packets that code MIDI System + Exclusive (0xF0) commands. Appendix C.5 defines session + configuration tools that may be used to split a dense MIDI name space + into several UDP streams (each sent in a different RTP session, per + Section 2.1) so that the payload fits comfortably into an MTU. + Another option is to use TCP. Section 4.3 of [RFC4696] provides non- + normative advice for packet size management. + + + + + + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 12] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + +3. MIDI Command Section + + Figure 2 shows the format of the MIDI command section. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |B|J|Z|P|LEN... | MIDI list ... | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 2 -- MIDI Command Section + + The MIDI command section begins with a variable-length header. + + The header field LEN codes the number of octets in the MIDI list that + follow the header. If the header flag B is 0, the header is one + octet long, and LEN is a 4-bit field, supporting a maximum MIDI list + length of 15 octets. + + If B is 1, the header is two octets long, and LEN is a 12-bit field, + supporting a maximum MIDI list length of 4095 octets. LEN is coded + in network byte order (big-endian): the 4 bits of LEN that appear in + the first header octet code the most significant 4 bits of the 12-bit + LEN value. + + A LEN value of 0 is legal, and it codes an empty MIDI list. + + If the J header bit is set to 1, a journal section MUST appear after + the MIDI command section in the payload. If the J header bit is set + to 0, the payload MUST NOT contain a journal section. + + We define the semantics of the P header bit in Section 3.2. + + If the LEN header field is nonzero, the MIDI list has the structure + shown in Figure 3. + + + + + + + + + + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 13] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Delta Time 0 (1-4 octets long, or 0 octets if Z = 0) | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | MIDI Command 0 (1 or more octets long) | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Delta Time 1 (1-4 octets long) | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | MIDI Command 1 (1 or more octets long) | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | ... | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Delta Time N (1-4 octets long) | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | MIDI Command N (0 or more octets long) | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 3 -- MIDI List Structure + + If the header flag Z is 1, the MIDI list begins with a complete MIDI + command (coded in the MIDI Command 0 field in Figure 3) preceded by a + delta time (coded in the Delta Time 0 field). If Z is 0, the Delta + Time 0 field is not present in the MIDI list, and the command coded + in the MIDI Command 0 field has an implicit delta time of 0. + + The MIDI list structure may also optionally encode a list of N + additional complete MIDI commands, each coded in a MIDI Command K + field. Each additional command MUST be preceded by a Delta Time K + field, which codes the command's delta time. We discuss exceptions + to the "command fields code complete MIDI commands" rule in Section + 3.2. + + The final MIDI command field (i.e., the MIDI Command N field, shown + in Figure 3) in the MIDI list MAY be empty. Moreover, a MIDI list + MAY consist of a single delta time (encoded in the Delta Time 0 + field) without an associated command (which would have been encoded + in the MIDI Command 0 field). These rules enable MIDI coding + features that are explained in Section 3.1. We delay the + explanations because an understanding of RTP MIDI timestamps is + necessary to describe the features. + +3.1. Timestamps + + In this section, we describe how RTP MIDI encodes a timestamp for + each MIDI list command. Command timestamps have the same units as + RTP packet header timestamps (described in Section 2.1 and + [RFC3550]). Recall that RTP timestamps have units of seconds, whose + scaling is set during session configuration (see Section 6.1 and + [RFC4566]). + + + +Lazzaro & Wawrzynek Standards Track [Page 14] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + As shown in Figure 3, the MIDI list encodes time using a compact + delta time format. The RTP MIDI delta time syntax is a modified form + of the MIDI File delta time syntax [MIDI]. RTP MIDI delta times use + 1-4 octet fields to encode 32-bit unsigned integers. Figure 4 shows + the encoded and decoded forms of delta times. Note that delta time + values may be legally encoded in multiple formats; for example, there + are four legal ways to encode the zero delta time (0x00, 0x8000, + 0x808000, 0x80808000). + + RTP MIDI uses delta times to encode a timestamp for each MIDI + command. The timestamp for MIDI Command K is the summation (modulo + 2^32) of the RTP timestamp and decoded delta times 0 through K. This + cumulative coding technique, borrowed from MIDI File delta time + coding, is efficient because it reduces the number of multi-octet + delta times. + + All command timestamps in a packet MUST be less than or equal to the + RTP timestamp of the next packet in the stream (modulo 2^32). + + This restriction ensures that a particular RTP MIDI packet in a + stream is uniquely responsible for encoding time, starting at the + moment after the RTP timestamp encoded in the RTP packet header and + ending at the moment before the final command timestamp encoded in + the MIDI list. The "moment before" and "moment after" qualifiers + acknowledge the "less than or equal" semantics (as opposed to + "strictly less than") in the sentence above this paragraph. + + Note that it is possible to "pad" the end of an RTP MIDI packet with + time that is guaranteed to be void of MIDI commands, by setting the + "Delta Time N" field of the MIDI list to the end of the void time and + by omitting its corresponding "MIDI Command N" field (a syntactic + construction the preamble of Section 3 expressly made legal). + + In addition, it is possible to code an RTP MIDI packet to express + that a period of time in the stream is void of MIDI commands. The + RTP timestamp in the header would code the start of the void time. + The MIDI list of this packet would consist of a "Delta Time 0" field + that coded the end of the void time. No other fields would be + present in the MIDI list (a syntactic construction the preamble of + Section 3 also expressly made legal). + + By default, a command timestamp indicates the execution time for the + command. The difference between two timestamps indicates the time + delay between the execution of the commands. This difference may be + zero, coding simultaneous execution. In this memo, we refer to this + interpretation of timestamps as "comex" (COMmand EXecution) + semantics. We formally define comex semantics in Appendix C.3. + + + + +Lazzaro & Wawrzynek Standards Track [Page 15] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + The comex interpretation of timestamps works well for transcoding a + Standard MIDI File (SMF) into an RTP MIDI stream, as SMFs code a + timestamp for each MIDI command stored in the file. To transcode an + SMF that uses metric time markers, use the SMF tempo map (encoded in + the SMF as meta-events) to convert metric SMF timestamp units into + seconds-based RTP timestamp units. + + The comex interpretation also works well for MIDI hardware + controllers that are coding raw sensor data directly onto an RTP MIDI + stream. Note that this controller design is preferable to a design + that converts raw sensor data into a MIDI 1.0 cable command stream + and then transcodes the stream onto an RTP MIDI stream. + + The comex interpretation of timestamps is usually not the best + timestamp interpretation for transcoding a MIDI source that uses + implicit command timing (such as MIDI 1.0 DIN cables) into an RTP + MIDI stream. Appendix C.3 defines alternatives to comex semantics + and describes session configuration tools for selecting the timestamp + interpretation semantics for a stream. + + One-Octet Delta Time: + + Encoded form: 0ddddddd + Decoded form: 00000000 00000000 00000000 0ddddddd + + Two-Octet Delta Time: + + Encoded form: 1ccccccc 0ddddddd + Decoded form: 00000000 00000000 00cccccc cddddddd + + Three-Octet Delta Time: + + Encoded form: 1bbbbbbb 1ccccccc 0ddddddd + Decoded form: 00000000 000bbbbb bbcccccc cddddddd + + Four-Octet Delta Time: + + Encoded form: 1aaaaaaa 1bbbbbbb 1ccccccc 0ddddddd + Decoded form: 0000aaaa aaabbbbb bbcccccc cddddddd + + Figure 4 -- Decoding Delta Time Formats + +3.2. Command Coding + + Each non-empty MIDI Command field in the MIDI list codes one of the + MIDI command types that may legally appear on a MIDI 1.0 DIN cable. + Standard MIDI File meta-events do not fit this definition and MUST + NOT appear in the MIDI list. As a rule, each MIDI Command field + + + +Lazzaro & Wawrzynek Standards Track [Page 16] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + codes a complete command, in the binary command format defined in + [MIDI]. In the remainder of this section, we describe exceptions to + this rule. + + The first MIDI channel command in the MIDI list MUST include a status + octet. Running status coding, as defined in [MIDI], MAY be used for + all subsequent MIDI channel commands in the list. As in [MIDI], + System Common and System Exclusive messages (0xF0 ... 0xF7) cancel + the running status state, but System Real-Time messages (0xF8 ... + 0xFF) do not affect the running status state. All system commands in + the MIDI list MUST include a status octet. + + As we note above, the first channel command in the MIDI list MUST + include a status octet. However, the corresponding command in the + original MIDI source data stream might not have a status octet (in + this case, the source would be coding the command using running + status). If the status octet of the first channel command in the + MIDI list does not appear in the source data stream, the P (phantom) + header bit MUST be set to 1. In all other cases, the P bit MUST be + set to 0. + + Note that the P bit describes the MIDI source data stream, not the + MIDI list encoding; regardless of the state of the P bit, the MIDI + list MUST include the status octet. + + As receivers MUST be able to decode running status, sender + implementors should feel free to use running status to improve + bandwidth efficiency. However, senders SHOULD NOT introduce timing + jitter into an existing MIDI command stream through an inappropriate + use or removal of running status coding. This warning primarily + applies to senders whose RTP MIDI streams may be transcoded onto a + MIDI 1.0 DIN cable [MIDI] by the receiver: both the timestamps and + the command coding (running status or not) must comply with the + physical restrictions of implicit time coding over a slow serial + line. + + On a MIDI 1.0 DIN cable [MIDI], a System Real-Time command may be + embedded inside of another "host" MIDI command. This syntactic + construction is not supported in the payload format: a MIDI Command + field in the MIDI list codes exactly one MIDI command (partially or + completely). + + To encode an embedded System Real-Time command, senders MUST extract + the command from its host and code it in the MIDI list as a separate + command. The host command and System Real-Time command SHOULD appear + in the same MIDI list. The delta time of the System Real-Time + command SHOULD result in a command timestamp that encodes the System + Real-Time command placement in its original embedded position. + + + +Lazzaro & Wawrzynek Standards Track [Page 17] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Two methods are provided for encoding MIDI System Exclusive (SysEx) + commands in the MIDI list. A SysEx command may be encoded in a MIDI + Command field verbatim: a 0xF0 octet, followed by an arbitrary number + of data octets, followed by a 0xF7 octet. + + Alternatively, a SysEx command may be encoded as multiple segments. + The command is divided into two or more SysEx command segments; each + segment is encoded in its own MIDI Command field in the MIDI list. + + The payload format supports segmentation in order to encode SysEx + commands that encode information in the temporal pattern of data + octets. By encoding these commands as a series of segments, each + data octet may be associated with a distinct delta time. + Segmentation also supports the coding of large SysEx commands across + several packets. + + To segment a SysEx command, first partition its data octet list into + two or more sublists. The last sublist MAY be empty (i.e., contain + no octets); all other sublists MUST contain at least one data octet. + To complete the segmentation, add the status octets defined in Figure + 5 to the head and tail of the first, last, and any "middle" sublists. + Figure 6 shows example segmentations of a SysEx command. + + A sender MAY cancel a segmented SysEx command transmission that is in + progress by sending the "cancel" sublist shown in Figure 5. A + "cancel" sublist MAY follow a "first" or "middle" sublist in the + transmission but MUST NOT follow a "last" sublist. The cancel MUST + be empty (thus, 0xF7 0xF4 is the only legal cancel sublist). + + The cancellation feature is needed because Appendix C.1 defines + configuration tools that let session parties exclude certain SysEx + commands in the stream. Senders that transcode a MIDI source onto an + RTP MIDI stream under these constraints have the responsibility of + excluding undesired commands from the RTP MIDI stream. + + The cancellation feature lets a sender start the transmission of a + command before the MIDI source has sent the entire command. If a + sender determines that the command whose transmission is in progress + should not appear on the RTP stream, it cancels the command. Without + a method for cancelling a SysEx command transmission, senders would + be forced to use a high-latency store-and-forward approach to + transcoding SysEx commands onto RTP MIDI packets, in order to + validate each SysEx command before transmission. + + The recommended receiver reaction to a cancellation depends on the + capabilities of the receiver. For example, a sound synthesizer that + is directly parsing RTP MIDI packets and rendering them to audio will + + + + +Lazzaro & Wawrzynek Standards Track [Page 18] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + be aware of the fact that SysEx commands may be cancelled in RTP + MIDI. These receivers SHOULD detect a SysEx cancellation in the MIDI + list and act as if they had never received the SysEx command. + + As a second example, a synthesizer may be receiving MIDI data from an + RTP MIDI stream via a MIDI DIN cable (or a software API emulation of + a MIDI DIN cable). In this case, an RTP-MIDI-aware system receives + the RTP MIDI stream and transcodes it onto the MIDI DIN cable (or its + emulation). Upon the receipt of the cancel sublist, the RTP-MIDI- + aware transcoder might have already sent the first part of the SysEx + command on the MIDI DIN cable to the receiver. + + Unfortunately, the MIDI DIN cable protocol cannot directly code + "cancel SysEx in progress" semantics. However, MIDI DIN cable + receivers begin SysEx processing after the complete command arrives. + The receiver checks to see if it recognizes the command (coded in the + first few octets) and then checks to see if the command is the + correct length. Thus, in practice, a transcoder can cancel a SysEx + command by sending an 0xF7 to (prematurely) end the SysEx command -- + the receiver will detect the incorrect command length and discard the + command. + + Appendix C.1 defines configuration tools that may be used to prohibit + SysEx command cancellation. + + The relative ordering of SysEx command segments in a MIDI list must + match the relative ordering of the sublists in the original SysEx + command. By default, commands other than System Real-Time MIDI + commands MUST NOT appear between SysEx command segments (Appendix C.1 + defines configuration tools to change this default to let other + commands types appear between segments). If the command segments of + a SysEx command are placed in the MIDI lists of two or more RTP + packets, the segment ordering rules apply to the concatenation of all + affected MIDI lists. + + ----------------------------------------------------------- + | Sublist Position | Head Status Octet | Tail Status Octet | + |-----------------------------------------------------------| + | first | 0xF0 | 0xF0 | + |-----------------------------------------------------------| + | middle | 0xF7 | 0xF0 | + |-----------------------------------------------------------| + | last | 0xF7 | 0xF7 | + |-----------------------------------------------------------| + | cancel | 0xF7 | 0xF4 | + ----------------------------------------------------------- + + Figure 5 -- Command Segmentation Status Octets + + + +Lazzaro & Wawrzynek Standards Track [Page 19] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + [MIDI] permits 0xF7 octets that are not part of a (0xF0, 0xF7) pair + to appear on a MIDI 1.0 DIN cable. Unpaired 0xF7 octets have no + semantic meaning in MIDI apart from cancelling running status. + + Unpaired 0xF7 octets MUST NOT appear in the MIDI list of the MIDI + Command section. We impose this restriction to avoid interference + with the command segmentation coding defined in Figure 5. + + SysEx commands carried on a MIDI 1.0 DIN cable may use the "dropped + 0xF7" construction [MIDI]. In this coding method, the 0xF7 octet is + dropped from the end of the SysEx command, and the status octet of + the next MIDI command acts both to terminate the SysEx command and + start the next command. To encode this construction in the payload + format, follow these steps: + + o Determine the appropriate delta times for the SysEx command and + the command that follows the SysEx command. + + o Insert the "dropped" 0xF7 octet at the end of the SysEx command to + form the standard SysEx syntax. + + o Code both commands into the MIDI list using the rules above. + + o Replace the 0xF7 octet that terminates the verbatim SysEx encoding + or the last segment of the segmented SysEx encoding with a 0xF5 + octet. This substitution informs the receiver of the original + "dropped 0xF7" coding. + + [MIDI] reserves the undefined System Common commands 0xF4 and 0xF5 + and the undefined System Real-Time commands 0xF9 and 0xFD for future + use. By default, undefined commands MUST NOT appear in a MIDI + Command field in the MIDI list, with the exception of the 0xF5 octets + used to code the "dropped 0xF7" construction and the 0xF4 octets used + by SysEx "cancel" sublists. + + During session configuration, a stream may be customized to transport + undefined commands (Appendix C.1). For this case, we now define how + senders encode undefined commands in the MIDI list. + + An undefined System Real-Time command MUST be coded using the System + Real-Time rules. + + If the undefined System Common commands are put to use in a future + version of [MIDI], the command will begin with an 0xF4 or 0xF5 status + octet, followed by an arbitrary number of data octets (i.e., zero or + more data bytes). To encode these commands, senders MUST terminate + the command with an 0xF7 octet and place the modified command into + the MIDI Command field. + + + +Lazzaro & Wawrzynek Standards Track [Page 20] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Unfortunately, non-compliant uses of the undefined System Common + commands may appear in MIDI implementations. To model these + commands, we assume that the command begins with an 0xF4 or 0xF5 + status octet, followed by zero or more data octets, followed by zero + or more trailing 0xF7 status octets. To encode the command, senders + MUST first remove all trailing 0xF7 status octets from the command. + Then, senders MUST terminate the command with an 0xF7 octet and place + the modified command into the MIDI Command field. + + Note that we include the trailing octets in our model as a cautionary + measure: if such commands appeared in a non-compliant use of an + undefined System Common command, an RTP MIDI encoding of the command + that did not remove trailing octets could be mistaken for an encoding + of the "middle" or "last" sublist of a segmented SysEx command + (Figure 5) under certain packet loss conditions. + + Original SysEx command: + + 0xF0 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0xF7 + + A two-segment segmentation: + + 0xF0 0x01 0x02 0x03 0x04 0xF0 + + 0xF7 0x05 0x06 0x07 0x08 0xF7 + + A different two-segment segmentation: + + 0xF0 0x01 0xF0 + + 0xF7 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0xF7 + + A three-segment segmentation: + + 0xF0 0x01 0x02 0xF0 + + 0xF7 0x03 0x04 0xF0 + + 0xF7 0x05 0x06 0x07 0x08 0xF7 + + The segmentation with the largest number of segments: + + 0xF0 0x01 0xF0 + + 0xF7 0x02 0xF0 + + 0xF7 0x03 0xF0 + + + + +Lazzaro & Wawrzynek Standards Track [Page 21] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + 0xF7 0x04 0xF0 + + 0xF7 0x05 0xF0 + + 0xF7 0x06 0xF0 + + 0xF7 0x07 0xF0 + + 0xF7 0x08 0xF0 + + 0xF7 0xF7 + + + Figure 6 -- Example Segmentations + +4. The Recovery Journal System + + The recovery journal is the default resiliency tool for unreliable + transport. In this section, we normatively define the roles that + senders and receivers play in the recovery journal system. + + MIDI is a fragile code. A single lost command in a MIDI command + stream may produce an artifact in the rendered performance. We + normatively classify rendering artifacts into two categories: + + o Transient artifacts. Transient artifacts produce immediate but + short-term glitches in the performance. For example, a lost + NoteOn (0x9) command produces a transient artifact: one note fails + to play, but the artifact does not extend beyond the end of that + note. + + o Indefinite artifacts. Indefinite artifacts produce long-lasting + errors in the rendered performance. For example, a lost NoteOff + (0x8) command may produce an indefinite artifact: the note that + should have been ended by the lost NoteOff command may sustain + indefinitely. As a second example, the loss of a Control Change + (0xB) command for controller number 7 (Channel Volume) may produce + an indefinite artifact: after the loss, all notes on the channel + may play too softly or too loudly. + + The purpose of the recovery journal system is to satisfy the recovery + journal mandate: the MIDI performance rendered from an RTP MIDI + stream sent over unreliable transport MUST NOT contain indefinite + artifacts. + + The recovery journal system does not use packet retransmission to + satisfy this mandate. Instead, each packet includes a special + section called the recovery journal. + + + +Lazzaro & Wawrzynek Standards Track [Page 22] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + The recovery journal codes the history of the stream back to an + earlier packet called the checkpoint packet. The range of coverage + for the journal is called the checkpoint history. The recovery + journal codes the information necessary to recover from the loss of + an arbitrary number of packets in the checkpoint history. Appendix + A.1 normatively defines the checkpoint history. + + When a receiver detects a packet loss, it compares its own knowledge + about the history of the stream with the history information coded in + the recovery journal of the packet that ends the loss event. By + noting the differences in these two versions of the past, a receiver + is able to transform all indefinite artifacts in the rendered + performance into transient artifacts by executing MIDI commands to + repair the stream. + + We now state the normative role for senders in the recovery journal + system. + + Senders prepare a recovery journal for every packet in the stream. + In doing so, senders choose the checkpoint packet identity for the + journal. Senders make this choice by applying a sending policy. + Appendix C.2.2 normatively defines three sending policies: "closed- + loop", "open-loop", and "anchor". + + By default, senders MUST use the closed-loop sending policy. If the + session description overrides this default policy by using the + parameter j_update defined in Appendix C.2.2, senders MUST use the + specified policy. + + After choosing the checkpoint packet identity for a packet, the + sender creates the recovery journal. By default, this journal MUST + conform to the normative semantics in Section 5 and Appendices A and + B in this memo. In Appendix C.2.3, we define parameters that modify + the normative semantics for recovery journals. If the session + description uses these parameters, the journal created by the sender + MUST conform to the modified semantics. + + Next, we state the normative role for receivers in the recovery + journal system. + + A receiver MUST detect each RTP sequence number break in a stream. + If the sequence number break is due to a packet loss event (as + defined in [RFC3550]), the receiver MUST repair all indefinite + artifacts in the rendered MIDI performance caused by the loss. If + the sequence number break is due to an out-of-order packet (as + defined in [RFC3550]), the receiver MUST NOT take actions that + introduce indefinite artifacts (ignoring the out-of-order packet is a + safe option). + + + +Lazzaro & Wawrzynek Standards Track [Page 23] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Receivers take special precautions when entering or exiting a + session. A receiver MUST process the first received packet in a + stream as if it were a packet that ends a loss event. Upon exiting a + session, a receiver MUST ensure that the rendered MIDI performance + does not end with indefinite artifacts. + + Receivers are under no obligation to perform indefinite artifact + repairs at the moment a packet arrives. A receiver that uses a + playout buffer may choose to wait until the moment of rendering + before processing the recovery journal, as the "lost" packet may be a + late packet that arrives in time to use. + + Next, we state the normative role for the creator of the session + description in the recovery journal system. The sender, the + receivers, and other parties may take part in creating or approving + the session description, depending on the application. + + A session description that specifies the default closed-loop sending + policy and the default recovery journal semantics satisfies the + recovery journal mandate. However, these default behaviors may not + be appropriate for all sessions. If the creators of a session + description use the parameters defined in Appendix C.2 to override + these defaults, the creators MUST ensure that the parameters define a + system that satisfies the recovery journal mandate. + + Finally, we note that this memo does not specify sender or receiver + recovery journal algorithms. Implementations are free to use any + algorithm that conforms to the requirements in this section. The + non-normative [RFC4696] discusses sender and receiver algorithm + design. + +5. Recovery Journal Format + + This section introduces the structure of the recovery journal and + defines the bitfields of recovery journal headers. Appendices A and + B complete the bitfield definition of the recovery journal. + + The recovery journal has a three-level structure: + + o Top-level header. + + o Channel and system journal headers. These headers encode recovery + information for a single voice channel (channel journal) or for + all system commands (system journal). + + o Chapters. Chapters describe recovery information for a single + MIDI command type. + + + + +Lazzaro & Wawrzynek Standards Track [Page 24] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Figure 7 shows the top-level structure of the recovery journal. The + recovery journal consists of a 3-octet header followed by an optional + system journal (labeled S-journal in Figure 7) and an optional list + of channel journals. Figure 8 shows the recovery journal header + format. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Recovery journal header | S-journal ... | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Channel journals ... | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 7 -- Top-Level Recovery Journal Format + + 0 1 2 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |S|Y|A|H|TOTCHAN| Checkpoint Packet Seqnum | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 8 -- Recovery Journal Header + + If the Y header bit is set to 1, the system journal appears in the + recovery journal, directly following the recovery journal header. + + If the A header bit is set to 1, the recovery journal ends with a + list of (TOTCHAN + 1) channel journals (the 4-bit TOTCHAN header + field is interpreted as an unsigned integer). + + A MIDI channel MAY be represented by (at most) one channel journal in + a recovery journal. Channel journals MUST appear in the recovery + journal in ascending channel-number order. + + If A and Y are both zero, the recovery journal only contains its + 3-octet header and is considered to be an "empty" journal. + + The S (single-packet loss) bit appears in most recovery journal + structures, including the recovery journal header. The S bit helps + receivers efficiently parse the recovery journal in the common case + of the loss of a single packet. Appendix A.1 defines S-bit + semantics. + + The H bit indicates if MIDI channels in the stream have been + configured to use the enhanced Chapter C encoding (Appendix A.3.3). + + + + + +Lazzaro & Wawrzynek Standards Track [Page 25] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + By default, the payload format does not use enhanced Chapter C + encoding. In this default case, the H bit MUST be set to 0 for all + packets in the stream. + + If the stream has been configured so that controller numbers for one + or more MIDI channels use enhanced Chapter C encoding, the H bit MUST + be set to 1 in all packets in the stream. In Appendix C.2.3, we show + how to configure a stream to use enhanced Chapter C encoding. + + The 16-bit Checkpoint Packet Seqnum header field codes the sequence + number of the checkpoint packet for this journal, in network byte + order (big-endian). The choice of the checkpoint packet sets the + depth of the checkpoint history for the journal (defined in Appendix + A.1). + + Receivers may use the Checkpoint Packet Seqnum field of the packet + that ends a loss event to verify that the journal checkpoint history + covers the entire loss event. The checkpoint history covers the loss + event if the Checkpoint Packet Seqnum field is less than or equal to + one plus the highest RTP sequence number previously received on the + stream (modulo 2^16). + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |S| CHAN |H| LENGTH |P|C|M|W|N|E|T|A| Chapters ... | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 9 -- Channel Journal Format + + Figure 9 shows the structure of a channel journal: a 3-octet header + followed by a list of leaf elements called channel chapters. A + channel journal encodes information about MIDI commands on the MIDI + channel coded by the 4-bit CHAN header field. Note that CHAN uses + the same bit encoding as the channel nibble in MIDI Channel Messages + (the cccc field in Figure E.1 of Appendix E). + + The 10-bit LENGTH field codes the length of the channel journal. The + semantics for LENGTH fields are uniform throughout the recovery + journal and are defined in Appendix A.1. + + The third octet of the channel journal header is the Table of + Contents (TOC) of the channel journal. The TOC is a set of bits that + encode the presence of a chapter in the journal. Each chapter + contains information about a certain class of MIDI channel command: + + o Chapter P: MIDI Program Change (0xC) + o Chapter C: MIDI Control Change (0xB) + + + +Lazzaro & Wawrzynek Standards Track [Page 26] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + o Chapter M: MIDI Parameter System (part of 0xB) + o Chapter W: MIDI Pitch Wheel (0xE) + o Chapter N: MIDI NoteOff (0x8), NoteOn (0x9) + o Chapter E: MIDI Note Command Extras (0x8, 0x9) + o Chapter T: MIDI Channel Aftertouch (0xD) + o Chapter A: MIDI Poly Aftertouch (0xA) + + Chapters appear in a list following the header, in order of their + appearance in the TOC. Appendices A.2-A.9 describe the bitfield + format for each chapter and define the conditions under which a + chapter type MUST appear in the recovery journal. If any chapter + types are required for a channel, an associated channel journal MUST + appear in the recovery journal. + + The H bit indicates if controller numbers on a MIDI channel have been + configured to use the enhanced Chapter C encoding (Appendix A.3.3). + + By default, controller numbers on a MIDI channel do not use enhanced + Chapter C encoding. In this default case, the H bit MUST be set to 0 + for all channel journal headers for the channel in the recovery + journal, for all packets in the stream. + + However, if at least one controller number for a MIDI channel has + been configured to use the enhanced Chapter C encoding, the H bit for + its channel journal MUST be set to 1, for all packets in the stream. + + In Appendix C.2.3, we show how to configure a controller number to + use enhanced Chapter C encoding. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |S|D|V|Q|F|X| LENGTH | System chapters ... | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 10 -- System Journal Format + + Figure 10 shows the structure of the system journal: a 2-octet header + followed by a list of system chapters. Each chapter codes + information about a specific class of MIDI system commands: + + o Chapter D: Song Select (0xF3), Tune Request (0xF6), Reset (0xFF), + undefined system commands (0xF4, 0xF5, 0xF9, 0xFD) + o Chapter V: Active Sense (0xFE) + o Chapter Q: Sequencer State (0xF2, 0xF8, 0xF9, 0xFA, 0xFB, 0xFC) + o Chapter F: MIDI Time Code (MTC) Tape Position (0xF1, 0xF0 0x7F + 0xcc 0x01 0x01) + o Chapter X: System Exclusive (all other 0xF0) + + + +Lazzaro & Wawrzynek Standards Track [Page 27] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + The 10-bit LENGTH field codes the size of the system journal and + conforms to semantics described in Appendix A.1. + + The D, V, Q, F, and X header bits form a Table of Contents (TOC) for + the system journal. A TOC bit that is set to 1 codes the presence of + a chapter in the journal. Chapters appear in a list following the + header, in the order of their appearance in the TOC. + + Appendix B describes the bitfield format for the system chapters and + defines the conditions under which a chapter type MUST appear in the + recovery journal. If any system chapter type is required to appear + in the recovery journal, the system journal MUST appear in the + recovery journal. + +6. Session Description Protocol + + RTP does not perform session management. Instead, RTP works together + with session management tools, such as the Session Initiation + Protocol (SIP, [RFC3261]) and the Real Time Streaming Protocol (RTSP, + [RFC2326]). + + RTP payload formats define media type parameters for use in session + management (for example, this memo defines rtp-midi as the media type + for native RTP MIDI streams). + + In most cases, session management tools use the media type parameters + via another standard, the Session Description Protocol (SDP, + [RFC4566]). + + SDP is a textual format for specifying session descriptions. Session + descriptions specify the network transport and media encoding for RTP + sessions. Session management tools coordinate the exchange of + session descriptions between participants ("parties"). + + Some session management tools use SDP to negotiate details of media + transport (network addresses, ports, etc.). We refer to this use of + SDP as "negotiated usage". One example of negotiated usage is the + Offer/Answer protocol ([RFC3264] and Appendix C.7.2 in this memo) as + used by SIP. + + Other session management tools use SDP to declare the media encoding + for the session but use other techniques to negotiate network + transport. We refer to this use of SDP as "declarative usage". One + example of declarative usage is RTSP ([RFC2326] and Appendix C.7.1 in + this memo). + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 28] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Below, we show session description examples for native (Section 6.1) + and mpeg4-generic (Section 6.2) streams. In Section 6.3, we + introduce session configuration tools that may be used to customize + streams. + +6.1. Session Descriptions for Native Streams + + The session description below defines a unicast UDP RTP session (via + a media ("m=") line) whose sole payload type (96) is mapped to a + minimal native RTP MIDI stream. + + v=0 + o=lazzaro 2520644554 2838152170 IN IP4 first.example.net + s=Example + t=0 0 + m=audio 5004 RTP/AVP 96 + c=IN IP4 192.0.2.94 + a=rtpmap:96 rtp-midi/44100 + + The rtpmap attribute line uses the rtp-midi media type to specify an + RTP MIDI native stream. The clock rate specified on the rtpmap line + (in the example above, 44100 Hz) sets the scaling for the RTP + timestamp header field (see Section 2.1 and also [RFC3550]). + + Note that this document does not specify a default clock rate value + for RTP MIDI. When RTP MIDI is used with SDP, parties MUST use the + rtpmap line to communicate the clock rate. Guidance for selecting + the RTP MIDI clock rate value appears in Section 2.1. + + We consider the RTP MIDI stream shown above to be "minimal" because + the session description does not customize the stream with + parameters. Without such customization, a native RTP MIDI stream has + these characteristics: + + 1. If the stream uses unreliable transport (unicast UDP, multicast + UDP, etc.), the recovery journal system is in use, and the RTP + payload contains both the MIDI command section and the journal + section. If the stream uses reliable transport (such as TCP), + the stream does not use journalling, and the payload contains + only the MIDI command section (Section 2.2). + + 2. If the stream uses the recovery journal system, the recovery + journal system uses the default sending policy and the default + journal semantics (Section 4). + + 3. In the MIDI command section of the payload, command timestamps + use the default comex semantics (Section 3). + + + + +Lazzaro & Wawrzynek Standards Track [Page 29] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + 4. The recommended temporal duration ("media time") of an RTP packet + ranges from 0 to 200 ms, and the RTP timestamp difference between + sequential packets in the stream may be arbitrarily large + (Section 2.1). + + 5. If more than one minimal rtp-midi stream appears in a session, + the MIDI name spaces for these streams are independent: channel 1 + in the first stream does not reference the same MIDI channel as + channel 1 in the second stream (see Appendix C.5 for a discussion + of the independence of minimal rtp-midi streams). + + 6. The rendering method for the stream is not specified. What the + receiver "does" with a minimal native MIDI stream is out of the + scope of this memo. For example, in content creation + environments, a user may manually configure client software to + render the stream with a specific software package. + + As is standard in RTP, RTP sessions managed by SIP are sendrecv by + default (parties send and receive MIDI), and RTP sessions managed by + RTSP are recvonly by default (server sends and client receives). + + In sendrecv RTP MIDI sessions for the session description shown + above, the 16 voice channel + systems MIDI name space is unique for + each sender. Thus, in a two-party session, the voice channel 0 sent + by one party is distinct from the voice channel 0 sent by the other + party. + + This behavior corresponds to what occurs when two MIDI 1.0 DIN + devices are cross-connected with two MIDI cables (one cable routing + MIDI Out from the first device into MIDI In of the second device and + a second cable routing MIDI In from the first device into MIDI Out of + the second device). We define this "association" formally in Section + 2.1. + + MIDI 1.0 DIN networks may be configured in a "party-line" multicast + topology. For these networks, the MIDI protocol itself provides + tools for addressing specific devices in transactions on a multicast + network and for device discovery. Thus, apart from providing a 1-to- + many forward path and a many-to-1 reverse path, IETF protocols do not + need to provide any special support for MIDI multicast networking. + +6.2. Session Descriptions for mpeg4-generic Streams + + An mpeg4-generic [RFC3640] RTP MIDI stream uses an MPEG 4 Audio + Object Type to render MIDI into audio. Three Audio Object Types + accept MIDI input: + + + + + +Lazzaro & Wawrzynek Standards Track [Page 30] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + o General MIDI (Audio Object Type ID 15), based on the General MIDI + rendering standard [MIDI]. + + o Wavetable Synthesis (Audio Object Type ID 14), based on the + Downloadable Sounds Level 2 (DLS 2) rendering standard [DLS2]. + + o Main Synthetic (Audio Object Type ID 13), based on Structured + Audio and the programming language SAOL [MPEGSA]. The name of the + language (SAOL) is an acronym that expands to Structured Audio + Orechestra Language. + + The primary service of an mpeg4-generic stream is to code Access + Units (AUs). We define the mpeg4-generic RTP MIDI AU as the MIDI + payload shown in Figure 1 of Section 2.1 of this memo: a MIDI command + section optionally followed by a journal section. + + Exactly one RTP MIDI AU MUST be mapped to one mpeg4-generic RTP MIDI + packet. The mpeg4-generic options for placing several AUs in an RTP + packet MUST NOT be used with RTP MIDI. The mpeg4-generic options for + fragmenting and interleaving AUs MUST NOT be used with RTP MIDI. The + mpeg4-generic RTP packet payload (Figure 1 in [RFC3640]) MUST contain + empty AU Header and Auxiliary sections. These rules yield + mpeg4-generic packets that are structurally identical to native RTP + MIDI packets, an essential property for the correct operation of the + payload format. + + The session description that follows defines a unicast UDP RTP + session (via a media ("m=") line) whose sole payload type (96) is + mapped to a minimal mpeg4-generic RTP MIDI stream. This example uses + the General MIDI Audio Object Type under Synthesis Profile @ Level 2. + + v=0 + o=lazzaro 2520644554 2838152170 IN IP6 first.example.net + s=Example + t=0 0 + m=audio 5004 RTP/AVP 96 + c=IN IP6 2001:DB8::7F2E:172A:1E24 + a=rtpmap:96 mpeg4-generic/44100 + a=fmtp:96 streamtype=5; mode=rtp-midi; profile-level-id=12; + config=7A0A0000001A4D546864000000060000000100604D54726B0000 + 000600FF2F000 + + (The a=fmtp line has been wrapped to fit the page to accommodate memo + formatting restrictions; it comprises a single line in SDP.) + + The fmtp attribute line codes the four parameters (streamtype, mode, + profile-level-id, and config) that are required in all mpeg4-generic + session descriptions [RFC3640]. For RTP MIDI streams, the streamtype + + + +Lazzaro & Wawrzynek Standards Track [Page 31] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + parameter MUST be set to 5, the mode parameter MUST be set to rtp- + midi, and the profile-level-id parameter MUST be set to the MPEG-4 + Profile Level for the stream. For the Synthesis Profile, legal + profile-level-id values are 11, 12, and 13, coding low (11), medium + (12), or high (13) decoder computational complexity, as defined by + MPEG conformance tests. + + In a minimal RTP MIDI session description, the config value MUST be a + hexadecimal encoding [RFC3640] of the AudioSpecificConfig data block + [MPEGAUDIO] for the stream. AudioSpecificConfig encodes the Audio + Object Type for the stream and also encodes initialization data (SAOL + programs, DLS 2 wave tables, etc.). Standard MIDI Files encoded in + AudioSpecificConfig in a minimal session description MUST be ignored + by the receiver. + + Receivers determine the rendering algorithm for the session by + interpreting the first 5 bits of AudioSpecificConfig as an unsigned + integer that codes the Audio Object Type. In our example above, the + 5 bits are coded within the first two nibbles ("7A") of the config + string. The Audio Object Type coded within "7A" is Audio Object Type + 15 (General MIDI). In Appendix E.4, we derive the config string + value in the session description shown above; the starting point of + the derivation is the MPEG bitstreams defined in [MPEGSA] and + [MPEGAUDIO]. + + We consider the stream to be "minimal" because the session + description does not customize the stream through the use of + parameters, other than the 4 required mpeg4-generic parameters + described above. In Section 6.1, we describe the behavior of a + minimal native stream as a numbered list of characteristics. Items + 1-4 on that list also describe the minimal mpeg4-generic stream, but + items 5 and 6 require restatements, as listed below: + + 5. If more than one minimal mpeg4-generic stream appears in a + session, each stream uses an independent instance of the Audio + Object Type coded in the config parameter value. + + 6. A minimal mpeg4-generic stream encodes the AudioSpecificConfig as + an inline hexadecimal constant. If a session description is sent + over UDP, it may be impossible to transport large + AudioSpecificConfig blocks within the Maximum Transmission Unit + (MTU) of the underlying network (for Ethernet, the MTU is 1500 + octets). In some cases, the AudioSpecificConfig block may exceed + the maximum size of the UDP packet itself. + + The comments in Section 6.1 on SIP and RTSP stream directional + defaults, sendrecv MIDI channel usage, and MIDI 1.0 DIN multicast + networks also apply to mpeg4-generic RTP MIDI sessions. + + + +Lazzaro & Wawrzynek Standards Track [Page 32] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + In sendrecv sessions, each party's session description MUST use + identical values for the mpeg4-generic parameters (including the + required streamtype, mode, profile-level-id, and config parameters). + As a consequence, each party uses an identically configured MPEG 4 + Audio Object Type to render MIDI commands into audio. The preamble + to Appendix C discusses a way to create "virtual sendrecv" sessions + that do not have this restriction. + +6.3. Parameters + + This section introduces parameters for session configuration for RTP + MIDI streams. In session descriptions, parameters modify the + semantics of a payload type. Parameters are specified on an fmtp + attribute line. See the session description example in Section 6.2 + for an example of a fmtp attribute line. + + The parameters add features to the minimal streams described in + Sections 6.1 and 6.2 and support several types of services: + + o Stream subsetting. By default, all MIDI commands that are legal + to appear on a MIDI 1.0 DIN cable may appear in an RTP MIDI + stream. The cm_unused parameter overrides this default by + prohibiting certain commands from appearing in the stream. The + cm_used parameter is used in conjunction with cm_unused to + simplify the specification of complex exclusion rules. We + describe cm_unused and cm_used in Appendix C.1. + + o Journal customization. The j_sec and j_update parameters + configure the use of the journal section. The ch_default, + ch_never, and ch_anchor parameters configure the semantics of the + recovery journal chapters. These parameters are described in + Appendix C.2 and override the default stream behaviors 1 and 2 + (listed in Section 6.1 and referenced in Section 6.2). + + o MIDI command timestamp semantics. The tsmode, octpos, mperiod, + and linerate parameters customize the semantics of timestamps in + the MIDI command section. These parameters let RTP MIDI + accurately encode the implicit time coding of MIDI 1.0 DIN cables. + These parameters are described in Appendix C.3 and override + default stream behavior 3 (listed in Section 6.1 and referenced in + Section 6.2). + + o Media time. The rtp_ptime and rtp_maxptime parameters define the + temporal duration ("media time") of an RTP MIDI packet. The + guardtime parameter sets the minimum sending rate of stream + packets. These parameters are described in Appendix C.4 and + override default stream behavior 4 (listed in Section 6.1 and + referenced in Section 6.2). + + + +Lazzaro & Wawrzynek Standards Track [Page 33] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + o Stream description. The musicport parameter labels the MIDI name + space of RTP streams in a multimedia session. Musicport is + described in Appendix C.5. The musicport parameter overrides + default stream behavior 5 (in Sections 6.1 and 6.2). + + o MIDI rendering. Several parameters specify the MIDI rendering + method of a stream. These parameters are described in Appendix + C.6 and override default stream behavior 6 (in Sections 6.1 and + 6.2). + + In Appendix C.7, we specify interoperability guidelines for two RTP + MIDI application areas: content streaming using RTSP (Appendix C.7.1) + and network musical performance using SIP (Appendix C.7.2). + +7. Extensibility + + The payload format defined in this memo exclusively encodes all + commands that may legally appear on a MIDI 1.0 DIN cable. + + Many worthy uses of MIDI over RTP do not fall within the narrow scope + of the payload format. For example, the payload format does not + support the direct transport of Standard MIDI File (SMF) meta-event + and metric timing data. As a second example, the payload format does + not define transport tools for user-defined commands (apart from + tools to support System Exclusive commands [MIDI]). + + The payload format does not provide an extension mechanism to support + new features of this nature, by design. Instead, we encourage the + development of new payload formats for specialized musical + applications. The IETF session management tools [RFC3264] [RFC2326] + support codec negotiation, to facilitate the use of new payload + formats in a backward-compatible way. + + However, the payload format does provide several extensibility tools, + which we list below: + + o Journalling. As described in Appendix C.2, new token values for + the j_sec and j_update parameters may be defined in IETF + Standards-Track documents. This mechanism supports the design of + new journal formats and the definition of new journal sending + policies. + + o Rendering. The payload format may be extended to support new MIDI + renderers (Appendix C.6.2). Certain general aspects of the RTP + MIDI rendering process may also be extended, via the definition of + new token values for the render (Appendix C.6) and smf_info + (Appendix C.6.4.1) parameters. + + + + +Lazzaro & Wawrzynek Standards Track [Page 34] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + o Undefined commands. [MIDI] reserves 4 MIDI system commands for + future use (0xF4, 0xF5, 0xF9, 0xFD). If updates to [MIDI] define + the reserved commands, IETF Standards-Track documents may be + defined to provide resiliency support for the commands. Opaque + LEGAL fields appear in System Chapter D for this purpose (Appendix + B.1.1). + + A final form of extensibility involves the inclusion of the payload + format in framework documents. Framework documents describe how to + combine protocols to form a platform for interoperable applications. + For example, a stage and studio framework might define how to use SIP + [RFC3261], RTSP [RFC2326], SDP [RFC4566], and RTP [RFC3550] to + support media networking for professional audio equipment and + electronic musical instruments. + +8. Congestion Control + + The RTP congestion control requirements defined in [RFC3550] apply to + RTP MIDI sessions, and implementors should carefully read the + congestion control section in [RFC3550]. As noted in [RFC3550], all + transport protocols used on the Internet need to address congestion + control in some way, and RTP is not an exception. + + In addition, the congestion control requirements defined in [RFC3551] + apply to RTP MIDI sessions run under applicable profiles. The basic + congestion control requirement defined in [RFC3551] is that RTP + sessions that use UDP transport should monitor packet loss (via RTCP + or other means) to ensure that the RTP stream competes fairly with + TCP flows that share the network. + + Finally, RTP MIDI has congestion control issues that are unique for + an audio RTP payload format. In applications such as network musical + performance [NMP], the packet rate is linked to the gestural rate of + a human performer. Senders MUST monitor the MIDI command source for + patterns that result in excessive packet rates and take actions + during RTP transcoding to reduce the RTP packet rate. [RFC4696] + offers implementation guidance on this issue. + +9. Security Considerations + + Implementors should carefully read the Security Considerations + sections of the RTP [RFC3550], AVP [RFC3551], and other RTP profile + documents, as the issues discussed in these sections directly apply + to RTP MIDI streams. Implementors should also review the Secure + Real-time Transport Protocol (SRTP, [RFC3711]), an RTP profile that + addresses the security issues discussed in [RFC3550] and [RFC3551]. + + + + + +Lazzaro & Wawrzynek Standards Track [Page 35] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Here, we discuss security issues that are unique to the RTP MIDI + payload format. + + When using RTP MIDI, authentication of incoming RTP and RTCP packets + is RECOMMENDED. Per-packet authentication may be provided by SRTP or + by other means. Without the use of authentication, attackers could + forge MIDI commands into an ongoing stream, damaging speakers and + eardrums. An attacker could also craft RTP and RTCP packets to + exploit known bugs in the client and take effective control of a + client machine. + + Session management tools (such as SIP [RFC3261]) SHOULD use + authentication during the transport of all session descriptions + containing RTP MIDI media streams. For SIP, the Security + Considerations section in [RFC3261] provides an overview of possible + authentication mechanisms. RTP MIDI session descriptions should use + authentication because the session descriptions may code + initialization data using the parameters described in Appendix C. If + an attacker inserts bogus initialization data into a session + description, he can corrupt the session or forge an client attack. + + Session descriptions may also code renderer initialization data by + reference, via the url (Appendix C.6.3) and smf_url (Appendix + C.6.4.2) parameters. If the coded URL is spoofed, both session and + client are open to attack, even if the session description itself is + authenticated. Therefore, URLs specified in url and smf_url + parameters SHOULD use [RFC2818]. + + Section 2.1 allows streams sent by a party in two RTP sessions to + have the same SSRC value and the same RTP timestamp initialization + value, under certain circumstances. Normally, these values are + randomly chosen for each stream in a session, to make plaintext + guessing harder to do if the payloads are encrypted. Thus, Section + 2.1 weakens this aspect of RTP security. + +10. Acknowledgements + + We thank the networking, media compression, and computer music + community members who have commented or contributed to the effort, + including Kurt B, Cynthia Bruyns, Steve Casner, Paul Davis, Robin + Davies, Joanne Dow, Tobias Erichsen, Roni Even, Nicolas Falquet, + Adrian Farrel, Dominique Fober, Philippe Gentric, Michael Godfrey, + Chris Grigg, Todd Hager, Alfred Hoenes, Russ Housley, Michel Jullian, + Phil Kerr, Young-Kwon Lim, Jessica Little, Jan van der Meer, Alexey + Melnikov, Colin Perkins, Charlie Richmond, Herbie Robinson, Dan + Romascanu, Larry Rowe, Eric Scheirer, Dave Singer, Martijn Sipkema, + Robert Sparks, William Stewart, Kent Terry, Sean Turner, Magnus + Westerlund, Tom White, Jim Wright, Doug Wyatt, and Giorgio Zoia. We + + + +Lazzaro & Wawrzynek Standards Track [Page 36] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + also thank the members of the San Francisco Bay Area music and audio + community for creating the context for the work, including Don + Buchla, Chris Chafe, Richard Duda, Dan Ellis, Adrian Freed, Ben Gold, + Jaron Lanier, Roger Linn, Richard Lyon, Dana Massie, Max Mathews, + Keith McMillen, Carver Mead, Nelson Morgan, Tom Oberheim, Malcolm + Slaney, Dave Smith, Julius Smith, David Wessel, and Matt Wright. + +11. IANA Considerations + + The bulk of this section is a verbatim reproduction of the IANA + considerations that appear in Section 11 of [RFC4695]. Preceding + this reproduction, we list several issues concerning this memo that + are related to the IANA considerations, as follows: + + o All existing IANA references to [RFC4695] have been deleted, and + replaced with references to this memo. In addition, a reference + to this memo has been added to the audio/mpeg4-generic MIME type + registration. + + o In Section 11.3, a sentence has been added to the Encoding + Considerations asc Media Type Registration: "Disk files that store + this data object use the file extension ".acn"". + + The reproduction of the [RFC4695] IANA considerations section appears + directly below. + + This section makes a series of requests to IANA. The IANA has + completed registration/assignments of the below requests. + + The subsections that follow hold the actual, detailed requests. All + registrations in this section are in the IETF tree and follow the + rules of [RFC4288] and [RFC4855], as appropriate. + + In Section 11.1, we request the registration of a new media type: + audio/rtp-midi. Paired with this request is a request for a + repository for new values for several parameters associated with + audio/rtp-midi. We request this repository in Section 11.1.1. + + In Section 11.2, we request the registration of a new value (rtp- + midi) for the mode parameter of the mpeg4-generic media type. The + mpeg4-generic media type is defined in [RFC3640], and [RFC3640] + defines a repository for the mode parameter. However, we believe we + are the first to request the registration of a mode value, so we + believe the registry for mode has not yet been created by IANA. + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 37] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Paired with our mode parameter value request for mpeg4-generic is a + request for a repository for new values for several parameters we + have defined for use with the rtp-midi mode value. We request this + repository in Section 11.2.1. + + In Section 11.3, we request the registration of a new media type: + audio/asc. No repository request is associated with this request. + +11.1. rtp-midi Media Type Registration + + This section requests the registration of the rtp-midi subtype for + the audio media type. We request the registration of the parameters + listed in the "optional parameters" section below (both the "non- + extensible parameters" and the "extensible parameters" lists). We + also request the creation of repositories for the "extensible + parameters"; the details of this request appear in Section 11.1.1. + + Media type name: + + audio + + Subtype name: + + rtp-midi + + Required parameters: + + rate: The RTP timestamp clock rate. See Sections 2.1 and 6.1 + for usage details. + + Optional parameters: + + Non-extensible parameters: + + ch_anchor: See Appendix C.2.3 for usage details. + ch_default: See Appendix C.2.3 for usage details. + ch_never: See Appendix C.2.3 for usage details. + cm_unused: See Appendix C.1 for usage details. + cm_used: See Appendix C.1 for usage details. + chanmask: See Appendix C.6.4.3 for usage details. + cid: See Appendix C.6.3 for usage details. + guardtime: See Appendix C.4.2 for usage details. + inline: See Appendix C.6.3 for usage details. + linerate: See Appendix C.3 for usage details. + mperiod: See Appendix C.3 for usage details. + multimode: See Appendix C.6.1 for usage details. + musicport: See Appendix C.5 for usage details. + octpos: See Appendix C.3 for usage details. + + + +Lazzaro & Wawrzynek Standards Track [Page 38] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + rinit: See Appendix C.6.3 for usage details. + rtp_maxptime: See Appendix C.4.1 for usage details. + rtp_ptime: See Appendix C.4.1 for usage details. + smf_cid: See Appendix C.6.4.2 for usage details. + smf_inline: See Appendix C.6.4.2 for usage details. + smf_url: See Appendix C.6.4.2 for usage details. + tsmode: See Appendix C.3 for usage details. + url: See Appendix C.6.3 for usage details. + + Extensible parameters: + + j_sec: See Appendix C.2.1 for usage details. See + Section 11.1.1 for repository details. + j_update: See Appendix C.2.2 for usage details. See + Section 11.1.1 for repository details. + render: See Appendix C.6 for usage details. See + Section 11.1.1 for repository details. + subrender: See Appendix C.6.2 for usage details. See + Section 11.1.1 for repository details. + smf_info: See Appendix C.6.4.1 for usage details. See + Section 11.1.1 for repository details. + + Encoding considerations: + + The format for this type is framed and binary. + + Restrictions on usage: + + This type is only defined for real-time transfers of MIDI + streams via RTP. Stored-file semantics for rtp-midi may + be defined in the future. + + Security considerations: + + See Section 9 of this memo. + + Interoperability considerations: + + None. + + Published specification: + + This memo and [MIDI] serve as the normative specification. In + addition, references [NMP], [GRAME], and [RFC4696] provide + non-normative implementation guidance. + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 39] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Applications that use this media type: + + Audio content-creation hardware, such as MIDI controller piano + keyboards and MIDI audio synthesizers. Audio content-creation + software, such as music sequencers, digital audio workstations, + and soft synthesizers. Computer operating systems, for network + support of MIDI Application Programmer Interfaces. Content + distribution servers and terminals may use this media type for + low bitrate music coding. + + Additional information: + + None. + + Person & email address to contact for further information: + + John Lazzaro <lazzaro@cs.berkeley.edu> + + Intended usage: + + COMMON. + + Author: + + John Lazzaro <lazzaro@cs.berkeley.edu> + + Change controller: + + IETF Audio/Video Transport Working Group delegated + from the IESG. + +11.1.1. Repository Request for audio/rtp-midi + + For the rtp-midi subtype, we request the creation of repositories for + extensions to the following parameters (which are those listed as + "extensible parameters" in Section 11.1). + + j_sec: + + Registrations for this repository may only occur + via an IETF Standards-Track document. Appendix C.2.1 + of this memo describes appropriate registrations for this + repository. + + Initial values for this repository appear below: + + "none": Defined in Appendix C.2.1 of this memo. + "recj": Defined in Appendix C.2.1 of this memo. + + + +Lazzaro & Wawrzynek Standards Track [Page 40] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + j_update: + + Registrations for this repository may only occur + via an IETF Standards-Track document. Appendix C.2.2 + of this memo describes appropriate registrations for this + repository. + + Initial values for this repository appear below: + + "anchor": Defined in Appendix C.2.2 of this memo. + "open-loop": Defined in Appendix C.2.2 of this memo. + "closed-loop": Defined in Appendix C.2.2 of this memo. + + render: + + Registrations for this repository MUST include a + specification of the usage of the proposed value. + See the preamble of Appendix C.6 for details + (the paragraph that begins "Other render token ..."). + + Initial values for this repository appear below: + + "unknown": Defined in Appendix C.6 of this memo. + "synthetic": Defined in Appendix C.6 of this memo. + "api": Defined in Appendix C.6 of this memo. + "null": Defined in Appendix C.6 of this memo. + + subrender: + + Registrations for this repository MUST include a + specification of the usage of the proposed value. + See Appendix C.6.2 for details (the paragraph + that begins "Other subrender token ..."). + + Initial values for this repository appear below: + + "default": Defined in Appendix C.6.2 of this memo. + + smf_info: + + Registrations for this repository MUST include a + specification of the usage of the proposed value. + See Appendix C.6.4.1 for details (the paragraph + that begins "Other smf_info token ..."). + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 41] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Initial values for this repository appear below: + + "ignore": Defined in Appendix C.6.4.1 of this memo. + "sdp_start": Defined in Appendix C.6.4.1 of this memo. + "identity": Defined in Appendix C.6.4.1 of this memo. + +11.2. mpeg4-generic Media Type Registration + + This section requests the registration of the rtp-midi value for the + mode parameter of the mpeg4-generic media type. The mpeg4-generic + media type is defined in [RFC3640], and [RFC3640] defines a + repository for the mode parameter. We are registering mode rtp-midi + to support the MPEG Audio codecs [MPEGSA] that use MIDI. + + In conjunction with this registration request, we request the + registration of the parameters listed in the "optional parameters" + section below (both the "non-extensible parameters" and the + "extensible parameters" lists). We also request the creation of + repositories for the "extensible parameters"; the details of this + request appear in Appendix 11.2.1. + + Media type name: + + audio + + Subtype name: + + mpeg4-generic + + Required parameters: + + The mode parameter is required by [RFC3640]. [RFC3640] + requests a repository for mode so that new values for mode + may be added. We request that the value rtp-midi be + added to the mode repository. + + In mode rtp-midi, the mpeg4-generic parameter rate is + a required parameter. Rate specifies the RTP timestamp + clock rate. See Sections 2.1 and 6.2 for usage details + of rate in mode rtp-midi. + + Optional parameters: + + We request registration of the following parameters + for use in mode rtp-midi for mpeg4-generic. + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 42] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Non-extensible parameters: + + ch_anchor: See Appendix C.2.3 for usage details. + ch_default: See Appendix C.2.3 for usage details. + ch_never: See Appendix C.2.3 for usage details. + cm_unused: See Appendix C.1 for usage details. + cm_used: See Appendix C.1 for usage details. + chanmask: See Appendix C.6.4.3 for usage details. + cid: See Appendix C.6.3 for usage details. + guardtime: See Appendix C.4.2 for usage details. + inline: See Appendix C.6.3 for usage details. + linerate: See Appendix C.3 for usage details. + mperiod: See Appendix C.3 for usage details. + multimode: See Appendix C.6.1 for usage details. + musicport: See Appendix C.5 for usage details. + octpos: See Appendix C.3 for usage details. + rinit: See Appendix C.6.3 for usage details. + rtp_maxptime: See Appendix C.4.1 for usage details. + rtp_ptime: See Appendix C.4.1 for usage details. + smf_cid: See Appendix C.6.4.2 for usage details. + smf_inline: See Appendix C.6.4.2 for usage details. + smf_url: See Appendix C.6.4.2 for usage details. + tsmode: See Appendix C.3 for usage details. + url: See Appendix C.6.3 for usage details. + + Extensible parameters: + + j_sec: See Appendix C.2.1 for usage details. + See Section 11.2.1 for repository details. + j_update: See Appendix C.2.2 for usage details. + See Section 11.2.1 for repository details. + render: See Appendix C.6 for usage details. + See Section 11.2.1 for repository details. + subrender: See Appendix C.6.2 for usage details. + See Section 11.2.1 for repository details. + smf_info: See Appendix C.6.4.1 for usage details. + See Section 11.2.1 for repository details. + + Encoding considerations: + + The format for this type is framed and binary. + + Restrictions on usage: + + Only defined for real-time transfers of audio/mpeg4-generic + RTP streams with mode=rtp-midi. + + + + + +Lazzaro & Wawrzynek Standards Track [Page 43] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Security considerations: + + See Section 9 of this memo. + + Interoperability considerations: + + Except for the marker bit (Section 2.1), the packet formats + for audio/rtp-midi and audio/mpeg4-generic (mode rtp-midi) + are identical. The formats differ in use: audio/mpeg4-generic + is for MPEG work, and audio/rtp-midi is for all other work. + + Published specification: + + This memo, [MIDI], and [MPEGSA] are the normative references. + In addition, [NMP], [GRAME], and [RFC4696] provide + non-normative implementation guidance. + + Applications that use this media type: + + MPEG 4 servers and terminals that support [MPEGSA]. + + Additional information: + + None. + + Person & email address to contact for further information: + + John Lazzaro <lazzaro@cs.berkeley.edu> + + Intended usage: + + COMMON. + + Author: + + John Lazzaro <lazzaro@cs.berkeley.edu> + + Change controller: + + IETF Audio/Video Transport Working Group delegated + from the IESG. + +11.2.1. Repository Request for Mode rtp-midi for mpeg4-generic + + For mode rtp-midi of the mpeg4-generic subtype, we request the + creation of repositories for extensions to the following parameters + (which are those listed as "extensible parameters" in Section 11.2). + + + + +Lazzaro & Wawrzynek Standards Track [Page 44] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + j_sec: + + Registrations for this repository may only occur + via an IETF Standards-Track document. Appendix C.2.1 + of this memo describes appropriate registrations for this + repository. + + Initial values for this repository appear below: + + "none": Defined in Appendix C.2.1 of this memo. + "recj": Defined in Appendix C.2.1 of this memo. + + j_update: + + Registrations for this repository may only occur + via an IETF Standards-Track document. Appendix C.2.2 + of this memo describes appropriate registrations for this + repository. + + Initial values for this repository appear below: + + "anchor": Defined in Appendix C.2.2 of this memo. + "open-loop": Defined in Appendix C.2.2 of this memo. + "closed-loop": Defined in Appendix C.2.2 of this memo. + + render: + + Registrations for this repository MUST include a + specification of the usage of the proposed value. + See the preamble of Appendix C.6 for details + (the paragraph that begins "Other render token ..."). + + Initial values for this repository appear below: + + "unknown": Defined in Appendix C.6 of this memo. + "synthetic": Defined in Appendix C.6 of this memo. + "null": Defined in Appendix C.6 of this memo. + + subrender: + + Registrations for this repository MUST include a + specification of the usage of the proposed value. + See Appendix C.6.2 for details (the paragraph + that begins "Other subrender token ..." and + subsequent paragraphs). Note that the text in + Appendix C.6.2 contains restrictions on subrender + registrations for mpeg4-generic (the sentence that + + + + +Lazzaro & Wawrzynek Standards Track [Page 45] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + begins "Registrations for mpeg4-generic subrender + values ..."). + + Initial values for this repository appear below: + + "default": Defined in Appendix C.6.2 of this memo. + + smf_info: + + Registrations for this repository MUST include a + specification of the usage of the proposed value. + See Appendix C.6.4.1 for details (the paragraph + that begins "Other smf_info token ..."). + + Initial values for this repository appear below: + + "ignore": Defined in Appendix C.6.4.1 of this memo. + "sdp_start": Defined in Appendix C.6.4.1 of this memo. + "identity": Defined in Appendix C.6.4.1 of this memo. + +11.3. asc Media Type Registration + + This section registers asc as a subtype for the audio media type. We + register this subtype to support the remote transfer of the "config" + parameter of the mpeg4-generic media type [RFC3640] when it is used + with mpeg4-generic mode rtp-midi (registered in Appendix 11.2 above). + We explain the mechanics of using audio/asc to set the config + parameter in Section 6.2 and Appendix C.6.5 of this document. + + Note that this registration is a new subtype registration and is not + an addition to a repository defined by MPEG-related memos (such as + [RFC3640]). Also, note that this request for audio/asc does not + register parameters and does not request the creation of a + repository. + + Media type name: + + audio + + Subtype name: + + asc + + Required parameters: + + None. + + + + + +Lazzaro & Wawrzynek Standards Track [Page 46] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Optional parameters: + + None. + + Encoding considerations: + + The native form of the data object is binary data, + zero-padded to an octet boundary. Disk files that + store this data object use the file extension ".acn". + + Restrictions on usage: + + This type is only defined for data object (stored file) + transfer. The most common transports for the type are + HTTP and SMTP. + + Security considerations: + + See Section 9 of this memo. + + Interoperability considerations: + + None. + + Published specification: + + The audio/asc data object is the AudioSpecificConfig + binary data structure, which is normatively defined in + [MPEGAUDIO]. + + Applications that use this media type: + + MPEG 4 Audio servers and terminals that support + audio/mpeg4-generic RTP streams for mode rtp-midi. + + Additional information: + + None. + + Person & email address to contact for further information: + + John Lazzaro <lazzaro@cs.berkeley.edu> + + Intended usage: + + COMMON. + + + + + +Lazzaro & Wawrzynek Standards Track [Page 47] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Author: + + John Lazzaro <lazzaro@cs.berkeley.edu> + + Change controller: + + IETF Audio/Video Transport Working Group delegated + from the IESG. + +12. Changes from RFC 4695 + + This document fixes errors found in RFC 4695 by reviewers. We thank + Alfred Hoenes, Roni Even, and Alexey Melnikov for reporting the + errors. To our knowledge, there are no interoperability issues + associated with the errors that are fixed by this document. In this + section, we list the error fixes. + + In Section 3 of RFC 4695, the bitfield format shown in Figure 3 is + inconsistent with the normative text that (correctly) describes the + bitfield. Specifically, Figure 3 in RFC 4695 incorrectly states the + dependence of the Delta Time 0 field on the Z header bit. Figure 3 + in this document corrects this error. To our knowledge, this error + did not result in incorrect implementations of RFC 4695. + + The remaining errors are in Appendices C and D and concern session + configuration parameters. The numbered list ((1) through (11)) below + describes these errors in detail, in order of appearance in the + document. To our knowledge, there are no interoperability issues + associated with these errors, as implementation activity has so far + focused on an application domain that does not use SDP for session + management. + + (1) In Appendices C.1 and C.2.3 of RFC 4695, an ABNF rule related to + System Chapter X is incorrectly defined as: + + <parameter> = "__" <h-list> ["_" <h-list>] "__" + + The correct version of this rule is: + + <parameter> = "__" <h-list> *( "_" <h-list> ) "__" + + (2) In Appendix C.6.3 of RFC 4695, the URIs permitted to be assigned + to the url parameter are not stated clearly. URIs assigned to url + MUST specify either HTTP or HTTP over TLS transport protocols. + + In Appendix C.7.1 and C.7.2 of RFC 4695, the transport + interoperability requirements for the url parameter are not stated + + + + +Lazzaro & Wawrzynek Standards Track [Page 48] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + clearly. For both C.7.1 and C.7.2, HTTP is REQUIRED and HTTP over + TLS is OPTIONAL. + + (3) In Appendix C.6.5, the filename extension ".acn" has been defined + for use with AudioSpecificConfig. + + (4) Both fmtp lines in both session description examples in Appendix + C.7.2 of RFC 4695 contain instances of the same syntax error (a + missing ";" at a line wrap after a cm_used assignment). + + (5) In the session description examples in Appendix C.5, C.6, and C.7 + of RFC 4695, the parameter assignment + + rinit="audio/asc"; + + is incorrect. The correct parameter assignment appears below: + + rinit=audio/asc; + + Note that this error also appears in the session descriptions shown + in Figures 1 and 2 of the informative RFC 4696. We are not aware of + existing implementations that use the rinit parameter, and so the + incorrect examples in RFC 4695 and RFC 4696 should not cause + interoperability problems. + + (6) In Appendix D of RFC 4695, all uses of "*ietf-extension" in rules + are in error and should be replaced with "ietf-extension". Likewise, + all uses of "*extension" are in error and should be replaced with + "extension". This bug incorrectly lets the null token be assigned to + the j_sec, j_update, render, smf_info, and subrender parameters. + + (7) In Appendix D of RFC 4695, the definitions of <command-type> and + <chapter-rules> incorrectly allow lowercase letters to appear in + these strings. The correct definitions of these rules appear below: + + command-type = [A] [B] [C] [F] [G] [H] [J] [K] [M] + [N] [P] [Q] [T] [V] [W] [X] [Y] [Z] + + chapter-list = [A] [B] [C] [D] [E] [F] [G] [H] [J] [K] + [M] [N] [P] [Q] [T] [V] [W] [X] [Y] [Z] + + A = %x41 + B = %x42 + C = %x43 + D = %x44 + E = %x45 + F = %x46 + G = %x47 + + + +Lazzaro & Wawrzynek Standards Track [Page 49] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + H = %x48 + J = %x4A + K = %x4B + M = %x4D + N = %x4E + P = %x50 + Q = %x51 + T = %x54 + V = %x56 + W = %x57 + X = %x58 + Y = %x59 + Z = %x5A + + (8) In Appendix D of RFC 4695, the definitions of <nonzero-four- + octet>, <four-octet>, and <midi-chan> are incorrect. The correct + definitions of these rules appear below: + + nonzero-four-octet = (NZ-DIGIT 0*8(DIGIT)) + / (%x31-33 9(DIGIT)) + / ("4" %x30-31 8(DIGIT)) + / ("42" %x30-38 7(DIGIT)) + / ("429" %x30-33 6(DIGIT)) + / ("4294" %x30-38 5(DIGIT)) + / ("42949" %x30-35 4(DIGIT)) + / ("429496" %x30-36 3(DIGIT)) + / ("4294967" %x30-31 2(DIGIT)) + / ("42949672" %x30-38 (DIGIT)) + / ("429496729" %x30-34) + + four-octet = "0" / nonzero-four-octet + midi-chan = DIGIT / ("1" %x30-35) + + DIGIT = %x30-39 + NZ-DIGIT = %x31-39 + + (9) In Appendix D of RFC4695, the rule <hex-octet> is incorrect. The + correct definition of this rule appears below. + + hex-octet = %x30-37 U-HEXDIG + U-HEXDIG = DIGIT / A / B / C / D / E / F + + ; DIGIT as defined in (6) above + ; A, B, C, D, E, F as defined in (5) above + + (10) In Appendix D, the <mime-subtype> rule now points to the + <subtype-name> rule in [RFC4288]. + + + + +Lazzaro & Wawrzynek Standards Track [Page 50] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + (11) In Appendix D of RFC4695, the rules <base64-unit> and + <base64-pad> are defined unclearly. The rewritten rules appear + below: + + base64-unit = 4(base64-char) + base64-pad = (2(base64-char) "==") / (3(base64-char) "=") + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 51] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + +Appendix A. The Recovery Journal Channel Chapters + +A.1. Recovery Journal Definitions + + This appendix defines the terminology and the coding idioms that are + used in the recovery journal bitfield descriptions in Section 5 + (journal header structure), Appendices A.2 to A.9 (channel journal + chapters), and Appendices B.1 to B.5 (system journal chapters). + + We assume that the recovery journal resides in the journal section of + an RTP packet with sequence number I ("packet I") and that the + Checkpoint Packet Seqnum field in the top-level recovery journal + header refers to a previous packet with sequence number C (an + exception is the self-referential C = I case). Unless stated + otherwise, algorithms are assumed to use modulo 2^16 arithmetic for + calculations on 16-bit sequence numbers and modulo 2^32 arithmetic + for calculations on 32-bit extended sequence numbers. + + Several bitfield coding idioms appear throughout the recovery journal + system with consistent semantics. Most recovery journal elements + begin with an "S" (Single-packet loss) bit. S bits are designed to + help receivers efficiently parse through the recovery journal + hierarchy in the common case of the loss of a single packet. + + As a rule, S bits MUST be set to 1. However, an exception applies if + a recovery journal element in packet I encodes data about a command + stored in the MIDI command section of packet I - 1. In this case, + the S bit of the recovery journal element MUST be set to 0. If a + recovery journal element has its S bit set to 0, all higher-level + recovery journal elements that contain it MUST also have S bits that + are set to 0, including the top-level recovery journal header. + + Other consistent bitfield coding idioms are described below: + + o R flag bit. R flag bits are reserved for future use. Senders + MUST set R bits to 0. Receivers MUST ignore R bit values. + + o LENGTH field. All fields named LENGTH (as distinct from LEN) code + the number of octets in the structure that contains it, including + the header it resides in and all hierarchical levels below it. If + a structure contains a LENGTH field, a receiver MUST use the + LENGTH field value to advance past the structure during parsing, + rather than use knowledge about the internal format of the + structure. + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 52] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + We now define normative terms used to describe recovery journal + semantics. + + o Checkpoint history. The checkpoint history of a recovery journal + is the concatenation of the MIDI command sections of packets C + through I - 1. The final command in the MIDI command section for + packet I - 1 is considered the most recent command; the first + command in the MIDI command section for packet C is the oldest + command. If command X is less recent than command Y, X is + considered to be "before Y". A checkpoint history with no + commands is considered to be empty. The checkpoint history never + contains the MIDI command section of packet I (the packet + containing the recovery journal), so if C == I, the checkpoint + history is empty by definition. + + o Session history. The session history of a recovery journal is the + concatenation of MIDI command sections from the first packet of + the session up to packet I - 1. The definitions of command + recency and history emptiness follow those in the checkpoint + history. The session history never contains the MIDI command + section of packet I, so the session history of the first packet in + the session is empty by definition. + + o Finished/unfinished commands. If all octets of a MIDI command + appear in the session history, the command is defined as being + finished. If some but not all octets of a command appear in the + session history, the command is defined as being unfinished. + Unfinished commands occur if segments of a SysEx command appear in + several RTP packets. For example, if a SysEx command is coded as + 3 segments, with segment 1 in packet K, segment 2 in packet K + 1, + and segment 3 in packet K + 2, the session histories for packets K + + 1 and K + 2 contain unfinished versions of the command. A + session history contains a finished version of a cancelled SysEx + command if the history contains the cancel sublist for the + command. + + o Reset State commands. Reset State (RS) commands reset renderers + to an initialized "powerup" condition. The RS commands are System + Reset (0xFF), General MIDI System Enable (0xF0 0x7E 0xcc 0x09 0x01 + 0xF7), General MIDI 2 System Enable (0xF0 0x7E 0xcc 0x09 0x03 + 0xF7), General MIDI System Disable (0xF0 0x7E 0xcc 0x09 0x00 + 0xF7), Turn DLS On (0xF0 0x7E 0xcc 0x0A 0x01 0xF7), and Turn DLS + Off (0xF0 0x7E 0xcc 0x0A 0x02 0xF7). Registrations of subrender + parameter token values (Appendix C.6.2) and IETF Standards-Track + documents MAY specify additional RS commands. + + o Active commands. Active commands are MIDI commands that do not + appear before a Reset State command in the session history. + + + +Lazzaro & Wawrzynek Standards Track [Page 53] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + o N-active commands. N-active commands are MIDI commands that do + not appear before one of the following commands in the session + history: MIDI Control Change numbers 123-127 (numbers with All + Notes Off semantics) or 120 (All Sound Off), and any Reset State + command. + + o C-active commands. C-active commands are MIDI commands that do + not appear before one of the following commands in the session + history: MIDI Control Change number 121 (Reset All Controllers) + and any Reset State command. + + o Oldest-first ordering rule. Several recovery journal chapters + contain a list of elements, where each element is associated with + a MIDI command that appears in the session history. In most + cases, the chapter definition requires that list elements be + ordered in accordance with the "oldest-first ordering rule". + Below, we normatively define this rule. + + Elements associated with the most recent command in the session + history coded in the list MUST appear at the end of the list. + + Elements associated with the oldest command in the session history + coded in the list MUST appear at the start of the list. + + All other list elements MUST be arranged with respect to these + boundary elements, to produce a list ordering that strictly + reflects the relative session history recency of the commands + coded by the elements in the list. + + o Parameter system. A MIDI feature that provides two sets of 16,384 + parameters to expand the 0-127 controller number space. The + Registered Parameter Numbers (RPN) system and the Non-Registered + Parameter Numbers (NRPN) system each provide 16,384 parameters. + + o Parameter system transaction. RPN and NRPN values are changed by + a series of Control Change commands that form a parameter system + transaction. A canonical transaction begins with two Control + Change commands to set the parameter number (controller numbers 99 + and 98 for NRPN parameters, controller numbers 101 and 100 for RPN + parameters). The transaction continues with an arbitrary number + of Data Entry (controller numbers 6 and 38), Data Increment + (controller number 96), and Data Decrement (controller number 97) + Control Change commands to set the parameter value. The + transaction ends with a second pair of (99, 98) or (101, 100) + Control Change commands that specify the null parameter (Most + Significant Bit (MSB) value 0x7F, Least Significant Bit (LSB) + value 0x7F). + + + + +Lazzaro & Wawrzynek Standards Track [Page 54] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Several variants of the canonical transaction sequence are + possible. Most commonly, the terminal pair of (99, 98) or (101, + 100) Control Change commands may specify a parameter other than + the null parameter. In this case, the command pair terminates the + first transaction and starts a second transaction. The command + pair is considered to be a part of both transactions. This + variant is legal and recommended in [MIDI]. We refer to this + variant as a "type 1 variant". + + Less commonly, the MSB (99 or 101) or LSB (98 or 100) command of a + (99, 98) or (101, 100) Control Change pair may be omitted. + + If the MSB command is omitted, the transaction uses the MSB value + of the most recent C-active Control Change command for controller + number 99 or 101 that appears in the session history. We refer to + this variant as a "type 2 variant". + + If the LSB command is omitted, the LSB value 0x00 is assumed. We + refer to this variant as a "type 3 variant". The type 2 and type + 3 variants are defined as legal but are not recommended in [MIDI]. + + System Real-Time commands may appear at any point during a + transaction (even between octets of individual commands in the + transaction). More generally, [MIDI] does not forbid the + appearance of unrelated MIDI commands during an open transaction. + As a rule, these commands are considered to be "outside" the + transaction and do not affect the status of the transaction in any + way. Exceptions to this rule are commands whose semantics act to + terminate transactions: Reset State commands and Control Change + (0xB) for controller number 121 (Reset All Controllers) [RP015]. + + o Initiated parameter system transaction. A canonical parameter + system transaction whose (99, 98) or (101, 100) initial Control + Change command pair appears in the session history is considered + to be an initiated parameter system transaction. This definition + also holds for type 1 variants. For type 2 variants (dropped + MSB), a transaction whose initial LSB Control Change command + appears in the session history is an initiated transaction. For + type 3 variants (dropped LSB), a transaction is considered to be + initiated if at least one transaction command follows the initial + MSB (99 or 101) Control Change command in the session history. + The completion of a transaction does not nullify its "initiated" + status. + + o Session history reference counts. Several recovery journal + chapters include a reference count field, which codes the total + number of commands of a type that appear in the session history. + Examples include the Reset and Tune Request command logs (Appendix + + + +Lazzaro & Wawrzynek Standards Track [Page 55] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + B.1, "System Chapter D") and the Active Sense command (Appendix + B.2, "System Chapter V"). Upon the detection of a loss event, + reference count fields let a receiver deduce if any instances of + the command have been lost, by comparing the journal reference + count with its own reference count. Thus, a reference count field + makes sense, even for command types in which knowing the NUMBER of + lost commands is irrelevant (as is true with all of the example + commands mentioned above). + + The chapter definitions in Appendices A.2 to A.9 and B.1 to B.5 + reflect the default recovery journal behavior. The ch_default, + ch_never, and ch_anchor parameters modify these definitions, as + described in Appendix C.2.3. + + The chapter definitions specify if data MUST be present in the + journal. Senders MAY also include non-required data in the journal. + This optional data MUST comply with the normative chapter definition. + For example, if a chapter definition states that a field codes data + from the most recent active command in the session history, the + sender MUST NOT code inactive commands or older commands in the + field. + + Finally, we note that a channel journal only encodes information + about MIDI commands appearing on the MIDI channel the journal + protects. All references to MIDI commands in Appendices A.2 to A.9 + should be read as "MIDI commands appearing on this channel". + +A.2. Chapter P: MIDI Program Change + + A channel journal MUST contain Chapter P if an active Program Change + (0xC) command appears in the checkpoint history. Figure A.2.1 shows + the format for Chapter P. + + 0 1 2 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |S| PROGRAM |B| BANK-MSB |X| BANK-LSB | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure A.2.1 -- Chapter P Format + + The chapter has a fixed size of 24 bits. The PROGRAM field indicates + the data value of the most recent active Program Change command in + the session history. By default, the B, BANK-MSB, X, and BANK-LSB + fields MUST be set to 0. Below, we define exceptions to this default + condition. + + + + + +Lazzaro & Wawrzynek Standards Track [Page 56] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + If an active Control Change (0xB) command for controller number 0 + (Bank Select MSB) appears before the Program Change command in the + session history, the B bit MUST be set to 1, and the BANK-MSB field + MUST code the data value of the Control Change command. + + If B is set to 1, the BANK-LSB field MUST code the data value of the + most recent Control Change command for controller number 32 (Bank + Select LSB) that preceded the Program Change command coded in the + PROGRAM field and followed the Control Change command coded in the + BANK-MSB field. If no such Control Change command exists, the BANK- + LSB field MUST be set to 0. + + If B is set to 1 and if a Control Change command for controller + number 121 (Reset All Controllers) appears in the MIDI stream between + the Control Change command coded by the BANK-MSB field and the + Program Change command coded by the PROGRAM field, the X bit MUST be + set to 1. + + Note that [RP015] specifies that Reset All Controllers does not reset + the values of controller numbers 0 (Bank Select MSB) and 32 (Bank + Select LSB). Thus, the X bit does not affect how receivers will use + the BANK-LSB and BANK-MSB values when recovering from a lost Program + Change command. The X bit serves to aid recovery in MIDI + applications where controller numbers 0 and 32 are used in a non- + standard way. + +A.3. Chapter C: MIDI Control Change + + Figure A.3.1 shows the format for Chapter C. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 8 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |S| LEN |S| NUMBER |A| VALUE/ALT |S| NUMBER | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |A| VALUE/ALT | .... | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure A.3.1 -- Chapter C Format + + The chapter consists of a 1-octet header followed by a variable- + length list of 2-octet controller logs. The list MUST contain at + least one controller log. The 7-bit LEN field codes the number of + controller logs in the list, minus one. We define the semantics of + the controller log fields in Appendix A.3.2. + + A channel journal MUST contain Chapter C if the rules defined in this + appendix require that one or more controller logs appear in the list. + + + +Lazzaro & Wawrzynek Standards Track [Page 57] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + +A.3.1. Log Inclusion Rules + + A controller log encodes information about a particular Control + Change command in the session history. + + In the default use of the payload format, list logs MUST encode + information about the most recent active command in the session + history for a controller number. Logs encoding earlier commands MUST + NOT appear in the list. + + Also, as a rule, the list MUST contain a log for the most recent + active command for a controller number that appears in the checkpoint + history. Below, we define exceptions to this rule: + + o MIDI streams may transmit 14-bit controller values using paired + Most Significant Byte (MSB, controller numbers 0-31, 99, 101) and + Least Significant Byte (LSB, controller numbers 32-63, 98, 100) + Control Change commands [MIDI]. + + If the most recent active Control Change command in the session + history for a 14-bit controller pair uses the MSB number, Chapter + C MAY omit the controller log for the most recent active Control + Change command for the associated LSB number, as the command + ordering makes this LSB value irrelevant. However, this exception + MUST NOT be applied if the sender is not certain that the MIDI + source uses 14-bit semantics for the controller number pair. Note + that some MIDI sources ignore 14-bit controller semantics and use + the LSB controller numbers as independent 7-bit controllers. + + o If active Control Change commands for controller numbers 0 (Bank + Select MSB) or 32 (Bank Select LSB) appear in the checkpoint + history and if the command instances are also coded in the BANK- + MSB and BANK-LSB fields of the Chapter P (Appendix A.2), Chapter C + MAY omit the controller logs for the commands. + + o Several controller number pairs are defined to be mutually + exclusive. Controller numbers 124 (Omni Off) and 125 (Omni On) + form a mutually exclusive pair, as do controller numbers 126 + (Mono) and 127 (Poly). + + If active Control Change commands for one or both members of a + mutually exclusive pair appear in the checkpoint history, a log + for the controller number of the most recent command for the pair + in the checkpoint history MUST appear in the controller list. + However, the list MAY omit the controller log for the most recent + active command for the other number in the pair. + + + + + +Lazzaro & Wawrzynek Standards Track [Page 58] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + If active Control Change commands for one or both members of a + mutually exclusive pair appear in the session history, and if a + log for the controller number of the most recent command for the + pair does not appear in the controller list, a log for the most + recent command for the other number of the pair MUST NOT appear in + the controller list. + + o If an active Control Change command for controller number 121 + (Reset All Controllers) appears in the session history, the + controller list MAY omit logs for Control Change commands that + precede the Reset All Controllers command in the session history, + under certain conditions. + + Namely, a log MAY be omitted if the sender is certain that a + command stream follows the Reset All Controllers semantics defined + in [RP015] and if the log codes a controller number for which + [RP015] specifies a reset value. + + For example, [RP015] specifies that controller number 1 + (Modulation Wheel) is reset to the value 0, and thus a controller + log for Modulation Wheel MAY be omitted from the controller log + list. In contrast, [RP015] specifies that controller number 7 + (Channel Volume) is not reset, and thus a controller log for + Channel Volume MUST NOT be omitted from the controller log list. + + o Appendix A.3.4 defines exception rules for the MIDI Parameter + System controller numbers 6, 38, and 96-101. + +A.3.2. Controller Log Format + + Figure A.3.2 shows the controller log structure of Chapter C. + + 0 1 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |S| NUMBER |A| VALUE/ALT | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure A.3.2 -- Chapter C Controller Log + + The 7-bit NUMBER field identifies the controller number of the coded + command. The 7-bit VALUE/ALT field codes recovery information for + the command. The A bit sets the format of the VALUE/ALT field. + + A log encodes recovery information using one of the following tools: + the value tool, the toggle tool, or the count tool. + + + + + +Lazzaro & Wawrzynek Standards Track [Page 59] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + A log uses the value tool if the A bit is set to 0. The value tool + codes the 7-bit data value of a command in the VALUE/ALT field. The + value tool works best for controllers that code a continuous + quantity, such as number 1 (Modulation Wheel). + + The A bit is set to 1 to code the toggle or count tool. These tools + work best for controllers that code discrete actions. Figure A.3.3 + shows the controller log for these tools. + + 0 1 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |S| NUMBER |1|T| ALT | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure A.3.3 -- Controller Log for ALT Tools + + A log uses the toggle tool if the T bit is set to 0. A log uses the + count tool if the T bit is set to 1. Both methods use the 6-bit ALT + field as an unsigned integer. + + The toggle tool works best for controllers that act as on/off + switches, such as 64 (Damper Pedal (Sustain)). These controllers + code the "off" state with control values 0-63 and the "on" state with + 64-127. + + For the toggle tool, the ALT field codes the total number of toggles + (off->on and on->off) due to Control Change commands in the session + history, up to and including a toggle caused by the command coded by + the log. The toggle count includes toggles caused by Control Change + commands for controller number 121 (Reset All Controllers). + + Toggle counting is performed modulo 64. The toggle count is reset at + the start of a session and whenever a Reset State command (Appendix + A.1) appears in the session history. When these reset events occur, + the toggle count for a controller is set to 0 (for controllers whose + default value is 0-63) or 1 (for controllers whose default value is + 64-127). + + The Damper Pedal (Sustain) controller illustrates the benefits of the + toggle tool over the value tool for switch controllers. As often + used in piano applications, the "on" state of the controller lets + notes resonate, while the "off" state immediately damps notes to + silence. The loss of the "off" command in an "on->off->on" sequence + results in ringing notes that should have been damped silent. The + toggle tool lets receivers detect this lost "off" command, but the + value tool does not. + + + + +Lazzaro & Wawrzynek Standards Track [Page 60] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + The count tool is conceptually similar to the toggle tool. For the + count tool, the ALT field codes the total number of Control Change + commands in the session history, up to and including the command + coded by the log. Command counting is performed modulo 64. The + command count is set to 0 at the start of the session and is reset to + 0 whenever a Reset State command (Appendix A.1) appears in the + session history. + + Because the count tool ignores the data value, it is a good match for + controllers whose controller value is ignored, such as number 123 + (All Notes Off). More generally, the count tool may be used to code + a (modulo 64) identification number for a command. + +A.3.3. Log List Coding Rules + + In this section, we describe the organization of controller logs in + the Chapter C log list. + + A log encodes information about a particular Control Change command + in the session history. In most cases, a command SHOULD be coded by + a single tool (and, thus, a single log). If a number is coded with a + single tool and this tool is the count tool, recovery Control Change + commands generated by a receiver SHOULD use the default control value + for the controller. + + However, a command MAY be coded by several tool types (and, thus, + several logs, each using a different tool). This technique may + improve recovery performance for controllers with complex semantics, + such as controller number 84 (Portamento Control) or controller + number 121 (Reset All Controllers) when used with a non-zero data + octet (with the semantics described in [DLS2]). + + If a command is encoded by multiple tools, the logs MUST be placed in + the list in the following order: count tool log (if any), followed by + value tool log (if any), followed by toggle tool log (if any). + + The Chapter C log list MUST obey the oldest-first ordering rule + (defined in Appendix A.1). Note that this ordering preserves the + information necessary for the recovery of 14-bit controller values + without precluding the use of MSB and LSB controller pairs as + independent 7-bit controllers. + + In the default use of the payload format, all logs that appear in the + list for a controller number encode information about one Control + Change command -- namely, the most recent active Control Change + command in the session history for the number. + + + + + +Lazzaro & Wawrzynek Standards Track [Page 61] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + This coding scheme provides good recovery performance for the + standard uses of Control Change commands defined in [MIDI]. However, + not all MIDI applications restrict the use of Control Change commands + to those defined in [MIDI]. + + For example, consider the common MIDI encoding of rotary encoders + ("infinite" rotation knobs). The mixing console MIDI convention + defined in [LCP] codes the position of rotary encoders as a series of + Control Change commands. Each command encodes a relative change of + knob position from the last update (expressed as a clockwise or + counter-clockwise knob-turning angle). + + As the knob position is encoded incrementally over a series of + Control Change commands, the best recovery performance is obtained if + the log list encodes all Control Change commands for encoder + controller numbers that appear in the checkpoint history, not only + the most recent command. + + To support application areas that use Control Change commands in this + way, Chapter C may be configured to encode information about several + Control Change commands for a controller number. We use the term + "enhanced" to describe this encoding method, which we describe below. + + In Appendix C.2.3, we show how to configure a stream to use enhanced + Chapter C encoding for specific controller numbers. In Section 5 in + the main text, we show how the H bits in the recovery journal header + (Figure 8) and in the channel journal header (Figure 9) indicate the + use of enhanced Chapter C encoding. + + Here, we define how to encode a Chapter C log list that uses the + enhanced encoding method. + + Senders that use the enhanced encoding method for a controller number + MUST obey the rules below. These rules let a receiver determine + which logs in the list correspond to lost commands. Note that these + rules override the exceptions listed in Appendix A.3.1. + + o If N commands for a controller number are encoded in the list, the + commands MUST be the N most recent commands for the controller + number in the session history. For example, for N = 2, the sender + MUST encode the most recent command and the second most recent + command, not the most recent command and the third most recent + command. + + o If a controller number uses enhanced encoding, the encoding of the + least recent command for the controller number in the log list + MUST include a count tool log. In addition, if commands are + + + + +Lazzaro & Wawrzynek Standards Track [Page 62] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + encoded for the controller number whose logs have S bits set to 0, + the encoding of the least recent command with S = 0 logs MUST + include a count tool log. + + The count tool is OPTIONAL for the other commands for the + controller number encoded in the list, as a receiver is able to + efficiently deduce the count tool value for these commands for + both single-packet and multi-packet loss events. + + o The use of the value and toggle tools MUST be identical for all + commands for a controller number encoded in the list. For + example, either a value tool log MUST appear for all commands for + the controller number coded in the list or, alternatively, value + tool logs for the controller number MUST NOT appear in the list. + Likewise, either a toggle tool log MUST appear for all commands + for the controller number coded in the list or, alternatively, + toggle tool logs for the controller number MUST NOT appear in the + list. + + o If a command is encoded by multiple tools, the logs MUST be placed + in the list in the following order: count tool log (if any), + followed by value tool log (if any), followed by toggle tool log + (if any). + + These rules permit a receiver recovering from a packet loss to use + the count tool log to match the commands encoded in the list with its + own history of the stream, as we describe below. Note that the text + below describes a non-normative algorithm; receivers are free to use + any algorithm to match its history with the log list. + + In a typical implementation of the enhanced encoding method, a + receiver computes and stores count, value, and toggle tool data field + values for the most recent Control Change command it has received for + a controller number. + + After a loss event, a receiver parses the Chapter C list and + processes list logs for a controller number that uses enhanced + encoding as follows. + + The receiver compares the count tool ALT field for the least recent + command for the controller number in the list against its stored + count data for the controller number to determine if recovery is + necessary for the command coded in the list. The value and toggle + tool logs (if any) that directly follow the count tool log are + associated with this least recent command. + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 63] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + To check more recent commands for the controller, the receiver + detects additional value and/or toggle tool logs for the controller + number in the list and infers count tool data for the command coded + by these logs. This inferred data is used to determine if recovery + is necessary for the command coded by the value and/or toggle tool + logs. + + In this way, a receiver is able to execute only lost commands, + without executing a command twice. While recovering from a single + packet loss, a receiver may skip through S = 1 logs in the list, as + the first S = 0 log for an enhanced controller number is always a + count tool log. + + Note that the requirements in Appendix C.2.2.2 for protective sender + and receiver actions during session startup for multicast operation + are of particular importance for enhanced encoding, as receivers need + to initialize their count tool data structures with recovery journal + data in order to match commands correctly after a loss event. + + Finally, we note in passing that in some applications of rotary + encoders, a good user experience may be possible without the use of + enhanced encoding. These applications are distinguished by visual + feedback of encoding position that is driven by the post-recovery + rotary encoding stream and relatively low packet loss. In these + domains, recovery performance may be acceptable for rotary encoders + if the log list encodes only the most recent command and if both + count and value logs appear for the command. + +A.3.4. The Parameter System + + Readers may wish to review the Appendix A.1 definitions of "parameter + system", "parameter system transaction", and "initiated parameter + system transaction" before reading this section. + + Parameter system transactions update a MIDI Registered Parameter + Numbers (RPN) or Non-Registered Parameter Numbers (NRPN) value. A + parameter system transaction is a sequence of Control Change commands + that may use the following controllers numbers: + + o Data Entry MSB (6) + o Data Entry LSB (38) + o Data Increment (96) + o Data Decrement (97) + o Non-Registered Parameter Number (NRPN) LSB (98) + o Non-Registered Parameter Number (NRPN) MSB (99) + o Registered Parameter Numbers (RPN) LSB (100) + o Registered Parameter Numbers (RPN) MSB (101) + + + + +Lazzaro & Wawrzynek Standards Track [Page 64] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Control Change commands that are a part of a parameter system + transaction MUST NOT be coded in Chapter C controller logs. Instead, + these commands are coded in Chapter M, the MIDI Parameter chapter + defined in Appendix A.4. + + However, Control Change commands that use the listed controllers as + general-purpose controllers (i.e., outside of a parameter system + transaction) MUST NOT be coded in Chapter M. + + Instead, the controllers are coded in Chapter C controller logs. The + controller logs follow the coding rules stated in Appendix A.3.2 and + A.3.3. The rules for coding paired LSB and MSB controllers, as + defined in Appendix A.3.1, apply to the pairs (6, 38), (99, 98), and + (101, 100) when coded in Chapter C. + + If active Control Change commands for controller numbers 6, 38, or + 96-101 appear in the checkpoint history, and these commands are used + as general-purpose controllers, the most recent general-purpose + command instance for these controller numbers MUST appear as entries + in the Chapter C controller list. + + MIDI syntax permits a source to use controllers 6, 38, 96, and 97 as + parameter-system controllers and general-purpose controllers in the + same stream. An RTP MIDI sender MUST deduce the role of each Control + Change command for these controller numbers by noting the placement + of the command in the stream and MUST use this information to code + the command in Chapter C or Chapter M, as appropriate. + + Specifically, active Control Change commands for controllers 6, 38, + 96, and 97 act in a general-purpose way when + + o no active Control Change commands that set an RPN or NRPN + parameter number appear in the session history, or + + o the most recent active Control Change commands in the session + history that set an RPN or NRPN parameter number code the null + parameter (MSB value 0x7F, LSB value 0x7F), or + + o a Control Change command for controller number 121 (Reset All + Controllers) appears more recently in the session history than all + active Control Change commands that set an RPN or NRPN parameter + number (see [RP015] for details). + + Finally, we note that a MIDI source that follows the recommendations + of [MIDI] exclusively uses numbers 98-101 as parameter system + controllers. Alternatively, a MIDI source may exclusively use 98-101 + as general-purpose controllers and lose the ability to perform + parameter system transactions in a stream. + + + +Lazzaro & Wawrzynek Standards Track [Page 65] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + In the language of [MIDI], the general-purpose use of controllers + 98-101 constitutes a non-standard controller assignment. As most + real-world MIDI sources use the standard controller assignment for + controller numbers 98-101, an RTP MIDI sender SHOULD assume these + controllers act as parameter system controllers, unless it knows that + a MIDI source uses controller numbers 98-101 in a general-purpose + way. + +A.4. Chapter M: MIDI Parameter System + + Readers may wish to review the Appendix A.1 definitions for "C-active + commands", "parameter system", "parameter system transaction", and + "initiated parameter system transaction" before reading this + appendix. + + Chapter M protects parameter system transactions for Registered + Parameter Numbers (RPN) and Non-Registered Parameter Numbers (NRPN) + values. Figure A.4.1 shows the format for Chapter M. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |S|P|E|U|W|Z| LENGTH |Q| PENDING | Log list ... | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure A.4.1 -- Top-Level Chapter M Format + + Chapter M begins with a 2-octet header. If the P header bit is set + to 1, a 1-octet field follows the header, coding the 7-bit PENDING + value and its associated Q bit. + + The 10-bit LENGTH field codes the size of Chapter M and conforms to + semantics described in Appendix A.1. + + Chapter M ends with a list of zero or more variable-length parameter + logs. Appendix A.4.2 defines the bitfield format of a parameter log. + Appendix A.4.1 defines the inclusion semantics of the log list. + + A channel journal MUST contain Chapter M if the rules defined in + Appendix A.4.1 require that one or more parameter logs appear in the + list. + + A channel journal also MUST contain Chapter M if the most recent C- + active Control Change command involved in a parameter system + transaction in the checkpoint history is + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 66] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + o an RPN MSB (101) or NRPN MSB (99) controller, or + + o an RPN LSB (100) or NRPN LSB (98) controller that completes the + coding of the null parameter (MSB value 0x7F, LSB value 0x7F). + + This rule provides loss protection for partially transmitted + parameter numbers and for the null parameter numbers. + + If the most recent C-active Control Change command involved in a + parameter system transaction in the session history is for the RPN + MSB or NRPN MSB controller, the P header bit MUST be set to 1, and + the PENDING field (and its associated Q bit) MUST follow the Chapter + M header. Otherwise, the P header bit MUST be set to 0, and the + PENDING field and Q bit MUST NOT appear in Chapter M. + + If PENDING codes an NRPN MSB controller, the Q bit MUST be set to 1. + If PENDING codes an RPN MSB controller, the Q bit MUST be set to 0. + + The E header bit codes the current transaction state of the MIDI + stream. If E = 1, an initiated transaction is in progress. Below, + we define the rules for setting the E header bit: + + o If no C-active parameter system transaction Control Change + commands appear in the session history, the E bit MUST be set to + 0. + + o If the P header bit is set to 1, the E bit MUST be set to 0. + + o If the most recent C-active parameter system transaction Control + Change command in the session history is for the NRPN LSB or RPN + LSB controller number and if this command acts to complete the + coding of the null parameter (MSB value 0x7F, LSB value 0x7F), the + E bit MUST be set to 0. + + o Otherwise, an initiated transaction is in progress, and the E bit + MUST be set to 1. + + The U, W, and Z header bits code properties that are shared by all + parameter logs in the list. If these properties are set, parameter + logs may be coded with improved efficiency (we explain how in A.4.2). + + By default, the U, W, and Z bits MUST be set to 0. If all parameter + logs in the list code RPN parameters, the U bit MAY be set to 1. If + all parameter logs in the list code NRPN parameters, the W bit MAY be + set to 1. If the parameter numbers of all RPN and NRPN logs in the + list lie in the range 0-127 (and thus have an MSB value of 0), the Z + bit MAY be set to 1. + + + + +Lazzaro & Wawrzynek Standards Track [Page 67] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Note that C-active semantics appear in the preceding paragraphs + because [RP015] specifies that pending Parameter System transactions + are closed by a Control Change command for controller number 121 + (Reset All Controllers). + +A.4.1. Log Inclusion Rules + + Parameter logs code recovery information for a specific RPN or NRPN + parameter. + + A parameter log MUST appear in the list if an active Control Change + command that forms a part of an initiated transaction for the + parameter appears in the checkpoint history. + + An exception to this rule applies if the checkpoint history only + contains transaction Control Change commands for controller numbers + 98-101 that act to terminate the transaction. In this case, a log + for the parameter MAY be omitted from the list. + + A log MAY appear in the list if an active Control Change command that + forms a part of an initiated transaction for the parameter appears in + the session history. Otherwise, a log for the parameter MUST NOT + appear in the list. + + Multiple logs for the same RPN or NRPN parameter MUST NOT appear in + the log list. + + The parameter log list MUST obey the oldest-first ordering rule + (defined in Appendix A.1), with the phrase "parameter transaction" + replacing the word "command" in the rule definition. + + Parameter logs associated with the RPN or NRPN null parameter (LSB = + 0x7F, MSB = 0x7F) MUST NOT appear in the log list. Chapter M uses + the E header bit (Figure A.4.1) and the log list ordering rules to + code null parameter semantics. + + Note that "active" semantics (rather than "C-active" semantics) + appear in the preceding paragraphs because [RP015] specifies that + pending Parameter System transactions are not reset by a Control + Change command for controller number 121 (Reset All Controllers). + However, the rule that follows uses C-active semantics because it + concerns the protection of the transaction system itself, and [RP015] + specifies that Reset All Controllers acts to close a transaction in + progress. + + In most cases, parameter logs for RPN and NRPN parameters that are + assigned to the ch_never parameter (Appendix C.2.3) MAY be omitted + from the list. An exception applies if + + + +Lazzaro & Wawrzynek Standards Track [Page 68] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + o the log codes the most recent initiated transaction in the session + history, and + + o a C-active command that forms a part of the transaction appears in + the checkpoint history, and + + o the E header bit for the top-level Chapter M header (Figure A.4.1) + is set to 1. + + In this case, a log for the parameter MUST appear in the list. This + log informs receivers recovering from a loss that a transaction is in + progress so that the receiver is able to correctly interpret RPN or + NRPN Control Change commands that follow the loss event. + +A.4.2. Log Coding Rules + + Figure A.4.2 shows the parameter log structure of Chapter M. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 8 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |S| PNUM-LSB |Q| PNUM-MSB |J|K|L|M|N|T|V|R| Fields ... | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure A.4.2 -- Parameter Log Format + + The log begins with a header, whose default size (as shown in Figure + A.4.2) is 3 octets. If the Q header bit is set to 0, the log encodes + an RPN parameter. If Q = 1, the log encodes an NRPN parameter. The + 7-bit PNUM-MSB and PNUM-LSB fields code the parameter number and + reflect the Control Change command data values for controllers 99 and + 98 (for NRPN parameters) or 101 and 100 (for RPN parameters). + + The J, K, L, M, and N header bits form a Table of Contents (TOC) for + the log and signal the presence of fixed-sized fields that follow the + header. A header bit that is set to 1 codes the presence of a field + in the log. The ordering of fields in the log follows the ordering + of the header bits in the TOC. Appendices A.4.2.1 and A.4.2.2 define + the fields associated with each TOC header bit. + + The T and V header bits code information about the parameter log but + are not part of the TOC. A set T or V bit does not signal the + presence of any parameter log field. + + If the rules in Appendix A.4.1 state that a log for a given parameter + MUST appear in Chapter M, the log MUST code sufficient information to + protect the parameter from the loss of active parameter transaction + Control Change commands in the checkpoint history. + + + +Lazzaro & Wawrzynek Standards Track [Page 69] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + This rule does not apply if the parameter coded by the log is + assigned to the ch_never parameter (Appendix C.2.3). In this case, + senders MAY choose to set the J, K, L, M, and N TOC bits to 0, coding + a parameter log with no fields. + + Note that logs to protect parameters that are assigned to ch_never + are REQUIRED under certain conditions (see Appendix A.4.1). The + purpose of the log is to inform receivers recovering from a loss that + a transaction is in progress so that the receiver is able to + correctly interpret RPN or NRPN Control Change commands that follow + the loss event. + + Parameter logs provide two tools for parameter protection: the value + tool and the count tool. Depending on the semantics of the + parameter, senders may use either tool, both tools, or neither tool + to protect a given parameter. + + The value tool codes information a receiver may use to determine the + current value of an RPN or NRPN parameter. If a parameter log uses + the value tool, the V header bit MUST be set to 1, and the semantics + defined in Appendix A.4.2.1 for setting the J, K, L, and M TOC bits + MUST be followed. If a parameter log does not use the value tool, + the V bit MUST be set to 0, and the J, K, L, and M TOC bits MUST also + be set to 0. + + The count tool codes the number of transactions for an RPN or NRPN + parameter. If a parameter log uses the count tool, the T header bit + MUST be set to 1, and the semantics defined in Appendix A.4.2.2 for + setting the N TOC bit MUST be followed. If a parameter log does not + use the count tool, the T bit and the N TOC bit MUST be set to 0. + + Note that V and T are set if the sender uses value (V) or count (T) + tool for the log on an ongoing basis. Thus, V may be set even if J = + K = L = M = 0, and T may be set even if N = 0. + + In many cases, all parameters coded in the log list are of one type + (RPN parameters or NRPN parameters), and all parameter numbers lie in + the range 0-127. As described in Appendix A.4, senders MAY signal + this condition by setting the top-level Chapter M header bit Z to 1 + (to code the restricted range) and by setting the U or W bit to 1 (to + code the parameter type). + + If the top-level Chapter M header codes Z = 1 and either U = 1 or W = + 1, all logs in the parameter log list MUST use a modified header + format. This modification deletes bits 8-15 of the bitfield shown in + Figure A.4.2 to yield a 2-octet header. The values of the deleted + PNUM-MSB and Q fields may be inferred from the U, W, and Z bit + values. + + + +Lazzaro & Wawrzynek Standards Track [Page 70] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + +A.4.2.1. The Value Tool + + The value tool uses several fields to track the value of an RPN or + NRPN parameter. + + The J TOC bit codes the presence of the octet shown in Figure A.4.3 + in the field list. + + 0 + 0 1 2 3 4 5 6 7 + +-+-+-+-+-+-+-+-+ + |X| ENTRY-MSB | + +-+-+-+-+-+-+-+-+ + + Figure A.4.3 -- ENTRY-MSB Field + + The 7-bit ENTRY-MSB field codes the data value of the most recent + active Control Change command for controller number 6 (Data Entry + MSB) in the session history that appears in a transaction for the log + parameter. + + The X bit MUST be set to 1 if the command coded by ENTRY-MSB precedes + the most recent Control Change command for controller 121 (Reset All + Controllers) in the session history. Otherwise, the X bit MUST be + set to 0. + + A parameter log that uses the value tool MUST include the ENTRY-MSB + field if an active Control Change command for controller number 6 + appears in the checkpoint history. + + Note that [RP015] specifies that Control Change commands for + controller 121 (Reset All Controllers) do not reset RPN and NRPN + values, and thus the X bit would not play a recovery role for MIDI + systems that comply with [RP015]. + + However, certain renderers (such as DLS 2 [DLS2]) specify that + certain RPN values are reset for some uses of Reset All Controllers. + The X bit (and other bitfield features of this nature in this + appendix) plays a role in recovery for renderers of this type. + + The K TOC bit codes the presence of the octet shown in Figure A.4.4 + in the field list. + + + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 71] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + 0 + 0 1 2 3 4 5 6 7 + +-+-+-+-+-+-+-+-+ + |X| ENTRY-LSB | + +-+-+-+-+-+-+-+-+ + + Figure A.4.4 -- ENTRY-LSB Field + + The 7-bit ENTRY-LSB field codes the data value of the most recent + active Control Change command for controller number 38 (Data Entry + LSB) in the session history that appears in a transaction for the log + parameter. + + The X bit MUST be set to 1 if the command coded by ENTRY-LSB precedes + the most recent Control Change command for controller 121 (Reset All + Controllers) in the session history. Otherwise, the X bit MUST be + set to 0. + + As a rule, a parameter log that uses the value tool MUST include the + ENTRY-LSB field if an active Control Change command for controller + number 38 appears in the checkpoint history. However, the ENTRY-LSB + field MUST NOT appear in a parameter log if the Control Change + command associated with the ENTRY-LSB precedes a Control Change + command for controller number 6 (Data Entry MSB) that appears in a + transaction for the log parameter in the session history. + + The L TOC bit codes the presence of the octets shown in Figure A.4.5 + in the field list. + + 0 1 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |G|X| A-BUTTON | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure A.4.5 -- A-BUTTON Field + + The 14-bit A-BUTTON field codes a count of the number of active + Control Change commands for controller numbers 96 and 97 (Data + Increment and Data Decrement) in the session history that appear in a + transaction for the log parameter. + + The M TOC bit codes the presence of the octets shown in Figure A.4.6 + in the field list. + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 72] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + 0 1 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |G|R| C-BUTTON | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure A.4.6 -- C-BUTTON Field + + The 14-bit C-BUTTON field has semantics identical to A-BUTTON, except + that Data Increment and Data Decrement Control Change commands that + precede the most recent Control Change command for controller 121 + (Reset All Controllers) in the session history are not counted. + + For both A-BUTTON and C-BUTTON, Data Increment and Data Decrement + Control Change commands are not counted if they precede Control + Changes commands for controller numbers 6 (Data Entry MSB) or 38 + (Data Entry LSB) that appear in a transaction for the log parameter + in the session history. + + The A-BUTTON and C-BUTTON fields are interpreted as unsigned + integers, and the G bit associated with the field codes the sign of + the integer (G = 0 for positive or zero, G = 1 for negative). + + To compute and code the count value, initialize the count value to 0, + add 1 for each qualifying Data Increment command, and subtract 1 for + each qualifying Data Decrement command. After each addition or + subtraction, limit the count magnitude to 16383. The G bit codes the + sign of the count, and the A-BUTTON or C-BUTTON field codes the count + magnitude. + + For the A-BUTTON field, if the most recent qualified Data Increment + or Data Decrement command precedes the most recent Control Change + command for controller 121 (Reset All Controllers) in the session + history, the X bit associated with A-BUTTON field MUST be set to 1. + Otherwise, the X bit MUST be set to 0. + + A parameter log that uses the value tool MUST include the A-BUTTON + and C-BUTTON fields if an active Control Change command for + controller numbers 96 or 97 appears in the checkpoint history. + However, to improve coding efficiency, this rule has several + exceptions: + + o If the log includes the A-BUTTON field, and if the X bit of the A- + BUTTON field is set to 1, the C-BUTTON field (and its associated R + and G bits) MAY be omitted from the log. + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 73] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + o If the log includes the A-BUTTON field, and if the A-BUTTON and C- + BUTTON fields (and their associated G bits) code identical values, + the C-BUTTON field (and its associated R and G bits) MAY be + omitted from the log. + +A.4.2.2. The Count Tool + + The count tool tracks the number of transactions for an RPN or NRPN + parameter. The N TOC bit codes the presence of the octet shown in + Figure A.4.7 in the field list. + + 0 + 0 1 2 3 4 5 6 7 + +-+-+-+-+-+-+-+-+ + |X| COUNT | + +-+-+-+-+-+-+-+-+ + + Figure A.4.7 -- COUNT Field + + The 7-bit COUNT codes the number of initiated transactions for the + log parameter that appear in the session history. Initiated + transactions are counted if they contain one or more active Control + Change commands, including commands for controllers 98-101 that + initiate the parameter transaction. + + If the most recent counted transaction precedes the most recent + Control Change command for controller 121 (Reset All Controllers) in + the session history, the X bit associated with the COUNT field MUST + be set to 1. Otherwise, the X bit MUST be set to 0. + + Transaction counting is performed modulo 128. The transaction count + is set to 0 at the start of a session and is reset to 0 whenever a + Reset State command (Appendix A.1) appears in the session history. + + A parameter log that uses the count tool MUST include the COUNT field + if an active command that increments the transaction count (modulo + 128) appears in the checkpoint history. + +A.5. Chapter W: MIDI Pitch Wheel + + A channel journal MUST contain Chapter W if a C-active MIDI Pitch + Wheel (0xE) command appears in the checkpoint history. Figure A.5.1 + shows the format for Chapter W. + + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 74] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + 0 1 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |S| FIRST |R| SECOND | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure A.5.1 -- Chapter W Format + + The chapter has a fixed size of 16 bits. The FIRST and SECOND fields + are the 7-bit values of the first and second data octets of the most + recent active Pitch Wheel command in the session history. + + Note that Chapter W encodes C-active commands and thus does not + encode active commands that are not C-active (see the second-to-last + paragraph of Appendix A.1 for an explanation of chapter inclusion + text in this regard). + + Chapter W does not encode "active but not C-active" commands because + [RP015] declares that Control Change commands for controller number + 121 (Reset All Controllers) act to reset the Pitch Wheel value to 0. + If Chapter W encoded "active but not C-active" commands, a repair + operation following a Reset All Controllers command could incorrectly + repair the stream with a stale Pitch Wheel value. + +A.6. Chapter N: MIDI NoteOff and NoteOn + + In this appendix, we consider NoteOn commands with zero velocity to + be NoteOff commands. Readers may wish to review the Appendix A.1 + definition of "N-active commands" before reading this appendix. + + Chapter N completely protects note commands in streams that alternate + between NoteOn and NoteOff commands for a particular note number. + However, in rare applications, multiple overlapping NoteOn commands + may appear for a note number. Chapter E, described in Appendix A.7, + augments Chapter N to completely protect these streams. + + A channel journal MUST contain Chapter N if an N-active MIDI NoteOn + (0x9) or NoteOff (0x8) command appears in the checkpoint history. + Figure A.6.1 shows the format for Chapter N. + + + + + + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 75] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 8 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |B| LEN | LOW | HIGH |S| NOTENUM |Y| VELOCITY | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |S| NOTENUM |Y| VELOCITY | .... | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | OFFBITS | OFFBITS | .... | OFFBITS | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure A.6.1 -- Chapter N Format + + Chapter N consists of a 2-octet header followed by at least one of + the following data structures: + + o A list of note logs to code NoteOn commands. + o A NoteOff bitfield structure to code NoteOff commands. + + We define the header bitfield semantics in Appendix A.6.1. We define + the note log semantics and the NoteOff bitfield semantics in Appendix + A.6.2. + + If one or more N-active NoteOn or NoteOff commands in the checkpoint + history reference a note number, the note number MUST be coded in + either the note log list or the NoteOff bitfield structure. + + The note log list MUST contain an entry for all note numbers whose + most recent checkpoint history appearance is in an N-active NoteOn + command. The NoteOff bitfield structure MUST contain a set bit for + all note numbers whose most recent checkpoint history appearance is + in an N-active NoteOff command. + + A note number MUST NOT be coded in both structures. + + All note logs and NoteOff bitfield set bits MUST code the most recent + N-active NoteOn or NoteOff reference to a note number in the session + history. + + The note log list MUST obey the oldest-first ordering rule (defined + in Appendix A.1). + + + + + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 76] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + +A.6.1. Header Structure + + The header for Chapter N, shown in Figure A.6.2, codes the size of + the note list and bitfield structures. + + 0 1 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |B| LEN | LOW | HIGH | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure A.6.2 -- Chapter N Header + + The LEN field, a 7-bit integer value, codes the number of 2-octet + note logs in the note list. Zero is a valid value for LEN and codes + an empty note list. + + The 4-bit LOW and HIGH fields code the number of OFFBITS octets that + follow the note log list. LOW and HIGH are unsigned integer values. + If LOW <= HIGH, there are (HIGH - LOW + 1) OFFBITS octets in the + chapter. The value pairs (LOW = 15, HIGH = 0) and (LOW = 15, HIGH = + 1) code an empty NoteOff bitfield structure (i.e., no OFFBITS + octets). Other (LOW > HIGH) value pairs MUST NOT appear in the + header. + + The B bit provides S-bit functionality (Appendix A.1) for the NoteOff + bitfield structure. By default, the B bit MUST be set to 1. + However, if the MIDI command section of the previous packet (packet I + - 1, with I as defined in Appendix A.1) includes a NoteOff command + for the channel, the B bit MUST be set to 0. If the B bit is set to + 0, the higher-level recovery journal elements that contain Chapter N + MUST have S bits that are set to 0, including the top-level journal + header. + + The LEN value of 127 codes a note list length of 127 or 128 note + logs, depending on the values of LOW and HIGH. If LEN = 127, LOW = + 15, and HIGH = 0, the note list holds 128 note logs, and the NoteOff + bitfield structure is empty. For other values of LOW and HIGH, LEN = + 127 codes that the note list contains 127 note logs. In this case, + the chapter has (HIGH - LOW + 1) NoteOff OFFBITS octets if LOW <= + HIGH and has no OFFBITS octets if LOW = 15 and HIGH = 1. + + + + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 77] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + +A.6.2. Note Structures + + Figure A.6.3 shows the 2-octet note log structure. + + 0 1 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |S| NOTENUM |Y| VELOCITY | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure A.6.3 -- Chapter N Note Log + + The 7-bit NOTENUM field codes the note number for the log. A note + number MUST NOT be represented by multiple note logs in the note + list. + + The 7-bit VELOCITY field codes the velocity value for the most recent + N-active NoteOn command for the note number in the session history. + Multiple overlapping NoteOns for a given note number may be coded + using Chapter E, as discussed in Appendix A.7. + + VELOCITY is never zero; NoteOn commands with zero velocity are coded + as NoteOff commands in the NoteOff bitfield structure. + + The note log does not code the execution time of the NoteOn command. + However, the Y bit codes a hint from the sender about the NoteOn + execution time. The Y bit codes a recommendation to play (Y = 1) or + skip (Y = 0) the NoteOn command recovered from the note log. See + Section 4.2 of [RFC4696] for non-normative guidance on the use of the + Y bit. + + Figure A.6.1 shows the NoteOff bitfield structure as the list of + OFFBITS octets at the end of the chapter. A NoteOff OFFBITS octet + codes NoteOff information for eight consecutive MIDI note numbers, + with the most significant bit representing the lowest note number. + The most significant bit of the first OFFBITS octet codes the note + number 8*LOW; the most significant bit of the last OFFBITS octet + codes the note number 8*HIGH. + + A set bit codes a NoteOff command for the note number. In the most + efficient coding for the NoteOff bitfield structure, the first and + last octets of the structure contain at least one set bit. Note that + Chapter N does not code NoteOff velocity data. + + Note that in the general case, the recovery journal does not code the + relative placement of a NoteOff command and a Change Control command + for controller 64 (Damper Pedal (Sustain)). In many cases, a + receiver processing a loss event may deduce this relative placement + + + +Lazzaro & Wawrzynek Standards Track [Page 78] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + from the history of the stream and thus determine if a NoteOff note + is sustained by the pedal. If such a determination is not possible, + receivers SHOULD err on the side of silencing pedal sustains, as + erroneously sustained notes may produce unpleasant (albeit transient) + artifacts. + +A.7. Chapter E: MIDI Note Command Extras + + Readers may wish to review the Appendix A.1 definition of "N-active + commands" before reading this appendix. In this appendix, a NoteOn + command with a velocity of 0 is considered to be a NoteOff command + with a release velocity value of 64. + + Chapter E encodes recovery information about MIDI NoteOn (0x9) and + NoteOff (0x8) command features that rarely appear in MIDI streams. + Receivers use Chapter E to reduce transient artifacts for streams + where several NoteOn commands appear for a note number without an + intervening NoteOff. Receivers also use Chapter E to reduce + transient artifacts for streams that use NoteOff release velocity. + Chapter E supplements the note information coded in Chapter N + (Appendix A.6). + + Figure A.7.1 shows the format for Chapter E. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 8 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |S| LEN |S| NOTENUM |V| COUNT/VEL |S| NOTENUM | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |V| COUNT/VEL | .... | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure A.7.1 -- Chapter E Format + + The chapter consists of a 1-octet header followed by a variable- + length list of 2-octet note logs. Appendix A.7.1 defines the + bitfield format for a note log. + + The log list MUST contain at least one note log. The 7-bit LEN + header field codes the number of note logs in the list, minus one. A + channel journal MUST contain Chapter E if the rules defined in this + appendix require that one or more note logs appear in the list. The + note log list MUST obey the oldest-first ordering rule (defined in + Appendix A.1). + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 79] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + +A.7.1. Note Log Format + + Figure A.7.2 reproduces the note log structure of Chapter E. + + 0 1 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |S| NOTENUM |V| COUNT/VEL | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure A.7.2 -- Chapter E Note Log + + A note log codes information about the MIDI note number coded by the + 7-bit NOTENUM field. The nature of the information depends on the + value of the V flag bit. + + If the V bit is set to 1, the COUNT/VEL field codes the release + velocity value for the most recent N-active NoteOff command for the + note number that appears in the session history. + + If the V bit is set to 0, the COUNT/VEL field codes a reference count + of the number of NoteOn and NoteOff commands for the note number that + appears in the session history. + + The reference count is set to 0 at the start of the session. NoteOn + commands increment the count by 1. NoteOff commands decrement the + count by 1. However, a decrement that generates a negative count + value is not performed. + + If the reference count is in the range 0-126, the 7-bit COUNT/VEL + field codes an unsigned integer representation of the count. If the + count is greater than or equal to 127, COUNT/VEL is set to 127. + + By default, the count is reset to 0 whenever a Reset State command + (Appendix A.1) appears in the session history and whenever MIDI + Control Change commands for controller numbers 123-127 (numbers with + All Notes Off semantics) or 120 (All Sound Off) appear in the session + history. + +A.7.2. Log Inclusion Rules + + If the most recent N-active NoteOn or NoteOff command for a note + number in the checkpoint history is a NoteOff command with a release + velocity value other than 64, a note log whose V bit is set to 1 MUST + appear in Chapter E for the note number. + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 80] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + If the most recent N-active NoteOn or NoteOff command for a note + number in the checkpoint history is a NoteOff command, and if the + reference count for the note number is greater than 0, a note log + whose V bit is set to 0 MUST appear in Chapter E for the note number. + + If the most recent N-active NoteOn or NoteOff command for a note + number in the checkpoint history is a NoteOn command, and if the + reference count for the note number is greater than 1, a note log + whose V bit is set to 0 MUST appear in Chapter E for the note number. + + At most, two note logs MAY appear in Chapter E for a note number: one + log whose V bit is set to 0 and one log whose V bit is set to 1. + + Chapter E codes a maximum of 128 note logs. If the log inclusion + rules yield more than 128 REQUIRED logs, note logs whose V bit is set + to 1 MUST be dropped from Chapter E in order to reach the 128-log + limit. Note logs whose V bit is set to 0 MUST NOT be dropped. + + Most MIDI streams do not use NoteOn and NoteOff commands in ways that + would trigger the log inclusion rules. For these streams, Chapter E + would never be REQUIRED to appear in a channel journal. + + The ch_never parameter (Appendix C.2.3) may be used to configure the + log inclusion rules for Chapter E. + +A.8. Chapter T: MIDI Channel Aftertouch + + A channel journal MUST contain Chapter T if an N-active and C-active + MIDI Channel Aftertouch (0xD) command appears in the checkpoint + history. Figure A.8.1 shows the format for Chapter T. + + 0 + 0 1 2 3 4 5 6 7 + +-+-+-+-+-+-+-+-+ + |S| PRESSURE | + +-+-+-+-+-+-+-+-+ + + Figure A.8.1 -- Chapter T Format + + The chapter has a fixed size of 8 bits. The 7-bit PRESSURE field + holds the pressure value of the most recent N-active and C-active + Channel Aftertouch command in the session history. + + Chapter T only encodes commands that are C-active and N-active. We + define a C-active restriction because [RP015] declares that a Control + Change command for controller 121 (Reset All Controllers) acts to + reset the channel pressure to 0 (see the discussion at the end of + Appendix A.5 for a more complete rationale). + + + +Lazzaro & Wawrzynek Standards Track [Page 81] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + We define an N-active restriction on the assumption that aftertouch + commands are linked to note activity, and thus Channel Aftertouch + commands that are not N-active are stale and should not be used to + repair a stream. + +A.9. Chapter A: MIDI Poly Aftertouch + + A channel journal MUST contain Chapter A if a C-active Poly + Aftertouch (0xA) command appears in the checkpoint history. Figure + A.9.1 shows the format for Chapter A. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 8 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |S| LEN |S| NOTENUM |X| PRESSURE |S| NOTENUM | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |X| PRESSURE | .... | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure A.9.1 -- Chapter A Format + + The chapter consists of a 1-octet header followed by a variable- + length list of 2-octet note logs. A note log MUST appear for a note + number if a C-active Poly Aftertouch command for the note number + appears in the checkpoint history. A note number MUST NOT be + represented by multiple note logs in the note list. The note log + list MUST obey the oldest-first ordering rule (defined in Appendix + A.1). + + The 7-bit LEN field codes the number of note logs in the list, minus + one. Figure A.9.2 reproduces the note log structure of Chapter A. + + 0 1 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |S| NOTENUM |X| PRESSURE | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure A.9.2 -- Chapter A Note Log + + The 7-bit PRESSURE field codes the pressure value of the most recent + C-active Poly Aftertouch command in the session history for the MIDI + note number coded in the 7-bit NOTENUM field. + + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 82] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + As a rule, the X bit MUST be set to 0. However, the X bit MUST be + set to 1 if the command coded by the log appears before one of the + following commands in the session history: MIDI Control Change + numbers 123-127 (numbers with All Notes Off semantics) or 120 (All + Sound Off). + + We define C-active restrictions for Chapter A because [RP015] + declares that a Control Change command for controller 121 (Reset All + Controllers) acts to reset the polyphonic pressure to 0 (see the + discussion at the end of Appendix A.5 for a more complete rationale). + +Appendix B. The Recovery Journal System Chapters + +B.1. System Chapter D: Simple System Commands + + The system journal MUST contain Chapter D if an active MIDI Reset + (0xFF), MIDI Tune Request (0xF6), MIDI Song Select (0xF3), undefined + MIDI System Common (0xF4 and 0xF5), or undefined MIDI System Real- + Time (0xF9 and 0xFD) command appears in the checkpoint history. + + Figure B.1.1 shows the variable-length format for Chapter D. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |S|B|G|H|J|K|Y|Z| Command logs ... | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure B.1.1 -- System Chapter D Format + + The chapter consists of a 1-octet header followed by one or more + command logs. Header flag bits indicate the presence of command logs + for the Reset (B = 1), Tune Request (G = 1), Song Select (H = 1), + undefined System Common 0xF4 (J = 1), undefined System Common 0xF5 (K + = 1), undefined System Real-Time 0xF9 (Y = 1), or undefined System + Real-Time 0xFD (Z = 1) commands. + + Command logs appear in a list following the header, in the order that + the flag bits appear in the header. + + + + + + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 83] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Figure B.1.2 shows the 1-octet command log format for the Reset and + Tune Request commands. + + 0 + 0 1 2 3 4 5 6 7 + +-+-+-+-+-+-+-+-+ + |S| COUNT | + +-+-+-+-+-+-+-+-+ + + Figure B.1.2 -- Command Log for Reset and Tune Request + + Chapter D MUST contain the Reset command log if an active Reset + command appears in the checkpoint history. The 7-bit COUNT field + codes the total number of Reset commands (modulo 128) present in the + session history. + + Chapter D MUST contain the Tune Request command log if an active Tune + Request command appears in the checkpoint history. The 7-bit COUNT + field codes the total number of Tune Request commands (modulo 128) + present in the session history. + + For these commands, the COUNT field acts as a reference count. See + the definition of "session history reference counts" in Appendix A.1 + for more information. + + Figure B.1.3 shows the 1-octet command log format for the Song Select + command. + + 0 + 0 1 2 3 4 5 6 7 + +-+-+-+-+-+-+-+-+ + |S| VALUE | + +-+-+-+-+-+-+-+-+ + + Figure B.1.3 -- Song Select Command Log Format + + Chapter D MUST contain the Song Select command log if an active Song + Select command appears in the checkpoint history. The 7-bit VALUE + field codes the song number of the most recent active Song Select + command in the session history. + +B.1.1. Undefined System Commands + + In this section, we define the Chapter D command logs for the + undefined system commands. [MIDI] reserves the undefined system + commands 0xF4, 0xF5, 0xF9, and 0xFD for future use. At the time of + this writing, any MIDI command stream that uses these commands is + + + + +Lazzaro & Wawrzynek Standards Track [Page 84] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + non-compliant with [MIDI]. However, future versions of [MIDI] may + define these commands, and a few products do use these commands in a + non-compliant manner. + + Figure B.1.4 shows the variable-length command log format for the + undefined System Common commands (0xF4 and 0xF5). + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |S|C|V|L|DSZ| LENGTH | COUNT | VALUE ... | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | LEGAL ... | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure B.1.4 -- Undefined System Common Command Log Format + + The command log codes a single command type (0xF4 or 0xF5, not both). + Chapter D MUST contain a command log if an active 0xF4 command + appears in the checkpoint history and MUST contain an independent + command log if an active 0xF5 command appears in the checkpoint + history. + + A Chapter D Undefined System Common command log consists of a two- + octet header followed by a variable number of data fields. Header + flag bits indicate the presence of the COUNT field (C = 1), the VALUE + field (V = 1), and the LEGAL field (L = 1). The 10-bit LENGTH field + codes the size of the command log and conforms to semantics described + in Appendix A.1. + + The 2-bit DSZ field codes the number of data octets in the command + instance that appears most recently in the session history. If DSZ = + 0-2, the command has 0-2 data octets. If DSZ = 3, the command has 3 + or more command data octets. + + We now define the default rules for the use of the COUNT, VALUE, and + LEGAL fields. The session configuration tools defined in Appendix + C.2.3 may be used to override this behavior. + + By default, if the DSZ field is set to 0, the command log MUST + include the COUNT field. The 8-bit COUNT field codes the total + number of commands of the type coded by the log (0xF4 or 0xF5) + present in the session history, modulo 256. + + By default, if the DSZ field is set to 1-3, the command log MUST + include the VALUE field. The variable-length VALUE field codes a + verbatim copy the data octets for the most recent use of the command + + + + +Lazzaro & Wawrzynek Standards Track [Page 85] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + type coded by the log (0xF4 or 0xF5) in the session history. The + most significant bit of the final data octet MUST be set to 1, and + the most significant bit of all other data octets MUST be set to 0. + + The LEGAL field is reserved for future use. If an update to [MIDI] + defines the 0xF4 or 0xF5 command, an IETF Standards-Track document + may define the LEGAL field. Until such a document appears, senders + MUST NOT use the LEGAL field, and receivers MUST use the LENGTH field + to skip over the LEGAL field. The LEGAL field would be defined by + the IETF if the semantics of the new 0xF4 or 0xF5 command could not + be protected from packet loss via the use of the COUNT and VALUE + fields. + + Figure B.1.5 shows the variable-length command log format for the + undefined System Real-Time commands (0xF9 and 0xFD). + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |S|C|L| LENGTH | COUNT | LEGAL ... | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure B.1.5 -- Undefined System Real-Time Command Log Format + + The command log codes a single command type (0xF9 or 0xFD, not both). + Chapter D MUST contain a command log if an active 0xF9 command + appears in the checkpoint history and MUST contain an independent + command log if an active 0xFD command appears in the checkpoint + history. + + A Chapter D Undefined System Real-Time command log consists of a one- + octet header followed by a variable number of data fields. Header + flag bits indicate the presence of the COUNT field (C = 1) and the + LEGAL field (L = 1). The 5-bit LENGTH field codes the size of the + command log and conforms to semantics described in Appendix A.1. + + We now define the default rules for the use of the COUNT and LEGAL + fields. The session configuration tools defined in Appendix C.2.3 + may be used to override this behavior. + + The 8-bit COUNT field codes the total number of commands of the type + coded by the log present in the session history, modulo 256. By + default, the COUNT field MUST be present in the command log. + + The LEGAL field is reserved for future use. If an update to [MIDI] + defines the 0xF9 or 0xFD command, an IETF Standards-Track document + may define the LEGAL field to protect the command. Until such a + document appears, senders MUST NOT use the LEGAL field, and receivers + + + +Lazzaro & Wawrzynek Standards Track [Page 86] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + MUST use the LENGTH field to skip over the LEGAL field. The LEGAL + field would be defined by the IETF if the semantics of the new 0xF9 + or 0xFD command could not be protected from packet loss via the use + of the COUNT field. + + Finally, we note that some non-standard uses of the undefined System + Real-Time commands act to implement non-compliant variants of the + MIDI sequencer system. In Appendix B.3.1, we describe resiliency + tools for the MIDI sequencer system that provide some protection in + this case. + +B.2. System Chapter V: Active Sense Command + + The system journal MUST contain Chapter V if an active MIDI Active + Sense (0xFE) command appears in the checkpoint history. Figure B.2.1 + shows the format for Chapter V. + + 0 + 0 1 2 3 4 5 6 7 + +-+-+-+-+-+-+-+-+ + |S| COUNT | + +-+-+-+-+-+-+-+-+ + + Figure B.2.1 -- System Chapter V Format + + The 7-bit COUNT field codes the total number of Active Sense commands + (modulo 128) present in the session history. The COUNT field acts as + a reference count. See the definition of "session history reference + counts" in Appendix A.1 for more information. + +B.3. System Chapter Q: Sequencer State Commands + + This appendix describes Chapter Q, the system chapter for the MIDI + sequencer commands. + + The system journal MUST contain Chapter Q if an active MIDI Song + Position Pointer (0xF2), MIDI Clock (0xF8), MIDI Start (0xFA), MIDI + Continue (0xFB), or MIDI Stop (0xFC) command appears in the + checkpoint history and if the rules defined in this appendix require + a change in the Chapter Q bitfield contents because of the command + appearance. + + + + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 87] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Figure B.3.1 shows the variable-length format for Chapter Q. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |S|N|D|C|T| TOP | CLOCK | TIMETOOLS ... | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | ... | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure B.3.1 -- System Chapter Q Format + + Chapter Q consists of a 1-octet header followed by several optional + fields, in the order shown in Figure B.3.1. + + Header flag bits signal the presence of the 16-bit CLOCK field (C = + 1) and the 24-bit TIMETOOLS field (T = 1). The 3-bit TOP header + field is interpreted as an unsigned integer, as are CLOCK and + TIMETOOLS. We describe the TIMETOOLS field in Appendix B.3.1. + + Chapter Q encodes the most recent state of the sequencer system. + Receivers use the chapter to resynchronize the sequencer after a + packet loss episode. Chapter fields encode the on/off state of the + sequencer, the current position in the song, and the downbeat. + + The N header bit encodes the relative occurrence of the Start, Stop, + and Continue commands in the session history. If an active Start or + Continue command appears most recently, the N bit MUST be set to 1. + If an active Stop appears most recently, or if no active Start, Stop, + or Continue commands appear in the session history, the N bit MUST be + set to 0. + + The C header flag, the TOP header field, and the CLOCK field act to + code the current position in the sequence: + + o If C = 1, the 3-bit TOP header field and the 16-bit CLOCK field + are combined to form the 19-bit unsigned quantity 65536*TOP + + CLOCK. This value encodes the song position in units of MIDI + Clocks (24 clocks per quarter note), modulo 524288. Note that the + maximum song position value that may be coded by the Song Position + Pointer command is 98303 clocks (which may be coded with 17 bits) + and that MIDI-coded songs are generally constructed to avoid + durations longer than this value. However, the 19-bit size may be + useful for real-time applications, such as a drum machine MIDI + output that is sending clock commands for long periods of time. + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 88] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + o If C = 0, the song position is the start of the song. The C = 0 + position is identical to the position coded by C = 1, TOP = 0, and + CLOCK = 0, for the case where the song position is less than + 524288 MIDI clocks. In certain situations (defined later in this + section), normative text may require the C = 0 or the C = 1, TOP = + 0, CLOCK = 0 encoding of the start of the song. + + The C, TOP, and CLOCK fields MUST be set to code the current song + position, for both N = 0 and N = 1 conditions. If C = 0, the TOP + field MUST be set to 0. See [MIDI] for a precise definition of a + song position. + + The D header bit encodes information about the downbeat and acts to + qualify the song position coded by the C, TOP, and CLOCK fields. + + If the D bit is set to 1, the song position represents the most + recent position in the sequence that has played. If D = 1, the next + Clock command (if N = 1) or the next (Continue, Clock) pair (if N = + 0) acts to increment the song position by one clock and to play the + updated position. + + If the D bit is set to 0, the song position represents a position in + the sequence that has not yet been played. If D = 0, the next Clock + command (if N = 1) or the next (Continue, Clock) pair (if N = 0) acts + to play the point in the song coded by the song position. The song + position is not incremented. + + An example of a stream that uses D = 0 coding is one whose most + recent sequence command is a Start or Song Position Pointer command + (both N = 1 conditions). However, it is also possible to construct + examples where D = 0 and N = 0. A Start command immediately followed + by a Stop command is coded in Chapter Q by setting C = 0, D = 0, N = + 0, TOP = 0. + + If N = 1 (coding Start or Continue), D = 0 (coding that the downbeat + has yet to be played), and the song position is at the start of the + song, the C = 0 song position encoding MUST be used if a Start + command occurs more recently than a Continue command in the session + history, and the C = 1, TOP = 0, CLOCK = 0 song position encoding + MUST be used if a Continue command occurs more recently than a Start + command in the session history. + +B.3.1. Non-Compliant Sequencers + + The Chapter Q description in this appendix assumes that the sequencer + system counts off time with Clock commands, as mandated in [MIDI]. + However, a few non-compliant products do not use Clock commands to + count off time, but instead use non-standard methods. + + + +Lazzaro & Wawrzynek Standards Track [Page 89] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Chapter Q uses the TIMETOOLS field to provide resiliency support for + these non-standard products. By default, the TIMETOOLS field MUST + NOT appear in Chapter Q, and the T header bit MUST be set to 0. The + session configuration tools described in Appendix C.2.3 may be used + to select TIMETOOLS coding. + + Figure B.3.2 shows the format of the 24-bit TIMETOOLS field. + + 0 1 2 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | TIME | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure B.3.2 -- TIMETOOLS Format + + The TIME field is a 24-bit unsigned integer quantity, with units of + milliseconds. TIME codes an additive correction term for the song + position coded by the TOP, CLOCK, and C fields. TIME is coded in + network byte order (big-endian). + + A receiver computes the correct song position by converting TIME into + units of MIDI clocks and adding it to 65536*TOP + CLOCK (assuming C = + 1). Alternatively, a receiver may convert 65536*TOP + CLOCK into + milliseconds (assuming C = 1) and add it to TIME. The downbeat (D + header bit) semantics defined in Appendix B.3 apply to the corrected + song position. + +B.4. System Chapter F: MIDI Time Code Tape Position + + This appendix describes Chapter F, the system chapter for the MIDI + Time Code (MTC) commands. Readers may wish to review the Appendix + A.1 definition of "finished/unfinished commands" before reading this + appendix. + + The system journal MUST contain Chapter F if an active System Common + Quarter Frame command (0xF1) or an active finished System Exclusive + (Universal Real Time) MTC Full Frame command (F0 7F cc 01 01 hr mn sc + fr F7) appears in the checkpoint history. Otherwise, the system + journal MUST NOT contain Chapter F. + + + + + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 90] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Figure B.4.1 shows the variable-length format for Chapter F. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |S|C|P|Q|D|POINT| COMPLETE ... | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | ... | PARTIAL ... | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | ... | + +-+-+-+-+-+-+-+-+ + + Figure B.4.1 -- System Chapter F Format + + Chapter F holds information about recent MTC tape positions coded in + the session history. Receivers use Chapter F to resynchronize the + MTC system after a packet loss episode. + + Chapter F consists of a 1-octet header followed by several optional + fields, in the order shown in Figure B.4.1. The C and P header bits + form a Table of Contents (TOC) and signal the presence of the 32-bit + COMPLETE field (C = 1) and the 32-bit PARTIAL field (P = 1). + + The Q header bit codes information about the COMPLETE field format. + If Chapter F does not contain a COMPLETE field, Q MUST be set to 0. + + The D header bit codes the tape movement direction. If the tape is + moving forward, or if the tape direction is indeterminate, the D bit + MUST be set to 0. If the tape is moving in the reverse direction, + the D bit MUST be set to 1. In most cases, the ordering of commands + in the session history clearly defines the tape direction. However, + a few command sequences have an indeterminate direction (such as a + session history consisting of one Full Frame command). + + The 3-bit POINT header field is interpreted as an unsigned integer. + Appendix B.4.1 defines how the POINT field codes information about + the contents of the PARTIAL field. If Chapter F does not contain a + PARTIAL field, POINT MUST be set to 7 (if D = 0) or 0 (if D = 1). + + Chapter F MUST include the COMPLETE field if an active finished Full + Frame command appears in the checkpoint history or if an active + Quarter Frame command that completes the encoding of a frame value + appears in the checkpoint history. + + The COMPLETE field encodes the most recent active complete MTC frame + value that appears in the session history. This frame value may take + the form of a series of 8 active Quarter Frame commands (0xF1 0x0n + + + + +Lazzaro & Wawrzynek Standards Track [Page 91] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + through 0xF1 0x7n for forward tape movement, 0xF1 0x7n through 0xF1 + 0x0n for reverse tape movement) or may take the form of an active + finished Full Frame command. + + If the COMPLETE field encodes a Quarter Frame command series, the Q + header bit MUST be set to 1, and the COMPLETE field MUST have the + format shown in Figure B.4.2. The 4-bit fields MT0 through MT7 code + the data (lower) nibble for the Quarter Frame commands for Message + Type 0 through Message Type 7 [MIDI]. These nibbles encode a + complete frame value, in addition to fields reserved for future use + by [MIDI]. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | MT0 | MT1 | MT2 | MT3 | MT4 | MT5 | MT6 | MT7 | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure B.4.2 -- COMPLETE Field Format, Q = 1 + + In this usage, the frame value encoded in the COMPLETE field MUST be + offset by 2 frames (relative to the frame value encoded in the + Quarter Frame commands) if the frame value codes a 0xF1 0x0n through + 0xF1 0x7n command sequence. This offset compensates for the two- + frame latency of the Quarter Frame encoding for forward tape + movement. No offset is applied if the frame value codes a 0xF1 0x7n + through 0xF1 0x0n Quarter Frame command sequence. + + The most recent active complete MTC frame value may alternatively be + encoded by an active finished Full Frame command. In this case, the + Q header bit MUST be set to 0, and the COMPLETE field MUST have the + format shown in Figure B.4.3. The HR, MN, SC, and FR fields + correspond to the hr, mn, sc, and fr data octets of the Full Frame + command. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | HR | MN | SC | FR | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure B.4.3 -- COMPLETE Field Format, Q = 0 + + + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 92] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + +B.4.1. Partial Frames + + The most recent active session history command that encodes MTC frame + value data may be a Quarter Frame command other than a forward-moving + 0xF1 0x7n command (which completes a frame value for forward tape + movement) or a reverse-moving 0xF1 0x1n command (which completes a + frame value for reverse tape movement). + + We consider this type of Quarter Frame command to be associated with + a partial frame value. The Quarter Frame sequence that defines a + partial frame value MUST either start at Message Type 0 and increment + contiguously to an intermediate Message Type less than 7 or start at + Message Type 7 and decrement contiguously to an intermediate Message + type greater than 0. A Quarter Frame command sequence that does not + follow this pattern is not associated with a partial frame value. + + Chapter F MUST include a PARTIAL field if the most recent active + command in the checkpoint history that encodes MTC frame value data + is a Quarter Frame command that is associated with a partial frame + value. Otherwise, Chapter F MUST NOT include a PARTIAL field. + + The partial frame value consists of the data (lower) nibbles of the + Quarter Frame command sequence. The PARTIAL field codes the partial + frame value, using the format shown in Figure B.4.2. Message Type + fields that are not associated with a Quarter Frame command MUST be + set to 0. + + The POINT header field identifies the Message Type fields in the + PARTIAL field that code valid data. If P = 1, the POINT field MUST + encode the unsigned integer value formed by the lower 3 bits of the + upper nibble of the data value of the most recent active Quarter + Frame command in the session history. If D = 0 and P = 1, POINT MUST + take on a value in the range 0-6. If D = 1 and P = 1, POINT MUST + take on a value in the range 1-7. + + If D = 0, MT fields (Figure B.4.2) in the inclusive range from 0 up + to and including the POINT value encode the partial frame value. If + D = 1, MT fields in the inclusive range from 7 down to and including + the POINT value encode the partial frame value. Note that, unlike + the COMPLETE field encoding, senders MUST NOT add a 2-frame offset to + the partial frame value encoded in PARTIAL. + + For the default semantics, if a recovery journal contains Chapter F + and if the session history codes a legal [MIDI] series of Quarter + Frame and Full Frame commands, the chapter always contains a COMPLETE + or a PARTIAL field (and may contain both fields). Thus, a one-octet + Chapter F (C = P = 0) always codes the presence of an illegal command + sequence in the session history (under some conditions, the C = 1, P + + + +Lazzaro & Wawrzynek Standards Track [Page 93] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + = 0 condition may also code the presence of an illegal command + sequence). The illegal command sequence conditions are transient in + nature and usually indicate that a Quarter Frame command sequence + began with an intermediate Message Type. + +B.5. System Chapter X: System Exclusive + + This appendix describes Chapter X, the system chapter for MIDI System + Exclusive (SysEx) commands (0xF0). Readers may wish to review the + Appendix A.1 definition of "finished/unfinished commands" before + reading this appendix. + + Chapter X consists of a list of one or more command logs. Each log + in the list codes information about a specific finished or unfinished + SysEx command that appears in the session history. The system + journal MUST contain Chapter X if the rules defined in Appendix B.5.2 + require that one or more logs appear in the list. + + The log list is not preceded by a header. Instead, each log + implicitly encodes its own length. Given the length of the N'th list + log, the presence of the (N+1)'th list log may be inferred from the + LENGTH field of the system journal header (Figure 10 in Section 5 of + the main text). The log list MUST obey the oldest-first ordering + rule (defined in Appendix A.1). + +B.5.1. Chapter Format + + Figure B.5.1 shows the bitfield format for the Chapter X command + logs. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |S|T|C|F|D|L|STA| TCOUNT | COUNT | FIRST ... | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | DATA ... | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure B.5.1 -- Chapter X Command Log Format + + A Chapter X command log consists of a 1-octet header followed by the + optional TCOUNT, COUNT, FIRST, and DATA fields. + + The T, C, F, and D header bits act as a Table of Contents (TOC) for + the log. If T is set to 1, the 1-octet TCOUNT field appears in the + log. If C is set to 1, the 1-octet COUNT field appears in the log. + If F is set to 1, the variable-length FIRST field appears in the log. + If D is set to 1, the variable-length DATA field appears in the log. + + + +Lazzaro & Wawrzynek Standards Track [Page 94] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + The L header bit sets the coding tool for the log. We define the log + coding tools in Appendix B.5.2. + + The STA field codes the status of the command coded by the log. The + 2-bit STA value is interpreted as an unsigned integer. If STA is 0, + the log codes an unfinished command. Non-zero STA values code + different classes of finished commands. An STA value of 1 codes a + cancelled command, an STA value of 2 codes a command that uses the + "dropped 0xF7" construction, and an STA value of 3 codes all other + finished commands. Section 3.2 in the main text describes cancelled + and "dropped 0xF7" commands. + + The S bit (Appendix A.1) of the first log in the list acts as the S + bit for Chapter X. For the other logs in the list, the S bit refers + to the log itself. The value of the "phantom" S bit associated with + the first log is defined by the following rules: + + o If the list codes one log, the phantom S-bit value is the same as + the Chapter X S-bit value. + + o If the list codes multiple logs, the phantom S-bit value is the + logical OR of the S-bit value of the first and second command logs + in the list. + + In all other respects, the S bit follows the semantics defined in + Appendix A.1. + + The FIRST field (present if F = 1) encodes a variable-length unsigned + integer value that sets the coverage of the DATA field. + + The FIRST field (present if F = 1) encodes a variable-length unsigned + integer value that specifies which SysEx data bytes are encoded in + the DATA field of the log. The FIRST field consists of an octet + whose most significant bit is set to 0, optionally preceded by one or + more octets whose most significant bit is set to 1. The algorithm + shown in Figure B.5.2 decodes this format into an unsigned integer to + yield the value dec(FIRST). FIRST uses a variable-length encoding + because dec(FIRST) references a data octet in a SysEx command, and a + SysEx command may contain an arbitrary number of data octets. + + One-Octet FIRST value: + + Encoded form: 0ddddddd + Decoded form: 00000000 00000000 00000000 0ddddddd + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 95] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Two-Octet FIRST value: + + Encoded form: 1ccccccc 0ddddddd + Decoded form: 00000000 00000000 00cccccc cddddddd + + Three-Octet FIRST value: + + Encoded form: 1bbbbbbb 1ccccccc 0ddddddd + Decoded form: 00000000 000bbbbb bbcccccc cddddddd + + Four-Octet FIRST value: + + Encoded form: 1aaaaaaa 1bbbbbbb 1ccccccc 0ddddddd + Decoded form: 0000aaaa aaabbbbb bbcccccc cddddddd + + + Figure B.5.2 -- Decoding FIRST Field Formats + + The DATA field (present if D = 1) encodes a modified version of the + data octets of the SysEx command coded by the log. Status octets + MUST NOT be coded in the DATA field. + + If F = 0, the DATA field begins with the first data octet of the + SysEx command and includes all subsequent data octets for the command + that appear in the session history. If F = 1, the DATA field begins + with the (dec(FIRST) + 1)'th data octet of the SysEx command and + includes all subsequent data octets for the command that appear in + the session history. Note that the word "command" in the + descriptions above refers to the original SysEx command as it appears + in the source MIDI data stream, not to a particular MIDI list SysEx + command segment. + + The length of the DATA field is coded implicitly, using the most + significant bit of each octet. The most significant bit of the final + octet of the DATA field MUST be set to 1. The most significant bit + of all other DATA octets MUST be set to 0. This coding method relies + on the fact that the most significant bit of a MIDI data octet is 0 + by definition. Apart from this length-coding modification, the DATA + field encodes a verbatim copy of all data octets it encodes. + +B.5.2. Log Inclusion Semantics + + Chapter X offers two tools to protect SysEx commands: the "recency" + tool and the "list" tool. The tool definitions use the concept of + the "SysEx type" of a command, which we now define. + + Each SysEx command instance in a session, excepting MTC Full Frame + commands, is said to have a "SysEx type". Types are used in equality + + + +Lazzaro & Wawrzynek Standards Track [Page 96] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + comparisons: two SysEx commands in a session are said to have "the + same SysEx type" or "different SysEx types". + + If efficiency is not a concern, a sender may follow a simple typing + rule: every SysEx command in the session history has a different + SysEx type, and thus no two commands in the session have the same + type. + + To improve efficiency, senders MAY implement exceptions to this rule. + These exceptions declare that certain sets of SysEx command instances + have the same SysEx type. Any command not covered by an exception + follows the simple rule. We list exceptions below: + + o All commands with identical data octet fields (same number of data + octets, same value for each data octet) have the same type. This + rule MUST be applied to all SysEx commands in the session or not + at all. Note that the implementation of this exception requires + no sender knowledge of the format and semantics of the SysEx + commands in the stream, merely the ability to count and compare + octets. + + o Two instances of the same command whose semantics set or report + the value of the same "parameter" have the same type. The + implementation of this exception requires specific knowledge of + the format and semantics of SysEx commands. In practice, a sender + implementation chooses to support this exception for certain + classes of commands (such as the Universal System Exclusive + commands defined in [MIDI]). If a sender supports this exception + for a particular command in a class (for example, the Universal + Real Time System Exclusive message for Master Volume, F0 F7 cc 04 + 01 vv vv F7, defined in [MIDI]), it MUST support the exception to + all instances of this particular command in the session. + + We now use this definition of "SysEx type" to define the "recency" + tool and the "list" tool for Chapter X. + + By default, the Chapter X log list MUST code sufficient information + to protect the rendered MIDI performance from indefinite artifacts + caused by the loss of all finished or unfinished active SysEx + commands that appear in the checkpoint history (excluding finished + MTC Full Frame commands, which are coded in Chapter F (Appendix + B.4)). + + To protect a command of a specific SysEx type with the recency tool, + senders MUST code a log in the log list for the most recent finished + active instance of the SysEx type that appears in the checkpoint + history. Additionally, if an unfinished active instance of the SysEx + type appears in the checkpoint history, senders MUST code a log in + + + +Lazzaro & Wawrzynek Standards Track [Page 97] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + the log list for the unfinished command instance. The L header bit + of both command logs MUST be set to 0. + + To protect a command of a specific SysEx type with the list tool, + senders MUST code a log in the Chapter X log list for each finished + or unfinished active instance of the SysEx type that appears in the + checkpoint history. The L header bit of list tool command logs MUST + be set to 1. + + As a rule, a log REQUIRED by the list or recency tool MUST include a + DATA field that codes all data octets that appear in the checkpoint + history for the SysEx command instance associated with the log. The + FIRST field MAY be used to configure a DATA field that minimally + meets this requirement. + + An exception to this rule applies to cancelled commands (defined in + Section 3.2). REQUIRED command logs associated with cancelled + commands MAY be coded with no DATA field. However, if DATA appears + in the log, DATA MUST code all data octets that appear in the + checkpoint history for the command associated with the log. + + As defined by the preceding text in this section, by default all + finished or unfinished active SysEx commands that appear in the + checkpoint history (excluding finished MTC Full Frame commands) MUST + be protected by the list tool or the recency tool. + + For some MIDI source streams, this default yields a Chapter X whose + size is too large. For example, imagine that a sender begins to + transcode a SysEx command with 10,000 data octets onto a UDP RTP + stream "on the fly", by sending SysEx command segments as soon as + data octets are delivered by the MIDI source. After 1000 octets have + been sent, the expansion of Chapter X yields an RTP packet that is + too large to fit in the Maximum Transmission Unit (MTU) for the + stream. + + In this situation, if a sender uses the closed-loop sending policy + for SysEx commands, the RTP packet size may always be capped by + stalling the stream. In a stream stall, once the packet reaches a + maximum size, the sender refrains from sending new packets with non- + empty MIDI Command Sections until receiver feedback permits the + trimming of Chapter X. If the stream permits arbitrary commands to + appear between SysEx segments (selectable during configuration using + the tools defined in Appendix C.1), the sender may stall the SysEx + segment stream but continue to code other commands in the MIDI list. + + Stalls are a workable but suboptimal solution to Chapter X size + issues. As an alternative to stalls, senders SHOULD take preemptive + + + + +Lazzaro & Wawrzynek Standards Track [Page 98] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + action during session configuration to reduce the anticipated size of + Chapter X, using the methods described below: + + o Partitioned transport. Appendix C.5 provides tools for sending a + MIDI name space over several RTP streams. Senders may use these + tools to map a MIDI source into a low-latency UDP RTP stream (for + channel commands and short SysEx commands) and a reliable + [RFC4571] TCP stream (for bulk-data SysEx commands). The + cm_unused and cm_used parameters (Appendix C.1) may be used to + communicate the nature of the SysEx command partition. As TCP is + reliable, the RTP MIDI TCP stream would not use the recovery + journal. To minimize transmission latency for short SysEx + commands, senders may begin segmental transmission for all SysEx + commands over the UDP stream and then cancel the UDP transmission + of long commands (using tools described in Section 3.2) and resend + the commands over the TCP stream. + + o Selective protection. Journal protection may not be necessary for + all SysEx commands in a stream. The ch_never parameter (Appendix + C.2) may be used to communicate which SysEx commands are excluded + from Chapter X. + +B.5.3. TCOUNT and COUNT Fields + + If the T header bit is set to 1, the 8-bit TCOUNT field appears in + the command log. If the C header bit is set to 1, the 8-bit COUNT + field appears in the command log. TCOUNT and COUNT are interpreted + as unsigned integers. + + The TCOUNT field codes the total number of SysEx commands of the + SysEx type coded by the log that appear in the session history at the + moment after the (finished or unfinished) command coded by the log + enters the session history. + + The COUNT field codes the total number of SysEx commands that appear + in the session history, excluding commands that are excluded from + Chapter X via the ch_never parameter (Appendix C.2) at the moment + after the (finished or unfinished) command coded by the log enters + the session history. + + Command counting for TCOUNT and COUNT uses modulo-256 arithmetic. + MTC Full Frame command instances (Appendix B.4) are included in + command counting if the TCOUNT and COUNT definitions warrant their + inclusion, as are cancelled commands (Section 3.2). + + Senders use the TCOUNT and COUNT fields to track the identity and + (for TCOUNT) the sequence position of a command instance. Senders + MUST use the TCOUNT or COUNT fields if identity or sequence + + + +Lazzaro & Wawrzynek Standards Track [Page 99] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + information is necessary to protect the command type coded by the + log. + + If a sender uses the COUNT field in a session, the final command log + in every Chapter X in the stream MUST code the COUNT field. This + rule lets receivers resynchronize the COUNT value after a packet + loss. + +Appendix C. Session Configuration Tools + + In Sections 6.1 and 6.2 of the main text, we show session + descriptions for minimal native and mpeg4-generic RTP MIDI streams. + Minimal streams lack the flexibility to support some applications. + In this appendix, we describe how to customize stream behavior + through the use of the payload format parameters. + + The appendix begins with 6 sections, each devoted to parameters that + affect a particular aspect of stream behavior: + + o Appendix C.1 describes the stream subsetting system (cm_unused and + cm_used). + + o Appendix C.2 describes the journalling system (ch_anchor, + ch_default, ch_never, j_sec, j_update). + + o Appendix C.3 describes MIDI command timestamp semantics (linerate, + mperiod, octpos, tsmode). + + o Appendix C.4 describes the temporal duration ("media time") of an + RTP MIDI packet (guardtime, rtp_maxptime, rtp_ptime). + + o Appendix C.5 concerns stream description (musicport). + + o Appendix C.6 describes MIDI rendering (chanmask, cid, inline, + multimode, render, rinit, subrender, smf_cid, smf_info, + smf_inline, smf_url, url). + + The parameters listed above may optionally appear in session + descriptions of RTP MIDI streams. If these parameters are used in an + SDP session description, the parameters appear on an fmtp attribute + line. This attribute line applies to the payload type associated + with the fmtp line. + + The parameters listed above add extra functionality ("features") to + minimal RTP MIDI streams. In Appendix C.7, we show how to use these + features to support two classes of applications: content streaming + + + + + +Lazzaro & Wawrzynek Standards Track [Page 100] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + using RTSP (Appendix C.7.1) and network musical performance using SIP + (Appendix C.7.2). + + The participants in a multimedia session MUST share a common view of + all of the RTP MIDI streams that appear in an RTP session, as defined + by a single media (m=) line. In some RTP MIDI applications, the + "common view" restriction makes it difficult to use sendrecv streams + (all parties send and receive), as each party has its own + requirements. For example, a two-party network musical performance + application may wish to customize the renderer on each host to match + the CPU performance of the host [NMP]. + + We solve this problem by using two RTP MIDI streams -- one sendonly, + one recvonly -- in lieu of one sendrecv stream. The data flows in + the two streams travel in opposite directions to control receivers + configured to use different renderers. In the third example in + Appendix C.5, we show how the musicport parameter may be used to + define virtual sendrecv streams. + + As a general rule, the RTP MIDI protocol does not handle parameter + changes during a session well because the parameters describe + heavyweight or stateful configuration that is not easily changed once + a session has begun. Thus, parties SHOULD NOT expect that parameter + change requests during a session will be accepted by other parties. + However, implementors SHOULD support in-session parameter changes + that are easy to handle (for example, the guardtime parameter defined + in Appendix C.4) and SHOULD be capable of accepting requests for + changes of those parameters, as received by its session management + protocol (for example, re-offers in SIP [RFC3264]). + + Appendix D defines the Augmented Backus-Naur Form (ABNF, [RFC5234]) + syntax for the payload parameters. Section 11 provides information + to the Internet Assigned Numbers Authority (IANA) on the media types + and parameters defined in this document. + + Appendix C.6.5 defines the media type audio/asc, a stored object for + initializing mpeg4-generic renderers. As described in Appendix C.6, + the audio/asc media type is assigned to the rinit parameter to + specify an initialization data object for the default mpeg4-generic + renderer. Note that RTP stream semantics are not defined for + audio/asc. Therefore, the asc subtype MUST NOT appear on the rtpmap + line of a session description. + +C.1. Configuration Tools: Stream Subsetting + + As defined in Section 3.2 in the main text, the MIDI list of an RTP + MIDI packet may encode any MIDI command that may legally appear on a + MIDI 1.0 DIN cable. + + + +Lazzaro & Wawrzynek Standards Track [Page 101] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + In this appendix, we define two parameters (cm_unused and cm_used) + that modify this default condition by excluding certain types of MIDI + commands from the MIDI list of all packets in a stream. For example, + if a multimedia session partitions a MIDI name space into two RTP + MIDI streams, the parameters may be used to define which commands + appear in each stream. + + In this appendix, we define a simple language for specifying MIDI + command types. If a command type is assigned to cm_unused, the + commands coded by the string MUST NOT appear in the MIDI list. If a + command type is assigned to cm_used, the commands coded by the string + MAY appear in the MIDI list. + + The parameter list may code multiple assignments to cm_used and + cm_unused. Assignments have a cumulative effect and are applied in + the order of appearance in the parameter list. A later assignment of + a command type to the same parameter expands the scope of the earlier + assignment. A later assignment of a command type to the opposite + parameter cancels (partially or completely) the effect of an earlier + assignment. + + To initialize the stream subsetting system, "implicit" assignments to + cm_unused and cm_used are processed before processing the actual + assignments that appear in the parameter list. The System Common + undefined commands (0xF4, 0xF5) and the System Real-Time Undefined + commands (0xF9, 0xFD) are implicitly assigned to cm_unused. All + other command types are implicitly assigned to cm_used. + + Note that the implicit assignments code the default behavior of an + RTP MIDI stream as defined in Section 3.2 in the main text (namely, + that all commands that may legally appear on a MIDI 1.0 DIN cable may + appear in the stream). Also, note that assignments of the System + Common undefined commands (0xF4, 0xF5) apply to the use of these + commands in the MIDI source command stream, not the special use of + 0xF4 and 0xF5 in SysEx segment encoding defined in Section 3.2 in the + main text. + + As a rule, parameter assignments obey the following syntax (see + Appendix D for ABNF): + + <parameter> = [channel list]<command-type list>[field list] + + The command-type list is mandatory; the channel and field lists are + optional. + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 102] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + The command-type list specifies the MIDI command types for which the + parameter applies. The command-type list is a concatenated sequence + of one or more of the letters (ABCFGHJKMNPQTVWXYZ). The letters code + the following command types: + + o A: Poly Aftertouch (0xA) + o B: System Reset (0xFF) + o C: Control Change (0xB) + o F: System Time Code (0xF1) + o G: System Tune Request (0xF6) + o H: System Song Select (0xF3) + o J: System Common Undefined (0xF4) + o K: System Common Undefined (0xF5) + o N: NoteOff (0x8), NoteOn (0x9) + o P: Program Change (0xC) + o Q: System Sequencer (0xF2, 0xF8, 0xFA, 0xFB, 0xFC) + o T: Channel Aftertouch (0xD) + o V: System Active Sense (0xFE) + o W: Pitch Wheel (0xE) + o X: SysEx (0xF0, 0xF7) + o Y: System Real-Time Undefined (0xF9) + o Z: System Real-Time Undefined (0xFD) + + In addition to the letters above, the letter M may also appear in the + command-type list. The letter M refers to the MIDI parameter system + (see definition in Appendix A.1 and in [MIDI]). An assignment of M + to cm_unused codes that no RPN or NRPN transactions may appear in the + MIDI list. + + Note that if cm_unused is assigned the letter M, Control Change (0xB) + commands for the controller numbers in the standard controller + assignment might still appear in the MIDI list. For an explanation, + see Appendix A.3.4 for a discussion of the "general-purpose" use of + parameter system controller numbers. + + In the text below, rules that apply to "MIDI voice channel commands" + also apply to the letter M. + + The letters in the command-type list MUST be uppercase and MUST + appear in alphabetical order. Letters other than + (ABCFGHJKMNPQTVWXYZ) that appear in the list MUST be ignored. + + For MIDI voice channel commands, the channel list specifies the MIDI + channels for which the parameter applies. If no channel list is + provided, the parameter applies to all MIDI channels (0-15). The + channel list takes the form of a list of channel numbers (0 through + 15) and dash-separated channel number ranges (i.e., 0-5, 8-12, etc.). + Dots (i.e., "." characters) separate elements in the channel list. + + + +Lazzaro & Wawrzynek Standards Track [Page 103] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Recall that system commands do not have a MIDI channel associated + with them. Thus, for most command-type letters that code system + commands (B, F, G, H, J, K, Q, V, Y, and Z), the channel list is + ignored. + + For the command-type letter X, the appearance of certain numbers in + the channel list codes special semantics. + + o The digit 0 codes that SysEx "cancel" sublists (Section 3.2 in the + main text) MUST NOT appear in the MIDI list. + + o The digit 1 codes that cancel sublists MAY appear in the MIDI list + (the default condition). + + o The digit 2 codes that commands other than System Real-Time MIDI + commands MUST NOT appear between SysEx command segments in the + MIDI list (the default condition). + + o The digit 3 codes that any MIDI command type may appear between + SysEx command segments in the MIDI list, with the exception of the + segmented encoding of a second SysEx command (verbatim SysEx + commands are OK). + + For command-type X, the channel list MUST NOT contain both digits 0 + and 1, and it MUST NOT contain both digits 2 and 3. For command-type + X, channel list numbers other than the numbers defined above are + ignored. If X does not have a channel list, the semantics marked + "the default condition" in the list above apply. + + The syntax for field lists in a parameter assignment follows the + syntax for channel lists. If no field list is provided, the + parameter applies to all controller or note numbers. + + For command-type C (Control Change), the field list codes the + controller numbers (0-255) for which the parameter applies. + + For command-type M (Parameter System), the field list codes the RPN + and NRPN controller numbers for which the parameter applies. The + number range 0-16383 specifies RPN controllers, the number range + 16384-32767 specifies NRPN controllers (16384 corresponds to NRPN + controller number 0, 32767 corresponds to NRPN controller number + 16383). + + For command-types N (NoteOn and NoteOff) and A (Poly Aftertouch), the + field list codes the note numbers for which the parameter applies. + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 104] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + For command-types J and K (System Common Undefined), the field list + consists of a single digit, which specifies the number of data octets + that follow the command octet. + + For command-type X (SysEx), the field list codes the number of data + octets that may appear in a SysEx command. Thus, the field list + 0-255 specifies SysEx commands with 255 or fewer data octets; the + field list 256-4294967295 specifies SysEx commands with more than 255 + data octets but excludes commands with 255 or fewer data octets; and + the field list 0 excludes all commands. + + A secondary parameter-assignment syntax customizes command-type X + (see Appendix D for complete ABNF): + + <parameter> = "__" <h-list> *( "_" <h-list> ) "__" + + The assignment defines the class of SysEx commands that obeys the + semantics of the assigned parameter. The command class is specified + by listing the permitted values of the first N data octets that + follow the SysEx 0xF0 command octet. Any SysEx command whose first N + data octets match the list is a member of the class. + + Each <h-list> defines a data octet of the command as a dot-separated + (".") list of one or more hexadecimal constants (such as "7F") or + dash-separated hexadecimal ranges (such as "01-1F"). Underscores + ("_") separate each <h-list>. Double-underscores ("__") delineate + the data octet list. + + Using this syntax, each assignment specifies a single SysEx command + class. Session descriptions may use several assignments to cm_used + and cm_unused to specify complex behaviors. + + The example session description below illustrates the use of the + stream subsetting parameters: + + v=0 + o=lazzaro 2520644554 2838152170 IN IP6 first.example.net + s=Example + t=0 0 + m=audio 5004 RTP/AVP 96 + c=IN IP6 2001:DB8::7F2E:172A:1E24 + a=rtpmap:96 rtp-midi/44100 + a=fmtp:96 cm_unused=ACGHJKNMPTVWXYZ; cm_used=__7F_00-7F_01_01__ + + The session description configures the stream for use in clock + applications. All voice channels are unused, as are all system + commands except those used for MIDI Time Code (command-type F and the + + + + +Lazzaro & Wawrzynek Standards Track [Page 105] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Full Frame SysEx command that is matched by the string assigned to + cm_used), the System Sequencer commands (command-type Q), and System + Reset (command-type B). + +C.2. Configuration Tools: The Journalling System + + In this appendix, we define the payload format parameters that + configure stream journalling and the recovery journal system. + + The j_sec parameter (Appendix C.2.1) sets the journalling method for + the stream. The j_update parameter (Appendix C.2.2) sets the + recovery journal sending policy for the stream. Appendix C.2.2 also + defines the sending policies of the recovery journal system. + + Appendix C.2.3 defines several parameters that modify the recovery + journal semantics. These parameters change the default recovery + journal semantics as defined in Section 5 and Appendices A and B. + + The journalling method for a stream is set at the start of a session + and MUST NOT be changed thereafter. This requirement forbids changes + to the j_sec parameter once a session has begun. + + A related requirement, defined in the appendices below, forbids the + acceptance of parameter values that would violate the recovery + journal mandate. In many cases, a change in one of the parameters + defined in this appendix during an ongoing session would result in a + violation of the recovery journal mandate for an implementation; in + this case, the parameter change MUST NOT be accepted. + +C.2.1. The j_sec Parameter + + Section 2.2 defines the default journalling method for a stream. + Streams that use unreliable transport (such as UDP) default to using + the recovery journal. Streams that use reliable transport (such as + TCP) default to not using a journal. + + The parameter j_sec may be used to override this default. This memo + defines two symbolic values for j_sec: "none", to indicate that all + stream payloads MUST NOT contain a journal section, and "recj", to + indicate that all stream payloads MUST contain a journal section that + uses the recovery journal format. + + For example, the j_sec parameter might be set to "none" for a UDP + stream that travels between two hosts on a local network that is + known to provide reliable datagram delivery. + + The session description below configures a UDP stream that does not + use the recovery journal: + + + +Lazzaro & Wawrzynek Standards Track [Page 106] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + v=0 + o=lazzaro 2520644554 2838152170 IN IP4 first.example.net + s=Example + t=0 0 + m=audio 5004 RTP/AVP 96 + c=IN IP4 192.0.2.94 + a=rtpmap:96 rtp-midi/44100 + a=fmtp:96 j_sec=none + + Other IETF Standards-Track documents may define alternative journal + formats. These documents MUST define new symbolic values for the + j_sec parameter to signal the use of the format. + + Parties MUST NOT accept a j_sec value that violates the recovery + journal mandate (see Section 4 for details). If a session + description uses a j_sec value unknown to the recipient, the + recipient MUST NOT accept the description. + + Special j_sec issues arise when sessions are managed by session + management tools (like RTSP, [RFC2326]) that use SDP for "declarative + usage" purposes (see the preamble of Section 6 for details). For + these session management tools, SDP does not code transport details + (such as UDP or TCP) for the session. Instead, server and client + negotiate transport details via other means (for RTSP, the SETUP + method). + + In this scenario, the use of the j_sec parameter may be ill-advised, + as the creator of the session description may not yet know the + transport type for the session. In this case, the session + description SHOULD configure the journalling system using the + parameters defined in the remainder of Appendix C.2, but it SHOULD + NOT use j_sec to set the journalling status. Recall that if j_sec + does not appear in the session description, the default method for + choosing the journalling method is in effect (no journal for reliable + transport, recovery journal for unreliable transport). + + However, in declarative usage situations where the creator of the + session description knows that journalling is always required or + never required, the session description SHOULD use the j_sec + parameter. + +C.2.2. The j_update Parameter + + In Section 4, we use the term "sending policy" to describe the method + a sender uses to choose the checkpoint packet identity for each + recovery journal in a stream. In the subsections that follow, we + normatively define three sending policies: anchor, closed-loop, and + open-loop. + + + +Lazzaro & Wawrzynek Standards Track [Page 107] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + As stated in Section 4, the default sending policy for a stream is + the closed-loop policy. The j_update parameter may be used to + override this default. + + We define three symbolic values for j_update: "anchor", to indicate + that the stream uses the anchor sending policy, "open-loop", to + indicate that the stream uses the open-loop sending policy, and + "closed-loop", to indicate that the stream uses the closed-loop + sending policy. See Appendix C.2.3 for examples of session + descriptions that use the j_update parameter. + + Parties MUST NOT accept a j_update value that violates the recovery + journal mandate (Section 4). + + Other IETF Standards-Track documents may define additional sending + policies for the recovery journal system. These documents MUST + define new symbolic values for the j_update parameter to signal the + use of the new policy. If a session description uses a j_update + value unknown to the recipient, the recipient MUST NOT accept the + description. + +C.2.2.1. The anchor Sending Policy + + In the anchor policy, the sender uses the first packet in the stream + as the checkpoint packet for all packets in the stream. The anchor + policy satisfies the recovery journal mandate (Section 4), as the + checkpoint history always covers the entire stream. + + The anchor policy does not require the use of the RTP Control + Protocol (RTCP, [RFC3550]) or other feedback from receiver to sender. + Senders do not need to take special actions to ensure that received + streams start up free of artifacts, as the recovery journal always + covers the entire history of the stream. Receivers are relieved of + the responsibility of tracking the changing identity of the + checkpoint packet, because the checkpoint packet never changes. + + The main drawback of the anchor policy is bandwidth efficiency. + Because the checkpoint history covers the entire stream, the size of + the recovery journals produced by this policy usually exceeds the + journal size of alternative policies. For single-channel MIDI data + streams, the bandwidth overhead of the anchor policy is often + acceptable (see Appendix A.4 of [NMP]). For dense streams, the + closed-loop or open-loop policies may be more appropriate. + + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 108] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + +C.2.2.2. The closed-loop Sending Policy + + The closed-loop policy is the default policy of the recovery journal + system. For each packet in the stream, the policy lets senders + choose the smallest possible checkpoint history that satisfies the + recovery journal mandate. As smaller checkpoint histories generally + yield smaller recovery journals, the closed-loop policy reduces the + bandwidth of a stream, relative to the anchor policy. + + The closed-loop policy relies on feedback from receiver to sender. + The policy assumes that a receiver periodically informs the sender of + the highest sequence number it has seen so far in the stream, coded + in the 32-bit extension format defined in [RFC3550]. For RTCP, + receivers transmit this information in the Extended Highest Sequence + Number Received (EHSNR) field of Receiver Reports. RTCP Sender or + Receiver Reports MUST be sent by any participant in a session with + the closed-loop sending policy, unless another feedback mechanism has + been agreed upon. + + The sender may safely use receiver sequence number feedback to guide + checkpoint history management because Section 4 requires that + receivers repair indefinite artifacts whenever a packet loss event + occurs. + + We now normatively define the closed-loop policy. At the moment a + sender prepares an RTP packet for transmission, the sender is aware + of R >= 0 receivers for the stream. Senders may become aware of a + receiver via RTCP traffic from the receiver, via RTP packets from a + paired stream sent by the receiver to the sender, via messages from a + session management tool, or by other means. As receivers join and + leave a session, the value of R changes. + + Each known receiver k (1 <= k <= R) is associated with a 32-bit + extended packet sequence number M(k), where the extension reflects + the sequence number rollover count of the sender. + + If the sender has received at least one feedback report from receiver + k, M(k) is the most recent report of the highest RTP packet sequence + number seen by the receiver, normalized to reflect the rollover count + of the sender. + + If the sender has not received a feedback report from the receiver, + M(k) is the extended sequence number of the last packet the sender + transmitted before it became aware of the receiver. If the sender + became aware of this receiver before it sent the first packet in the + stream, M(k) is the extended sequence number of the first packet in + the stream. + + + + +Lazzaro & Wawrzynek Standards Track [Page 109] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Given this definition of M(k), we now state the closed-loop policy. + When preparing a new packet for transmission, a sender MUST choose a + checkpoint packet with extended sequence number N, such that M(k) >= + (N - 1) for all k, 1 <= k <= R, where R >= 1. The policy does not + restrict sender behavior in the R == 0 (no known receivers) case. + + Under the closed-loop policy as defined above, a sender may transmit + packets whose checkpoint history is shorter than the session history + (as defined in Appendix A.1). In this event, a new receiver that + joins the stream may experience indefinite artifacts. + + For example, if a Control Change (0xB) command for Channel Volume + (controller number 7) was sent early in a stream, and later a new + receiver joins the session, the closed-loop policy may permit all + packets sent to the new receiver to use a checkpoint history that + does not include the Channel Volume Control Change command. As a + result, the new receiver experiences an indefinite artifact and plays + all notes on a channel too loudly or too softly. + + To address this issue, the closed-loop policy states that whenever a + sender becomes aware of a new receiver, the sender MUST determine if + the receiver would be subject to indefinite artifacts under the + closed-loop policy. If so, the sender MUST ensure that the receiver + starts the session free of indefinite artifacts. For example, to + solve the Channel Volume issue described above, the sender may code + the current state of the Channel Volume controller numbers in the + recovery journal Chapter C, until it receives the first RTCP RR + report that signals that a packet containing this Chapter C has been + received. + + In satisfying this requirement, senders MAY infer the initial MIDI + state of the receiver from the session description. For example, the + stream example in Section 6.2 has the initial state defined in [MIDI] + for General MIDI. + + In a unicast RTP session, a receiver may safely assume that the + sender is aware of its presence as a receiver from the first packet + sent in the RTP stream. However, in other types of RTP sessions + (multicast, conference focus, RTP translator/mixer), a receiver is + often not able to determine if the sender is initially aware of its + presence as a receiver. + + To address this issue, the closed-loop policy states that if a + receiver participates in a session where it may have access to a + stream whose sender is not aware of the receiver, the receiver MUST + take actions to ensure that its rendered MIDI performance does not + contain indefinite artifacts. These protections will be necessarily + incomplete. For example, a receiver may monitor the Checkpoint + + + +Lazzaro & Wawrzynek Standards Track [Page 110] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Packet Seqnum for uncovered loss events and "err on the side of + caution" with respect to handling stuck notes due to lost MIDI + NoteOff commands, but the receiver is not able to compensate for the + lack of Channel Volume initialization data in the recovery journal. + + The receiver MUST NOT discontinue these protective actions until it + is certain that the sender is aware of its presence. If a receiver + is not able to ascertain sender awareness, the receiver MUST continue + these protective actions for the duration of the session. + + Note that in a multicast session where all parties are expected to + send and receive, the reception of RTCP receiver reports from the + sender about the RTP stream a receiver is multicasting back is + evidence of the sender's awareness that the RTP stream multicast by + the sender is being monitored by the receiver. Receivers may also + obtain sender awareness evidence from session management tools, or by + other means. In practice, ongoing observation of the Checkpoint + Packet Seqnum to determine if the sender is taking actions to prevent + loss events for a receiver is a good indication of sender awareness, + as is the sudden appearance of recovery journal chapters with + numerous Control Change controller data that was not foreshadowed by + recent commands coded in the MIDI list shortly after sending an RTCP + RR. + + The final set of normative closed-loop policy requirements concerns + how senders and receivers handle unplanned disruptions of RTCP + feedback from a receiver to a sender. By "unplanned", we refer to + disruptions that are not due to the signalled termination of an RTP + stream, via an RTCP BYE or via session management tools. + + As defined earlier in this section, the closed-loop policy states + that a sender MUST choose a checkpoint packet with extended sequence + number N, such that M(k) >= (N - 1) for all k, 1 <= k <= R, where R + >= 1. If the sender has received at least one feedback report from + receiver k, M(k) is the most recent report of the highest RTP packet + sequence number seen by the receiver, normalized to reflect the + rollover count of the sender. + + If this receiver k stops sending feedback to the sender, the M(k) + value used by the sender reflects the last feedback report from the + receiver. As time progresses without feedback from receiver k, this + fixed M(k) value forces the sender to increase the size of the + checkpoint history and thus increases the bandwidth of the stream. + + At some point, the sender may need to take action in order to limit + the bandwidth of the stream. In most envisioned uses of RTP MIDI, + long before this point is reached, the SSRC time-out mechanism + defined in [RFC3550] will remove the uncooperative receiver from the + + + +Lazzaro & Wawrzynek Standards Track [Page 111] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + session (note that the closed-loop policy does not suggest or require + any special sender behavior upon an SSRC time-out, other than the + sender actions related to changing R, described earlier in this + section). + + However, in rare situations, the bandwidth of the stream (due to a + lack of feedback reports from the sender) may become too large to + continue sending the stream to the receiver before the SSRC time-out + occurs for the receiver. In this case, the closed-loop policy states + that the sender should invoke the SSRC time-out for the receiver + early. + + We now discuss receiver responsibilities in the case of unplanned + disruptions of RTCP feedback from receiver to sender. + + In the unicast case, if a sender invokes the SSRC time-out mechanism + for a receiver, the receiver stops receiving packets from the sender. + The sender behavior imposed by the guardtime parameter (Appendix + C.4.2) lets the receiver conclude that an SSRC time-out has occurred + in a reasonable time period. + + In this case of a time-out, a receiver MUST keep sending RTCP + feedback, in order to re-establish the RTP flow from the sender. + Unless the receiver expects a prompt recovery of the RTP flow, the + receiver MUST take actions to ensure that the rendered MIDI + performance does not exhibit "very long transient artifacts" (for + example, by silencing NoteOns to prevent stuck notes) while awaiting + reconnection of the flow. + + In the multicast case, if a sender invokes the SSRC time-out + mechanism for a receiver, the receiver may continue to receive + packets, but the sender will no longer be using the M(k) feedback + from the receiver to choose each checkpoint packet. If the receiver + does not have additional information that precludes an SSRC time-out + (such as RTCP Receiver Reports from the sender about an RTP stream + the receiver is multicasting back to the sender), the receiver MUST + monitor the Checkpoint Packet Seqnum to detect an SSRC time-out. If + an SSRC time-out is detected, the receiver MUST follow the + instructions for SSRC time-outs described for the unicast case above. + + Finally, we note that the closed-loop policy is suitable for use in + RTP/RTCP sessions that use multicast transport. However, aspects of + the closed-loop policy do not scale well to sessions with large + numbers of participants. The sender state scales linearly with the + number of receivers, as the sender needs to track the identity and + M(k) value for each receiver k. The average recovery journal size is + not independent of the number of receivers, as the RTCP reporting + interval backoff slows down the rate of a full update of M(k) values. + + + +Lazzaro & Wawrzynek Standards Track [Page 112] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + The backoff algorithm may also increase the amount of ancillary state + used by implementations of the normative sender and receiver + behaviors defined in Section 4. + +C.2.2.3. The open-loop Sending Policy + + The open-loop policy is suitable for sessions that are not able to + implement the receiver-to-sender feedback required by the closed-loop + policy and that are also not able to use the anchor policy because of + bandwidth constraints. + + The open-loop policy does not place constraints on how a sender + chooses the checkpoint packet for each packet in the stream. In the + absence of such constraints, a receiver may find that the recovery + journal in the packet that ends a loss event has a checkpoint history + that does not cover the entire loss event. We refer to loss events + of this type as uncovered loss events. + + To ensure that uncovered loss events do not compromise the recovery + journal mandate, the open-loop policy assigns specific recovery tasks + to senders, receivers, and the creators of session descriptions. The + underlying premise of the open-loop policy is that the indefinite + artifacts produced during uncovered loss events fall into two + classes. + + One class of artifacts is recoverable indefinite artifacts. + Receivers are able to repair recoverable artifacts that occur during + an uncovered loss event without intervention from the sender, at the + potential cost of unpleasant transient artifacts. + + For example, after an uncovered loss event, receivers are able to + repair indefinite artifacts due to NoteOff (0x8) commands that may + have occurred during the loss event, by executing NoteOff commands + for all active NoteOns commands. This action causes a transient + artifact (a sudden silent period in the performance) but ensures that + no stuck notes sound indefinitely. We refer to MIDI commands that + are amenable to repair in this fashion as recoverable MIDI commands. + + A second class of artifacts is unrecoverable indefinite artifacts. + If this class of artifact occurs during an uncovered loss event, the + receiver is not able to repair the stream. + + For example, after an uncovered loss event, receivers are not able to + repair indefinite artifacts due to Control Change (0xB) Channel + Volume (controller number 7) commands that have occurred during the + loss event. A repair is impossible because the receiver has no way + + + + + +Lazzaro & Wawrzynek Standards Track [Page 113] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + of determining the data value of a lost Channel Volume command. We + refer to MIDI commands that are fragile in this way as unrecoverable + MIDI commands. + + The open-loop policy does not specify how to partition the MIDI + command set into recoverable and unrecoverable commands. Instead, it + assumes that the creators of the session descriptions are able to + come to agreement on a suitable recoverable/unrecoverable MIDI + command partition for an application. + + Given these definitions, we now state the normative requirements for + the open-loop policy. + + In the open-loop policy, the creators of the session description MUST + use the ch_anchor parameter (defined in Appendix C.2.3) to protect + all unrecoverable MIDI command types from indefinite artifacts or + alternatively MUST use the cm_unused parameter (defined in Appendix + C.1) to exclude the command types from the stream. These options act + to shield command types from artifacts during an uncovered loss + event. + + In the open-loop policy, receivers MUST examine the Checkpoint Packet + Seqnum field of the recovery journal header after every loss event, + to check if the loss event is an uncovered loss event. Section 5 + shows how to perform this check. If an uncovered loss event has + occurred, a receiver MUST perform indefinite artifact recovery for + all MIDI command types that are not shielded by ch_anchor and + cm_unused parameter assignments in the session description. + + The open-loop policy does not place specific constraints on the + sender. However, the open-loop policy works best if the sender + manages the size of the checkpoint history to ensure that uncovered + losses occur infrequently, by taking into account the delay and loss + characteristics of the network. Also, as each checkpoint packet + change incurs the risk of an uncovered loss, senders should only move + the checkpoint if it reduces the size of the journal. + +C.2.3. Recovery Journal Chapter Inclusion Parameters + + The recovery journal chapter definitions (Appendices A and B) specify + under what conditions a chapter MUST appear in the recovery journal. + In most cases, the definition states that if a certain command + appears in the checkpoint history, a certain chapter type MUST appear + in the recovery journal to protect the command. + + In this section, we describe the chapter inclusion parameters. These + parameters modify the conditions under which a chapter appears in the + journal. These parameters are essential to the use of the open-loop + + + +Lazzaro & Wawrzynek Standards Track [Page 114] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + policy (Appendix C.2.2.3) and may also be used to simplify + implementations of the closed-loop (Appendix C.2.2.2) and anchor + (Appendix C.2.2.1) policies. + + Each parameter represents a type of chapter inclusion semantics. An + assignment to a parameter declares which chapters (or chapter + subsets) obey the inclusion semantics. We describe the assignment + syntax for these parameters later in this section. + + A party MUST NOT accept chapter inclusion parameter values that + violate the recovery journal mandate (Section 4). All assignments of + the subsetting parameters (cm_used and cm_unused) MUST precede the + first assignment of a chapter inclusion parameter in the parameter + list. + + Below, we normatively define the semantics of the chapter inclusion + parameters. For clarity, we define the action of parameters on + complete chapters. If a parameter is assigned a subset of a chapter, + the definition applies only to the chapter subset. + + o ch_never. A chapter assigned to the ch_never parameter MUST NOT + appear in the recovery journal (Appendices A.4.1 and A.4.2 define + exceptions to this rule for Chapter M). To signal the exclusion + of a chapter from the journal, an assignment to ch_never MUST be + made, even if the commands coded by the chapter are assigned to + cm_unused. This rule simplifies the handling of commands types + that may be coded in several chapters. + + o ch_default. A chapter assigned to the ch_default parameter MUST + follow the default semantics for the chapter, as defined in + Appendices A and B. + + o ch_anchor. A chapter assigned to the ch_anchor MUST obey a + modified version of the default chapter semantics. In the + modified semantics, all references to the checkpoint history are + replaced with references to the session history, and all + references to the checkpoint packet are replaced with references + to the first packet sent in the stream. + + Parameter assignments obey the following syntax (see Appendix D for + ABNF): + + <parameter> = [channel list]<chapter list>[field list] + + The chapter list is mandatory; the channel and field lists are + optional. Multiple assignments to parameters have a cumulative + effect and are applied in the order of parameter appearance in a + media description. + + + +Lazzaro & Wawrzynek Standards Track [Page 115] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + To determine the semantics of a list of chapter inclusion parameter + assignments, we begin by assuming an implicit assignment of all + channel and system chapters to the ch_default parameter, with the + default values for the channel list and field list for each chapter + that are defined below. + + We then interpret the semantics of the actual parameter assignments, + using the rules below. + + A later assignment of a chapter to the same parameter expands the + scope of the earlier assignment. In most cases, a later assignment + of a chapter to a different parameter cancels (partially or + completely) the effect of an earlier assignment. + + The chapter list specifies the channel or system chapters for which + the parameter applies. The chapter list is a concatenated sequence + of one or more of the letters corresponding to the chapter types + (ACDEFMNPQTVWX). In addition, the list may contain one or more of + the letters for the subchapter types (BGHJKYZ) of System Chapter D. + + The letters in a chapter list MUST be uppercase and MUST appear in + alphabetical order. Letters other than (ABCDEFGHJKMNPQTVWXYZ) that + appear in the chapter list MUST be ignored. + + The channel list specifies the channel journals for which this + parameter applies; if no channel list is provided, the parameter + applies to all channel journals. The channel list takes the form of + a list of channel numbers (0 through 15) and dash-separated channel + number ranges (i.e., 0-5, 8-12, etc.). Dots (i.e., "." characters) + separate elements in the channel list. + + Several of the system chapters may be configured to have special + semantics. Configuration occurs by specifying a channel list for the + system channel, using the coding described below. (Note that MIDI + system commands do not have a "channel" and thus the original purpose + of the channel list does not apply to system chapters). The + expression "the digit N" in the text below refers to the inclusion of + N as a "channel" in the channel list for a system chapter. + + For the J and K Chapter D subchapters (undefined System Common), the + digit 0 codes that the parameter applies to the LEGAL field of the + associated command log (Figure B.1.4 of Appendix B.1), the digit 1 + codes that the parameter applies to the VALUE field of the command + log, and the digit 2 codes that the parameter applies to the COUNT + field of the command log. + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 116] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + For the Y and Z Chapter D subchapters (undefined System Real-Time), + the digit 0 codes that the parameter applies to the LEGAL field of + the associated command log (Figure B.1.5 of Appendix B.1) and the + digit 1 codes that the parameter applies to the COUNT field of the + command log. + + For Chapter Q (Sequencer State Commands), the digit 0 codes that the + parameter applies to the default Chapter Q definition, which forbids + the TIME field. The digit 1 codes that the parameter applies to the + optional Chapter Q definition, which supports the TIME field. + + The syntax for field lists follows the syntax for channel lists. If + no field list is provided, the parameter applies to all controller or + note numbers. For Chapter C, if no field list is provided, the + controller numbers do not use enhanced Chapter C encoding (Appendix + A.3.3). + + For Chapter C, the field list may take on values in the range 0 to + 255. A field value X in the range 0-127 refers to a controller + number X and indicates that the controller number does not use + enhanced Chapter C encoding. A field value X in the range 128-255 + refers to a controller number "X minus 128" and indicates the + controller number does use the enhanced Chapter C encoding. + + Assignments made to configure the Chapter C encoding method for a + controller number MUST be made to the ch_default or ch_anchor + parameters, as assignments to ch_never act to exclude the number from + the recovery journal (and thus the indicated encoding method is + irrelevant). + + A Chapter C field list MUST NOT encode conflicting information about + the enhanced encoding status of a particular controller number. For + example, values 0 and 128 MUST NOT both be coded by a field list. + + For Chapter M, the field list codes the RPN and NRPN controller + numbers for which the parameter applies. The number range 0-16383 + specifies RPN controller numbers, the number range 16384-32767 + specifies NRPN controller numbers (16384 corresponds to NRPN + controller number 0, 32767 corresponds to NRPN controller number + 16383). + + For Chapters N and A, the field list codes the note numbers for which + the parameter applies. The note number range specified for Chapter N + also applies to Chapter E. + + For Chapter E, the digit 0 codes that the parameter applies to + Chapter E note logs whose V bit is set to 0, and the digit 1 codes + that the parameter applies to note logs whose V bit is set to 1. + + + +Lazzaro & Wawrzynek Standards Track [Page 117] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + For Chapter X, the field list codes the number of data octets that + may appear in a SysEx command that is coded in the chapter. Thus, + the field list 0-255 specifies SysEx commands with 255 or fewer data + octets, the field list 256-4294967295 specifies SysEx commands with + more than 255 data octets but excludes commands with 255 or fewer + data octets, and the field list 0 excludes all commands. + + A secondary parameter assignment syntax customizes Chapter X (see + Appendix D for complete ABNF): + + <parameter> = "__" <h-list> *( "_" <h-list> ) "__" + + The assignment defines a class of SysEx commands whose Chapter X + coding obeys the semantics of the assigned parameter. The command + class is specified by listing the permitted values of the first N + data octets that follow the SysEx 0xF0 command octet. Any SysEx + command whose first N data octets match the list is a member of the + class. + + Each <h-list> defines a data octet of the command as a dot-separated + (".") list of one or more hexadecimal constants (such as "7F") or + dash-separated hexadecimal ranges (such as "01-1F"). Underscores + ("_") separate each <h-list>. Double-underscores ("__") delineate + the data octet list. + + Using this syntax, each assignment specifies a single SysEx command + class. Session descriptions may use several assignments to the same + (or different) parameters to specify complex Chapter X behaviors. + The ordering behavior of multiple assignments follows the guidelines + for chapter parameter assignments described earlier in this section. + + The example session description below illustrates the use of the + chapter inclusion parameters: + + v=0 + o=lazzaro 2520644554 2838152170 IN IP6 first.example.net + s=Example + t=0 0 + m=audio 5004 RTP/AVP 96 + c=IN IP6 2001:DB8::7F2E:172A:1E24 + a=rtpmap:96 rtp-midi/44100 + a=fmtp:96 j_update=open-loop; cm_unused=ABCFGHJKMQTVWXYZ; + cm_used=__7E_00-7F_09_01.02.03__; + cm_used=__7F_00-7F_04_01.02__; cm_used=C7.64; + ch_never=ABCDEFGHJKMQTVWXYZ; ch_never=4.11-13N; + ch_anchor=P; ch_anchor=C7.64; + ch_anchor=__7E_00-7F_09_01.02.03__; + ch_anchor=__7F_00-7F_04_01.02__ + + + +Lazzaro & Wawrzynek Standards Track [Page 118] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + (The a=fmtp line has been wrapped to fit the page to accommodate memo + formatting restrictions; it comprises a single line in SDP.) + + The j_update parameter codes that the stream uses the open-loop + policy. Most MIDI command-types are assigned to cm_unused and thus + do not appear in the stream. As a consequence, the assignments to + the first ch_never parameter reflect that most chapters are not in + use. + + Chapter N for several MIDI channels is assigned to ch_never. Chapter + N for MIDI channels other than 4, 11, 12, and 13 may appear in the + recovery journal, using the (default) ch_default semantics. In + practice, this assignment pattern would reflect knowledge about a + resilient rendering method in use for the excluded channels. + + The MIDI Program Change command and several MIDI Control Change + controller numbers are assigned to ch_anchor. Note that the ordering + of the ch_anchor Chapter C assignment after the ch_never command acts + to override the ch_never assignment for the listed controller numbers + (7 and 64). + + The assignment of command-type X to cm_unused excludes most SysEx + commands from the stream. Exceptions are made for General MIDI + System On/Off commands and for the Master Volume and Balance + commands, via the use of the secondary assignment syntax. The + cm_used assignment codes the exception, and the ch_anchor assignment + codes how these commands are protected in Chapter X. + +C.3. Configuration Tools: Timestamp Semantics + + The MIDI command section of the payload format consists of a list of + commands, each with an associated timestamp. The semantics of + command timestamps may be set during session configuration using the + parameters we describe in this section. + + The parameter tsmode specifies the timestamp semantics for a stream. + The parameter takes on one of three token values: "comex", "async", + or "buffer". + + The default "comex" value specifies that timestamps code the + execution time for a command (Appendix C.3.1) and supports the + accurate transcoding of Standard MIDI Files (SMFs, [MIDI]). The + "comex" value is also RECOMMENDED for new MIDI user-interface + controller designs. The "async" value specifies an asynchronous + timestamp sampling algorithm for time-of-arrival sources (Appendix + C.3.2). The "buffer" value specifies a synchronous timestamp + sampling algorithm (Appendix C.3.3) for time-of-arrival sources. + + + + +Lazzaro & Wawrzynek Standards Track [Page 119] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Ancillary parameters MAY follow tsmode in a media description. We + define these parameters in Appendices C.3.2 and C.3.3. + +C.3.1. The comex Algorithm + + The default "comex" (COMmand EXecution) tsmode value specifies the + execution time for the command. With comex, the difference between + two timestamps indicates the time delay between the execution of the + commands. This difference may be zero, coding simultaneous + execution. + + The comex interpretation of timestamps works well for transcoding a + Standard MIDI File (SMF, [MIDI]) into an RTP MIDI stream, as SMFs + code a timestamp for each MIDI command stored in the file. To + transcode an SMF that uses metric time markers, use the SMF tempo map + (encoded in the SMF as meta-events) to convert metric SMF timestamp + units into seconds-based RTP timestamp units. + + New MIDI controller designs (piano keyboard, drum pads, etc.) that + support RTP MIDI and that have direct access to sensor data SHOULD + use comex interpretation for timestamps so that simultaneous gestural + events may be accurately coded by RTP MIDI. + + Comex is a poor choice for transcoding MIDI 1.0 DIN cables [MIDI], + for a reason that we will now explain. A MIDI DIN cable is an + asynchronous serial protocol (320 microseconds per MIDI byte). MIDI + commands on a DIN cable are not tagged with timestamps. Instead, + MIDI DIN receivers infer command timing from the time of arrival of + the bytes. Thus, two two-byte MIDI commands that occur at a source + simultaneously are encoded on a MIDI 1.0 DIN cable with a 640 + microsecond time offset. A MIDI DIN receiver is unable to tell if + this time offset existed in the source performance or is an artifact + of the serial speed of the cable. However, the RTP MIDI comex + interpretation of timestamps declares that a timestamp offset between + two commands reflects the timing of the source performance. + + This semantic mismatch is the reason that comex is a poor choice for + transcoding MIDI DIN cables. Note that the choice of the RTP + timestamp rate (Sections 6.1 and 6.2 in the main text) cannot fix + this inaccuracy issue. In the sections that follow, we describe two + alternative timestamp interpretations ("async" and "buffer") that are + a better match to MIDI 1.0 DIN cable timing and to other MIDI time- + of-arrival sources. + + The octpos, linerate, and mperiod ancillary parameters (defined + below) SHOULD NOT be used with comex. + + + + + +Lazzaro & Wawrzynek Standards Track [Page 120] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + +C.3.2. The async Algorithm + + The "async" tsmode value specifies the asynchronous sampling of a + MIDI time-of-arrival source. In asynchronous sampling, the moment an + octet is received from a source, it is labelled with a wall-clock + time value. The time value has RTP timestamp units. + + The octpos ancillary parameter defines how RTP command timestamps are + derived from octet time values. If octpos has the token value + "first", a timestamp codes the time value of the first octet of the + command. If octpos has the token value "last", a timestamp codes the + time value of the last octet of the command. If the octpos parameter + does not appear in the media description, the sender does not know + which octet of the command the timestamp references (for example, the + sender may be relying on an operating system service that does not + specify this information). + + The octpos semantics refer to the first or last octet of a command as + it appears on a time-of-arrival MIDI source, not as it appears in an + RTP MIDI packet. This distinction is significant because the RTP + coding may contain octets that are not present in the source. For + example, the status octet of the first MIDI command in a packet may + have been added to the MIDI stream during transcoding to comply with + the RTP MIDI running status requirements (Section 3.2). + + The linerate ancillary parameter defines the timespan of one MIDI + octet on the transmission medium of the MIDI source to be sampled + (such as a MIDI 1.0 DIN cable). The parameter has units of + nanoseconds and takes on integral values. For MIDI 1.0 DIN cables, + the correct linerate value is 320000 (this value is also the default + value for the parameter). + + We now show a session description example for the async algorithm. + Consider a sender that is transcoding a MIDI 1.0 DIN cable source + into RTP. The sender runs on a computing platform that assigns time + values to every incoming octet of the source, and the sender uses the + time values to label the first octet of each command in the RTP + packet. This session description describes the transcoding: + + v=0 + o=lazzaro 2520644554 2838152170 IN IP4 first.example.net + s=Example + t=0 0 + m=audio 5004 RTP/AVP 96 + c=IN IP4 192.0.2.94 + a=rtpmap:96 rtp-midi/44100 + a=sendonly + a=fmtp:96 tsmode=async; linerate=320000; octpos=first + + + +Lazzaro & Wawrzynek Standards Track [Page 121] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + +C.3.3. The buffer Algorithm + + The "buffer" tsmode value specifies the synchronous sampling of a + MIDI time-of-arrival source. + + In synchronous sampling, octets received from a source are placed in + a holding buffer upon arrival. At periodic intervals, the RTP sender + examines the buffer. The sender removes complete commands from the + buffer and codes those commands in an RTP packet. The command + timestamp codes the moment of buffer examination, expressed in RTP + timestamp units. Note that several commands may have the same + timestamp value. + + The mperiod ancillary parameter defines the nominal periodic sampling + interval. The parameter takes on positive integral values and has + RTP timestamp units. + + The octpos ancillary parameter, defined in Appendix C.3.2 for + asynchronous sampling, plays a different role in synchronous + sampling. In synchronous sampling, the parameter specifies the + timestamp semantics of a command whose octets span several sampling + periods. + + If octpos has the token value "first", the timestamp reflects the + arrival period of the first octet of the command. If octpos has the + token value "last", the timestamp reflects the arrival period of the + last octet of the command. The octpos semantics refer to the first + or last octet of the command as it appears on a time-of-arrival + source, not as it appears in the RTP packet. + + If the octpos parameter does not appear in the media description, the + timestamp MAY reflect the arrival period of any octet of the command; + senders use this option to signal a lack of knowledge about the + timing details of the buffering process at subcommand granularity. + + We now show a session description example for the buffer algorithm. + Consider a sender that is transcoding a MIDI 1.0 DIN cable source + into RTP. The sender runs on a computing platform that places source + data into a buffer upon receipt. The sender polls the buffer 1000 + times a second, extracts all complete commands from the buffer, and + places the commands in an RTP packet. This session description + describes the transcoding: + + + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 122] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + v=0 + o=lazzaro 2520644554 2838152170 IN IP6 first.example.net + s=Example + t=0 0 + m=audio 5004 RTP/AVP 96 + c=IN IP6 2001:DB8::7F2E:172A:1E24 + a=rtpmap:96 rtp-midi/44100 + a=sendonly + a=fmtp:96 tsmode=buffer; linerate=320000; octpos=last; mperiod=44 + + The mperiod value of 44 is derived by dividing the clock rate + specified by the rtpmap attribute (44100 Hz) by the 1000 Hz buffer + sampling rate and rounding to the nearest integer. Command + timestamps might not increment by exact multiples of 44, as the + actual sampling period might not precisely match the nominal mperiod + value. + +C.4. Configuration Tools: Packet Timing Tools + + In this appendix, we describe session configuration tools for + customizing the temporal behavior of MIDI stream packets. + +C.4.1. Packet Duration Tools + + Senders control the granularity of a stream by setting the temporal + duration ("media time") of the packets in the stream. Short media + times (20 ms or less) often imply an interactive session. Longer + media times (100 ms or more) usually indicate a content-streaming + session. The RTP AVP profile [RFC3551] recommends audio packet media + times in a range from 0 to 200 ms. + + By default, an RTP receiver dynamically senses the media time of + packets in a stream and chooses the length of its playout buffer to + match the stream. A receiver typically sizes its playout buffer to + fit several audio packets and adjusts the buffer length to reflect + the network jitter and the sender timing fidelity. + + Alternatively, the packet media time may be statically set during + session configuration. Session descriptions MAY use the RTP MIDI + parameter rtp_ptime to set the recommended media time for a packet. + Session descriptions MAY also use the RTP MIDI parameter rtp_maxptime + to set the maximum media time for a packet permitted in a stream. + Both parameters MAY be used together to configure a stream. + + The values assigned to the rtp_ptime and rtp_maxptime parameters have + the units of the RTP timestamp for the stream, as set by the rtpmap + attribute (see Section 6.1). Thus, if rtpmap sets the clock rate of + a stream to 44100 Hz, a maximum packet media time of 10 ms is coded + + + +Lazzaro & Wawrzynek Standards Track [Page 123] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + by setting rtp_maxptime=441. As stated in the Appendix C preamble, + the senders and receivers of a stream MUST agree on common values for + rtp_ptime and rtp_maxptime if the parameters appear in the media + description for the stream. + + 0 ms is a reasonable media time value for MIDI packets and is often + used in low-latency interactive applications. In a packet with a 0 + ms media time, all commands execute at the instant they are coded by + the packet timestamp. The session description below configures all + packets in the stream to have 0 ms media time: + + v=0 + o=lazzaro 2520644554 2838152170 IN IP4 first.example.net + s=Example + t=0 0 + m=audio 5004 RTP/AVP 96 + c=IN IP4 192.0.2.94 + a=rtpmap:96 rtp-midi/44100 + a=fmtp:96 rtp_ptime=0; rtp_maxptime=0 + + The session attributes ptime and maxptime [RFC4566] MUST NOT be used + to configure an RTP MIDI stream. Sessions MUST use rtp_ptime in lieu + of ptime and MUST use rtp_maxptime in lieu of maxptime. RTP MIDI + defines its own parameters for media time configuration because 0 ms + values for ptime and maxptime are forbidden by [RFC3264] but are + essential for certain applications of RTP MIDI. + + See the Appendix C.7 examples for additional discussion about using + rtp_ptime and rtp_maxptime for session configuration. + +C.4.2. The guardtime Parameter + + RTP permits a sender to stop sending audio packets for an arbitrary + period of time during a session. When sending resumes, the RTP + sequence number series continues unbroken, and the RTP timestamp + value reflects the media time silence gap. + + This RTP feature has its roots in telephony, but it is also well- + matched to interactive MIDI sessions, as players may fall silent for + several seconds during (or between) songs. + + Certain MIDI applications benefit from a slight enhancement to this + RTP feature. In interactive applications, receivers may use online + network models to guide heuristics for handling lost and late RTP + packets. These models may work poorly if a sender ceases packet + transmission for long periods of time. + + + + + +Lazzaro & Wawrzynek Standards Track [Page 124] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Session descriptions may use the parameter guardtime to set a minimum + sending rate for a media session. The value assigned to guardtime + codes the maximum separation time between two sequential packets, as + expressed in RTP timestamp units. + + Typical guardtime values are 500-2000 ms. This value range is not a + normative bound, and parties SHOULD be prepared to process values + outside this range. + + The congestion control requirements for sender implementations + (described in Section 8 and [RFC3550]) take precedence over the + guardtime parameter. Thus, if the guardtime parameter requests a + minimum sending rate, but sending at this rate would violate the + congestion control requirements, senders MUST ignore the guardtime + parameter value. In this case, senders SHOULD use the lowest minimum + sending rate that satisfies the congestion control requirements. + + Below, we show a session description that uses the guardtime + parameter. + + v=0 + o=lazzaro 2520644554 2838152170 IN IP6 first.example.net + s=Example + t=0 0 + m=audio 5004 RTP/AVP 96 + c=IN IP6 2001:DB8::7F2E:172A:1E24 + a=rtpmap:96 rtp-midi/44100 + a=fmtp:96 guardtime=44100; rtp_ptime=0; rtp_maxptime=0 + +C.5. Configuration Tools: Stream Description + + As we discussed in Section 2.1, a party may send several RTP MIDI + streams in the same RTP session, and several RTP sessions that carry + MIDI may appear in a multimedia session. + + By default, the MIDI name space (16 channels + systems) of each RTP + stream sent by a party in a multimedia session is independent. By + independent, we mean three distinct things: + + o If a party sends two RTP MIDI streams (A and B), MIDI voice + channel 0 in stream A is a different "channel 0" than MIDI voice + channel 0 in stream B. + + o MIDI voice channel 0 in stream B is not considered to be "channel + 16" of a 32-channel MIDI voice channel space whose "channel 0" is + channel 0 of stream A. + + + + + +Lazzaro & Wawrzynek Standards Track [Page 125] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + o Streams sent by different parties over different RTP sessions, or + over the same RTP session but with different payload type numbers, + do not share the association that is shared by a MIDI cable pair + that cross-connects two devices in a MIDI 1.0 DIN network. By + default, this association is only held by streams sent by + different parties in the same RTP session that use the same + payload type number. + + In this appendix, we show how to express that specific RTP MIDI + streams in a multimedia session are not independent but instead are + related in one of the three ways defined above. We use two tools to + express these relations: + + o The musicport parameter. This parameter is assigned a non- + negative integer value between 0 and 4294967295. It appears in + the fmtp lines of payload types. + + o The FID grouping attribute [RFC5888] signals that several RTP + sessions in a multimedia session are using the musicport parameter + to express an inter-session relationship. + + If a multimedia session has several payload types whose musicport + parameters are assigned the same integer value, streams using these + payload types share an "identity relationship" (including streams + that use the same payload type). Streams in an identity relationship + share two properties: + + o Identity relationship streams sent by the same party target the + same MIDI name space. Thus, if streams A and B share an identity + relationship, voice channel 0 in stream A is the same "channel 0" + as voice channel 0 in stream B. + + o Pairs of identity relationship streams that are sent by different + parties share the association that is shared by a MIDI cable pair + that cross-connects two devices in a MIDI 1.0 DIN network. + + A party MUST NOT send two RTP MIDI streams that share an identity + relationship in the same RTP session. Instead, each stream MUST be + in a separate RTP session. As explained in Section 2.1, this + restriction is necessary to support the RTP MIDI method for the + synchronization of streams that share a MIDI name space. + + If a multimedia session has several payload types whose musicport + parameters are assigned sequential values (i.e., i, i+1, ... i+k), + the streams using the payload types share an "ordered relationship". + For example, if payload type A assigns 2 to musicport and payload + type B assigns 3 to musicport, A and B are in an ordered + relationship. + + + +Lazzaro & Wawrzynek Standards Track [Page 126] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Streams in an ordered relationship that are sent by the same party + are considered by renderers to form a single larger MIDI space. For + example, if stream A has a musicport value of 2 and stream B has a + musicport value of 3, MIDI voice channel 0 in stream B is considered + to be voice channel 16 in the larger MIDI space formed by the + relationship. Note that it is possible for streams to participate in + both an identity relationship and an ordered relationship. + + We now state several rules for using musicport: + + o If streams from several RTP sessions in a multimedia session use + the musicport parameter, the RTP sessions MUST be grouped using + the FID grouping attribute defined in [RFC5888]. + + o An ordered or identity relationship MUST NOT contain both native + RTP MIDI streams and mpeg4-generic RTP MIDI streams. An exception + applies if a relationship consists of sendonly and recvonly (but + not sendrecv) streams. In this case, the sendonly streams MUST + NOT contain both types of streams, and the recvonly streams MUST + NOT contain both types of streams. + + o It is possible to construct identity relationships that violate + the recovery journal mandate (for example, sending NoteOns for a + voice channel on stream A and NoteOffs for the same voice channel + on stream B). Parties MUST NOT generate (or accept) session + descriptions that exhibit this flaw. + + o Other payload formats MAY define musicport media type parameters. + Formats would define these parameters so that their sessions could + be bundled into RTP MIDI name spaces. The parameter definitions + MUST be compatible with the musicport semantics defined in this + appendix. + + As a rule, at most one payload type in a relationship may specify a + MIDI renderer. An exception to the rule applies to relationships + that contain sendonly and recvonly streams but no sendrecv streams. + In this case, one sendonly session and one recvonly session may each + define a renderer. + + Renderer specification in a relationship may be done using the tools + described in Appendix C.6. These tools work for both native streams + and mpeg4-generic streams. An mpeg4-generic stream that uses the + Appendix C.6 tools MUST set all "config" parameters to the empty + string (""). + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 127] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Alternatively, for mpeg4-generic streams, renderer specification may + be done by setting one "config" parameter in the relationship to the + renderer configuration string and all other config parameters to the + empty string (""). + + We now define sender and receiver rules that apply when a party sends + several streams that target the same MIDI name space. + + Senders MAY use the subsetting parameters (Appendix C.1) to predefine + the partitioning of commands between streams, or they MAY use a + dynamic partitioning strategy. + + Receivers that merge identity relationship streams into a single MIDI + command stream MUST maintain the structural integrity of the MIDI + commands coded in each stream during the merging process, in the same + way that software that merges traditional MIDI 1.0 DIN cable flows is + responsible for creating a merged command flow compatible with + [MIDI]. + + Senders MUST partition the name space so that the rendered MIDI + performance does not contain indefinite artifacts (as defined in + Section 4). This responsibility holds even if all streams are sent + over reliable transport, as different stream latencies may yield + indefinite artifacts. For example, stuck notes may occur in a + performance split over two TCP streams, if NoteOn commands are sent + on one stream and NoteOff commands are sent on the other. + + Senders MUST NOT split a Registered Parameter Numbers (RPN) or Non- + Registered Parameter Numbers (NRPN) transaction appearing on a MIDI + channel across multiple identity relationship sessions. Receivers + MUST assume that the RPN/NRPN transactions that appear on different + identity relationship sessions are independent and MUST preserve + transactional integrity during the MIDI merge. + + A simple way to safely partition voice channel commands is to place + all MIDI commands for a particular voice channel into the same + session. Safe partitioning of MIDI system commands may be more + complicated for sessions that extensively use System Exclusive. + + We now show several session description examples that use the + musicport parameter. + + Our first session description example shows two RTP MIDI streams that + drive the same General MIDI decoder. The sender partitions MIDI + commands between the streams dynamically. The musicport values + indicate that the streams share an identity relationship. + + + + + +Lazzaro & Wawrzynek Standards Track [Page 128] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + v=0 + o=lazzaro 2520644554 2838152170 IN IP4 first.example.net + s=Example + t=0 0 + a=group:FID 1 2 + c=IN IP4 192.0.2.94 + m=audio 5004 RTP/AVP 96 + a=rtpmap:96 mpeg4-generic/44100 + a=mid:1 + a=fmtp:96 streamtype=5; mode=rtp-midi; profile-level-id=12; + config=7A0A0000001A4D546864000000060000000100604D54726B0 + 000000600FF2F000; musicport=12 + m=audio 5006 RTP/AVP 96 + a=rtpmap:96 mpeg4-generic/44100 + a=mid:2 + a=fmtp:96 streamtype=5; mode=rtp-midi; config=""; + profile-level-id=12; musicport=12 + + (The a=fmtp lines have been wrapped to fit the page to accommodate + memo formatting restrictions; they comprise single lines in SDP.) + + Recall that Section 2.1 defines rules for streams that target the + same MIDI name space. Those rules, implemented in the example above, + require that each stream resides in a separate RTP session and that + the grouping mechanisms defined in [RFC5888] signal an inter-session + relationship. The "group" and "mid" attribute lines implement this + grouping mechanism. + + A variant on this example, whose session description is not shown, + would use two streams in an identity relationship driving the same + MIDI renderer, each with a different transport type. One stream + would use UDP and would be dedicated to real-time messages. A second + stream would use TCP [RFC4571] and would be used for SysEx bulk data + messages. + + In the next example, two mpeg4-generic streams form an ordered + relationship to drive a Structured Audio decoder with 32 MIDI voice + channels. Both streams reside in the same RTP session. + + + + + + + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 129] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + v=0 + o=lazzaro 2520644554 2838152170 IN IP6 first.example.net + s=Example + t=0 0 + m=audio 5006 RTP/AVP 96 97 + c=IN IP6 2001:DB8::7F2E:172A:1E24 + a=rtpmap:96 mpeg4-generic/44100 + a=fmtp:96 streamtype=5; mode=rtp-midi; config=""; + profile-level-id=13; musicport=5 + a=rtpmap:97 mpeg4-generic/44100 + a=fmtp:97 streamtype=5; mode=rtp-midi; config=""; + profile-level-id=13; musicport=6; render=synthetic; + rinit=audio/asc; + url="http://example.com/cardinal.asc"; + cid="azsldkaslkdjqpwojdkmsldkfpe" + + (The a=fmtp lines have been wrapped to fit the page to accommodate + memo formatting restrictions; they comprise single lines in SDP.) + + The sequential musicport values for the two sessions establish the + ordered relationship. The musicport=5 session maps to Structured + Audio extended channels range 0-15; the musicport=6 session maps to + Structured Audio extended channels range 16-31. + + Both config strings are empty. The configuration data is specified + by parameters that appear in the fmtp line of the second media + description. We define this configuration method in Appendix C.6. + + The next example shows two RTP MIDI streams (one recvonly, one + sendonly) that form a "virtual sendrecv" session. Each stream + resides in a different RTP session (a requirement because sendonly + and recvonly are RTP session attributes). + + + + + + + + + + + + + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 130] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + v=0 + o=lazzaro 2520644554 2838152170 IN IP4 first.example.net + s=Example + t=0 0 + a=group:FID 1 2 + c=IN IP4 192.0.2.94 + m=audio 5004 RTP/AVP 96 + a=sendonly + a=rtpmap:96 mpeg4-generic/44100 + a=mid:1 + a=fmtp:96 streamtype=5; mode=rtp-midi; profile-level-id=12; + config=7A0A0000001A4D546864000000060000000100604D54726B0 + 000000600FF2F000; musicport=12 + m=audio 5006 RTP/AVP 96 + a=recvonly + a=rtpmap:96 mpeg4-generic/44100 + a=mid:2 + a=fmtp:96 streamtype=5; mode=rtp-midi; profile-level-id=12; + config=7A0A0000001A4D546864000000060000000100604D54726B0 + 000000600FF2F000; musicport=12 + + (The a=fmtp lines have been wrapped to fit the page to accommodate + memo formatting restrictions; they comprise single lines in SDP.) + + To signal the "virtual sendrecv" semantics, the two streams assign + musicport to the same value (12). As defined earlier in this + section, pairs of identity relationship streams that are sent by + different parties share the association that is shared by a MIDI + cable pair that cross-connects two devices in a MIDI 1.0 network. We + use the term "virtual sendrecv" because streams sent by different + parties in a true sendrecv session also have this property. + + As discussed in the preamble to Appendix C, the primary advantage of + the virtual sendrecv configuration is that each party can customize + the property of the stream it receives. In the example above, each + stream defines its own "config" string that could customize the + rendering algorithm for each party (in fact, the particular strings + shown in this example are identical, because General MIDI is not a + configurable MPEG 4 renderer). + +C.6. Configuration Tools: MIDI Rendering + + This appendix defines the session configuration tools for rendering. + + The render parameter specifies a rendering method for a stream. The + parameter is assigned a token value that signals the top-level + rendering class. This memo defines four token values for render: + "unknown", "synthetic", "api", and "null": + + + +Lazzaro & Wawrzynek Standards Track [Page 131] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + o An "unknown" renderer is a renderer whose nature is unspecified. + It is the default renderer for native RTP MIDI streams. + + o A "synthetic" renderer transforms the MIDI stream into audio + output (or sometimes into stage lighting changes or other + actions). It is the default renderer for mpeg4-generic RTP MIDI + streams. + + o An "api" renderer presents the command stream to applications via + an Application Programming Interface (API). + + o The "null" renderer discards the MIDI stream. + + The "null" render value plays special roles during Offer/Answer + negotiations [RFC3264]. A party uses the "null" value in an answer + to reject an offered renderer. Note that rejecting a renderer is + independent from rejecting a payload type (coded by removing the + payload type from a media line) and rejecting a media stream (coded + by zeroing the port of a media line that uses the renderer). + + Other render token values MAY be registered with IANA. The token + value MUST adhere to the ABNF for render tokens defined in Appendix + D. Registrations MUST include a complete specification of parameter + value usage, similar in depth to the specifications that appear + throughout Appendix C.6 for "synthetic" and "api" render values. If + a party is offered a session description that uses a render token + value that is not known to the party, the party MUST NOT accept the + renderer. Options include rejecting the renderer (using the "null" + value), the payload type, the media stream, or the session + description. + + Other parameters MAY follow a render parameter in a parameter list. + The additional parameters act to define the exact nature of the + renderer. For example, the subrender parameter (defined in Appendix + C.6.2) specifies the exact nature of the renderer. + + Special rules apply to using the render parameter in an mpeg4-generic + stream. We define these rules in Appendix C.6.5. + +C.6.1. The multimode Parameter + + A media description MAY contain several render parameters. By + default, if a parameter list includes several render parameters, a + receiver MUST choose exactly one renderer from the list to render the + stream. The multimode parameter may be used to override this + default. We define two token values for multimode: "one" and "all". + + + + + +Lazzaro & Wawrzynek Standards Track [Page 132] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + o The default "one" value requests rendering by exactly one of the + listed renderers. + + o The "all" value requests the synchronized rendering of the RTP + MIDI stream by all listed renderers, if possible. + + If the multimode parameter appears in a parameter list, it MUST + appear before the first render parameter assignment. + + Render parameters appear in the parameter list in order of decreasing + priority. A receiver MAY use the priority ordering to decide which + renderer(s) to retain in a session. + + If the "offer" in an Offer/Answer-style negotiation [RFC3264] + contains a parameter list with one or more render parameters, the + "answer" MUST set the render parameters of all unchosen renderers to + "null". + +C.6.2. Renderer Specification + + The render parameter (Appendix C.6 preamble) specifies, in a broad + sense, what a renderer does with a MIDI stream. In this appendix, we + describe the subrender parameter. The token value assigned to + subrender defines the exact nature of the renderer. Thus, render and + subrender combine to define a renderer, in the same way as MIME types + and MIME subtypes combine to define a type of media [RFC2045]. + + If the subrender parameter is used for a renderer definition, it MUST + appear immediately after the render parameter in the parameter list. + At most, one subrender parameter may appear in a renderer definition. + + This document defines one value for subrender: the value "default". + The "default" token specifies the use of the default renderer for the + stream type (native or mpeg4-generic). The default renderer for + native RTP MIDI streams is a renderer whose nature is unspecified + (see point 6 in Section 6.1 for details). The default renderer for + mpeg4-generic RTP MIDI streams is an MPEG 4 Audio Object Type whose + ID number is 13, 14, or 15 (see Section 6.2 for details). + + If a renderer definition does not use the subrender parameter, the + value "default" is assumed for subrender. + + Other subrender token values may be registered with IANA. We now + discuss guidelines for registering subrender values. + + A subrender value is registered for a specific stream type (native or + mpeg4-generic) and a specific render value (excluding "null" and + "unknown"). Registrations for mpeg4-generic subrender values are + + + +Lazzaro & Wawrzynek Standards Track [Page 133] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + restricted to new MPEG 4 Audio Object Types that accept MIDI input. + The syntax of the token MUST adhere to the token definition in + Appendix D. + + For "render=synthetic" renderers, a subrender value registration + specifies an exact method for transforming the MIDI stream into audio + (or sometimes into video or control actions, such as stage lighting). + For standardized renderers, this specification is usually a pointer + to a standards document, perhaps supplemented by RTP-MIDI-specific + information. For commercial products and open-source projects, this + specification usually takes the form of instructions for interfacing + the RTP MIDI stream with the product or project software. A + "render=synthetic" registration MAY specify additional Reset State + commands for the renderer (Appendix A.1). + + A "render=api" subrender value registration specifies how an RTP MIDI + stream interfaces with an API. This specification is usually a + pointer to programmer's documentation for the API, perhaps + supplemented by RTP-MIDI-specific information. + + A subrender registration MAY specify an initialization file (referred + to in this document as an initialization data object) for the stream. + The initialization data object MAY be encoded in the parameter list + (verbatim or by reference) using the coding tools defined in Appendix + C.6.3. An initialization data object MUST have a registered + [RFC4288] media type and subtype [RFC2045]. + + For "render=synthetic" renderers, the data object usually encodes + initialization data for the renderer (sample files, synthesis patch + parameters, reverberation room impulse responses, etc.). + + For "render=api" renderers, the data object usually encodes data + about the stream used by the API (for example, for an RTP MIDI stream + generated by a piano keyboard controller, the manufacturer and model + number of the keyboard, for use in GUI presentation). + + Usually, only one initialization object is encoded for a renderer. + If a renderer uses multiple data objects, the correct receiver + interpretation of multiple data objects MUST be defined in the + subrender registration. + + A subrender value registration may also specify additional + parameters, to appear in the parameter list immediately after + subrender. These parameter names MUST begin with the subrender value + followed by an underscore ("_") to avoid name space collisions with + future RTP MIDI parameter names (for example, a parameter "foo_bar" + defined for subrender value "foo"). + + + + +Lazzaro & Wawrzynek Standards Track [Page 134] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + We now specify guidelines for interpreting the subrender parameter + during session configuration. + + If a party is offered a session description that uses a renderer + whose subrender value is not known to the party, the party MUST NOT + accept the renderer. Options include rejecting the renderer (using + the "null" value), the payload type, the media stream, or the session + description. + + Receivers MUST be aware of the Reset State commands (Appendix A.1) + for the renderer specified by the subrender parameter and MUST insure + that the renderer does not experience indefinite artifacts due to the + presence (or the loss) of a Reset State command. + +C.6.3. Renderer Initialization + + If the renderer for a stream uses an initialization data object, an + rinit parameter MUST appear in the parameter list immediately after + the subrender parameter. If the renderer parameter list does not + include a subrender parameter (recall the semantics for "default" in + Appendix C.6.2), the rinit parameter MUST appear immediately after + the render parameter. + + The value assigned to the rinit parameter MUST be the media + type/subtype [RFC2045] for the initialization data object. If an + initialization object type is registered with several media types, + including audio, the assignment to rinit MUST use the audio media + type. + + RTP MIDI supports several parameters for encoding initialization data + objects for renderers in the parameter list: inline, url, and cid. + + If the inline, url, and/or cid parameters are used by a renderer, + these parameters MUST immediately follow the rinit parameter. + + If a url parameter appears for a renderer, an inline parameter MUST + NOT appear. If an inline parameter appears for a renderer, a url + parameter MUST NOT appear. However, neither url nor inline is + required to appear. If neither url or inline parameters follow + rinit, the cid parameter MUST follow rinit. + + The inline parameter supports the inline encoding of the data object. + The parameter is assigned a double-quoted Base64 [RFC2045] encoding + of the binary data object, with no line breaks. Appendix E.4 shows + an example that constructs an inline parameter value. + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 135] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + The url parameter is assigned a double-quoted string representation + of a Uniform Resource Locator (URL) for the data object. The string + MUST specify either a HyperText Transport Protocol URI (HTTP, + [RFC2616]) or an HTTP over TLS URI (HTTPS, [RFC2818]). The media + type/subtype for the data object SHOULD be specified in the + appropriate HTTP or HTTPS transport header. + + The cid parameter supports data object caching. The parameter is + assigned a double-quoted string value that encodes a globally unique + identifier for the data object. + + A cid parameter MAY immediately follow an inline parameter, in which + case the cid identifier value MUST be associated with the inline data + object. + + If a url parameter is present, and if the data object for the URL is + expected to be unchanged for the life of the URL, a cid parameter MAY + immediately follow the url parameter. The cid identifier value MUST + be associated with the data object for the URL. A cid parameter + assigned to the same identifier value SHOULD be specified following + the data object type/subtype in the appropriate HTTP transport + header. + + If a url parameter is present, and if the data object for the URL is + expected to change during the life of the URL, a cid parameter MUST + NOT follow the url parameter. A receiver interprets the presence of + a cid parameter as an indication that it is safe to use a cached copy + of the url data object; the absence of a cid parameter is an + indication that it is not safe to use a cached copy, as it may + change. + + Finally, the cid parameter MAY be used without the inline and url + parameters. In this case, the identifier references a local or + distributed catalog of data objects. + + In most cases, only one data object is coded in the parameter list + for each renderer. For example, the default renderer for + mpeg4-generic streams uses a single data object (see Appendix C.6.5 + for example usage). + + However, a subrender registration MAY permit the use of multiple data + objects for a renderer. If multiple data objects are encoded for a + renderer, each object encoding begins with an rinit parameter + followed by inline, url, and/or cid parameters. + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 136] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Initialization data objects MAY encapsulate a Standard MIDI File + (SMF). By default, the SMFs that are encapsulated in a data object + MUST be ignored by an RTP MIDI receiver. We define parameters to + override this default in Appendix C.6.4. + + To end this section, we offer guidelines for registering media types + for initialization data objects. These guidelines are in addition to + the information in [RFC4288]. + + Some initialization data objects are also capable of encoding MIDI + note information and thus complete audio performances. These objects + SHOULD be registered using the audio media type (so that the objects + may also be used for store-and-forward rendering) and the + "application" media type (to support editing tools). Initialization + objects without note storage, or initialization objects for non-audio + renderers, SHOULD be registered only for an "application" media type. + +C.6.4. MIDI Channel Mapping + + In this appendix, we specify how to map MIDI name spaces (16 voice + channels + systems) onto a renderer. + + In the general case: + + o A session may define an ordered relationship (Appendix C.5) that + presents more than one MIDI name space to a renderer. + + o A renderer may accept an arbitrary number of MIDI name spaces, or + it may expect a specific number of MIDI name spaces. + + A session description SHOULD provide a compatible MIDI name space to + each renderer in the session. If a receiver detects that a session + description has too many or too few MIDI name spaces for a renderer, + MIDI data from extra stream name spaces MUST be discarded, and extra + renderer name spaces MUST NOT be driven with MIDI data (except as + described in Appendix C.6.4.1). + + If a parameter list defines several renderers and assigns the "all" + token value to the multimode parameter, the same name space is + presented to each renderer. However, the chanmask parameter may be + used to mask out selected voice channels to each renderer. We define + chanmask and other MIDI management parameters in the subsections + below. + + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 137] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + +C.6.4.1. The smf_info Parameter + + The smf_info parameter defines the use of the SMFs encapsulated in + renderer data objects (if any). The smf_info parameter also defines + the use of SMFs coded in the smf_inline, smf_url, and smf_cid + parameters (defined in Appendix C.6.4.2). + + The smf_info parameter describes the render parameter that most + recently precedes it in the parameter list. The smf_info parameter + MUST NOT appear in parameter lists that do not use the render + parameter and MUST NOT appear before the first use of render in the + parameter list. + + We define three token values for smf_info: "ignore", "sdp_start", and + "identity": + + o The "ignore" value indicates that the SMFs MUST be discarded. + This behavior is the default SMF-rendering behavior. + + o The "sdp_start" value codes that SMFs MUST be rendered and that + the rendering MUST begin upon the acceptance of the session + description. If a receiver is offered a session description with + a renderer that uses an smf_info parameter set to "sdp_start" and + if the receiver does not support rendering SMFs, the receiver MUST + NOT accept the renderer associated with the smf_info parameter. + Options include rejecting the renderer (by setting the render + parameter to "null"), the payload type, the media stream, or the + entire session description. + + o The "identity" value indicates that the SMFs code the identity of + the renderer. The value is meant for use with the "unknown" + renderer (see Appendix C.6 preamble). The MIDI commands coded in + the SMF are informational in nature and MUST NOT be presented to a + renderer for audio presentation. In typical use, the SMF would + use SysEx Identity Reply commands (F0 7E nn 06 02, as defined in + [MIDI]) to identify devices and use device-specific SysEx commands + to describe the current state of the devices (patch memory + contents, etc.). + + Other smf_info token values MAY be registered with IANA. The token + value MUST adhere to the ABNF for render tokens defined in Appendix + D. Registrations MUST include a complete specification of parameter + usage, similar in depth to the specifications that appear in this + appendix for "sdp_start" and "identity". + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 138] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + If a party is offered a session description that uses an smf_info + parameter value that is not known to the party, the party MUST NOT + accept the renderer associated with the smf_info parameter. Options + include rejecting the renderer, the payload type, the media stream, + or the entire session description. + + We now define the rendering semantics for the "sdp_start" token value + in detail. + + The SMFs and RTP MIDI streams in a session description share the same + MIDI name space(s). In the simple case of a single RTP MIDI stream + and a single SMF, the SMF MIDI commands and RTP MIDI commands are + merged into a single name space and presented to the renderer. The + indefinite artifact responsibilities for merged MIDI streams defined + in Appendix C.5 also apply to merging RTP and SMF MIDI data. + + If a payload type codes multiple SMFs, the SMF name spaces are + presented as an ordered entity to the renderer. To determine the + ordering of SMFs for a renderer (which SMF is "first", which is + "second", etc.), use the following rules: + + o If the renderer uses a single data object, the order of appearance + of the SMFs in the object's internal structure defines the order + of the SMFs (the earliest SMF in the object is "first", the next + SMF in the object is "second", etc.). + + o If multiple data objects are encoded for a renderer, the + appearance of each data object in the parameter list sets the + relative order of the SMFs encoded in each data object (SMFs + encoded in parameters that appear earlier in the list are ordered + before SMFs encoded in parameters that appear later in the list). + + o If SMFs are encoded in data objects parameters and in the + parameters defined in Appendix C.6.4.2, the relative order of the + data object parameters and Appendix C.6.4.2 parameters in the + parameter list sets the relative order of SMFs (SMFs encoded in + parameters that appear earlier in the list are ordered before SMFs + in parameters that appear later in the list). + + Given this ordering of SMFs, we now define the mapping of SMFs to + renderer name spaces. The SMF that appears first for a renderer maps + to the first renderer name space. The SMF that appears second for a + renderer maps to the second renderer name space, etc. If the + associated RTP MIDI streams also form an ordered relationship, the + first SMF is merged with the first name space of the relationship, + the second SMF is merged to the second name space of the + relationship, etc. + + + + +Lazzaro & Wawrzynek Standards Track [Page 139] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Unless the streams and the SMFs both use MIDI Time Code, the time + offset between SMF and stream data is unspecified. This restriction + limits the use of SMFs to applications where synchronization is not + critical, such as the transport of System Exclusive commands for + renderer initialization or human-SMF interactivity. + + Finally, we note that each SMF in the sdp_start discussion above + encodes exactly one MIDI name space (16 voice channels + systems). + Thus, the use of the Device Name SMF meta event to specify several + MIDI name spaces in an SMF is not supported for sdp_start. + +C.6.4.2. The smf_inline, smf_url, and smf_cid Parameters + + In some applications, the renderer data object may not encapsulate + SMFs, but an application may wish to use SMFs in the manner defined + in Appendix C.6.4.1. + + The smf_inline, smf_url, and smf_cid parameters address this + situation. These parameters use the syntax and semantics of the + inline, url, and cid parameters defined in Appendix C.6.3, except + that the encoded data object is an SMF. + + The smf_inline, smf_url, and smf_cid parameters belong to the render + parameter that most recently precedes it in the session description. + The smf_inline, smf_url, and smf_cid parameters MUST NOT appear in + parameter lists that do not use the render parameter and MUST NOT + appear before the first use of render in the parameter list. If + several smf_inline, smf_url, or smf_cid parameters appear for a + renderer, the order of the parameters defines the SMF name space + ordering. + +C.6.4.3. The chanmask Parameter + + The chanmask parameter instructs the renderer to ignore all MIDI + voice commands for certain channel numbers. The parameter value is a + concatenated string of "1" and "0" digits. Each string position maps + to a MIDI voice channel number (system channels may not be masked). + A "1" instructs the renderer to process the voice channel; a "0" + instructs the renderer to ignore the voice channel. + + The string length of the chanmask parameter value MUST be 16 (for a + single stream or an identity relationship) or a multiple of 16 (for + an ordered relationship). + + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 140] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + The chanmask parameter describes the render parameter that most + recently precedes it in the session description; chanmask MUST NOT + appear in parameter lists that do not use the render parameter and + MUST NOT appear before the first use of render in the parameter list. + + The chanmask parameter describes the final MIDI name spaces presented + to the renderer. The SMF and stream components of the MIDI name + spaces may not be independently masked. + + If a receiver is offered a session description with a renderer that + uses the chanmask parameter, and if the receiver does not implement + the semantics of the chanmask parameter, the receiver MUST NOT accept + the renderer unless the chanmask parameter value contains only "1"s. + +C.6.5. The audio/asc Media Type + + In Appendix 11.3, we register the audio/asc media type. The data + object for audio/asc is a binary encoding of the AudioSpecificConfig + data block used to initialize mpeg4-generic streams (Section 6.2 and + [MPEGAUDIO]). Disk files that store this data object use the file + extension ".acn". + + An mpeg4-generic parameter list MAY use the render, subrender, and + rinit parameters with the audio/asc media type for renderer + configuration. Several restrictions apply to the use of these + parameters in mpeg4-generic parameter lists: + + o An mpeg4-generic media description that uses the render parameter + MUST assign the empty string ("") to the mpeg4-generic "config" + parameter. The use of the streamtype, mode, and profile-level-id + parameters MUST follow the normative text in Section 6.2. + + o Sessions that use identity or ordered relationships MUST follow + the mpeg4-generic configuration restrictions in Appendix C.5. + + o The render parameter MUST be assigned the value "synthetic", + "unknown", "null", or a render value that has been added to the + IANA repository for use with mpeg4-generic RTP MIDI streams. The + "api" token value for render MUST NOT be used. + + o If a subrender parameter is present, it MUST immediately follow + the render parameter, and it MUST be assigned the token value + "default" or assigned a subrender value added to the IANA + repository for use with mpeg4-generic RTP MIDI streams. A + subrender parameter assignment may be left out of the renderer + configuration, in which case the implied value of subrender is the + default value of "default". + + + + +Lazzaro & Wawrzynek Standards Track [Page 141] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + o If the render parameter is assigned the value "synthetic" and the + subrender parameter has the value "default" (assigned or implied), + the rinit parameter MUST be assigned the value audio/asc, and an + AudioSpecificConfig data object MUST be encoded using the + mechanisms defined in Appendices C.6.2 and C.6.3. The + AudioSpecificConfig data MUST encode one of the MPEG 4 Audio + Object Types defined for use with mpeg4-generic in Section 6.2. + If the subrender value is other than "default", refer to the + subrender registration for information on the use of audio/asc + with the renderer. + + o If the render parameter is assigned the value "null" or "unknown", + the data object MAY be omitted. + + Several general restrictions apply to the use of the audio/asc media + type in RTP MIDI: + + o A native stream MUST NOT assign audio/asc to rinit. The audio/asc + media type is not intended to be a general-purpose container for + rendering systems outside of MPEG usage. + + o The audio/asc media type defines a stored object type; it does not + define semantics for RTP streams. Thus, audio/asc MUST NOT appear + on an rtpmap line of a session description. + + Below, we show session description examples for audio/asc. The + session description below uses the inline parameter to code the + AudioSpecificConfig block for a mpeg4-generic General MIDI stream. + We derive the value assigned to the inline parameter in Appendix E.4. + The subrender token value of "default" is implied by the absence of + the subrender parameter in the parameter list. + + v=0 + o=lazzaro 2520644554 2838152170 IN IP4 first.example.net + s=Example + t=0 0 + m=audio 5004 RTP/AVP 96 + c=IN IP4 192.0.2.94 + a=rtpmap:96 mpeg4-generic/44100 + a=fmtp:96 streamtype=5; mode=rtp-midi; config=""; + profile-level-id=12; render=synthetic; rinit=audio/asc; + inline="egoAAAAaTVRoZAAAAAYAAAABAGBNVHJrAAAABgD/LwAA" + + (The a=fmtp line has been wrapped to fit the page to accommodate memo + formatting restrictions; it comprises a single line in SDP.) + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 142] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + The session description below uses the url parameter to code the + AudioSpecificConfig block for the same General MIDI stream: + + v=0 + o=lazzaro 2520644554 2838152170 IN IP4 first.example.net + s=Example + t=0 0 + m=audio 5004 RTP/AVP 96 + c=IN IP4 192.0.2.94 + a=rtpmap:96 mpeg4-generic/44100 + a=fmtp:96 streamtype=5; mode=rtp-midi; config=""; + profile-level-id=12; render=synthetic; rinit=audio/asc; + url="http://example.net/oski.asc"; + cid="xjflsoeiurvpa09itnvlduihgnvet98pa3w9utnuighbuk" + + (The a=fmtp line has been wrapped to fit the page to accommodate memo + formatting restrictions; it comprises a single line in SDP.) + +C.7. Interoperability + + In this appendix, we define interoperability guidelines for two + application areas: + + o MIDI content-streaming applications. RTP MIDI is added to RTSP- + based content-streaming servers so that viewers may experience + MIDI performances (produced by a specified client-side renderer) + in synchronization with other streams (video, audio). + + o Long-distance network musical performance applications. RTP MIDI + is added to SIP-based voice chat or videoconferencing programs, as + an alternative, or as an addition, to audio and/or video RTP + streams. + + For each application, we define a core set of functionalities that + all implementations MUST implement. + + The applications we address in this section are not an exhaustive + list of potential RTP MIDI uses. We expect framework documents for + other applications to be developed, within the IETF or within other + organizations. We discuss other potential application areas for RTP + MIDI in Section 1 of the main text of this memo. + + + + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 143] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + +C.7.1. MIDI Content-Streaming Applications + + In content-streaming applications, a user invokes an RTSP client to + initiate a request to an RTSP server to view a multimedia session. + For example, clicking on a web page link for an Internet Radio + channel launches an RTSP client that uses the link's RTSP URL to + contact the RTSP server hosting the radio channel. + + The content may be pre-recorded (for example, on-demand replay of + yesterday's football game) or "live" (for example, football game + coverage as it occurs), but in either case, the user is usually an + "audience member" as opposed to a "participant" (as the user would be + in telephony). + + Note that these examples describe the distribution of audio content + to an audience member. The interoperability guidelines in this + appendix address RTP MIDI applications of this nature, not + applications such as the transmission of raw MIDI command streams for + use in a professional environment (recording studio, performance + stage, etc.). + + In an RTSP session, a client accesses a session description that is + "declared" by the server, either via the RTSP DESCRIBE method or via + other means such as HTTP or email. The session description defines + the session from the perspective of the client. For example, if a + media line in the session description contains a non-zero port + number, it encodes the server's preference for the client's port + numbers for RTP and RTCP reception. Once media flow begins, the + server sends an RTP MIDI stream to the client, which renders it for + presentation, perhaps in synchrony with video or other audio streams. + + We now define the interoperability text for content-streaming RTSP + applications. + + In most cases, server interoperability responsibilities are described + in terms of limits on the "reference" session description a server + provides for a performance if it has no information about the + capabilities of the client. The reference session is a "lowest + common denominator" session that maximizes the odds that a client + will be able to view the session. If a server is aware of the + capabilities of the client, the server is free to provide a session + description customized for the client in the DESCRIBE reply. + + Clients MUST support unicast UDP RTP MIDI streams that use the + recovery journal with the closed-loop or the anchor sending policies. + Clients MUST be able to interpret stream subsetting and chapter + + + + + +Lazzaro & Wawrzynek Standards Track [Page 144] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + inclusion parameters in the session description that qualify the + sending policies. Client support of enhanced Chapter C encoding is + OPTIONAL. + + The reference session description offered by a server MUST send all + RTP MIDI UDP streams as unicast streams that use the recovery journal + and the closed-loop or anchor sending policies. Servers SHOULD use + the stream subsetting and chapter inclusion parameters in the + reference session description to simplify the rendering task of the + client. Server support of enhanced Chapter C encoding is OPTIONAL. + + Clients and servers MUST support the use of RTSP interleaved mode (a + method for interleaving RTP onto the RTSP TCP transport). + + Clients MUST be able to interpret the timestamp semantics signalled + by the "comex" value of the tsmode parameter (i.e., the timestamp + semantics of Standard MIDI Files [MIDI]). Servers MUST use the + "comex" value for the tsmode parameter in the reference session + description. + + Clients MUST be able to process an RTP MIDI stream whose packets + encode an arbitrary temporal duration ("media time"). Thus, in + practice, clients MUST implement a MIDI playout buffer. Clients MUST + NOT depend on the presence of rtp_ptime, rtp_maxtime, and guardtime + parameters in the session description in order to process packets, + but they SHOULD be able to use these parameters to improve packet + processing. + + Servers SHOULD strive to send RTP MIDI streams in the same way media + servers send conventional audio streams: a sequence of packets that + all code either the same temporal duration (non-normative example: 50 + ms packets) or one of an integral number of temporal durations (non- + normative example: 50 ms, 100 ms, 250 ms, or 500 ms packets). + Servers SHOULD encode information about the packetization method in + the rtp_ptime and rtp_maxtime parameters in the session description. + + Clients MUST be able to examine the render and subrender parameter to + determine if a multimedia session uses a renderer it supports. + Clients MUST be able to interpret the default "one" value of the + multimode parameter to identify supported renderers from a list of + renderer descriptions. Clients MUST be able to interpret the + musicport parameter to the degree that it is relevant to the + renderers it supports. Clients MUST be able to interpret the + chanmask parameter. + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 145] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Clients supporting renderers whose data object (as encoded by a + parameter value for inline) could exceed 300 octets in size MUST + support the url and cid parameters and thus must implement the HTTP + protocol in addition to RTSP. HTTP over TLS [RFC2818] support for + data objects is OPTIONAL. + + Servers MUST specify complete rendering systems for RTP MIDI streams. + Note that a minimal RTP MIDI native stream does not meet this + requirement (Section 6.1), as the rendering method for such streams + is "not specified". + + At the time of writing this memo, the only way for servers to specify + a complete rendering system is to specify an mpeg4-generic RTP MIDI + stream in mode rtp-midi (Section 6.2 and Appendix C.6.5). As a + consequence, the only rendering systems that may be presently used + are General MIDI [MIDI], DLS 2 [DLS2], or Structured Audio [MPEGSA]. + Note that the maximum inline value for General MIDI is well under 300 + octets (and thus clients need not support the url parameter) and that + the maximum inline values for DLS 2 and Structured Audio may be much + larger than 300 octets (and thus clients MUST support the url + parameter). + + We anticipate that the owners of rendering systems (both standardized + and proprietary) will register subrender parameters for their + renderers. Once registration occurs, native RTP MIDI sessions may + use render and subrender (Appendix C.6.2) to specify complete + rendering systems for RTSP content-streaming multimedia sessions. + + Servers MUST NOT use the sdp_start value for the smf_info parameter + in the reference session description, as this use would require that + clients be able to parse and render Standard MIDI Files. + + Clients MUST support mpeg4-generic mode rtp-midi General MIDI (GM) + sessions, at a polyphony limited by the hardware capabilities of the + client. This requirement provides a "lowest common denominator" + rendering system for content providers to target. Note that this + requirement does not force implementors of a non-GM renderer (such as + DLS 2 or Structured Audio) to add a second rendering engine. + Instead, a client may satisfy the requirement by including a set of + voice patches that implement the GM instrument set and using this + emulation for mpeg4-generic GM sessions. + + It is RECOMMENDED that servers use General MIDI as the renderer for + the reference session description because clients are REQUIRED to + support it. We do not require General MIDI as the reference renderer + because it is an inappropriate choice for normative applications. + + + + + +Lazzaro & Wawrzynek Standards Track [Page 146] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Servers using General MIDI as a "lowest common denominator" renderer + SHOULD use Universal Real-Time SysEx Maximum Instantaneous Polyphony + (MIP) messages [SPMIDI] to communicate the priority of voices to + polyphony-limited clients. + +C.7.2. MIDI Network Musical Performance Applications + + In Internet telephony and videoconferencing applications, parties + interact over an IP network as they would face-to-face. Good user + experiences require low end-to-end audio latency and tight + audiovisual synchronization (for "lip-sync"). The Session Initiation + Protocol (SIP, [RFC3261]) is used for session management. + + In this appendix section, we define interoperability guidelines for + using RTP MIDI streams in interactive SIP applications. Our primary + interest is supporting Network Musical Performances (NMPs), where + musicians in different locations interact over the network as if they + were in the same room. See [NMP] for background information on NMP, + and see [RFC4696] for a discussion of low-latency RTP MIDI + implementation techniques for NMP. + + Note that the goal of NMP applications is telepresence: the parties + should hear audio that is close to what they would hear if they were + in the same room. The interoperability guidelines in this appendix + address RTP MIDI applications of this nature, not applications such + as the transmission of raw MIDI command streams for use in a + professional environment (recording studio, performance stage, etc.). + + We focus on session management for two-party unicast sessions that + specify a renderer for RTP MIDI streams. Within this limited scope, + the guidelines defined here are sufficient to let applications + interoperate. We define the REQUIRED capabilities of RTP MIDI + senders and receivers in NMP sessions and define how session + descriptions exchanged are used to set up network musical performance + sessions. + + SIP lets parties negotiate details of the session using the + Offer/Answer protocol [RFC3264]. However, RTP MIDI has so many + parameters that "blind" negotiations between two parties might not + yield a common session configuration. + + Thus, we now define a set of capabilities that NMP parties MUST + support. Session description offers whose options lie outside the + envelope of REQUIRED party behavior risk negotiation failure. We + also define session description idioms that the RTP MIDI part of an + offer MUST follow in order to structure the offer for simpler + analysis. + + + + +Lazzaro & Wawrzynek Standards Track [Page 147] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + We use the term "offerer" for the party making a SIP offer and + "answerer" for the party answering the offer. Finally, we note that + unless it is qualified by the adjective "sender" or "receiver", a + statement that a party MUST support X implies that it MUST support X + for both sending and receiving. + + If an offerer wishes to define a "sendrecv" RTP MIDI stream, it may + use a true sendrecv session or the "virtual sendrecv" construction + described in the preamble to Appendix C and in Appendix C.5. A true + sendrecv session indicates that the offerer wishes to participate in + a session where both parties use identically configured renderers. A + virtual sendrecv session indicates that the offerer is willing to + participate in a session where the two parties may be using different + renderer configurations. Thus, parties MUST be prepared to see both + real and virtual sendrecv sessions in an offer. + + Parties MUST support unicast UDP transport of RTP MIDI streams. + These streams MUST use the recovery journal with the closed-loop or + anchor sending policies. These streams MUST use the stream + subsetting and chapter inclusion parameters to declare the types of + MIDI commands that will be sent on the stream (for sendonly streams) + or will be processed (for recvonly streams), including the size + limits on System Exclusive commands. Support of enhanced Chapter C + encoding is OPTIONAL. + + Note that both TCP and multicast UDP support are OPTIONAL. We make + TCP OPTIONAL because we expect NMP renderers to rely on data objects + (signalled by rinit and associated parameters) for initialization at + the start of the session and only to use System Exclusive commands + for interactive control during the session. These interactive + commands are small enough to be protected via the recovery journal + mechanism of RTP MIDI UDP streams. + + We now discuss timestamps, packet timing, and packet-sending + algorithms. + + Recall that the tsmode parameter controls the semantics of command + timestamps in the MIDI list of RTP packets. + + Parties MUST support clock rates of 44.1 kHz, 48 kHz, 88.2 kHz, and + 96 kHz. Parties MUST support streams using the "comex", "async", and + "buffer" tsmode values. Recvonly offers MUST offer the default + "comex". + + Parties MUST support a wide range of packet temporal durations: from + rtp_ptime and rtp_maxptime values of 0, to rtp_ptime and rtp_maxptime + values that code 100 ms. Thus, receivers MUST be able to implement a + playout buffer. + + + +Lazzaro & Wawrzynek Standards Track [Page 148] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Offers and answers MUST present rtp_ptime, rtp_maxptime, and + guardtime values that support the latency that users would expect in + the application, subject to bandwidth constraints. As senders MUST + abide by values set for these parameters in a session description, a + receiver SHOULD use these values to size its playout buffer to + produce the lowest reliable latency for a session. Implementors + should refer to [RFC4696] for information on packet-sending + algorithms for latency-sensitive applications. Parties MUST be able + to implement the semantics of the guardtime parameter for times from + 5 ms to 5000 ms. + + We now discuss the use of the render parameter. + + Sessions MUST specify complete rendering systems for all RTP MIDI + streams. Note that a minimal RTP MIDI native stream does not meet + this requirement (Section 6.1), as the rendering method for such + streams is "not specified". + + At the time of this writing, the only way for parties to specify a + complete rendering system is to specify an mpeg4-generic RTP MIDI + stream in mode rtp-midi (Section 6.2 and Appendix C.6.5). We + anticipate that the owners of rendering systems (both standardized + and proprietary) will register subrender values for their renderers. + Once IANA registration occurs, native RTP MIDI sessions may use + render and subrender (Appendix C.6.2) to specify complete rendering + systems for SIP network musical performance multimedia sessions. + + All parties MUST support General MIDI (GM) sessions at a polyphony + limited by the hardware capabilities of the party. This requirement + provides a "lowest common denominator" rendering system, without + which practical interoperability will be quite difficult. When using + GM, parties SHOULD use Universal Real-Time SysEx MIP messages + [SPMIDI] to communicate the priority of voices to polyphony-limited + clients. + + Note that this requirement does not force implementors of a non-GM + renderer (for mpeg4-generic sessions, DLS 2, or Structured Audio) to + add a second rendering engine. Instead, a client may satisfy the + requirement by including a set of voice patches that implement the GM + instrument set and using this emulation for mpeg4-generic GM + sessions. We require GM support so that an offerer that wishes to + maximize interoperability may do so by offering GM if its preferred + renderer is not accepted by the answerer. + + Offerers MUST NOT present several renderers as options in a session + description by listing several payload types on a media line, as + Section 2.1 uses this construct to let a party send several RTP MIDI + streams in the same RTP session. + + + +Lazzaro & Wawrzynek Standards Track [Page 149] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Instead, an offerer wishing to present rendering options SHOULD offer + a single payload type that offers several renderers. In this + construct, the parameter list codes a list of render parameters (each + followed by its support parameters). As discussed in Appendix C.6.1, + the order of renderers in the list declares the offerer's preference. + The "unknown" and "null" values MUST NOT appear in the offer. The + answer MUST set all render values except the desired renderer to + "null". Thus, "unknown" MUST NOT appear in the answer. + + We use SHOULD instead of MUST in the first sentence in the paragraph + above because this technique does not work in all situations (for + example, if an offerer wishes to offer both mpeg4-generic renderers + and native RTP MIDI renderers as options). In this case, the offerer + MUST present a series of session descriptions, each offering a single + renderer, until the answerer accepts a session description. + + Parties MUST support the musicport, chanmask, subrender, rinit, and + inline parameters. Parties supporting renderers whose data object + (as encoded by a parameter value for inline) could exceed 300 octets + in size MUST support the url and cid parameters and thus must + implement the HTTP protocol. HTTP over TLS [RFC2818] support for + data objects is OPTIONAL. Note that in mpeg4-generic, General MIDI + data objects cannot exceed 300 octets, but DLS 2 and Structured Audio + data objects may. Support for the other rendering parameters + (smf_cif, smf_info, smf_inline, smf_url) is OPTIONAL. + + Thus far in this document, our discussion has assumed that the only + MIDI flows that drive a renderer are the network flows described in + the session description. In NMP applications, this assumption would + require two rendering engines: one for local use by a party and a + second for the remote party. + + In practice, applications may wish to have both parties share a + single rendering engine. In this case, the session description MUST + use a virtual sendrecv session and MUST use the stream subsetting and + chapter inclusion parameters to allocate which MIDI channels are + intended for use by a party. If two parties are sharing a MIDI + channel, the application MUST ensure that appropriate MIDI merging + occurs at the input to the renderer. + + We now discuss the use of (non-MIDI) audio streams in the session. + + Audio streams may be used for two purposes: as a "talkback" channel + for parties to converse or as a way to conduct a performance that + includes MIDI and audio channels. In the latter case, offers MUST + use sample rates and the packet temporal durations for the audio and + MIDI streams that support low-latency synchronized rendering. + + + + +Lazzaro & Wawrzynek Standards Track [Page 150] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + We now show an example of an offer/answer exchange in a network + musical performance application. + + Below, we show an offer that complies with the interoperability text + in this appendix section. + + v=0 + o=first 2520644554 2838152170 IN IP4 first.example.net + s=Example + t=0 0 + a=group:FID 1 2 + c=IN IP4 192.0.2.94 + m=audio 16112 RTP/AVP 96 + a=recvonly + a=mid:1 + a=rtpmap:96 mpeg4-generic/44100 + a=fmtp:96 streamtype=5; mode=rtp-midi; config=""; + profile-level-id=12; cm_unused=ABCFGHJKMNPQTVWXYZ; cm_used=2NPTW; + cm_used=2C0.1.7.10.11.64.121.123; cm_used=2M0.1.2; + cm_used=X0-16; ch_never=ABCDEFGHJKMNPQTVWXYZ; + ch_default=2NPTW; ch_default=2C0.1.7.10.11.64.121.123; + ch_default=2M0.1.2; cm_default=X0-16; + rtp_ptime=0; rtp_maxptime=0; guardtime=44100; + musicport=1; render=synthetic; rinit=audio/asc; + inline="egoAAAAaTVRoZAAAAAYAAAABAGBNVHJrAAAABgD/LwAA" + m=audio 16114 RTP/AVP 96 + a=sendonly + a=mid:2 + a=rtpmap:96 mpeg4-generic/44100 + a=fmtp:96 streamtype=5; mode=rtp-midi; config=""; + profile-level-id=12; cm_unused=ABCFGHJKMNPQTVWXYZ; cm_used=1NPTW; + cm_used=1C0.1.7.10.11.64.121.123; cm_used=1M0.1.2; + cm_used=X0-16; ch_never=ABCDEFGHJKMNPQTVWXYZ; + ch_default=1NPTW; ch_default=1C0.1.7.10.11.64.121.123; + ch_default=1M0.1.2; cm_default=X0-16; + rtp_ptime=0; rtp_maxptime=0; guardtime=44100; + musicport=1; render=synthetic; rinit=audio/asc; + inline="egoAAAAaTVRoZAAAAAYAAAABAGBNVHJrAAAABgD/LwAA" + + (The a=fmtp lines have been wrapped to fit the page to accommodate + memo formatting restrictions; it comprises a single line in SDP.) + + The owner line (o=) identifies the session owner as "first". + + The session description defines two MIDI streams: a recvonly stream + on which "first" receives a performance and a sendonly stream that + "first" uses to send a performance. The recvonly port number encodes + the ports on which "first" wishes to receive RTP (16112) and RTCP + + + +Lazzaro & Wawrzynek Standards Track [Page 151] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + (16113) media at IP4 address 192.0.2.94. The sendonly port number + encodes the port on which "first" wishes to receive RTCP for the + stream (16115). + + The musicport parameters code that the two streams share an identity + relationship and thus form a virtual sendrecv stream. + + Both streams are mpeg4-generic RTP MIDI streams that specify a + General MIDI renderer. The stream subsetting parameters code that + the recvonly stream uses MIDI channel 1 exclusively for voice + commands and that the sendonly stream uses MIDI channel 2 exclusively + for voice commands. This mapping permits the application software to + share a single renderer for local and remote performers. + + We now show the answer to the offer. + + v=0 + o=second 2520644554 2838152170 IN IP4 second.example.net + s=Example + t=0 0 + a=group:FID 1 2 + c=IN IP4 192.0.2.105 + m=audio 5004 RTP/AVP 96 + a=sendonly + a=mid:1 + a=rtpmap:96 mpeg4-generic/44100 + a=fmtp:96 streamtype=5; mode=rtp-midi; config=""; + profile-level-id=12; cm_unused=ABCFGHJKMNPQTVWXYZ; cm_used=2NPTW; + cm_used=2C0.1.7.10.11.64.121.123; cm_used=2M0.1.2; + cm_used=X0-16; ch_never=ABCDEFGHJKMNPQTVWXYZ; + ch_default=2NPTW; ch_default=2C0.1.7.10.11.64.121.123; + ch_default=2M0.1.2; cm_default=X0-16; + rtp_ptime=0; rtp_maxptime=882; guardtime=44100; + musicport=1; render=synthetic; rinit=audio/asc; + inline="egoAAAAaTVRoZAAAAAYAAAABAGBNVHJrAAAABgD/LwAA" + m=audio 5006 RTP/AVP 96 + a=recvonly + a=mid:2 + a=rtpmap:96 mpeg4-generic/44100 + a=fmtp:96 streamtype=5; mode=rtp-midi; config=""; + profile-level-id=12; cm_unused=ABCFGHJKMNPQTVWXYZ; cm_used=1NPTW; + cm_used=1C0.1.7.10.11.64.121.123; cm_used=1M0.1.2; + cm_used=X0-16; ch_never=ABCDEFGHJKMNPQTVWXYZ; + ch_default=1NPTW; ch_default=1C0.1.7.10.11.64.121.123; + ch_default=1M0.1.2; cm_default=X0-16; + rtp_ptime=0; rtp_maxptime=0; guardtime=88200; + musicport=1; render=synthetic; rinit=audio/asc; + inline="egoAAAAaTVRoZAAAAAYAAAABAGBNVHJrAAAABgD/LwAA" + + + +Lazzaro & Wawrzynek Standards Track [Page 152] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + (The a=fmtp lines have been wrapped to fit the page to accommodate + memo formatting restrictions; they comprise single lines in SDP.) + + The owner line (o=) identifies the session owner as "second". + + The port numbers for both media streams are non-zero; thus, "second" + has accepted the session description. The stream marked "sendonly" + in the offer is marked "recvonly" in the answer and vice versa, + coding the different view of the session held by "session". The IP4 + number (192.0.2.105), RTP (5004 and 5006), and RTCP (5005 and 5007) + have been changed by "second" to match its transport wishes. + + In addition, "second" has made several parameter changes: + rtp_maxptime for the sendonly stream has been changed to code 2 ms + (441 in clock units), and the guardtime for the recvonly stream has + been doubled. As these parameter modifications request capabilities + that are REQUIRED to be implemented by interoperable parties, + "second" can make these changes with confidence that "first" can + abide by them. + +Appendix D. Parameter Syntax Definitions + + In this appendix, we define the syntax for the RTP MIDI media type + parameters in Augmented Backus-Naur Form (ABNF, [RFC5234]). When + using these parameters with SDP, all parameters MUST appear on a + single fmtp attribute line of an RTP MIDI media description. For + mpeg4-generic RTP MIDI streams, this line MUST also include any + mpeg4-generic parameters (usage described in Section 6.2). An fmtp + attribute line may be defined (after [RFC3640]) as: + + ; + ; SDP fmtp line definition + ; + + fmtp = "a=fmtp:" token SP param-assign 0*(";" SP param-assign) CRLF + + where <token> codes the RTP payload type. Note that white space MUST + NOT appear between the "a=fmtp:" and the RTP payload type. + + We now define the syntax of the parameters defined in Appendix C. + The definition takes the form of the incremental assembly of the + <param-assign> token. See [RFC3640] for the syntax of the + mpeg4-generic parameters discussed in Section 6.2. + + ; + ; + ; top-level definition for all parameters + ; + + + +Lazzaro & Wawrzynek Standards Track [Page 153] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + ; + + ; + ; Parameters defined in Appendix C.1 + + param-assign = ("cm_unused=" (([channel-list] command-type + [f-list]) / sysex-data)) + + param-assign =/ ("cm_used=" (([channel-list] command-type + [f-list]) / sysex-data)) + + ; + ; Parameters defined in Appendix C.2 + + param-assign =/ ("j_sec=" ("none" / "recj" / ietf-extension)) + + param-assign =/ ("j_update=" ("anchor" / "closed-loop" / + "open-loop" / ietf-extension)) + + param-assign =/ ("ch_default=" (([channel-list] chapter-list + [f-list]) / sysex-data)) + + param-assign =/ ("ch_never=" (([channel-list] chapter-list + [f-list]) / sysex-data)) + + param-assign =/ ("ch_anchor=" (([channel-list] chapter-list + [f-list]) / sysex-data)) + + ; + ; Parameters defined in Appendix C.3 + + param-assign =/ ("tsmode=" ("comex" / "async" / "buffer")) + + param-assign =/ ("linerate=" nonzero-four-octet) + + param-assign =/ ("octpos=" ("first" / "last")) + + param-assign =/ ("mperiod=" nonzero-four-octet) + + ; + ; Parameter defined in Appendix C.4 + + param-assign =/ ("guardtime=" nonzero-four-octet) + + param-assign =/ ("rtp_ptime=" four-octet) + + param-assign =/ ("rtp_maxptime=" four-octet) + + + + +Lazzaro & Wawrzynek Standards Track [Page 154] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + ; + ; Parameters defined in Appendix C.5 + + param-assign =/ ("musicport=" four-octet) + + ; + ; Parameters defined in Appendix C.6 + + param-assign =/ ("chanmask=" 1*( 16(BIT) )) + + param-assign =/ ("cid=" DQUOTE cid-block DQUOTE) + + param-assign =/ ("inline=" DQUOTE base-64-block DQUOTE) + + param-assign =/ ("multimode=" ("all" / "one")) + + param-assign =/ ("render=" ("synthetic" / "api" / "null" / + "unknown" / extension)) + + param-assign =/ ("rinit=" mime-type "/" mime-subtype) + + param-assign =/ ("smf_cid=" DQUOTE cid-block DQUOTE) + + param-assign =/ ("smf_info=" ("ignore" / "identity" / + "sdp_start" / extension)) + + param-assign =/ ("smf_inline=" DQUOTE base-64-block DQUOTE) + + param-assign =/ ("smf_url=" DQUOTE uri-element DQUOTE) + + param-assign =/ ("subrender=" ("default" / extension)) + + param-assign =/ ("url=" DQUOTE uri-element DQUOTE) + + ; + ; list definitions for the cm_ command-type + ; + + command-type = [A] [B] [C] [F] [G] [H] [J] [K] [M] + [N] [P] [Q] [T] [V] [W] [X] [Y] [Z] + + ; + ; list definitions for the ch_ chapter-list + ; + + chapter-list = [A] [B] [C] [D] [E] [F] [G] [H] [J] [K] + [M] [N] [P] [Q] [T] [V] [W] [X] [Y] [Z] + + + + +Lazzaro & Wawrzynek Standards Track [Page 155] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + ; + ; list definitions for the channel-list (used in ch_* / cm_* params) + ; + + channel-list = midi-chan-element *("." midi-chan-element) + + midi-chan-element = midi-chan / midi-chan-range + + midi-chan-range = midi-chan "-" midi-chan + ; + ; Decimal value of left midi-chan + ; MUST be strictly less than + ; decimal value of right midi-chan. + + midi-chan = DIGIT / ("1" %x30-35) ; "0" .. "15" + + ; + ; list definitions for the ch_ field list (f-list) + ; + + f-list = midi-field-element *("." midi-field-element) + + midi-field-element = midi-field / midi-field-range + + midi-field-range = midi-field "-" midi-field + ; + ; Decimal value of left midi-field + ; MUST be strictly less than + ; decimal value of right midi-field. + + midi-field = four-octet + ; + ; Large range accommodates Chapter M + ; RPN (0-16383), NRPN (16384-32767) + ; parameters, and Chapter X octet sizes. + + ; + ; definitions for ch_ sysex-data + ; + + sysex-data = "__" h-list *("_" h-list) "__" + + h-list = hex-field-element *("." hex-field-element) + + hex-field-element = hex-octet / hex-field-range + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 156] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + hex-field-range = hex-octet "-" hex-octet + ; + ; Hexadecimal value of left hex-octet + ; MUST be strictly less than hexadecimal + ; value of right hex-octet. + + hex-octet = %x30-37 U-HEXDIG + ; + ; Rewritten special case of hex-octet in + ; [RFC2045] (page 23). + ; Note that a-f are not permitted, only A-F. + ; hex-octet values MUST NOT exceed 0x7F. + + ; + ; definitions for rinit parameter + ; + + mime-type = "audio" / "application" + + mime-subtype = subtype-name + ; + ; See Appendix C.6.2 for registration + ; requirements for rinit type/subtypes. + ; + ; subtype-name is defined in [RFC4288], + ; Section 4.2. + + ; + ; Definitions for base64 encoding + ; copied from [RFC4566] + ; changes from [RFC4566] to improve automatic syntax checking. + ; + + base-64-block = *base64-unit [base64-pad] + + base64-unit = 4(base64-char) + + base64-pad = (2(base64-char) "==") / (3(base64-char) "=") + + base64-char = %x41-5A / %x61-7A / %x30-39 / "+" / "/" + ; A-Z, a-z, 0-9, "+" and "/" + + ; + ; generic rules + ; + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 157] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + ietf-extension = token + ; + ; may only be defined in Standards-Track RFCs + + extension = token + ; + ; may be defined + ; by filing a registration with IANA + + nonzero-four-octet = (NZ-DIGIT 0*8(DIGIT)) + / (%x31-33 9(DIGIT)) + / ("4" %x30-31 8(DIGIT)) + / ("42" %x30-38 7(DIGIT)) + / ("429" %x30-33 6(DIGIT)) + / ("4294" %x30-38 5(DIGIT)) + / ("42949" %x30-35 4(DIGIT)) + / ("429496" %x30-36 3(DIGIT)) + / ("4294967" %x30-31 2(DIGIT)) + / ("42949672" %x30-38 (DIGIT)) + / ("429496729" %x30-34) + ; + ; unsigned encoding of non-zero 32-bit value: + ; 1 .. 4294967295 + + four-octet = "0" / nonzero-four-octet + ; + ; unsigned encoding of 32-bit value: + ; 0 .. 4294967295 + + uri-element = URI-reference + ; as defined in [RFC3986] + + token = 1*token-char + ; copied from [RFC4566] + + token-char = %x21 / %x23-27 / %x2A-2B / %x2D-2E / + %x30-39 / %x41-5A / %x5E-7E + ; copied from [RFC4566] + + cid-block = 1*cid-char + + cid-char = token-char + cid-char =/ "@" + cid-char =/ "," + cid-char =/ ";" + cid-char =/ ":" + cid-char =/ "\" + cid-char =/ "/" + + + +Lazzaro & Wawrzynek Standards Track [Page 158] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + cid-char =/ "[" + cid-char =/ "]" + cid-char =/ "?" + cid-char =/ "=" + ; + ; - Add back in the tspecials [RFC2045], except + ; for DQUOTE and the non-email safe ( ) < >. + ; - Note that the definitions above ensure that + ; cid-block is always enclosed with DQUOTEs. + + A = %x41 ; Uppercase-only letters used above. + B = %x42 + C = %x43 + D = %x44 + E = %x45 + F = %x46 + G = %x47 + H = %x48 + J = %x4A + K = %x4B + M = %x4D + N = %x4E + P = %x50 + Q = %x51 + T = %x54 + V = %x56 + W = %x57 + X = %x58 + Y = %x59 + Z = %x5A + + NZ-DIGIT = %x31-39 ; non-zero decimal digit + + U-HEXDIG = DIGIT / A / B / C / D / E / F + ; variant of HEXDIG [RFC5234] : + ; hexadecimal digit using uppercase A-F only + + ; The rules below are from the Core Rules from [RFC5234]. + + BIT = "0" / "1" + + DQUOTE = %x22 ; " (Double Quote) + + DIGIT = %x30-39 ; 0-9 + + + ; external references + ; URI-reference: from [RFC3986] + + + +Lazzaro & Wawrzynek Standards Track [Page 159] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + ; subtype-name: from [RFC4288] + + ; + ; End of ABNF + + The mpeg4-generic RTP payload [RFC3640] defines a mode parameter that + signals the type of MPEG stream in use. We add a new mode value, + rtp-midi, using the ABNF rule below: + + ; + ; mpeg4-generic mode parameter extension + ; + + mode =/ "rtp-midi" + ; as described in Section 6.2 of this memo + +Appendix E. A MIDI Overview for Networking Specialists + + This appendix presents an overview of the MIDI standard for the + benefit of networking specialists new to musical applications. + Implementors should consult [MIDI] for a normative description of + MIDI. + + Musicians make music by performing a controlled sequence of physical + movements. For example, a pianist plays by coordinating a series of + key presses, key releases, and pedal actions. MIDI represents a + musical performance by encoding these physical gestures as a sequence + of MIDI commands. This high-level musical representation is compact + but fragile: one lost command may be catastrophic to the performance. + + MIDI commands have much in common with the machine instructions of a + microprocessor. MIDI commands are defined as binary elements. + Bitfields within a MIDI command have a regular structure and a + specialized purpose. For example, the upper nibble of the first + command octet (the opcode field) codes the command type. MIDI + commands may consist of an arbitrary number of complete octets, but + most MIDI commands are 1, 2, or 3 octets in length. + + + + + + + + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 160] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + | Channel Voice Messages | Bitfield Pattern | + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + | NoteOff (end a note) | 1000cccc 0nnnnnnn 0vvvvvvv | + |-------------------------------------------------------------| + | NoteOn (start a note) | 1001cccc 0nnnnnnn 0vvvvvvv | + |-------------------------------------------------------------| + | PTouch (Polyphonic Aftertouch) | 1010cccc 0nnnnnnn 0aaaaaaa | + |-------------------------------------------------------------| + | CControl (Controller Change) | 1011cccc 0xxxxxxx 0yyyyyyy | + |-------------------------------------------------------------| + | PChange (Program Change) | 1100cccc 0ppppppp | + |-------------------------------------------------------------| + | CTouch (Channel Aftertouch) | 1101cccc 0aaaaaaa | + |-------------------------------------------------------------| + | PWheel (Pitch Wheel) | 1110cccc 0xxxxxxx 0yyyyyyy | + ------------------------------------------------------------- + + Figure E.1 -- MIDI Channel Messages + + + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + | System Common Messages | Bitfield Pattern | + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + | System Exclusive | 11110000, followed by a | + | | list of 0xxxxxx octets, | + | | followed by 11110111 | + |-------------------------------------------------------------| + | MIDI Time Code Quarter Frame | 11110001 0xxxxxxx | + |-------------------------------------------------------------| + | Song Position Pointer | 11110010 0xxxxxxx 0yyyyyyy | + |-------------------------------------------------------------| + | Song Select | 11110011 0xxxxxxx | + |-------------------------------------------------------------| + | Undefined | 11110100 | + |-------------------------------------------------------------| + | Undefined | 11110101 | + |-------------------------------------------------------------| + | Tune Request | 11110110 | + |-------------------------------------------------------------| + | System Exclusive End Marker | 11110111 | + ------------------------------------------------------------- + + + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 161] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + | System Real-Time Messages | Bitfield Pattern | + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + | Clock | 11111000 | + |-------------------------------------------------------------| + | Undefined | 11111001 | + |-------------------------------------------------------------| + | Start | 11111010 | + |-------------------------------------------------------------| + | Continue | 11111011 | + |-------------------------------------------------------------| + | Stop | 11111100 | + |-------------------------------------------------------------| + | Undefined | 11111101 | + |-------------------------------------------------------------| + | Active Sense | 11111110 | + |-------------------------------------------------------------| + | System Reset | 11111111 | + ------------------------------------------------------------- + + Figure E.2 -- MIDI System Messages + + Figures E.1 and E.2 show the MIDI command family. There are three + major classes of commands: voice commands (opcode field values in the + range 0x8 through 0xE), System Common commands (opcode field 0xF, + commands 0xF0 through 0xF7), and System Real-Time commands (opcode + field 0xF, commands 0xF8 through 0xFF). Voice commands code the + musical gestures for each timbre in a composition. System commands + perform functions that usually affect all voice channels, such as + System Reset (0xFF). + +E.1. Commands Types + + A voice command executes on one of 16 MIDI channels, as coded by its + 4-bit channel field (field cccc in Figure E.1). In most + applications, notes for different timbres are assigned to different + channels. To support applications that require more than 16 + channels, MIDI systems use several MIDI command streams in parallel + to yield 32, 48, or 64 MIDI channels. + + As an example of a voice command, consider a NoteOn command (opcode + 0x9), with binary encoding 1001cccc 0nnnnnnn 0aaaaaaa. This command + signals the start of a musical note on MIDI channel cccc. The note + has a pitch coded by the note number nnnnnnn, and an onset amplitude + coded by note velocity aaaaaaa. + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 162] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Other voice commands signal the end of notes (NoteOff, opcode 0x8), + map a specific timbre to a MIDI channel (PChange, opcode 0xC), or set + the value of parameters that modulate the timbral quality (all other + voice commands). The exact meaning of most voice channel commands + depends on the rendering algorithms the MIDI receiver uses to + generate sound. In most applications, a MIDI sender has a model (in + some sense) of the rendering method used by the receiver. + + System commands perform a variety of global tasks in the stream, + including "sequencer" playback control of pre-recorded MIDI commands + (the Song Position Pointer, Song Select, Clock, Start, Continue, and + Stop messages), SMPTE time code (the MIDI Time Code Quarter Frame + command), and the communication of device-specific data (the System + Exclusive messages). + +E.2. Running Status + + All MIDI command bitfields share a special structure: the leading bit + of the first octet is set to 1, and the leading bit of all subsequent + octets is set to 0. This structure supports a data compression + system, called running status [MIDI], that improves the coding + efficiency of MIDI. + + In running status coding, the first octet of a MIDI voice command may + be dropped if it is identical to the first octet of the previous MIDI + voice command. This rule, in combination with a convention to + consider NoteOn commands with a null third octet as NoteOff commands, + supports the coding of note sequences using two octets per command. + + Running status coding is only used for voice commands. The presence + of a System Common message in the stream cancels running status mode + for the next voice command. However, System Real-Time messages do + not cancel running status mode. + +E.3. Command Timing + + The bitfield formats in Figures E.1 and E.2 do not encode the + execution time for a command. Timing information is not a part of + the MIDI command syntax itself; different applications of the MIDI + command language use different methods to encode timing. + + For example, the MIDI command set acts as the transport layer for + MIDI 1.0 DIN cables [MIDI]. MIDI cables are short asynchronous + serial lines that facilitate the remote operation of musical + instruments and audio equipment. Timestamps are not sent over a MIDI + 1.0 DIN cable. Instead, the standard uses an implicit "time of + arrival" code. Receivers execute MIDI commands at the moment of + arrival. + + + +Lazzaro & Wawrzynek Standards Track [Page 163] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + In contrast, Standard MIDI Files (SMFs, [MIDI]), a file format for + representing complete musical performances, add an explicit timestamp + to each MIDI command, using a delta encoding scheme that is optimized + for statistics of musical performance. SMF timestamps usually code + timing using the metric notation of a musical score. SMF meta-events + are used to add a tempo map to the file so that score beats may be + accurately converted into units of seconds during rendering. + +E.4. AudioSpecificConfig Templates for MMA Renderers + + In Section 6.2 and Appendix C.6.5, we describe how session + descriptions include an AudioSpecificConfig data block to specify a + MIDI rendering algorithm for mpeg4-generic RTP MIDI streams. + + The bitfield format of AudioSpecificConfig is defined in [MPEGAUDIO]. + StructuredAudioSpecificConfig, a key data structure coded in + AudioSpecificConfig, is defined in [MPEGSA]. + + For implementors wishing to specify Structured Audio renderers, a + full understanding of [MPEGSA] and [MPEGAUDIO] is essential. + However, many implementors will limit their rendering options to the + two MIDI Manufacturers Association (MMA) renderers that may be + specified in AudioSpecificConfig: General MIDI (GM, [MIDI]) and + Downloadable Sounds 2 (DLS 2, [DLS2]). + + To aid these implementors, we reproduce the AudioSpecificConfig + bitfield formats for a GM renderer and a DLS 2 renderer below. We + have checked these bitfields carefully and believe they are correct. + However, we stress that the material below is informative and that + [MPEGAUDIO] and [MPEGSA] are the normative definitions for + AudioSpecificConfig. + + As described in Section 6.2, a minimal mpeg4-generic session + description encodes the AudioSpecificConfig binary bitfield as a + hexadecimal string (whose format is defined in [RFC3640]) that is + assigned to the "config" parameter. As described in Appendix C.6.3, + a session description that uses the render parameter encodes the + AudioSpecificConfig binary bitfield as a Base64-encoded string + assigned to the inline parameter or in the body of an HTTP URL + assigned to the url parameter. + + Below, we show a simplified binary AudioSpecificConfig bitfield + format, suitable for sending and receiving GM and DLS 2 data: + + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 164] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | AOTYPE |FREQIDX|CHANNEL|SACNK| FILE_BLK 1 (required) ... | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |1|SACNK| FILE_BLK 2 (optional) ... | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | ... |1|SACNK| FILE_BLK N (optional) ... | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |0|0| (first "0" bit terminates FILE_BLK list) + +-+-+ + + Figure E.3 -- Simplified AudioSpecificConfig + + The 5-bit AOTYPE field specifies the Audio Object Type as an unsigned + integer. The legal values for use with mpeg4-generic RTP MIDI + streams are "15" (General MIDI), "14" (DLS 2), and "13" (Structured + Audio). Thus, receivers that do not support all three mpeg4-generic + renderers may parse the first 5 bits of an AudioSpecificConfig coded + in a session description and reject sessions that specify unsupported + renderers. + + The 4-bit FREQIDX field specifies the sampling rate of the renderer. + We show the mapping of FREQIDX values to sampling rates in Figure + E.4. Senders MUST specify a sampling frequency that matches the RTP + clock rate, if possible; if not, senders MUST specify the escape + value. Receivers MUST consult the RTP clock parameter for the true + sampling rate if the escape value is specified. + + + + + + + + + + + + + + + + + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 165] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + FREQIDX Sampling Frequency + + 0x0 96000 + 0x1 88200 + 0x2 64000 + 0x3 48000 + 0x4 44100 + 0x5 32000 + 0x6 24000 + 0x7 22050 + 0x8 16000 + 0x9 12000 + 0xa 11025 + 0xb 8000 + 0xc reserved + 0xd reserved + 0xe reserved + 0xf escape value + + Figure E.4 -- FreqIdx Encoding + + The 4-bit CHANNEL field specifies the number of audio channels for + the renderer. The values 0x1 to 0x5 specify 1 to 5 audio channels; + the value 0x6 specifies 5+1 surround sound; and the value 0x7 + specifies 7+1 surround sound. If the rtpmap line in the session + description specifies one of these formats, CHANNEL MUST be set to + the corresponding value. Otherwise, CHANNEL MUST be set to 0x0. + + The CHANNEL field is followed by a list of one or more binary file + data blocks. The 3-bit SACNK field (the chunk_type field in class + StructuredAudioSpecificConfig, defined in [MPEGSA]) specifies the + type of each data block. + + For General MIDI, only Standard MIDI Files may appear in the list + (SACNK field value 2). For DLS 2, only Standard MIDI Files and DLS 2 + RIFF files (SACNK field value 4) may appear. For both of these file + types, the FILE_BLK field has the format shown in Figure E.5: a + 32-bit unsigned integer value (FILE_LEN) coding the number of bytes + in the SMF or RIFF file, followed by FILE_LEN bytes coding the file + data. + + + + + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 166] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | FILE_LEN (32-bit, a byte count SMF file or RIFF file) | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | FILE_DATA (file contents, a list of FILE_LEN bytes) ... | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure E.5 -- The FILE_BLK Field Format + + Note that several files may follow the CHANNEL field. The "1" + constant fields in Figure E.3 code the presence of another file; the + "0" constant field codes the end of the list. The final "0" bit in + Figure E.3 codes the absence of special coding tools (see [MPEGAUDIO] + for details). Senders not using these tools MUST append this "0" + bit; receivers that do not understand these coding tools MUST ignore + all data following a "1" in this position. + + The StructuredAudioSpecificConfig bitfield structure requires the + presence of one FILE_BLK. For mpeg4-generic RTP MIDI use of DLS 2, + FILE_BLKs MUST code RIFF files or SMF files. For mpeg4-generic RTP + MIDI use of General MIDI, FILE_BLKs MUST code SMF files. By default, + this SMF will be ignored (Appendix C.6.4.1). In this default case, a + GM StructuredAudioSpecificConfig bitfield SHOULD code a FILE_BLK + whose FILE_LEN is 0 and whose FILE_DATA is empty. + + To complete this appendix, we derive the + StructuredAudioSpecificConfig that we use in the General MIDI session + examples in this memo. Referring to Figure E.3, we note that for GM, + AOTYPE = 15. Our examples use a 44,100 Hz sample rate (FREQIDX = 4) + and are in mono (CHANNEL = 1). For GM, a single SMF is encoded + (SACNK = 2), using the SMF shown in Figure E.6 (a 26 byte file). + + -------------------------------------------- + | MIDI File = <Header Chunk> <Track Chunk> | + -------------------------------------------- + + <Header Chunk> = <chunk type> <length> <format> <ntrks> <divsn> + 4D 54 68 64 00 00 00 06 00 00 00 01 00 60 + + <Track Chunk> = <chunk type> <length> <delta-time> <end-event> + 4D 54 72 6B 00 00 00 04 00 FF 2F 00 + + Figure E.6 -- SMF File Encoded in the Example + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 167] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + Placing these constants in binary format into the data structure + shown in Figure E.3 yields the constant shown in Figure E.7. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |0 1 1 1 1|0 1 0 0|0 0 0 1|0 1 0|0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0| + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0|0 1 0 0|1 1 0 1|0 1 0 1|0 1 0 0| + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |0 1 1 0|1 0 0 0|0 1 1 0|0 1 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0| + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |0 0 0 0|0 0 0 0|0 0 0 0|0 1 1 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0| + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 1|0 0 0 0|0 0 0 0|0 1 1 0|0 0 0 0| + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |0 1 0 0|1 1 0 1|0 1 0 1|0 1 0 0|0 1 1 1|0 0 1 0|0 1 1 0|1 0 1 1| + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 1 1 0| + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |0 0 0 0|0 0 0 0|1 1 1 1|1 1 1 1|0 0 1 0|1 1 1 1|0 0 0 0|0 0 0 0| + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |0|0| + +-+-+ + + Figure E.7 -- AudioSpecificConfig Used in GM Examples + + Expressing this bitfield as an ASCII hexadecimal string yields: + + 7A0A0000001A4D546864000000060000000100604D54726B0000000600FF2F000 + + This string is assigned to the "config" parameter in the minimal + mpeg4-generic General MIDI examples in this memo (such as the example + in Section 6.2). Expressing this string in Base64 [RFC2045] yields: + + egoAAAAaTVRoZAAAAAYAAAABAGBNVHJrAAAABgD/LwAA + + This string is assigned to the inline parameter in the General MIDI + example shown in Appendix C.6.5. + + + + + + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 168] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + +References + +Normative References + + [MIDI] MIDI Manufacturers Association. "The Complete MIDI 1.0 + Detailed Specification", 1996. + + [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. + Jacobson, "RTP: A Transport Protocol for Real-Time + Applications", STD 64, RFC 3550, July 2003. + + [RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and + Video Conferences with Minimal Control", STD 65, RFC + 3551, July 2003. + + [RFC3640] van der Meer, J., Mackie, D., Swaminathan, V., Singer, + D., and P. Gentric, "RTP Payload Format for Transport of + MPEG-4 Elementary Streams", RFC 3640, November 2003. + + [MPEGSA] International Standards Organization. "ISO/IEC 14496 + MPEG-4", Part 3 (Audio), Subpart 5 (Structured Audio), + 2001. + + [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session + Description Protocol", RFC 4566, July 2006. + + [MPEGAUDIO] International Standards Organization. "ISO 14496 MPEG- + 4", Part 3 (Audio), 2001. + + [RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail + Extensions (MIME) Part One: Format of Internet Message + Bodies", RFC 2045, November 1996. + + [DLS2] MIDI Manufacturers Association. "The MIDI Downloadable + Sounds Specification", v98.2, 1998. + + [RFC5234] Crocker, D., Ed., and P. Overell, "Augmented BNF for + Syntax Specifications: ABNF", STD 68, RFC 5234, January + 2008. + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, March 1997. + + [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. + Norrman, "The Secure Real-time Transport Protocol + (SRTP)", RFC 3711, March 2004. + + + + + +Lazzaro & Wawrzynek Standards Track [Page 169] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model + with Session Description Protocol (SDP)", RFC 3264, June + 2002. + + [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform + Resource Identifier (URI): Generic Syntax", STD 66, RFC + 3986, January 2005. + + [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., + Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext + Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999. + + [RFC5888] Camarillo, G. and H. Schulzrinne, "The Session + Description Protocol (SDP) Grouping Framework", RFC 5888, + June 2010. + + [RFC2818] Rescorla, E., "HTTP Over TLS", RFC 2818, May 2000. + + [RP015] MIDI Manufacturers Association. "Recommended Practice + 015 (RP-015): Response to Reset All Controllers", 11/98. + + [RFC4288] Freed, N. and J. Klensin, "Media Type Specifications and + Registration Procedures", BCP 13, RFC 4288, December + 2005. + + [RFC4855] Casner, S., "Media Type Registration of RTP Payload + Formats", RFC 4855, February 2007. + +Informative References + + [NMP] Lazzaro, J. and J. Wawrzynek. "A Case for Network + Musical Performance", 11th International Workshop on + Network and Operating Systems Support for Digital Audio + and Video (NOSSDAV 2001) June 25-26, 2001, Port + Jefferson, New York. + + [GRAME] Fober, D., Orlarey, Y., and S. Letz. "Real Time Musical + Events Streaming over Internet", Proceedings of the + International Conference on WEB Delivering of Music 2001, + pages 147-154. + + [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, + A., Peterson, J., Sparks, R., Handley, M., and E. + Schooler, "SIP: Session Initiation Protocol", RFC 3261, + June 2002. + + [RFC2326] Schulzrinne, H., Rao, A., and R. Lanphier, "Real Time + Streaming Protocol (RTSP)", RFC 2326, April 1998. + + + +Lazzaro & Wawrzynek Standards Track [Page 170] + +RFC 6295 RTP Payload Format for MIDI June 2011 + + + [ALF] Clark, D.D. and D.L. Tennenhouse. "Architectural + Considerations for a New Generation of Protocols", + SIGCOMM Symposium on Communications Architectures and + Protocols, (Philadelphia, Pennsylvania), pp. 200-208, + ACM, Sept. 1990. + + [RFC4695] Lazzaro, J. and J. Wawrzynek, "RTP Payload Format for + MIDI", RFC 4695, November 2006. + + [RFC4696] Lazzaro, J. and J. Wawrzynek, "An Implementation Guide + for RTP MIDI", RFC 4696, November 2006. + + [RFC2205] Braden, R., Ed., Zhang, L., Berson, S., Herzog, S., and + S. Jamin, "Resource ReSerVation Protocol (RSVP) -- + Version 1 Functional Specification", RFC 2205, September + 1997. + + [RFC4571] Lazzaro, J., "Framing Real-time Transport Protocol (RTP) + and RTP Control Protocol (RTCP) Packets over Connection- + Oriented Transport", RFC 4571, July 2006. + + [SPMIDI] MIDI Manufacturers Association. "Scalable Polyphony + MIDI, Specification and Device Profiles", Document + Version 1.0a, 2002. + + [LCP] Apple Computer. "Logic 7 Dedicated Control Surface + Support", Appendix B. Product manual available from + www.apple.com. + +Authors' Addresses + + John Lazzaro (corresponding author) + UC Berkeley + CS Division + 315 Soda Hall + Berkeley, CA 94720-1776 + EMail: lazzaro@cs.berkeley.edu + + John Wawrzynek + UC Berkeley + CS Division + 631 Soda Hall + Berkeley, CA 94720-1776 + EMail: johnw@cs.berkeley.edu + + + + + + + +Lazzaro & Wawrzynek Standards Track [Page 171] + |