1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
|
Internet Engineering Task Force (IETF) J. Skoglund
Request for Comments: 8486 Google LLC
Updates: 7845 M. Graczyk
Category: Standards Track October 2018
ISSN: 2070-1721
Ambisonics in an Ogg Opus Container
Abstract
This document defines an extension to the Opus audio codec to
encapsulate coded Ambisonics using the Ogg format. It also contains
updates to RFC 7845 to reflect necessary changes in the description
of channel mapping families.
Status of This Memo
This is an Internet Standards Track document.
This document is a product of the Internet Engineering Task Force
(IETF). It represents the consensus of the IETF community. It has
received public review and has been approved for publication by the
Internet Engineering Steering Group (IESG). Further information on
Internet Standards is available in Section 2 of RFC 7841.
Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
https://www.rfc-editor.org/info/rfc8486.
Copyright Notice
Copyright (c) 2018 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Skoglund & Graczyk Standards Track [Page 1]
^L
RFC 8486 Opus Ambisonics October 2018
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3
3. Ambisonics with Ogg Opus . . . . . . . . . . . . . . . . . . 3
3.1. Channel Mapping Family 2 . . . . . . . . . . . . . . . . 3
3.2. Channel Mapping Family 3 . . . . . . . . . . . . . . . . 4
3.3. Allowed Numbers of Channels . . . . . . . . . . . . . . . 5
4. Downmixing . . . . . . . . . . . . . . . . . . . . . . . . . 6
5. Updates to RFC 7845 . . . . . . . . . . . . . . . . . . . . . 7
5.1. Format of the Channel Mapping Table . . . . . . . . . . . 7
5.2. Unknown Mapping Families . . . . . . . . . . . . . . . . 8
6. Experimental Mapping Families . . . . . . . . . . . . . . . . 8
7. Security Considerations . . . . . . . . . . . . . . . . . . . 8
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9
9. References . . . . . . . . . . . . . . . . . . . . . . . . . 9
9.1. Normative References . . . . . . . . . . . . . . . . . . 9
9.2. Informative References . . . . . . . . . . . . . . . . . 10
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 10
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10
1. Introduction
Ambisonics is a representation format for three-dimensional sound
fields that can be used for surround sound and immersive virtual-
reality playback. See [fellgett75] and [daniel04] for technical
details on the Ambisonics format. For the purposes of the this
document, Ambisonics can be considered a multichannel audio stream.
A separate stereo stream can be used alongside the Ambisonics in a
head-tracked virtual reality experience to provide so-called non-
diegetic audio -- that is, audio that should remain unchanged by
rotation of the listener's head, such as narration or stereo music.
Ogg is a general-purpose container, supporting audio, video, and
other media. It can be used to encapsulate audio streams coded using
the Opus codec. See [RFC6716] and [RFC7845] for technical details on
the Opus codec and its encapsulation in the Ogg container,
respectively.
This document extends the Ogg Opus format by defining two new channel
mapping families for encoding Ambisonics. The Ogg Opus format is
extended indirectly by adding items with values 2 and 3 to the "Opus
Channel Mapping Families" IANA registry. When 2 or 3 are used as the
Channel Mapping Family Number in an Ogg stream, the semantic meaning
of the channels in the multichannel Opus stream is one of the
Ambisonics layouts defined in this document. This mapping can also
be used in other contexts that make use of the channel mappings
defined by the "Opus Channel Mapping Families" registry.
Skoglund & Graczyk Standards Track [Page 2]
^L
RFC 8486 Opus Ambisonics October 2018
Furthermore, mapping families 240 through 254 (inclusively) are
reserved for experimental use.
2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in
BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here.
3. Ambisonics with Ogg Opus
Ambisonics can be encapsulated in the Ogg format by encoding with the
Opus codec and setting the channel mapping family value to 2 or 3 in
the Ogg identification (ID) header. A demuxer implementation
encountering channel mapping family 2 or 3 MUST interpret the Opus
stream as containing Ambisonics with the format described in Sections
3.1 or 3.2, respectively.
3.1. Channel Mapping Family 2
This channel mapping uses the same channel mapping table format used
by channel mapping family 1. The output channels are Ambisonic
components ordered in Ambisonic Channel Number (ACN) order (which is
defined in Figure 1) followed by two optional channels of non-
diegetic stereo indexed (left, right). The terms "order" and
"degree" are defined according to [ambix].
ACN = n * (n + 1) + m,
for order n and degree m.
Figure 1: Ambisonic Channel Number (ACN)
For the Ambisonic channels, the ACN component corresponds to channel
index as k = ACN. The reverse correspondence can also be computed
for an Ambisonic channel with index k.
order n = floor(sqrt(k)),
degree m = k - n * (n + 1).
Figure 2: Ambisonic Degree and Order from ACN
Note that channel mapping family 2 allows for so-called mixed-order
Ambisonic representation, in which only a subset of the full
Ambisonic order number of channels is encoded. By specifying the
full number in the channel count field, the inactive ACNs can then be
indicated in the channel mapping field using the index 255.
Skoglund & Graczyk Standards Track [Page 3]
^L
RFC 8486 Opus Ambisonics October 2018
Ambisonic channels are normalized with Schmidt Semi-Normalization
(SN3D). The interpretation of the Ambisonics signal as well as
detailed definitions of ACN channel ordering and SN3D normalization
are described in [ambix], Section 2.1.
3.2. Channel Mapping Family 3
In this mapping, C output channels (the channel count) are generated
at the decoder by multiplying K = N + M decoded channels with a
designated demixing matrix, D, having C rows and K columns (C and K
do not have to be equal). Here, N denotes the number of streams
encoded, and M is the number of these encoded streams that are
coupled to produce two channels. As for channel mapping family 2,
this mapping family also allows for the encoding and decoding of
full-order Ambisonics and mixed-order Ambisonics, as well as non-
diegetic stereo channels. Furthermore, it has the added flexibility
of mixing channels. Let X denote a column vector containing K
decoded channels X1, X2, ..., XK (from N streams), and let S denote a
column vector containing C output streams S1, S2, ..., SC. Then, S =
D X, as shown in Figure 3.
/ \ / \ / \
| S1 | | D11 D12 ... D1K | | X1 |
| S2 | | D21 D22 ... D2K | | X2 |
| ... | = | ... ... ... ... | | ... |
| SC | | DC1 DC2 ... DCK | | XK |
\ / \ / \ /
Figure 3: Demixing in Channel Mapping Family 3
The matrix MUST be provided in the channel mapping table part of the
identification header; see Section 5.1.1 of [RFC7845]. The matrix
replaces the need for a channel mapping field; for channel mapping
family 3, the mapping table has the following layout:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+
| Stream Count |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Coupled Count | Demixing Matrix :
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 4: Channel Mapping Table for Channel Mapping Family 3
Skoglund & Graczyk Standards Track [Page 4]
^L
RFC 8486 Opus Ambisonics October 2018
The fields in the channel mapping table have the following meaning:
1. Stream Count "N" (8 bits, unsigned):
This is the total number of streams encoded in each Ogg packet.
2. Coupled Stream Count "M" (8 bits, unsigned):
This is the number of the N streams whose decoders are to be
configured to produce two channels (stereo).
3. Demixing Matrix (16*K*C bits, signed):
The coefficients of the demixing matrix stored in column-major
order as 16-bit, signed, two's complement fixed-point values with
15 fractional bits (Q15), little endian. If needed, the output
gain field can be used for a normalization scale. For mixed-
order Ambisonic representations, the silent ACN channels are
indicated by all zeros in the corresponding rows of the mixing
matrix. This also allows for mixed order with non-diegetic
stereo as the number of columns implies the presence of non-
diegetic channels.
Note that [RFC7845] specifies that the identification header cannot
exceed one "page", which is 65,025 octets. This limits the Ambisonic
order, which then MUST be lower than 12, if full order is utilized
and the number of coded streams is the same as the Ambisonic order
plus the two non-diegetic channels. The total output channel number,
C, MUST be set in the third field of the identification header.
3.3. Allowed Numbers of Channels
For both channel mapping families 2 and 3, the allowed numbers of
channels are (1 + n)^2 + 2j for n = 0, 1, ..., 14 and j = 0 or 1,
where n denotes the (highest) Ambisonic order and j denotes whether
or not there is a separate non-diegetic stereo stream. This
corresponds to periphonic Ambisonics from zeroth to fourteenth order
plus potentially two channels of non-diegetic stereo. Explicitly,
the allowed number of channels are 1, 3, 4, 6, 9, 11, 16, 18, 25, 27,
36, 38, 49, 51, 64, 66, 81, 83, 100, 102, 121, 123, 144, 146, 169,
171, 196, 198, 225, and 227. Note again that if full Ambisonic order
is used and the number of coded streams is the same as the Ambisonic
order plus the two non-diegetic channels, the order must then be
lower than 12, due to the identification header length limit.
Skoglund & Graczyk Standards Track [Page 5]
^L
RFC 8486 Opus Ambisonics October 2018
4. Downmixing
The downmixing matrices in this section are only examples known to
give acceptable results for stereo downmixing from Ambisonics, but
other mixing strategies will be allowed, e.g., to emphasize a certain
panning.
An Ogg Opus player MAY use the matrix in Figure 5 to implement
downmixing from multichannel files using channel mapping families 2
and 3 when there is no non-diegetic stereo. The first and second
Ambisonic channels are known as "W" and "Y", respectively. The
omitted coefficients in the matrix in the figure have the value 0.0.
/ \ / \ / \
| L | | 0.5 0.5 0.0 ... | | W |
| R | = | 0.5 -0.5 0.0 ... | | Y |
\ / \ / | ... |
\ /
Figure 5: Stereo Downmixing Matrix for Channel Mapping Families 2 and
3 - Only Ambisonic Channels
The first Ambisonic channel (W) is a mono audio stream that
represents the average audio signal over all directions. Since W is
not directional, Ogg Opus players MAY use W directly for mono
playback.
If a non-diegetic stereo track is present, the player MAY use the
matrix in Figure 6 for downmixing. Ls and Rs denote the two non-
diegetic stereo channels.
/ \ / \ / \
| L | | 0.25 0.25 0.0 ... 0.5 0.0 | | W |
| R | = | 0.25 -0.25 0.0 ... 0.0 0.5 | | Y |
\ / \ / | ... |
| Ls |
| Rs |
\ /
Figure 6: Stereo Downmixing Matrix for Channel Mapping Families 2 and
3 - Ambisonic Channels Plus a Non-Diegetic Stereo Stream
Skoglund & Graczyk Standards Track [Page 6]
^L
RFC 8486 Opus Ambisonics October 2018
5. Updates to RFC 7845
5.1. Format of the Channel Mapping Table
The language in Section 5.1.1 of [RFC7845] (copied below) implies
that the channel mapping table, when present, has a fixed format for
all channel mapping families:
The order and meaning of these channels are defined by a channel
mapping, which consists of the 'channel mapping family' octet and,
for channel mapping families other than family 0, a 'channel
mapping table', as illustrated in Figure 3.
This document updates [RFC7845] to clarify that the format of the
channel mapping table may depend on the channel mapping family:
The order and meaning of these channels are defined by a channel
mapping, which consists of the 'channel mapping family' octet and
for channel mapping families other than family 0, a 'channel
mapping table'.
The format of the channel mapping table depends on the channel
mapping family. Unless the channel mapping family requires a
custom format for its channel mapping table, the RECOMMENDED
channel mapping table format for new mapping families is
illustrated in Figure 3.
The change above is not meant to change how families 1 and 255
currently work. To ensure that, the first paragraph of
Section 5.1.1.2 is changed from:
Allowed numbers of channels: 1...8. Vorbis channel order (see
below).
to:
Allowed numbers of channels: 1...8, with the mapping specified
according to Figure 3. Vorbis channel order (see below).
Similarly, the first paragraph of Section 5.1.1.3 is changed from:
Allowed numbers of channels: 1...255. No defined channel meaning.
to:
Allowed numbers of channels: 1...255, with the mapping specified
according to Figure 3. No defined channel meaning.
Skoglund & Graczyk Standards Track [Page 7]
^L
RFC 8486 Opus Ambisonics October 2018
5.2. Unknown Mapping Families
The treatment of unknown mapping families is changed slightly.
Section 5.1.1.4 of [RFC7845] states:
The remaining channel mapping families (2...254) are reserved. A
demuxer implementation encountering a reserved 'channel mapping
family' value SHOULD act as though the value is 255.
This is changed to:
The remaining channel mapping families (2...254) are reserved. A
demuxer implementation encountering a 'channel mapping family'
value that it does not recognize SHOULD NOT attempt to decode the
packets and SHOULD NOT use any information except for the first 19
octets of the ID header packet (Figure 2) and the comment header
(Figure 10).
6. Experimental Mapping Families
To make development of new mapping families easier while reducing the
risk of creating compatibility issues with non-final versions of
mapping families, mapping families 240 through 254 (inclusively) are
now reserved for experiments and implementations of in-development
families. Note that these mapping-family experiments are not
restricted to Ambisonics. Implementers SHOULD attempt to use
experimental family numbers that have not recently been used and
SHOULD advertise what experimental numbers they use (e.g., for
Internet-Drafts).
The Ambisonics mapping experiments that led to this document used
experimental family 254 for family 2 and experimental family 253 for
family 3.
7. Security Considerations
Implementations of the Ogg container need to take appropriate
security considerations into account, as outlined in Section 8 of
[RFC7845]. The extension defined in this document requires that
semantic meaning be assigned to more channels than the existing Ogg
format requires. Since more allocations will be required to encode
and decode these semantically meaningful channels, care should be
taken in any new allocation paths. Implementations MUST NOT overrun
their allocated memory nor read from uninitialized memory when
managing the Ambisonic channel mapping.
Skoglund & Graczyk Standards Track [Page 8]
^L
RFC 8486 Opus Ambisonics October 2018
8. IANA Considerations
IANA has added 17 new assignments to the "Opus Channel Mapping
Families^a registry.
+---------+----------------------+----------------------------------+
| Value | Description | Reference |
+---------+----------------------+----------------------------------+
| 0 | Mono, L/R stereo | Section 5.1.1.1 of [RFC7845], |
| | | Section 5 of this document |
| | | |
| 1 | 1-8 channel surround | Section 5.1.1.2 of [RFC7845], |
| | | Section 5 of this document |
| | | |
| 2 | Ambisonics as | Section 3.1 of this document |
| | individual channels | |
| | | |
| 3 | Ambisonics with | Section 3.2 of this document |
| | demixing matrix | |
| | | |
| 240-254 | Experimental use | Section 6 of this document |
| | | |
| 255 | Discrete channels | Section 5.1.1.3 of [RFC7845], |
| | | Section 5 of this document |
+---------+----------------------+----------------------------------+
9. References
9.1. Normative References
[ambix] Nachbar, C., Zotter, F., Deleflie, E., and A. Sontacchi,
"AMBIX - A SUGGESTED AMBISONICS FORMAT",
Ambisonics Symposium, June 2011,
<http://iem.kug.ac.at/fileadmin/media/iem/projects/2011/
ambisonics11_nachbar_zotter_sontacchi_deleflie.pdf>.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>.
[RFC6716] Valin, JM., Vos, K., and T. Terriberry, "Definition of the
Opus Audio Codec", RFC 6716, DOI 10.17487/RFC6716,
September 2012, <https://www.rfc-editor.org/info/rfc6716>.
[RFC7845] Terriberry, T., Lee, R., and R. Giles, "Ogg Encapsulation
for the Opus Audio Codec", RFC 7845, DOI 10.17487/RFC7845,
April 2016, <https://www.rfc-editor.org/info/rfc7845>.
Skoglund & Graczyk Standards Track [Page 9]
^L
RFC 8486 Opus Ambisonics October 2018
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>.
9.2. Informative References
[daniel04] Daniel, J. and S. Moreau, "Further Study of Sound Field
Coding with Higher Order Ambisonics", Audio Engineering
Society Convention Paper, May 2004,
<https://www.researchgate.net/publication/
277841868_Further_Study_of_Sound_Field_Coding
_with_Higher_Order_Ambisonics>.
[fellgett75]
Fellgett, P., "Ambisonics. Part one: General system
description", Studio Sound vol. 17, no. 8, pp. 20-22,
August 1975,
<http://www.michaelgerzonphotos.org.uk/articles/
Ambisonics%201.pdf>.
Acknowledgments
Thanks to Timothy Terriberry, Jean-Marc Valin, Mark Harris, Marcin
Gorzel, and Andrew Allen for their guidance and valuable
contributions to this document.
Authors' Addresses
Jan Skoglund
Google LLC
345 Spear Street
San Francisco, CA 94105
United States of America
Email: jks@google.com
Michael Graczyk
Email: michael@mgraczyk.com
Skoglund & Graczyk Standards Track [Page 10]
^L
|