summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc2833.txt
blob: af15628c3fc5f704097110eba01037954fda2839 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
Network Working Group                                      H. Schulzrinne
Request for Comments: 2833                            Columbia University
Category: Standards Track                                      S. Petrack
                                                                  MetaTel
                                                                 May 2000


   RTP Payload for DTMF Digits, Telephony Tones and Telephony Signals

Status of this Memo

   This document specifies an Internet standards track protocol for the
   Internet community, and requests discussion and suggestions for
   improvements.  Please refer to the current edition of the "Internet
   Official Protocol Standards" (STD 1) for the standardization state
   and status of this protocol.  Distribution of this memo is unlimited.

Copyright Notice

   Copyright (C) The Internet Society (2000).  All Rights Reserved.

Abstract

   This memo describes how to carry dual-tone multifrequency (DTMF)
   signaling, other tone signals and telephony events in RTP packets.

1 Introduction

   This memo defines two payload formats, one for carrying dual-tone
   multifrequency (DTMF) digits, other line and trunk signals (Section
   3), and a second one for general multi-frequency tones in RTP [1]
   packets (Section 4). Separate RTP payload formats are desirable since
   low-rate voice codecs cannot be guaranteed to reproduce these tone
   signals accurately enough for automatic recognition. Defining
   separate payload formats also permits higher redundancy while
   maintaining a low bit rate.

   The payload formats described here may be useful in at least three
   applications: DTMF handling for gateways and end systems, as well as
   "RTP trunks". In the first application, the Internet telephony
   gateway detects DTMF on the incoming circuits and sends the RTP
   payload described here instead of regular audio packets. The gateway
   likely has the necessary digital signal processors and algorithms, as
   it often needs to detect DTMF, e.g., for two-stage dialing. Having
   the gateway detect tones relieves the receiving Internet end system
   from having to do this work and also avoids that low bit-rate codecs
   like G.723.1 render DTMF tones unintelligible. Secondly, an Internet




Schulzrinne & Petrack       Standards Track                     [Page 1]
^L
RFC 2833                         Tones                          May 2000


   end system such as an "Internet phone" can emulate DTMF functionality
   without concerning itself with generating precise tone pairs and
   without imposing the burden of tone recognition on the receiver.

   In the "RTP trunk" application, RTP is used to replace a normal
   circuit-switched trunk between two nodes. This is particularly of
   interest in a telephone network that is still mostly circuit-
   switched.  In this case, each end of the RTP trunk encodes audio
   channels into the appropriate encoding, such as G.723.1 or G.729.
   However, this encoding process destroys in-band signaling information
   which is carried using the least-significant bit ("robbed bit
   signaling") and may also interfere with in-band signaling tones, such
   as the MF digit tones. In addition, tone properties such as the phase
   reversals in the ANSam tone, will not survive speech coding. Thus,
   the gateway needs to remove the in-band signaling information from
   the bit stream. It can now either carry it out-of-band in a signaling
   transport mechanism yet to be defined, or it can use the mechanism
   described in this memorandum. (If the two trunk end points are within
   reach of the same media gateway controller, the media gateway
   controller can also handle the signaling.)  Carrying it in-band may
   simplify the time synchronization between audio packets and the tone
   or signal information. This is particularly relevant where duration
   and timing matter, as in the carriage of DTMF signals.

1.1 Terminology

   In this document, the key words "MUST", "MUST NOT", "REQUIRED",
   "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY",
   and "OPTIONAL" are to be interpreted as described in RFC 2119 [2] and
   indicate requirement levels for compliant implementations.

2 Events vs. Tones

   A gateway has two options for handling DTMF digits and events. First,
   it can simply measure the frequency components of the voice band
   signals and transmit this information to the RTP receiver (Section
   4). In this mode, the gateway makes no attempt to discern the meaning
   of the tones, but simply distinguishes tones from speech signals.

   All tone signals in use in the PSTN and meant for human consumption
   are sequences of simple combinations of sine waves, either added or
   modulated. (There is at least one tone, the ANSam tone [3] used for
   indicating data transmission over voice lines, that makes use of
   periodic phase reversals.)

   As a second option, a gateway can recognize the tones and translate
   them into a name, such as ringing or busy tone. The receiver then
   produces a tone signal or other indication appropriate to the signal.



Schulzrinne & Petrack       Standards Track                     [Page 2]
^L
RFC 2833                         Tones                          May 2000


   Generally, since the recognition of signals often depends on their
   on/off pattern or the sequence of several tones, this recognition can
   take several seconds. On the other hand, the gateway may have access
   to the actual signaling information that generates the tones and thus
   can generate the RTP packet immediately, without the detour through
   acoustic signals.

   In the phone network, tones are generated at different places,
   depending on the switching technology and the nature of the tone.
   This determines, for example, whether a person making a call to a
   foreign country hears her local tones she is familiar with or the
   tones as used in the country called.

   For analog lines, dial tone is always generated by the local switch.
   ISDN terminals may generate dial tone locally and then send a Q.931
   SETUP message containing the dialed digits. If the terminal just
   sends a SETUP message without any Called Party digits, then the
   switch does digit collection, provided by the terminal as KEYPAD
   messages, and provides dial tone over the B-channel. The terminal can
   either use the audio signal on the B-channel or can use the Q.931
   messages to trigger locally generated dial tone.

   Ringing tone (also called ringback tone) is generated by the local
   switch at the callee, with a one-way voice path opened up as soon as
   the callee's phone rings. (This reduces the chance of clipping the
   called party's response just after answer. It also permits pre-answer
   announcements or in-band call-progress indications to reach the
   caller before or in lieu of a ringing tone.) Congestion tone and
   special information tones can be generated by any of the switches
   along the way, and may be generated by the caller's switch based on
   ISUP messages received. Busy tone is generated by the caller's
   switch, triggered by the appropriate ISUP message, for analog
   instruments, or the ISDN terminal.

   Gateways which send signaling events via RTP MAY send both named
   signals (Section 3) and the tone representation (Section 4) as a
   single RTP session, using the redundancy mechanism defined in Section
   3.7 to interleave the two representations. It is generally a good
   idea to send both, since it allows the receiver to choose the
   appropriate rendering.

   If a gateway cannot present a tone representation, it SHOULD send the
   audio tones as regular RTP audio packets (e.g., as payload format
   PCMU), in addition to the named signals.







Schulzrinne & Petrack       Standards Track                     [Page 3]
^L
RFC 2833                         Tones                          May 2000


3 RTP Payload Format for Named Telephone Events

3.1 Introduction

   The payload format for named telephone events described below is
   suitable for both gateway and end-to-end scenarios. In the gateway
   scenario, an Internet telephony gateway connecting a packet voice
   network to the PSTN recreates the DTMF tones or other telephony
   events and injects them into the PSTN. Since, for example, DTMF digit
   recognition takes several tens of milliseconds, the first few
   milliseconds of a digit will arrive as regular audio packets. Thus,
   careful time and power (volume) alignment between the audio samples
   and the events is needed to avoid generating spurious digits at the
   receiver.

   DTMF digits and named telephone events are carried as part of the
   audio stream, and MUST use the same sequence number and time-stamp
   base as the regular audio channel to simplify the generation of audio
   waveforms at a gateway. The default clock frequency is 8,000 Hz, but
   the clock frequency can be redefined when assigning the dynamic
   payload type.

   The payload format described here achieves a higher redundancy even
   in the case of sustained packet loss than the method proposed for the
   Voice over Frame Relay Implementation Agreement [4].

   If an end system is directly connected to the Internet and does not
   need to generate tone signals again, time alignment and power levels
   are not relevant. These systems rely on PSTN gateways or Internet end
   systems to generate DTMF events and do not perform their own audio
   waveform analysis. An example of such a system is an Internet
   interactive voice-response (IVR) system.

   In circumstances where exact timing alignment between the audio
   stream and the DTMF digits or other events is not important and data
   is sent unicast, such as the IVR example mentioned earlier, it may be
   preferable to use a reliable control protocol rather than RTP
   packets. In those circumstances, this payload format would not be
   used.

3.2 Simultaneous Generation of Audio and Events

   A source MAY send events and coded audio packets for the same time
   instants, using events as the redundant encoding for the audio
   stream, or it MAY block outgoing audio while event tones are active
   and only send named events as both the primary and redundant
   encodings.




Schulzrinne & Petrack       Standards Track                     [Page 4]
^L
RFC 2833                         Tones                          May 2000


   Note that a period covered by an encoded tone may overlap in time
   with a period of audio encoded by other means. This is likely to
   occur at the onset of a tone and is necessary to avoid possible
   errors in the interpretation of the reproduced tone at the remote
   end.  Implementations supporting this payload format must be prepared
   to handle the overlap. It is RECOMMENDED that gateways only render
   the encoded tone since the audio may contain spurious tones
   introduced by the audio compression algorithm. However, it is
   anticipated that these extra tones in general should not interfere
   with recognition at the far end.

3.3 Event Types

   This payload format is used for five different types of signals:

      o  DTMF tones (Section 3.10);

      o  fax-related tones (Section 3.11);

      o  standard subscriber line tones (Section 3.12);

      o  country-specific subscriber line tones (Section 3.13) and;

      o  trunk events (Section 3.14).

   A compliant implementation MUST support the events listed in Table 1
   with the exception of "flash". If it uses some other, out-of-band
   mechanism for signaling line conditions, it does not have to
   implement the other events.

   In some cases, an implementation may simply ignore certain events,
   such as fax tones, that do not make sense in a particular
   environment.  Section 3.9 specifies how an implementation can use the
   SDP "fmtp" parameter within an SDP description to indicate its
   inability to understand a particular event or range of events.

   Depending on the available user interfaces, an implementation MAY
   render all tones in Table 5 the same or, preferably, use the tones
   conveyed by the concurrent "tone" payload or other RTP audio payload.
   Alternatively, it could provide a textual representation.

   Note that end systems that emulate telephones only need to support
   the events described in Sections 3.10 and 3.12, while systems that
   receive trunk signaling need to implement those in Sections 3.10,
   3.11, 3.12 and 3.14, since MF trunks also carry most of the "line"
   signals. Systems that do not support fax or modem functionality do
   not need to render fax-related events described in Section 3.11.




Schulzrinne & Petrack       Standards Track                     [Page 5]
^L
RFC 2833                         Tones                          May 2000


   The RTP payload format is designated as "telephone-event", the MIME
   type as "audio/telephone-event". The default timestamp rate is 8000
   Hz, but other rates may be defined. In accordance with current
   practice, this payload format does not have a static payload type
   number, but uses a RTP payload type number established dynamically
   and out-of-band.

3.4 Use of RTP Header Fields

      Timestamp: The RTP timestamp reflects the measurement point for
           the current packet. The event duration described in Section
           3.5 extends forwards from that time. The receiver calculates
           jitter for RTCP receiver reports based on all packets with a
           given timestamp. Note: The jitter value should primarily be
           used as a means for comparing the reception quality between
           two users or two time-periods, not as an absolute measure.

      Marker bit: The RTP marker bit indicates the beginning of a new
           event.

3.5 Payload Format

   The payload format is shown in Fig. 1.

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     event     |E|R| volume    |          duration             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

               Figure 1: Payload Format for Named Events

      events: The events are encoded as shown in Sections 3.10 through
           3.14.

      volume: For DTMF digits and other events representable as tones,
           this field describes the power level of the tone, expressed
           in dBm0 after dropping the sign. Power levels range from 0 to
           -63 dBm0. The range of valid DTMF is from 0 to -36 dBm0 (must
           accept); lower than -55 dBm0 must be rejected (TR-TSY-000181,
           ITU-T Q.24A). Thus, larger values denote lower volume. This
           value is defined only for DTMF digits. For other events, it
           is set to zero by the sender and is ignored by the receiver.








Schulzrinne & Petrack       Standards Track                     [Page 6]
^L
RFC 2833                         Tones                          May 2000


      duration: Duration of this digit, in timestamp units. Thus, the
           event began at the instant identified by the RTP timestamp
           and has so far lasted as long as indicated by this parameter.
           The event may or may not have ended.

           For a sampling rate of 8000 Hz, this field is sufficient to
           express event durations of up to approximately 8 seconds.

      E: If set to a value of one, the "end" bit indicates that this
           packet contains the end of the event. Thus, the duration
           parameter above measures the complete duration of the event.

           A sender MAY delay setting the end bit until retransmitting
           the last packet for a tone, rather than on its first
           transmission. This avoids having to wait to detect whether
           the tone has indeed ended.

           Receiver implementations MAY use different algorithms to
           create tones, including the two described here. In the first,
           the receiver simply places a tone of the given duration in
           the audio playout buffer at the location indicated by the
           timestamp. As additional packets are received that extend the
           same tone, the waveform in the playout buffer is extended
           accordingly. (Care has to be taken if audio is mixed, i.e.,
           summed, in the playout buffer rather than simply copied.)
           Thus, if a packet in a tone lasting longer than the packet
           interarrival time gets lost and the playout delay is short, a
           gap in the tone may occur.  Alternatively, the receiver can
           start a tone and play it until it receives a packet with the
           "E" bit set, the next tone, distinguished by a different
           timestamp value or a given time period elapses. This is more
           robust against packet loss, but may extend the tone if all
           retransmissions of the last packet in an event are lost.
           Limiting the time period of extending the tone is necessary
           to avoid that a tone "gets stuck". Regardless of the
           algorithm used, the tone SHOULD NOT be extended by more than
           three packet interarrival times. A slight extension of tone
           durations and shortening of pauses is generally harmless.

      R: This field is reserved for future use. The sender MUST set it
           to zero, the receiver MUST ignore it.










Schulzrinne & Petrack       Standards Track                     [Page 7]
^L
RFC 2833                         Tones                          May 2000


3.6 Sending Event Packets

   An audio source SHOULD start transmitting event packets as soon as it
   recognizes an event and every 50 ms thereafter or the packet interval
   for the audio codec used for this session, if known. (The sender does
   not need to maintain precise time intervals between event packets in
   order to maintain precise inter-event times, since the timing
   information is contained in the timestamp.)

      Q.24 [5], Table A-1, indicates that all administrations surveyed
      use a minimum signal duration of 40 ms, with signaling velocity
      (tone and pause) of no less than 93 ms.

   If an event continues for more than one period, the source generating
   the events should send a new event packet with the RTP timestamp
   value corresponding to the beginning of the event and the duration of
   the event increased correspondingly. (The RTP sequence number is
   incremented by one for each packet.) If there has been no new event
   in the last interval, the event SHOULD be retransmitted three times
   or until the next event is recognized. This ensures that the duration
   of the event can be recognized correctly even if the last packet for
   an event is lost.

      DTMF digits and events are sent incrementally to avoid having the
      receiver wait for the completion of the event.  Since some tones
      are two seconds long, this would incur a substantial delay. The
      transmitter does not know if event length is important and thus
      needs to transmit immediately and incrementally. If the receiver
      application does not care about event length, the incremental
      transmission mechanism avoids delay. Some applications, such as
      gateways into the PSTN, care about both delays and event duration.

3.7 Reliability

   During an event, the RTP event payload format provides incremental
   updates on the event. The error resiliency depends on the playout
   delay at the receiver. For example, for a playout delay of 120 ms and
   a packet gap of 50 ms, two packets in a row can get lost without
   causing a gap in the tones generated at the receiver.

   The audio redundancy mechanism described in RFC 2198 [6] MAY be used
   to recover from packet loss across events. The effective data rate is
   r times 64 bits (32 bits for the redundancy header and 32 bits for
   the telephone-event payload) every 50 ms or r times 1280 bits/second,
   where r is the number of redundant events carried in each packet. The
   value of r is an implementation trade-off, with a value of 5
   suggested.




Schulzrinne & Petrack       Standards Track                     [Page 8]
^L
RFC 2833                         Tones                          May 2000


      The timestamp offset in this redundancy scheme has 14 bits, so
      that it allows a single packet to "cover" 2.048 seconds of
      telephone events at a sampling rate of 8000 Hz.  Including the
      starting time of previous events allows precise reconstruction of
      the tone sequence at a gateway.  The scheme is resilient to
      consecutive packet losses spanning this interval of 2.048 seconds
      or r digits, whichever is less. Note that for previous digits,
      only an average loudness can be represented.

   An encoder MAY treat the event payload as a highly-compressed version
   of the current audio frame. In that mode, each RTP packet during an
   event would contain the current audio codec rendition (say, G.723.1
   or G.729) of this digit as well as the representation described in
   Section 3.5, plus any previous events seen earlier.

      This approach allows dumb gateways that do not understand this
      format to function. See also the discussion in Section 1.

3.8 Example

   A typical RTP packet, where the user is just dialing the last digit
   of the DTMF sequence "911". The first digit was 200 ms long (1600
   timestamp units) and started at time 0, the second digit lasted 250
   ms (2000 timestamp units) and started at time 800 ms (6400 timestamp
   units), the third digit was pressed at time 1.4 s (11,200 timestamp
   units) and the packet shown was sent at 1.45 s (11,600 timestamp
   units).  The frame duration is 50 ms. To make the parts recognizable,
   the figure below ignores byte alignment. Timestamp and sequence
   number are assumed to have been zero at the beginning of the first
   digit. In this example, the dynamic payload types 96 and 97 have been
   assigned for the redundancy mechanism and the telephone event
   payload, respectively.



















Schulzrinne & Petrack       Standards Track                     [Page 9]
^L
RFC 2833                         Tones                          May 2000


3.9 Indication of Receiver Capabilities using SDP

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |V=2|P|X|  CC   |M|     PT      |       sequence number         |
   | 2 |0|0|   0   |0|     96      |              28               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                           timestamp                           |
   |                             11200                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           synchronization source (SSRC) identifier            |
   |                            0x5234a8                           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |F|   block PT  |     timestamp offset      |   block length    |
   |1|     97      |            11200          |         4         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |F|   block PT  |     timestamp offset      |   block length    |
   |1|     97      |   11200 - 6400 = 4800     |         4         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |F|   Block PT  |
   |0|     97      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     digit     |E R| volume    |          duration             |
   |       9       |1 0|     7     |             1600              |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     digit     |E R| volume    |          duration             |
   |       1       |1 0|    10     |             2000              |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     digit     |E R| volume    |          duration             |
   |       1       |0 0|    20     |              400              |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

          Figure 2: Example RTP packet after dialing "911"

   Receivers MAY indicate which named events they can handle, for
   example, by using the Session Description Protocol (RFC 2327 [7]).
   The payload formats use the following fmtp format to list the event
   values that they can receive:

   a=fmtp:<format> <list of values>

   The list of values consists of comma-separated elements, which can be
   either a single decimal number or two decimal numbers separated by a
   hyphen (dash), where the second number is larger than the first. No
   whitespace is allowed between numbers or hyphens. The list does not
   have to be sorted.




Schulzrinne & Petrack       Standards Track                    [Page 10]
^L
RFC 2833                         Tones                          May 2000


   For example, if the payload format uses the payload type number 100,
   and the implementation can handle the DTMF tones (events 0 through
   15) and the dial and ringing tones, it would include the following
   description in its SDP message:

   a=fmtp:100 0-15,66,70

   Since all implementations MUST be able to receive events 0 through
   15, listing these events in the a=fmtp line is OPTIONAL.

   The corresponding MIME parameter is "events", so that the following
   sample media type definition corresponds to the SDP example above:

   audio/telephone-event;events="0-11,66,67";rate="8000"

3.10 DTMF Events

   Table 1 summarizes the DTMF-related named events within the
   telephone-event payload format.

                     Event  encoding (decimal)
                     _________________________
                     0--9                0--9
                     *                     10
                     #                     11
                     A--D              12--15
                     Flash                 16

                     Table 1: DTMF named events

3.11 Data Modem and Fax Events

   Table 3.11 summarizes the events and tones that can appear on a
   subscriber line serving a fax machine or modem. The tones are
   described below, with additional detail in Table 7.

      ANS: This 2100 +/- 15 Hz tone is used to disable echo
           suppression for data transmission [8,9]. For fax machines,
           Recommendation T.30 [9] refers to this tone as called
           terminal identification (CED) answer tone.

      /ANS: This is the same signal as ANS, except that it reverses
           phase at an interval of 450 +/- 25 ms. It disables both
           echo cancellers and echo suppressors. (In the ITU
           Recommendation V.25 [8], this signal is rendered as ANS
           with a bar on top.)





Schulzrinne & Petrack       Standards Track                    [Page 11]
^L
RFC 2833                         Tones                          May 2000


      ANSam: The modified answer tone (ANSam) [3] is a sinewave signal
           at 2100 +/- 1 Hz without phase reversals, amplitude-modulated
           by a sinewave at 15 +/- 0.1 Hz. This tone is sent by modems
           if network echo canceller disabling is not required.

      /ANSam: The modified answer tone with phase reversals (ANSam) [3]
           is a sinewave signal at 2100 +/- 1 Hz with phase reversals at
           intervals of 450 +/- 25 ms, amplitude-modulated by a sinewave
           at 15 +/- 0.1 Hz. This tone [10,8] is sent by modems [11] and
           faxes to disable echo suppressors.

      CNG: After dialing the called fax machine's telephone number (and
           before it answers), the calling Group III fax machine
           (optionally) begins sending a CalliNG tone (CNG) consisting
           of an interrupted tone of 1100 Hz. [9]

      CRdi: Capabilities Request (CRd), initiating side, [12] is a
           dual-tone signal with tones at 1375 Hz and 2002 Hz for 400
           ms, followed by a single tone at 1900 Hz for 100 ms. "This
           signal requests the remote station transition from telephony
           mode to an information transfer mode and requests the
           transmission of a capabilities list message by the remote
           station. In particular, CRdi is sent by the initiating
           station during the course of a call, or by the calling
           station at call establishment in response to a CRe or MRe."

      CRdr: CRdr is the response tone to CRdi (see above). It consists
           of a dual-tone signal with tones at 1529 Hz and 2225 Hz for
           400 ms, followed by a single tone at 1900 Hz for 100 ms.

      CRe: Capabilities Request (CRe) [12] is a dual-tone signal with
           tones at tones at 1375 Hz and 2002 Hz for 400 ms, followed by
           a single tone at 400 Hz for 100 ms. "This signal requests the
           remote station transition from telephony mode to an
           information transfer mode and requests the transmission of a
           capabilities list message by the remote station. In
           particular, CRe is sent by an automatic answering station at
           call establishment."

      CT: "The calling tone [8] consists of a series of interrupted
           bursts of binary 1 signal or 1300 Hz, on for a duration of
           not less than 0.5 s and not more than 0.7 s and off for a
           duration of not less than 1.5 s and not more than 2.0 s."
           Modems not starting with the V.8 call initiation tone often
           use this tone.






Schulzrinne & Petrack       Standards Track                    [Page 12]
^L
RFC 2833                         Tones                          May 2000


      ESi: Escape Signal (ESi) [12] is a dual-tone signal with tones at
           1375 Hz and 2002 Hz for 400 ms, followed by a single tone at
           980 Hz for 100 ms. "This signal requests the remote station
           transition from telephony mode to an information transfer
           mode. signal ESi is sent by the initiating station."

      ESr: Escape Signal (ESr) [12] is a dual-tone signal with tones at
           1529 Hz and 2225 Hz for 400 ms, followed by a single tone at
           1650 Hz for 100 ms. Same as ESi, but sent by the responding
           station.

      MRdi: Mode Request (MRd), initiating side, [12] is a dual-tone
           signal with tones at 1375 Hz and 2002 Hz for 400 ms followed
           by a single tone at 1150 Hz for 100 ms. "This signal requests
           the remote station transition from telephony mode to an
           information transfer mode and requests the transmission of a
           mode select message by the remote station. In particular,
           signal MRd is sent by the initiating station during the
           course of a call, or by the calling station at call
           establishment in response to an MRe." [12]

      MRdr: MRdr is the response tone to MRdi (see above). It consists
           of a dual-tone signal with tones at 1529 Hz and 2225 Hz for
           400 ms, followed by a single tone at 1150 Hz for 100 ms.

      MRe: Mode Request (MRe) [12] is a dual-tone signal with tones at
           1375 Hz and 2002 Hz for 400 ms, followed by a single tone at
           650 Hz for 100 ms. "This signal requests the remote station
           transition from telephony mode to an information transfer
           mode and requests the transmission of a mode select message
           by the remote station. In particular, signal MRe is sent by
           an automatic answering station at call establishment." [12]

      V.21: V.21 describes a 300 b/s full-duplex modem that employs
           frequency shift keying (FSK). It is used by Group 3 fax
           machines to exchange T.30 information. The calling transmits
           on channel 1 and receives on channel 2; the answering modem
           transmits on channel 2 and receives on channel 1. Each bit
           value has a distinct tone, so that V.21 signaling comprises a
           total of four distinct tones.











Schulzrinne & Petrack       Standards Track                    [Page 13]
^L
RFC 2833                         Tones                          May 2000


   In summary, procedures in Table 2 are used.

           Procedure                      indications
           ___________________________________________________
           V.25 and V.8                   ANS
           V.25, echo canceller disabled  ANS, /ANS, ANS, /ANS
           V.8                            ANSam
           V.8, echo canceller disabled   /ANSam

      Table 2: Use of ANS, ANSam and /ANSam in V.x recommendations


           Event                    encoding (decimal)
           ___________________________________________________
           Answer tone (ANS)                        32
           /ANS                                     33
           ANSam                                    34
           /ANSam                                   35
           Calling tone (CNG)                       36
           V.21 channel 1, "0" bit                  37
           V.21 channel 1, "1" bit                  38
           V.21 channel 2, "0" bit                  39
           V.21 channel 2, "1" bit                  40
           CRdi                                     41
           CRdr                                     42
           CRe                                      43
           ESi                                      44
           ESr                                      45
           MRdi                                     46
           MRdr                                     47
           MRe                                      48
           CT                                       49

                Table 3: Data and fax named events

3.12 Line Events

   Table 4 summarizes the events and tones that can appear on a
   subscriber line.

   ITU Recommendation E.182 [13] defines when certain tones should be
   used. It defines the following standard tones that are heard by the
   caller:

      Dial tone: The exchange is ready to receive address information.






Schulzrinne & Petrack       Standards Track                    [Page 14]
^L
RFC 2833                         Tones                          May 2000


      PABX internal dial tone: The PABX is ready to receive address
           information.

      Special dial tone: Same as dial tone, but the caller's line is
           subject to a specific condition, such as call diversion or a
           voice mail is available (e.g., "stutter dial tone").

      Second dial tone: The network has accepted the address
           information, but additional information is required.

      Ring: This named signal event causes the recipient to generate an
           alerting signal ("ring"). The actual tone or other indication
           used to render this named event is left up to the receiver.
           (This differs from the ringing tone, below, heard by the
           caller

      Ringing tone: The call has been placed to the callee and a calling
           signal (ringing) is being transmitted to the callee. This
           tone is also called "ringback".

      Special ringing tone: A special service, such as call forwarding
           or call waiting, is active at the called number.

      Busy tone: The called telephone number is busy.

      Congestion tone: Facilities necessary for the call are temporarily
           unavailable.

      Calling card service tone: The calling card service tone consists
           of 60 ms of the sum of 941 Hz and 1477 Hz tones (DTMF '#'),
           followed by 940 ms of 350 Hz and 440 Hz (U.S.  dial tone),
           decaying exponentially with a time constant of 200 ms.

      Special information tone: The callee cannot be reached, but the
           reason is neither "busy" nor "congestion". This tone should
           be used before all call failure announcements, for the
           benefit of automatic equipment.

      Comfort tone: The call is being processed. This tone may be used
           during long post-dial delays, e.g., in international
           connections.

      Hold tone: The caller has been placed on hold.

      Record tone: The caller has been connected to an automatic
           answering device and is requested to begin speaking.





Schulzrinne & Petrack       Standards Track                    [Page 15]
^L
RFC 2833                         Tones                          May 2000


      Caller waiting tone: The called station is busy, but has call
           waiting service.

      Pay tone: The caller, at a payphone, is reminded to deposit
           additional coins.

      Positive indication tone: The supplementary service has been
           activated.

      Negative indication tone: The supplementary service could not be
           activated.

      Off-hook warning tone: The caller has left the instrument off-hook
           for an extended period of time.

   The following tones can be heard by either calling or called party
   during a conversation:

      Call waiting tone: Another party wants to reach the subscriber.

      Warning tone: The call is being recorded. This tone is not
           required in all jurisdictions.

      Intrusion tone: The call is being monitored, e.g., by an operator.

      CPE alerting signal: A tone used to alert a device to an arriving
           in-band FSK data transmission. A CPE alerting signal is a
           combined 2130 and 2750 Hz tone, both with tolerances of 0.5%
           and a duration of 80 to.  80 ms. The CPE alerting signal is
           used with ADSI services and Call Waiting ID services [14].

   The following tones are heard by operators:

      Payphone recognition tone: The person making the call or being
           called is using a payphone (and thus it is ill-advised to
           allow collect calls to such a person).















Schulzrinne & Petrack       Standards Track                    [Page 16]
^L
RFC 2833                         Tones                          May 2000


          Event                      encoding (decimal)
          _____________________________________________
          Off Hook                                  64
          On Hook                                   65
          Dial tone                                 66
          PABX internal dial tone                   67
          Special dial tone                         68
          Second dial tone                          69
          Ringing tone                              70
          Special ringing tone                      71
          Busy tone                                 72
          Congestion tone                           73
          Special information tone                  74
          Comfort tone                              75
          Hold tone                                 76
          Record tone                               77
          Caller waiting tone                       78
          Call waiting tone                         79
          Pay tone                                  80
          Positive indication tone                  81
          Negative indication tone                  82
          Warning tone                              83
          Intrusion tone                            84
          Calling card service tone                 85
          Payphone recognition tone                 86
          CPE alerting signal (CAS)                 87
          Off-hook warning tone                     88
          Ring                                      89

                   Table 4: E.182 line events

3.13 Extended Line Events

   Table 5 summarizes country-specific events and tones that can appear
   on a subscriber line.

3.14 Trunk Events

   Table 6 summarizes the events and tones that can appear on a trunk.
   Note that trunk can also carry line events (Section 3.12), as MF
   signaling does not include backward signals [15].

      ABCD transitional: 4-bit signaling used by digital trunks. For N-
           state signaling, the first N values are used.







Schulzrinne & Petrack       Standards Track                    [Page 17]
^L
RFC 2833                         Tones                          May 2000


       Event                            encoding (decimal)
       ___________________________________________________
       Acceptance tone                                  96
       Confirmation tone                                97
       Dial tone, recall                                98
       End of three party service tone                  99
       Facilities tone                                 100
       Line lockout tone                               101
       Number unobtainable tone                        102
       Offering tone                                   103
       Permanent signal tone                           104
       Preemption tone                                 105
       Queue tone                                      106
       Refusal tone                                    107
       Route tone                                      108
       Valid tone                                      109
       Waiting tone                                    110
       Warning tone (end of period)                    111
       Warning Tone (PIP tone)                         112

            Table 5: Country-specific Line events

           The T1 ESF (extended super frame format) allows 2, 4, and 16
           state signaling bit options. These signaling bits are named
           A, B, C, and D.  Signaling information is sent as robbed bits
           in frames 6, 12, 18, and 24 when using ESF T1 framing. A D4
           superframe only transmits 4-state signaling with A and B
           bits. On the CEPT E1 frame, all signaling is carried in
           timeslot 16, and two channels of 16-state (ABCD) signaling
           are sent per frame.

           Since this information is a state rather than a changing
           signal, implementations SHOULD use the following triple-
           redundancy mechanism, similar to the one specified in ITU-T
           Rec. I.366.2 [16], Annex L. At the time of a transition, the
           same ABCD information is sent 3 times at an interval of 5 ms.
           If another transition occurs during this time, then this
           continues. After a period of no change, the ABCD information
           is sent every 5 seconds.

      Wink: A brief transition, typically 120-290 ms, from on-hook
           (unseized) to off-hook (seized) and back to onhook, used by
           the incoming exchange to signal that the call address
           signaling can proceed.

      Incoming seizure: Incoming indication of call attempt (off-hook).





Schulzrinne & Petrack       Standards Track                    [Page 18]
^L
RFC 2833                         Tones                          May 2000


       Event                           encoding (decimal)
       __________________________________________________
       MF 0... 9                                128...137
       MF K0 or KP (start-of-pulsing)                 138
       MF K1                                          139
       MF K2                                          140
       MF S0 to ST (end-of-pulsing)                   141
       MF S1... S3                              142...143
       ABCD signaling (see below)               144...159
       Wink                                           160
       Wink off                                       161
       Incoming seizure                               162
       Seizure                                        163
       Unseize circuit                                164
       Continuity test                                165
       Default continuity tone                        166
       Continuity tone (single tone)                  167
       Continuity test send                           168
       Continuity verified                            170
       Loopback                                       171
       Old milliwatt tone (1000 Hz)                   172
       New milliwatt tone (1004 Hz)                   173

                     Table 6: Trunk events

      Seizure: Seizure by answering exchange, in response to outgoing
           seizure.

      Unseize circuit: Transition of circuit from off-hook to on-hook at
           the end of a call.

      Wink off: A brief transition, typically 100-350 ms, from off-hook
           (seized) to on-hook (unseized) and back to off-hook (seized).
           Used in operator services trunks.

      Continuity tone send: A tone of 2010 Hz.

      Continuity tone detect: A tone of 2010 Hz.

      Continuity test send: A tone of 1780 Hz is sent by the calling
           exchange. If received by the called exchange, it returns a
           "continuity verified" tone.

      Continuity verified: A tone of 2010 Hz. This is a response tone,
           used in dual-tone procedures.






Schulzrinne & Petrack       Standards Track                    [Page 19]
^L
RFC 2833                         Tones                          May 2000


4 RTP Payload Format for Telephony Tones

4.1 Introduction

   As an alternative to describing tones and events by name, as
   described in Section 3, it is sometimes preferable to describe them
   by their waveform properties. In particular, recognition is faster
   than for naming signals since it does not depend on recognizing
   durations or pauses.

   There is no single international standard for telephone tones such as
   dial tone, ringing (ringback), busy, congestion ("fast-busy"),
   special announcement tones or some of the other special tones, such
   as payphone recognition, call waiting or record tone. However, across
   all countries, these tones share a number of characteristics [17]:

      o  Telephony tones consist of either a single tone, the addition
         of two or three tones or the modulation of two tones. (Almost
         all tones use two frequencies; only the Hungarian "special dial
         tone" has three.) Tones that are mixed have the same amplitude
         and do not decay.

      o  Tones for telephony events are in the range of 25 (ringing tone
         in Angola) to 1800 Hz. CED is the highest used tone at 2100 Hz.
         The telephone frequency range is limited to 3,400 Hz.  (The
         piano has a range from 27.5 to 4186 Hz.)

      o  Modulation frequencies range between 15 (ANSam tone) to 480 Hz
         (Jamaica). Non-integer frequencies are used only for
         frequencies of 16 2/3 and 33 1/3 Hz. (These fractional
         frequencies appear to be derived from older AC power grid
         frequencies.)

      o  Tones that are not continuous have durations of less than four
         seconds.

      o  ITU Recommendation E.180 [18] notes that different telephone
         companies require a tone accuracy of between 0.5 and 1.5%.  The
         Recommendation suggests a frequency tolerance of 1%.

4.2 Examples of Common Telephone Tone Signals

   As an aid to the implementor, Table 7 summarizes some common tones.
   The rows labeled "ITU ..." refer to the general recommendation of
   Recommendation E.180 [18]. Note that there are no specific guidelines
   for these tones. In the table, the symbol "+" indicates addition of





Schulzrinne & Petrack       Standards Track                    [Page 20]
^L
RFC 2833                         Tones                          May 2000


   the tones, without modulation, while "*" indicates amplitude
   modulation. The meaning of some of the tones is described in Section
   3.12 or Section 3.11 (for V.21).

     Tone name             frequency  on period  off period
     ______________________________________________________
     CNG                        1100        0.5         3.0
     V.25 CT                    1300        0.5         2.0
     CED                        2100        3.3          --
     ANS                        2100        3.3          --
     ANSam                   2100*15        3.3          --
     V.21 "0" bit, ch. 1        1180    0.00333
     V.21 "1" bit, ch. 1         980    0.00333
     V.21 "0" bit, ch. 2        1850    0.00333
     V.21 "1" bit, ch. 2        1650    0.00333
     ITU dial tone               425         --          --
     U.S. dial tone          350+440         --          --
     ______________________________________________________
     ITU ringing tone            425  0.67--1.5        3--5
     U.S. ringing tone       440+480        2.0         4.0
     ITU busy tone               425
     U.S. busy tone          480+620        0.5         0.5
     ______________________________________________________
     ITU congestion tone         425
     U.S. congestion tone    480+620       0.25        0.25

             Table 7: Examples of telephony tones

4.3 Use of RTP Header Fields

      Timestamp: The RTP timestamp reflects the measurement point for
           the current packet. The event duration described in Section
           3.5 extends forwards from that time.

4.4 Payload Format

   Based on the characteristics described above, this document defines
   an RTP payload format called "tone" that can represent tones
   consisting of one or more frequencies. (The corresponding MIME type
   is "audio/tone".) The default timestamp rate is 8,000 Hz, but other
   rates may be defined. Note that the timestamp rate does not affect
   the interpretation of the frequency, just the durations.

   In accordance with current practice, this payload format does not
   have a static payload type number, but uses a RTP payload type number
   established dynamically and out-of-band.

   It is shown in Fig. 3.



Schulzrinne & Petrack       Standards Track                    [Page 21]
^L
RFC 2833                         Tones                          May 2000


     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |    modulation   |T|  volume   |          duration             |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |R R R R|       frequency       |R R R R|       frequency       |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |R R R R|       frequency       |R R R R|       frequency       |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    ......

    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |R R R R|       frequency       |R R R R|      frequency        |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                 Figure 3: Payload format for tones

   The payload contains the following fields:

      modulation: The modulation frequency, in Hz. The field is a 9-bit
           unsigned integer, allowing modulation frequencies up to 511
           Hz. If there is no modulation, this field has a value of
           zero.

      T: If the "T" bit is set (one), the modulation frequency is to be
           divided by three. Otherwise, the modulation frequency is
           taken as is.

           This bit allows frequencies accurate to 1/3 Hz, since
           modulation frequencies such as 16 2/3 Hz are in practical
           use.

      volume: The power level of the tone, expressed in dBm0 after
           dropping the sign, with range from 0 to -63 dBm0. (Note: A
           preferred level range for digital tone generators is -8 dBm0
           to -3 dBm0.)

      duration: The duration of the tone, measured in timestamp units.
           The tone begins at the instant identified by the RTP
           timestamp and lasts for the duration value.

           The definition of duration corresponds to that for sample-
           based codecs, where the timestamp represents the sampling
           point for the first sample.

      frequency: The frequencies of the tones to be added, measured in
           Hz and represented as a 12-bit unsigned integer. The field
           size is sufficient to represent frequencies up to 4095 Hz,



Schulzrinne & Petrack       Standards Track                    [Page 22]
^L
RFC 2833                         Tones                          May 2000


           which exceeds the range of telephone systems. A value of zero
           indicates silence. A single tone can contain any number of
           frequencies.

      R: This field is reserved for future use. The sender MUST set it
           to zero, the receiver MUST ignore it.

4.5 Reliability

   This payload format uses the reliability mechanism described in
   Section 3.7.

5 Combining Tones and Named Events

   The payload formats in Sections 3 and 4 can be combined into a single
   payload using the method specified in RFC 2198. Fig. 4 shows an
   example. In that example, the RTP packet combines two "tone" and one
   "telephone-event" payloads.  The payload types are chosen arbitrarily
   as 97 and 98, respectively, with a sample rate of 8000 Hz. Here, the
   redundancy format has the dynamic payload type 96.

   The packet represents a snapshot of U.S. ringing tone, 1.5 seconds
   (12,000 timestamp units) into the second "on" part of the 2.0/4.0
   second cadence, i.e., a total of 7.5 seconds (60,000 timestamp units)
   into the ring cycle. The 440 + 480 Hz tone of this second cadence
   started at RTP timestamp 48,000. Four seconds of silence preceded it,
   but since RFC 2198 only has a fourteen-bit offset, only 2.05 seconds
   (16383 timestamp units) can be represented. Even though the tone
   sequence is not complete, the sender was able to determine that this
   is indeed ringback, and thus includes the corresponding named event.

6 MIME Registration

6.1 audio/telephone-event

      MIME media type name: audio

      MIME subtype name: telephone-event

      Required parameters: none.











Schulzrinne & Petrack       Standards Track                    [Page 23]
^L
RFC 2833                         Tones                          May 2000


     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    | V |P|X|  CC   |M|     PT      |       sequence number         |
    | 2 |0|0|   0   |0|     96      |              31               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                           timestamp                           |
    |                             48000                             |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |           synchronization source (SSRC) identifier            |
    |                            0x5234a8                           |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |F|   block PT  |     timestamp offset      |   block length    |
    |1|     98      |            16383          |         4         |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |F|   block PT  |     timestamp offset      |   block length    |
    |1|     97      |            16383          |         8         |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |F|   Block PT  |
    |0|     97      |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |  event=ring   |0|0| volume=0  |     duration=28383            |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    | modulation=0    |0| volume=63 |     duration=16383            |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |0 0 0 0|     frequency=0       |0 0 0 0|    frequency=0        |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    | modulation=0    |0| volume=5  |     duration=12000            |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |0 0 0 0|     frequency=440     |0 0 0 0|    frequency=480      |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

       Figure 4: Combining tones and events in a single RTP packet

      Optional parameters: The "events" parameter lists the events
           supported by the implementation. Events are listed as one or
           more comma-separated elements. Each element can either be a
           single integer or two integers separated by a hyphen.  No
           white space is allowed in the argument. The integers
           designate the event numbers supported by the implementation.
           All implementations MUST support events 0 through 15, so that
           the parameter can be omitted if the implementation only
           supports these events.




Schulzrinne & Petrack       Standards Track                    [Page 24]
^L
RFC 2833                         Tones                          May 2000


           The "rate" parameter describes the sampling rate, in Hertz.
           The number is written as a floating point number or as an
           integer. If omitted, the default value is 8000 Hz.

      Encoding considerations: This type is only defined for transfer
           via RTP [1].

      Security considerations: See the "Security Considerations"
           (Section 7) section in this document.

      Interoperability considerations: none

      Published specification: This document.

      Applications which use this media: The telephone-event audio
           subtype supports the transport of events occurring in
           telephone systems over the Internet.

      Additional information:

           1. Magic number(s): N/A

           2. File extension(s): N/A

           3. Macintosh file type code: N/A

6.2 audio/tone

      MIME media type name: audio

      MIME subtype name: tone

      Required parameters: none

      Optional parameters: The "rate" parameter describes the sampling
           rate, in Hertz. The number is written as a floating point
           number or as an integer. If omitted, the default value is
           8000 Hz.

      Encoding considerations: This type is only defined for transfer
           via RTP [1].

      Security considerations: See the "Security Considerations"
           (Section 7) section in this document.

      Interoperability considerations: none

      Published specification: This document.



Schulzrinne & Petrack       Standards Track                    [Page 25]
^L
RFC 2833                         Tones                          May 2000


      Applications which use this media: The tone audio subtype supports
           the transport of pure composite tones, for example those
           commonly used in the current telephone system to signal call
           progress.

      Additional information:

           1. Magic number(s): N/A

           2. File extension(s): N/A

           3. Macintosh file type code: N/A

7 Security Considerations

   RTP packets using the payload format defined in this specification
   are subject to the security considerations discussed in the RTP
   specification (RFC 1889 [1]), and any appropriate RTP profile (for
   example RFC 1890 [19]).This implies that confidentiality of the media
   streams is achieved by encryption. Because the data compression used
   with this payload format is applied end-to-end, encryption may be
   performed after compression so there is no conflict between the two
   operations.

   This payload type does not exhibit any significant non-uniformity in
   the receiver side computational complexity for packet processing to
   cause a potential denial-of-service threat.

   In older networks employing in-band signaling and lacking appropriate
   tone filters, the tones in Section 3.14 may be used to commit toll
   fraud.

   Additional security considerations are described in RFC 2198 [6].

8 IANA Considerations

   This document defines two new RTP payload formats, named telephone-
   event and tone, and associated Internet media (MIME) types,
   audio/telephone-event and audio/tone.

   Within the audio/telephone-event type, additional events MUST be
   registered with IANA. Registrations are subject to approval by the
   current chair of the IETF audio/video transport working group, or by
   an expert designated by the transport area director if the AVT group
   has closed.






Schulzrinne & Petrack       Standards Track                    [Page 26]
^L
RFC 2833                         Tones                          May 2000


   The meaning of new events MUST be documented either as an RFC or an
   equivalent standards document produced by another standardization
   body, such as ITU-T.

9 Acknowledgements

   The suggestions of the Megaco working group are gratefully
   acknowledged.  Detailed advice and comments were provided by Fred
   Burg, Steve Casner, Fatih Erdin, Bill Foster, Mike Fox, Gunnar
   Hellstrom, Terry Lyons, Steve Magnell, Vern Paxson and Colin Perkins.

10 Authors' Addresses

   Henning Schulzrinne
   Dept. of Computer Science
   Columbia University
   1214 Amsterdam Avenue
   New York, NY 10027
   USA

   EMail:  schulzrinne@cs.columbia.edu


   Scott Petrack
   MetaTel
   45 Rumford Avenue
   Waltham, MA 02453
   USA

   EMail:  scott.petrack@metatel.com

11 Bibliography

   [1]  Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobson,
        "RTP:  A Transport Protocol for Real-Time Applications", RFC
        1889, January 1996.

   [2]  Bradner, S., "Key words for use in RFCs to Indicate Requirement
        Levels", BCP 14, RFC 2119, March 1997.

   [3]  International Telecommunication Union, "Procedures for starting
        sessions of data transmission over the public switched telephone
        network," Recommendation V.8, Telecommunication Standardization
        Sector of ITU, Geneva, Switzerland, Feb. 1998.

   [4]  R. Kocen and T. Hatala, "Voice over frame relay implementation
        agreement", Implementation Agreement FRF.11, Frame Relay Forum,
        Foster City, California, Jan. 1997.



Schulzrinne & Petrack       Standards Track                    [Page 27]
^L
RFC 2833                         Tones                          May 2000


   [5]  International Telecommunication Union, "Multifrequency push-
        button signal reception," Recommendation Q.24, Telecommunication
        Standardization Sector of ITU, Geneva, Switzerland, 1988.

   [6]  Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., Handley, M.,
        Bolot, J., Vega-Garcia, A. and S. Fosse-Parisis, "RTP Payload
        for Redundant Audio Data", RFC 2198, September 1997.

   [7]  Handley M. and V. Jacobson, "SDP: Session Description Protocol",
        RFC 2327, April 1998.

   [8]  International Telecommunication Union, "Automatic answering
        equipment and general procedures for automatic calling equipment
        on the general switched telephone network including procedures
        for disabling of echo control devices for both manually and
        automatically established calls," Recommendation V.25,
        Telecommunication Standardization Sector of ITU, Geneva,
        Switzerland, Oct. 1996.

   [9]  International Telecommunication Union, "Procedures for document
        facsimile transmission in the general switched telephone
        network," Recommendation T.30, Telecommunication Standardization
        Sector of ITU, Geneva, Switzerland, July 1996.

   [10] International Telecommunication Union, "Echo cancellers,"
        Recommendation G.165, Telecommunication Standardization Sector
        of ITU, Geneva, Switzerland, Mar. 1993.

   [11] International Telecommunication Union, "A modem operating at
        data signaling rates of up to 33 600 bit/s for use on the
        general switched telephone network and on leased point-to-point
        2-wire telephone-type circuits," Recommendation V.34,
        Telecommunication Standardization Sector of ITU, Geneva,
        Switzerland, Feb. 1998.

   [12] International Telecommunication Union, "Procedures for the
        identification and selection of common modes of operation
        between data circuit-terminating equipments (DCEs) and between
        data terminal equipments (DTEs) over the public switched
        telephone network and on leased point-to-point telephone-type
        circuits," Recommendation V.8bis, Telecommunication
        Standardization Sector of ITU, Geneva, Switzerland, Sept. 1998.

   [13] International Telecommunication Union, "Application of tones and
        recorded announcements in telephone services," Recommendation
        E.182, Telecommunication Standardization Sector of ITU, Geneva,
        Switzerland, Mar. 1998.




Schulzrinne & Petrack       Standards Track                    [Page 28]
^L
RFC 2833                         Tones                          May 2000


   [14] Bellcore, "Functional criteria for digital loop carrier
        systems," Technical Requirement TR-NWT-000057, Telcordia
        (formerly Bellcore), Morristown, New Jersey, Jan. 1993.

   [15] J. G. van Bosse, Signaling in Telecommunications Networks
        Telecommunications and Signal Processing, New York, New York:
        Wiley, 1998.

   [16] International Telecommunication Union, "AAL type 2 service
        specific convergence sublayer for trunking," Recommendation
        I.366.2, Telecommunication Standardization Sector of ITU,
        Geneva, Switzerland, Feb. 1999.

   [17] International Telecommunication Union, "Various tones used in
        national networks," Recommendation Supplement 2 to
        Recommendation E.180, Telecommunication Standardization Sector
        of ITU, Geneva, Switzerland, Jan. 1994.

   [18] International Telecommunication Union, "Technical
        characteristics of tones for telephone service," Recommendation
        Supplement 2 to Recommendation E.180, Telecommunication
        Standardization Sector of ITU, Geneva, Switzerland, Jan. 1994.

   [19] Schulzrinne, H., "RTP Profile for Audio and Video Conferences
        with Minimal Control", RFC 1890, January 1996.


























Schulzrinne & Petrack       Standards Track                    [Page 29]
^L
RFC 2833                         Tones                          May 2000


12 Full Copyright Statement

   Copyright (C) The Internet Society (2000).  All Rights Reserved.

   This document and translations of it may be copied and furnished to
   others, and derivative works that comment on or otherwise explain it
   or assist in its implementation may be prepared, copied, published
   and distributed, in whole or in part, without restriction of any
   kind, provided that the above copyright notice and this paragraph are
   included on all such copies and derivative works.  However, this
   document itself may not be modified in any way, such as by removing
   the copyright notice or references to the Internet Society or other
   Internet organizations, except as needed for the purpose of
   developing Internet standards in which case the procedures for
   copyrights defined in the Internet Standards process must be
   followed, or as required to translate it into languages other than
   English.

   The limited permissions granted above are perpetual and will not be
   revoked by the Internet Society or its successors or assigns.

   This document and the information contained herein is provided on an
   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Acknowledgement

   Funding for the RFC Editor function is currently provided by the
   Internet Society.



















Schulzrinne & Petrack       Standards Track                    [Page 30]
^L