1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
2037
2038
2039
2040
2041
2042
2043
2044
2045
2046
2047
2048
2049
2050
2051
2052
2053
2054
2055
2056
2057
2058
2059
2060
2061
2062
2063
2064
2065
2066
2067
2068
2069
2070
2071
2072
2073
2074
2075
2076
2077
2078
2079
2080
2081
2082
2083
2084
2085
2086
2087
2088
2089
2090
2091
2092
2093
2094
2095
2096
2097
2098
2099
2100
2101
2102
2103
2104
2105
2106
2107
2108
2109
2110
2111
2112
2113
2114
2115
2116
2117
2118
2119
2120
2121
2122
2123
2124
2125
2126
2127
2128
2129
2130
2131
2132
2133
2134
2135
2136
2137
2138
2139
2140
2141
2142
2143
2144
2145
2146
2147
2148
2149
2150
2151
2152
2153
2154
2155
2156
2157
2158
2159
2160
2161
2162
2163
2164
2165
2166
2167
2168
2169
2170
2171
2172
2173
2174
2175
2176
2177
2178
2179
2180
2181
2182
2183
2184
2185
2186
2187
2188
2189
2190
2191
2192
2193
2194
2195
2196
2197
2198
2199
2200
2201
2202
2203
2204
2205
2206
2207
2208
2209
2210
2211
2212
2213
2214
2215
2216
2217
2218
2219
2220
2221
2222
2223
2224
2225
2226
2227
2228
2229
2230
2231
2232
2233
2234
2235
2236
2237
2238
2239
2240
2241
2242
2243
2244
2245
2246
2247
2248
2249
2250
2251
2252
2253
2254
2255
2256
2257
2258
2259
2260
2261
2262
2263
2264
2265
2266
2267
2268
2269
2270
2271
2272
2273
2274
2275
2276
2277
2278
2279
2280
2281
2282
2283
2284
2285
2286
2287
2288
2289
2290
2291
2292
2293
2294
2295
2296
2297
2298
2299
|
Network Working Group D. Thaler
Request for Comments: 3913 Microsoft
Category: Informational September 2004
Border Gateway Multicast Protocol (BGMP):
Protocol Specification
Status of this Memo
This memo provides information for the Internet community. It does
not specify an Internet standard of any kind. Distribution of this
memo is unlimited.
Copyright Notice
Copyright (C) The Internet Society (2004).
Abstract
This document describes the Border Gateway Multicast Protocol (BGMP),
a protocol for inter-domain multicast routing. BGMP builds shared
trees for active multicast groups, and optionally allows receiver
domains to build source-specific, inter-domain, distribution branches
where needed. BGMP natively supports "source-specific multicast"
(SSM). To also support "any-source multicast" (ASM), BGMP requires
that each multicast group be associated with a single root (in BGMP
it is referred to as the root domain). It requires that different
ranges of the multicast address space are associated (e.g., with
Unicast-Prefix-Based Multicast addressing) with different domains.
Each of these domains then becomes the root of the shared domain-
trees for all groups in its range. Multicast participants will
generally receive better multicast service if the session initiator's
address allocator selects addresses from its own domain's part of the
space, thereby causing the root domain to be local to at least one of
the session participants.
Thaler Informational [Page 1]
^L
RFC 3913 BGMP: Protocol Specification September 2004
Table of Contents
1. Purpose. . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Terminology. . . . . . . . . . . . . . . . . . . . . . . . . . 4
3. Protocol Overview. . . . . . . . . . . . . . . . . . . . . . . 5
3.1. Design Rationale . . . . . . . . . . . . . . . . . . . . 7
4. Protocol Details . . . . . . . . . . . . . . . . . . . . . . . 8
4.1. Interaction with the EGP . . . . . . . . . . . . . . . . 8
4.2. Multicast Data Packet Processing . . . . . . . . . . . . 9
4.3. BGMP processing of Join and Prune messages and
notifications. . . . . . . . . . . . . . . . . . . . . . 10
4.3.1. Receiving Joins. . . . . . . . . . . . . . . . . 10
4.3.2. Receiving Prune Notifications. . . . . . . . . . 11
4.3.3. Receiving Route Change Notifications . . . . . . 12
4.3.4. Receiving (S,G) Poison-Reverse messages. . . . . 12
4.4. Interaction with M-IGP components. . . . . . . . . . . . 13
4.4.1. Interaction with DVMRP and PIM-DM. . . . . . . . 14
4.4.2. Interaction with PIM-SM. . . . . . . . . . . . . 15
4.4.3. Interaction with CBT . . . . . . . . . . . . . . 16
4.4.4. Interaction with MOSPF . . . . . . . . . . . . . 17
4.5. Operation over Multi-access Networks . . . . . . . . . . 17
4.6. Interaction between (S,G) state and G-routes . . . . . . 18
5. Message Formats. . . . . . . . . . . . . . . . . . . . . . . . 18
5.1. Message Header Format. . . . . . . . . . . . . . . . . . 19
5.2. OPEN Message Format. . . . . . . . . . . . . . . . . . . 19
5.3. UPDATE Message Format. . . . . . . . . . . . . . . . . . 23
5.4. Encoding examples. . . . . . . . . . . . . . . . . . . . 27
5.5. KEEPALIVE Message Format . . . . . . . . . . . . . . . . 27
5.6. NOTIFICATION Message Format. . . . . . . . . . . . . . . 28
6. BGMP Error Handling. . . . . . . . . . . . . . . . . . . . . . 30
6.1. Message Header error handling. . . . . . . . . . . . . . 30
6.2. OPEN message error handling. . . . . . . . . . . . . . . 30
6.3. UPDATE message error handling. . . . . . . . . . . . . . 31
6.4. NOTIFICATION message error handling. . . . . . . . . . . 32
6.5. Hold Timer Expired error handling. . . . . . . . . . . . 32
6.6. Finite State Machine error handling. . . . . . . . . . . 32
6.7. Cease. . . . . . . . . . . . . . . . . . . . . . . . . . 32
6.8. Connection collision detection . . . . . . . . . . . . . 32
7. BGMP Version Negotiation . . . . . . . . . . . . . . . . . . . 33
7.1. BGMP Capability Negotiation. . . . . . . . . . . . . . . 34
8. BGMP Finite State machine. . . . . . . . . . . . . . . . . . . 34
9. Security Considerations. . . . . . . . . . . . . . . . . . . . 38
10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 39
11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 39
11.1. Normative References . . . . . . . . . . . . . . . . . . 39
11.2. Informative References . . . . . . . . . . . . . . . . . 40
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 40
Full Copyright Statement . . . . . . . . . . . . . . . . . . . . . 41
Thaler Informational [Page 2]
^L
RFC 3913 BGMP: Protocol Specification September 2004
1. Purpose
It has been suggested that inter-domain "any-source" multicast is
better supported with a rendezvous mechanism whereby members receive
sources' data packets without any sort of global broadcast (e.g.,
MSDP broadcasts source information, PIM-DM [PIMDM] and DVMRP [DVMRP]
broadcast initial data packets, and MOSPF [MOSPF] broadcasts
membership information). PIM-SM [PIMSM] and CBT [CBT] use a shared
group-tree, to which all members join and thereby hear from all
sources (and to which non-members do not join and thereby hear from
no sources).
This document describes BGMP, a protocol for inter-domain multicast
routing. BGMP natively supports "source-specific multicast" (SSM).
To also support "any-source multicast" (ASM), BGMP builds shared
trees for active multicast groups, and allows domains to build
source-specific, inter-domain, distribution branches where needed.
Building upon concepts from PIM-SM and CBT, BGMP requires that each
global multicast group be associated with a single root. However, in
BGMP, the root is an entire exchange or domain, rather than a single
router.
For non-source-specific groups, BGMP assumes that ranges of the
multicast address space have been associated (e.g., with Unicast-
Prefix-Based Multicast [V4PREFIX,V6PREFIX] addressing) with selected
domains. Each such domain then becomes the root of the shared
domain-trees for all groups in its range. An address allocator will
generally achieve better distribution trees if it takes its multicast
addresses from its own domain's part of the space, thereby causing
the root domain to be local.
BGMP uses TCP as its transport protocol. This eliminates the need to
implement message fragmentation, retransmission, acknowledgement, and
sequencing. BGMP uses TCP port 264 for establishing its connections.
This port is distinct from BGP's port to provide protocol
independence, and to facilitate distinguishing between protocol
packets (e.g., by packet classifiers, diagnostic utilities, etc.)
Two BGMP peers form a TCP connection between one another, and
exchange messages to open and confirm the connection parameters.
They then send incremental Join/Prune Updates as group memberships
change. BGMP does not require periodic refresh of individual
entries. KeepAlive messages are sent periodically to ensure the
liveness of the connection. Notification messages are sent in
response to errors or special conditions. If a connection encounters
an error condition, a notification message is sent and the connection
is closed if the error is a fatal one.
Thaler Informational [Page 3]
^L
RFC 3913 BGMP: Protocol Specification September 2004
2. Terminology
This document uses the following technical terms:
Domain:
A set of one or more contiguous links and zero or more routers
surrounded by one or more multicast border routers. Note that
this loose definition of domain also applies to an external link
between two domains, as well as an exchange.
Root Domain:
When constructing a shared tree of domains for some group, one
domain will be the "root" of the tree. The root domain receives
data from each sender to the group, and functions as a rendezvous
domain toward which member domains can send inter-domain joins,
and to which sender domains can send data.
Multicast RIB:
The Routing Information Base, or routing table, used to calculate
the "next-hop" towards a particular address for multicast traffic.
Multicast IGP (M-IGP):
A generic term for any multicast routing protocol used for tree
construction within a domain. Typical examples of M-IGPs are:
PIM-SM, PIM-DM, DVMRP, MOSPF, and CBT.
EGP: A generic term for the interdomain unicast routing protocol in
use.
Typically, this will be some version of BGP which can support a
Multicast RIB, such as MBGP [MBGP], containing both unicast and
multicast address prefixes.
Component:
The portion of a border router associated with (and logically
inside) a particular domain that runs the multicast IGP (M-IGP)
for that domain, if any. Each border router thus has zero or more
components inside routing domains. In addition, each border
router with external links that do not fall inside any routing
domain will have an inter-domain component that runs BGMP.
External peer:
A border router in another multicast AS (autonomous system, as
used in BGP), to which a BGMP TCP-connection is open. If BGP is
being used as the EGP, a separate "eBGP" TCP-connection will also
be open to the same peer.
Thaler Informational [Page 4]
^L
RFC 3913 BGMP: Protocol Specification September 2004
Internal peer:
Another border router of the same multicast AS. If BGP is being
used as the EGP, the border router either speaks iBGP ("internal"
BGP) directly to internal peers in a full mesh, or indirectly
through a route reflector [REFLECT].
Next-hop peer:
The next-hop peer towards a given IP address is the next EGP
router on the path to the given address, according to multicast
RIB routes in the EGP's routing table (e.g., in MBGP, routes whose
Subsequent Address Family Identifier field indicates that the
route is valid for multicast traffic).
target:
Either an EGP peer, or an M-IGP component.
Tree State Table:
This is a table of (S-prefix,G) and (*,G-prefix) entries that have
been explicitly joined by a set of targets. Each entry has, in
addition to the source and group addresses and masks, a list of
targets that have explicitly requested data (on behalf of directly
connected hosts or downstream routers). (S,G) entries also have
an "SPT" bit.
The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY"
in this document are to be interpreted as described in [RFC2119].
3. Protocol Overview
BGMP maintains group-prefix state in response to messages from BGMP
peers and notifications from M-IGP components. Group-shared trees
are rooted at the domain advertising the group prefix covering those
groups. When a receiver joins a specific group address, the border
router towards the root domain generates a group-specific Join
message, which is then forwarded Border-Router-by-Border-Router
towards the root domain (see Figure 1). BGMP Join and Prune messages
are sent over TCP connections between BGMP peers, and BGMP protocol
state is refreshed by KEEPALIVE messages periodically sent over TCP.
BGMP routers build group-specific bidirectional forwarding state as
they process the BGMP Join messages. Bidirectional forwarding state
means that packets received from any target are forwarded to all
other targets in the target list without any RPF checks. No group-
specific state or traffic exists in parts of the network where there
are no members of that group.
Thaler Informational [Page 5]
^L
RFC 3913 BGMP: Protocol Specification September 2004
BGMP routers optionally build source-specific unidirectional
forwarding state, only where needed, to be compatible with source-
specific trees (SPTs) used by some M-IGPs (e.g., DVMRP, PIM-DM, or
PIM-SM), or to construct trees for source-specific groups. A domain
that uses an SPT-based M-IGP may need to inject multicast packets
from external sources via different border routers (to be compatible
with the M-IGP RPF checks) which thus act as "surrogates". For
example, in the Transit_1 domain, data from Src_A arrives at BR12,
but must be injected by BR11. A surrogate router may create a
source-specific BGMP branch if no shared tree state exists. Note:
stub domains with a single border router, such as Rcvr_Stub_7 in
Figure 1, receive all multicast data packets through that router, to
which all RPF checks point. Therefore, stub domains never build
source-specific state.
Root_Domain
[BR91]--------------------------\
| |
[BR32] [BR41]
Transit_3 Transit_4
[BR31] [BR42] [BR43]
| | |
[BR22] [BR52] [BR53]
Transit_2 Transit_5
[BR21] [BR51]
| |
[BR12] [BR61]
Transit_1[BR11]----------[BR62]Stub_6
[BR13] (Src_A)
| (Rcvr_D)
-------------------
| |
[BR71] [BR81]
Rcvr_Stub_7 Src_only_Stub_8
(Rcvr_C) (Src_B)
Figure 1: Example inter-domain topology. [BRxy] represents a BGMP
border router. Transit_X is a transit domain network. *_Stub_X is a
stub domain network.
Data packets are forwarded based on a combination of BGMP and M-IGP
rules. The router forwards to a set of targets according to a
matching (S,G) BGMP tree state entry if it exists. If not found, the
router checks for a matching (*,G) BGMP tree state entry. If neither
is found, then the packet is sent natively to the next-hop EGP peer
for G, according to the Multicast RIB (for example, in the case of a
non-member sender such as Src_B in Figure 1). If a matching entry
was found, the packet is forwarded to all other targets in the target
Thaler Informational [Page 6]
^L
RFC 3913 BGMP: Protocol Specification September 2004
list. In this way BGMP trees forward data in a bidirectional manner.
If a target is an M-IGP component then forwarding is subject to the
rules of that M-IGP protocol.
3.1. Design Rationale
Several other protocols, or protocol proposals, build shared trees
within domains [PIMSM, CBT]. The design choices made for BGMP result
from our focus on Inter-Domain multicast in particular. The design
choices made by PIM-SM and CBT are better suited to the wide-area
intra-domain case. There are three major differences between BGMP
and other shared-tree protocols:
(1) Unidirectional vs. Bidirectional trees
Bidirectional trees (using bidirectional forwarding state as
described above) minimize third party dependence which is essential
in the inter-domain context. For example, in Figure 1, stub domains
7 and 8 would like to exchange multicast packets without being
dependent on the quality of connectivity of the root domain.
However, unidirectional shared trees (i.e., those using RPF checks)
have more aggressive loop prevention and share the same processing
rules as source-specific entries which are inherently unidirectional.
The lack of third party dependence concerns in the INTRA domain case
reduces the incentive to employ bidirectional trees. BGMP supports
bidirectional trees because it has to, and because it can without
excessive cost.
(2) Source-specific distribution trees/branches
In a departure from other shared tree protocols, source-specific BGMP
state is built ONLY where (a) it is needed to pull the multicast
traffic down to a BGMP router that has source-specific (S,G) state,
and (b) that router is NOT already on the shared tree (i.e., has no
(*,G) state), and (c) that router does not want to receive packets
via encapsulation from a router which is on the shared tree. BGMP
provides source-specific branches because most M-IGP protocols in use
today build source-specific trees. BGMP's source-specific branches
eliminate the unnecessary overhead of encapsulations for high data
rate sources from the shared tree's ingress router to the surrogate
injector (e.g., from BR12 to BR11 in Figure 1). Moreover, cases in
which shared paths are significantly longer than SPT paths will also
benefit.
However, except for source-specific group distribution trees, we do
not build source-specific inter-domain trees in general because (a)
inter-domain connectivity is generally less rich than intra-domain
Thaler Informational [Page 7]
^L
RFC 3913 BGMP: Protocol Specification September 2004
connectivity, so shared distribution trees should have more
acceptable path length and traffic concentration properties in the
inter-domain context, than in the intra-domain case, and (b) by
having the shared tree state always take precedence over source-
specific tree state, we avoid ambiguities that can otherwise arise.
In summary, BGMP trees are, in a sense, a hybrid between PIM-SM and
CBT trees.
(3) Method of choosing root of group shared tree
The choice of a group's shared-tree-root has implications for
performance and policy. In the intra-domain case it is sometimes
assumed that all potential shared-tree roots (RPs/Cores) within the
domain are equally suited to be the root for a group that is
initiated within that domain. In the INTER-domain case, there is far
more opportunity for unacceptably poor locality, and administrative
control of a group's shared-tree root. Therefore in the intra-domain
case, other protocols sometimes treat all candidate roots (RPs or
Cores) as equivalent and emphasize load sharing and stability to
maximize performance. In the Inter-Domain case, all roots are not
equivalent, and we adopt an approach whereby a group's root domain is
not random but is subject to administrative control.
4. Protocol Details
In this section, we describe the detailed protocol that border
routers perform. We assume that each border router conforms to the
component-based model described in [INTEROP], modulo one correction
to section 3.2 ("BGMP" Dispatcher), as follows:
The iif owner of a (*,G) entry is the component owning the next-hop
interface towards the nominal root of G, in the multicast RIB.
4.1. Interaction with the EGP
The fundamental requirements imposed by BGMP are that:
(1) For a given source-specific group and source, BGMP must be able
to look up the next-hop towards the source in the Multicast
RIB, and
(2) For a given non-source-specific group, BGMP will map the group
address to a nominal "root" address, and must be able to look
up the next-hop towards that address in the Multicast RIB.
Thaler Informational [Page 8]
^L
RFC 3913 BGMP: Protocol Specification September 2004
BGMP determines the nominal "root" address as follows. If the
multicast address is a Unicast-Prefix-based Multicast address, then
the nominal root address is the embedded unicast prefix, padded with
a suffix of 0 bits to form a full address.
For example, if the IPv6 group address is
ff2e:0100:1234:5678:9abc:def0::123, then the unicast prefix is
1234:5678:9abc:def0/64, and the nominal root address would be
1234:5678:9abc:def0::. (This address is in fact the subnet router
anycast address [IPv6AA].)
Support for any-source-multicast using any address other than a
Unicast-prefix-based Multicast Address is outside the scope of this
document.
4.2. Multicast Data Packet Processing
For BGMP rules to be applied, an incoming packet must first be
"accepted":
o If the packet arrived on an interface owned by an M-IGP, the M-IGP
component determines whether the packet should be accepted or
dropped according to its rules. If the packet is accepted, the
packet is forwarded (or not forwarded) out any other interfaces
owned by the same component, as specified by the M-IGP.
o If the packet was received over a point-to-point interface owned
by BGMP, the packet is accepted.
o If the packet arrived on a multiaccess network interface owned by
BGMP, the packet is accepted if it is receiving data on a source-
specific branch, if it is the designated forwarder for the longest
matching route for S, or for the longest matching route for the
nominal root of G.
If the packet is accepted, then the router checks the tree state
table for a matching (S,G) entry. If one is found, but the packet
was not received from the next hop target towards S (if the entry's
SPT bit is True), or was not received from the next hop target
towards G (if the entry's SPT bit is False) then the packet is
dropped and no further actions are taken. If no (S,G) entry was
found, the router then checks for a matching (*,G) entry.
If neither is found, then the packet is forwarded towards the next-
hop peer for the nominal root of G, according to the Multicast RIB.
If a matching entry was found, the packet is forwarded to all other
targets in the target list.
Thaler Informational [Page 9]
^L
RFC 3913 BGMP: Protocol Specification September 2004
Forwarding to a target which is an M-IGP component means that the
packet is forwarded out any interfaces owned by that component
according to that component's multicast forwarding rules.
4.3. BGMP processing of Join and Prune messages and notifications
4.3.1. Receiving Joins
When the BGMP component receives a (*,G) or (S,G) Join alert from
another component, or a BGMP (S,G) or (*,G) Join message from an
external peer, it searches the tree state table for a matching entry.
If an entry is found, and that peer is already listed in the target
list, then no further actions are taken.
Otherwise, if no (*,G) or (S,G) entry was found, one is created. In
the case of a (*,G), the target list is initialized to contain the
next-hop peer towards the nominal root of G, if it is an external
peer. If the peer is internal, the target list is initialized to
contain the M-IGP component owning the next-hop interface. If there
is no next-hop peer (because the nominal root of G is inside the
domain), then the target list is initialized to contain the next-hop
component. If an (S,G) entry exists for the same G for which the
(*,G) Join is being processed, and the next-hop peers toward S and
the nominal root of G are different, the BGMP router must first send
a (S,G) Prune message toward the source and clear the SPT bit on the
(S,G) entry, before activating the (*,G) entry.
When creating (S,G) state, if the source is internal to the BGMP
speaker's domain, a "Poison-Reverse" bit (PR-bit) is set. This bit
indicates that the router may receive packets matching (S,G) anyway
due to the BGMP speaker being a member of a domain on the path
between S and the root domain. (Depending on the M-IGP protocol, it
may in fact receive such packets anyway only if it is the best exit
for the nominal root of G.)
The target from which the Join was received is then added to the
target list. The router then looks up S or the nominal root of G in
the Multicast RIB to find the next-hop EGP peer. If the target list,
not including the next-hop target towards G for a (*,G) entry,
becomes non-null as a result, the next-hop EGP peer must be notified
as follows:
a) If the next-hop peer towards the nominal root of G (for a (*,G)
entry) is an external peer, a BGMP (*,G) Join message is unicast
to the external peer. If the next-hop peer towards S (for an
(S,G) entry) is an external peer, and the router does NOT have any
active (*,G) state for that group address G, a BGMP (S,G) Join
message is unicast to the external peer. A BGMP (S,G) Join
Thaler Informational [Page 10]
^L
RFC 3913 BGMP: Protocol Specification September 2004
message is never sent to an external peer by a router that also
contains active (*,G) state for the same group. If the next-hop
peer towards S (for an (S,G entry) is an external peer and the
router DOES have active (*,G) state for that group G, the SPT bit
is always set to False.
b) If the next-hop peer is an internal peer, a (*,G) or (S,G) Join
alert is sent to the M-IGP component owning the next-hop
interface.
c) If there is no next-hop peer, a (*,G) or (S,G) Join alert is sent
to the M-IGP component owning the next-hop interface.
Finally, if an (S,G) Join is received from an internal peer, the peer
should be stored with the M-IGP component target. If (S,G) state
exists with the PR-bit set, and the next-hop towards the nominal root
for G is through the M-IGP component, an (S,G) Poison-Reverse message
is immediately sent to the internal peer.
If an (S,G) Join is received from an external peer, and (S,G) state
exists with the PR-bit set, and the local BGMP speaker is the best
exit for the nominal root of G, and the next-hop towards the nominal
root for G is through the interface towards the external peer, an
(S,G) Poison-Reverse message is immediately sent to the external
peer.
4.3.2. Receiving Prune Notifications
When the BGMP component receives a (*,G) or (S,G) Prune alert from
another component, or a BGMP (*,G) or (S,G) Prune message from an
external peer, it searches the tree state table for a matching entry.
If no (S,G) entry was found for an (S,G) Prune, but (*,G) state
exists, an (S,G) entry is created, with the target list copied from
the (*,G) entry. If no matching entry exists, or if the component or
peer is not listed in the target list, no further actions are taken.
Otherwise, the component or peer is removed from the target list. If
the target list becomes null as a result, the next-hop peer towards
the nominal root of G (for a (*,G) entry), or towards S (for an (S,G)
entry if and only if the BGMP router does NOT have any corresponding
(*,G) entry), must be notified as follows.
a) If the peer is an external peer, a BGMP (*,G) or (S,G) Prune
message is unicast to it.
b) If the next-hop peer is an internal peer, a (*,G) or (S,G) Prune
alert is sent to the M-IGP component owning the next-hop
interface.
Thaler Informational [Page 11]
^L
RFC 3913 BGMP: Protocol Specification September 2004
c) If there is no next-hop peer, a (*,G) or (S,G) Prune alert is sent
to the M-IGP component owning the next-hop interface.
4.3.3. Receiving Route Change Notifications
When a border router receives a route for a new prefix in the
multicast RIB, or a existing route for a prefix is withdrawn, a route
change notification for that prefix must be sent to the BGMP
component. In addition, when the next hop peer (according to the
multicast RIB) changes, a route change notification for that prefix
must be sent to the BGMP component.
In addition, in IPv4 (only), an internal route for each class-D
prefix associated with the domain (if any) MUST be injected into the
multicast RIB in the EGP by the domain's border routers.
When a route for a new group prefix is learned, or an existing route
for a group prefix is withdrawn, or the next-hop peer for a group
prefix changes, a BGMP router updates all affected (*,G) target
lists. The router sends a (*,G) Join to the new next-hop target, and
a (*,G) Prune to the old next-hop target, as appropriate. In
addition, if any (S,G) state exists with the PR-bit set:
o If the BGMP speaker has just become the best exit for the nominal
root of G, an (S,G) Poison Reverse message with the PR-bit set is
sent as noted below.
o If the BGMP speaker was the best exit for the nominal root of G
and is no longer, an (S,G) Poison Reverse message with the PR-bit
clear is sent as noted below.
The (S,G) Poison-Reverse messages are sent to all external peers on
the next-hop interface towards the nominal root of G from which (S,G)
Joins have been received.
When an existing route for a source prefix is withdrawn, or the
next-hop peer for a source prefix changes, a BGMP router updates all
affected (S,G) target lists. The router sends a (S,G) Join to the
new next-hop target, and a (S,G) Prune to the old next-hop target, as
appropriate.
4.3.4. Receiving (S,G) Poison-Reverse messages
When a BGMP speaker receives an (S,G) Poison-Reverse message from a
peer, it sets the PR-bit on the (S,G) state to match the PR-bit in
the message, and looks up the next-hop towards the nominal root of G.
If the next-hop target is an M-IGP component, it forwards the (S,G)
Poison Reverse message to all internal peers of that component from
Thaler Informational [Page 12]
^L
RFC 3913 BGMP: Protocol Specification September 2004
which it has received (S,G) Joins. If the next-hop target is an
external peer on a given interface, it forwards the (S,G) Poison
Reverse message to all external peers on that interface.
When a BGMP speaker receives an (S,G) Poison-Reverse message from an
external peer, with the PR-bit set, and the speaker has received no
(S,G) Joins from any other peers (e.g., only from the M-IGP, or has
(S,G) state due to encapsulation as described in 5.4.1), it knows
that its own (S,G) Join is unnecessary, and should send an (S,G)
Prune.
When a BGMP speaker receives an (S,G) Poison-Reverse message from an
internal peer, with the PR-bit set, and the speaker is the best exit
for the nominal root of G, and has (S,G) prune state, an (S,G) Join
message is sent to cancel the prune state and the state is deleted.
4.4. Interaction with M-IGP components
When an M-IGP component on a border router first learns that there
are internally-reached members for a group G (whose scope is larger
than that domain), a (*,G) Join alert is sent to the BGMP component.
Similarly, when an M-IGP component on a border router learns that
there are no longer internally-reached members for a group G (whose
scope is larger than a single domain), a (*,G) Prune alert is sent to
the BGMP component.
At any time, any M-IGP domain MAY decide to join a source-specific
branch for some external source S and group G. When the M-IGP
component in the border router that is the next-hop router for a
particular source S learns that a receiver wishes to receive data
from S on a source-specific path, an (S,G) Join alert is sent to the
BGMP component. When it is learned that such receivers no longer
exist, an (S,G) Prune alert is sent to the BGMP component. Recall
that the BGMP component will generate external source-specific Joins
only where the source-specific branch does not coincide with the
shared tree distribution tree for that group.
Finally, we will require that the border router that is the next-hop
internal peer for a particular address S or the nominal root of G be
able to forward data for a matching tree state table entry to all
members within the domain. This requirement has implications on
specific M-IGPs as follows.
Thaler Informational [Page 13]
^L
RFC 3913 BGMP: Protocol Specification September 2004
4.4.1. Interaction with DVMRP and PIM-DM
DVMRP and PIM-DM are both "broadcast and prune" protocols in which
every data packet must pass an RPF check against the packet's source
address, or be dropped. If the border router receiving packets from
an external source is the only BR to inject the route for the source
into the domain, then there are no problems. For example, this will
always be true for stub domains with a single border router (see
Figure 1). Otherwise, the border router receiving packets externally
is responsible for encapsulating the data to any other border routers
that must inject the data into the domain for RPF checks to succeed.
When an intended border router injector for a source receives
encapsulated packets from another border router in its domain, it
should create source-specific (S,G) BGMP state. Note that the border
router may be configured to do this on a data-rate triggered basis so
that the state is not created for very low data-rate/intermittent
sources. If source-specific state is created, then its incoming
interface points to the virtual encapsulation interface from the
border router that forwarded the packet, and it has an SPT flag that
is initialized to be False.
When the (S,G) BGMP state is created, the BGMP component will in turn
send a BGMP (S,G) Join message to the next-hop external peer towards
S if there is no (*,G) state for that same group, G. The (S,G) BGMP
state will have the SPT bit set to False if (*,G) BGMP state is
present.
When the first data packet from S arrives from the external peer and
matches on the BGMP (S,G) state, and IF there is no (*,G) state, the
router sets the SPT flag to True, resets the incoming interface to
point to the external peer, and sends a BGMP (S,G) Prune message to
the border router that was encapsulating the packets (e.g., in Figure
1, BR11 sends the (Src_A,G) Prune to BR12). When the border router
with (*,G) state receives the prune for (S,G), it then deletes that
border router from its list of targets.
If the decapsulator receives a (S,G) Poison Reverse message with the
PR-bit set, it will forward it to the encapsulator (which may again
forward it up the shared tree according to normal BGMP rules), and
both will delete their BGMP (S,G) state.
PIM-DM and DVMRP present an additional problem, i.e., no protocol
mechanism exists for joining and pruning entire groups; only joins
and prunes for individual sources are available. As a result, BGMP
does not currently support such protocols being used in a transit
domain.
Thaler Informational [Page 14]
^L
RFC 3913 BGMP: Protocol Specification September 2004
4.4.2. Interaction with PIM-SM
Protocols such as PIM-SM build unidirectional shared and source-
specific trees. As with DVMRP and PIM-DM, every data packet must
pass an RPF check against some group-specific or source-specific
address.
The fewest encapsulations/decapsulations will be done when the
intra-domain tree is rooted at the next-hop internal peer (which
becomes the RP) towards the nominal root of G, since in general that
router will receive the most packets from external sources. To
achieve this, each BGMP border router to a PIM-SM domain should send
Candidate-RP-Advertisements within the domain for those groups for
which it is the shared-domain tree ingress router. When the border
router that is the RP for a group G receives an external data packet,
it forwards the packet according to the M-IGP (i.e., PIM-SM) shared-
tree outgoing interface list.
Other border routers will receive data packets from external sources
that are farther down the bidirectional tree of domains. When a
border router that is not the RP receives an external packet for
which it does not have a source-specific entry, the border router
treats it like a local source by creating (S,G) state with a Register
flag set, based on normal PIM-SM rules; the Border router then
encapsulates the data packets in PIM-SM Registers and unicasts them
to the RP for the group. As explained above, the RP for the inter-
domain group will be one of the other border routers of the domain.
If a source's data rate is high enough, DRs within the PIM-SM domain
may switch to the shortest path tree. If the shortest path to an
external source is via the group's ingress router for the shared
tree, the new (S,G) state in the BGMP border router will not cause
BGMP (S,G) Joins because that border router will already have (*,G)
state. If however, the shortest path to an external source is via
some other border router, that border router will create (S,G) BGMP
state in response to the M-IGP (S,G) Join alert. In this case,
because there is no local (*,G) state to suppress it, the border
router will send a BGMP (S,G) Join to the next-hop external peer
towards S, in order to pull the data down directly. (See BR11 in
Figure 1). As in normal PIM-SM operation, those PIM-SM routers that
have (*,G) and (S,G) state pointing to different incoming interfaces
will prune that source off the shared tree. Therefore, all internal
interfaces may be eventually pruned off the internal shared tree.
Thaler Informational [Page 15]
^L
RFC 3913 BGMP: Protocol Specification September 2004
After the border router sends a BGMP (S,G) Join, if its (S,G) state
has the PR-bit clear, a (S,G) Poison-Reverse message (with the PR-bit
clear) is sent to the ingress router for G. The ingress router then
creates (S,G) if it does not already exist, and removes the next hop
towards the nominal root of G from the target list.
If the border router later receives an (S,G) Poison-Reverse message
with the PR-bit set, the Poison-Reverse message is forwarded to the
ingress router for G. The best-exit router then creates (S,G) state
if it does not already exist, and puts the next hop towards the
nominal root of G in the target list if not already present.
4.4.3. Interaction with CBT
CBT builds bidirectional shared trees but must address two points of
compatibility with BGMP. First, CBT can not accommodate more than
one border router injecting a packet. Therefore, if a CBT domain
does have multiple external connections, the M-IGP components of the
border routers are responsible for insuring that only one of them
will inject data from any given source.
Second, CBT cannot process source-specific Joins or Prunes. Two
options thus exist for each CBT domain:
Option A:
The CBT component interprets a (S,G) Join alert as if it were an
(*,G) Join alert, as described in [INTEROP]. That is, if it is
not already on the core-tree for G, then it sends a CBT (*,G)
JOIN-REQUEST message towards the core for G. Similarly, when the
CBT component receives an (S,G) Prune alert, and the child
interface list for a group is NULL, then it sends a (*,G)
QUIT_NOTIFICATION towards the core for G. This option has the
disadvantage of pulling all data for the group G down to the CBT
domain when no members exist.
Option B:
The CBT domain does not propagate any routes to their external
peers for the Multicast RIB unless it is known that no other path
exists to that prefix (e.g., routes for prefixes internal to the
domain or in a singly-homed customer's domain may be propagated).
This insures that source-specific joins are never received unless
the source's data already passes through the domain on the shared
tree, in which case the (S,G) Join need not be propagated anyway.
BGMP border routers will only send source-specific Joins or Prunes
to an external peer if that external peer advertises source-
prefixes in the EGP. If a BGMP-CBT border router does receive an
(S,G) Join or Prune, that border router should ignore the message.
Thaler Informational [Page 16]
^L
RFC 3913 BGMP: Protocol Specification September 2004
To minimize en/de-capsulations, CBTv2 BR's may follow the same scheme
as described under PIM-SM above, in which Candidate-Core
advertisements are sent for those groups for which it is the shared-
tree ingress router.
4.4.4. Interaction with MOSPF
As with CBTv2, MOSPF cannot process source-specific Joins or Prunes,
and the same two options are available. Therefore, an MOSPF domain
may either:
Option A:
send a Group-Membership-LSA for all of G in response to a (S,G)
Join alert, and "prematurely age" it out (when no other downstream
members exist) in response to an (S,G) Prune alert, OR
Option B:
not propagate any routes to their external peers for the Multicast
RIB unless it is known that no other path exists to that prefix
(e.g., routes for prefixes internal to the domain or in a singly-
homed customer's domain may be propagated)
4.5. Operation over Multi-access Networks
Multiaccess links require special handling to prevent duplicates.
The following mechanism enables BGMP to operate over multiaccess
links which do not run an M-IGP. This avoids broadcast-and-prune
behavior and does not require (S,G) state.
To elect a designated forwarder per prefix, BGMP uses a FWDR_PREF
message to exchange "forwarder preference" values for each prefix.
The peer with the highest forwarder preference becomes the designated
forwarder, with ties broken by lowest BGMP Identifier. The
designated forwarder is the router responsible for forwarding packets
up the tree, and is the peer to which joins will be sent.
When BGMP first learns that a route exists in the multicast RIB whose
next-hop interface is NOT the multiaccess link, the BGMP router sends
a BGMP FWDR_PREF message for the prefix, to all BGMP peers on the
LAN. The FWDR_PREF message contains a "forwarder preference value"
for the local router, and the same value MUST be sent to all peers on
the LAN. Likewise, when the prefix is no longer reachable, a
FWDR_PREF of 0 is sent to all peers on the LAN.
Whenever a BGMP router calculates the next-hop peer towards a
particular address, and that peer is reached over a BGMP-owned
multiaccess LAN, the designated forwarder is used instead.
Thaler Informational [Page 17]
^L
RFC 3913 BGMP: Protocol Specification September 2004
When a BGMP router receives a FWDR_PREF message from a peer, it looks
up the matching route in its multicast RIB, and calculates the new
designated forwarder. If the router has tree state entries whose
parent target was the old forwarder, it sends Joins to the new
forwarder and Prunes to the old forwarder.
When a BGMP router which is NOT the designated forwarder receives a
packet on the multiaccess link, it is silently dropped.
Finally, this mechanism prevents duplicates where full peering exists
on a "logical" link. Where full peering does not exist, steps must
be taken (outside of BGMP) to present separate logical interfaces to
BGMP, each of which is a link with full peering. This might entail,
for example, using different link-layer address mappings, doing
encapsulation, or changing the physical media.
4.6. Interaction between (S,G) state and G-routes
As discussed earlier, routers with (*,G) state will not propagate
(S,G) joins. However, a special case occurs when (S,G) state
coincides with the G-route (or route towards the nominal root of G).
When this occurs, care must be taken so that the data will reach the
root domain without causing duplicates or black holes. For this
reason, (S,G) state on the path between the source and the root
domain is annotated as being "poison-reversed". A PR-bit is kept for
this purpose, which is updated by (UN)POISON_REVERSE messages.
The PR-bit indicates to BGMP nodes whether they need to forward
packets up towards the root domain. For example, in a case where an
(S,G) branch exists, a transit domain may get packets along the (S,G)
branch, and needs to know whether to (also) forward them up towards
the root domain. If the domain in question is on the path between S
and the root domain, then the answer is yes (and the PR bit will be
set on the S,G state). If the domain in question is not on the path
between S and the root domain, then the answer is no (and the PR bit
will be clear on the S,G state).
5. Message Formats
This section describes message formats used by BGMP.
Messages are sent over a reliable transport protocol connection. A
message is processed only after it is entirely received. The maximum
message size is 4096 octets. All implementations are required to
support this maximum message size.
All fields labelled "Reserved" below must be transmitted as 0, and
ignored upon receipt.
Thaler Informational [Page 18]
^L
RFC 3913 BGMP: Protocol Specification September 2004
5.1. Message Header Format
Each message has a fixed-size (4-byte) header. There may or may not
be a data portion following the header, depending on the message
type. The layout of these fields is shown below:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length | Type | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Length:
This 2-octet unsigned integer indicates the total length of the
message, including the header, in octets. Thus, e.g., it allows
one to locate in the transport-level stream the start of the next
message. The value of the Length field must always be at least 4
and no greater than 4096, and may be further constrained,
depending on the message type. No "padding" of extra data after
the message is allowed, so the Length field must have the smallest
value required given the rest of the message.
Type:
This 1-octet unsigned integer indicates the type code of the
message. The following type codes are defined:
1 - OPEN
2 - UPDATE
3 - NOTIFICATION
4 - KEEPALIVE
5.2. OPEN Message Format
After a transport protocol connection is established, the first
message sent by each side is an OPEN message. If the OPEN message is
acceptable, a KEEPALIVE message confirming the OPEN is sent back.
Once the OPEN is confirmed, UPDATE, KEEPALIVE, and NOTIFICATION
messages may be exchanged.
In addition to the fixed-size BGMP header, the OPEN message contains
the following fields:
Thaler Informational [Page 19]
^L
RFC 3913 BGMP: Protocol Specification September 2004
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Version | Rsvd| AddrFam | Hold Time |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| BGMP Identifier (variable length) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ (Optional Parameters) |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Version:
This 1-octet unsigned integer indicates the protocol version
number of the message. The current BGMP version number is 1.
AddrFam:
The IANA-assigned address family number of the BGMP Identifier.
These include (among others):
Number Description
------ -----------
1 IP (IP version 4)
2 IPv6 (IP version 6)
Hold Time:
This 2-octet unsigned integer indicates the number of seconds that
the sender proposes for the value of the Hold Timer. Upon receipt
of an OPEN message, a BGMP speaker MUST calculate the value of the
Hold Timer by using the smaller of its configured Hold Time and
the Hold Time received in the OPEN message. The Hold Time MUST be
either zero or at least three seconds. An implementation may
reject connections on the basis of the Hold Time. The calculated
value indicates the maximum number of seconds that may elapse
between the receipt of successive KEEPALIVE, and/or UPDATE
messages by the sender.
BGMP Identifier:
This 4-octet (for IPv4) or 16-octet (IPv6) unsigned integer
indicates the BGMP Identifier of the sender. A given BGMP speaker
sets the value of its BGMP Identifier to a globally-unique value
assigned to that BGMP speaker (e.g., an IPv4 address). The value
of the BGMP Identifier is determined on startup and is the same
for every BGMP session opened.
Thaler Informational [Page 20]
^L
RFC 3913 BGMP: Protocol Specification September 2004
Optional Parameters:
This field may contain a list of optional parameters, where each
parameter is encoded as a <Parameter Length, Parameter Type,
Parameter Value> triplet. The combined length of all optional
parameters can be derived from the Length field in the message
header.
0 1
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-...
| Parm. Type | Parm. Length | Parameter Value (variable)
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-...
Parameter Type is a one octet field that unambiguously identifies
individual parameters. Parameter Length is a one octet field that
contains the length of the Parameter Value field in octets.
Parameter Value is a variable length field that is interpreted
according to the value of the Parameter Type field.
This document defines the following Optional Parameters:
a) Authentication Information (Parameter Type 1): This optional
parameter may be used to authenticate a BGMP peer. The Parameter
Value field contains a 1-octet Authentication Code followed by a
variable length Authentication Data.
0 1 2 3 4 5 6 7 8
+-+-+-+-+-+-+-+-+
| Auth. Code |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
| Authentication Data |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Authentication Code:
This 1-octet unsigned integer indicates the authentication
mechanism being used. Whenever an authentication mechanism is
specified for use within BGMP, three things must be included in
the specification:
- the value of the Authentication Code which indicates use of the
mechanism, and - the form and meaning of the Authentication Data.
Note that a separate authentication mechanism may be used in
establishing the transport level connection.
Thaler Informational [Page 21]
^L
RFC 3913 BGMP: Protocol Specification September 2004
Authentication Data:
The form and meaning of this field is a variable-length field
depend on the Authentication Code.
The minimum length of the OPEN message is 12 octets (including
message header).
b) Capability Information (Parameter Type 2): This is an Optional
Parameter that is used by a BGMP-speaker to convey to its peer the
list of capabilities supported by the speaker. The parameter
contains one or more triples <Capability Code, Capability Length,
Capability Value>, where each triple is encoded as shown below:
+------------------------------+
| Capability Code (1 octet) |
+------------------------------+
| Capability Length (1 octet) |
+------------------------------+
| Capability Value (variable) |
+------------------------------+
Capability Code:
Capability Code is a one octet field that unambiguously identifies
individual capabilities.
Capability Length:
Capability Length is a one octet field that contains the length of
the Capability Value field in octets.
Capability Value:
Capability Value is a variable length field that is interpreted
according to the value of the Capability Code field.
A particular capability, as identified by its Capability Code, may
occur more than once within the Optional Parameter.
This document reserves Capability Codes 128-255 for vendor-specific
applications.
This document reserves value 0.
Capability Codes (other than those reserved for vendor specific use)
are assigned only by the IETF consensus process and IESG approval.
Thaler Informational [Page 22]
^L
RFC 3913 BGMP: Protocol Specification September 2004
5.3. UPDATE Message Format
UPDATE messages are used to transfer Join/Prune/FwdrPref information
between BGMP peers. The UPDATE message always includes the fixed-
size BGMP header, and one or more attributes as described below.
The message format below allows compact encoding of (*,G) Joins and
Prunes, while allowing the flexibility needed to do other updates
such as (S,G) Joins and Prunes towards sources as well as on the
shared tree. In the discussion below, an Encoded-Address-Prefix is
of the form:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+
|EnTyp| AddrFam |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Address (variable length) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Mask (variable length) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
EnTyp:
0 - All 1's Mask. The Mask field is 0 bytes long.
1 - Mask length included. The Mask field is 4 bytes long, and
contains the mask length, in bits.
2 - Full Mask included. The Mask field is the same length
as the Address field, and contains the full bitmask.
AddrFam:
The IANA-assigned address family number of the encoded prefix.
Address:
The address associated with the given prefix to be encoded. The
length is determined based on the Address Family.
Mask:
The mask associated with the given prefix. The format (or absence)
of this field is determined by the EnTyp field.
Each attribute is of the form:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length | Type | Data ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Thaler Informational [Page 23]
^L
RFC 3913 BGMP: Protocol Specification September 2004
All attributes are 4-byte aligned.
Length:
The Length is the length of the entire attribute, including the
length, type, and data fields. If other attributes are nested
within the data field, the length includes the size of all such
nested attributes.
Type:
Types 128-255 are reserved for "optional" attributes. If a
required attribute is unrecognized, a NOTIFICATION will be sent and
the connection will be closed if the error is a fatal one.
Unrecognized optional attributes are simply ignored.
0 - JOIN
1 - PRUNE
2 - GROUP
3 - SOURCE
4 - FWDR_PREF
5 - POISON_REVERSE
a) JOIN (Type Code 0)
The JOIN attribute indicates that all GROUP or SOURCE options
nested immediately within the JOIN option should be joined.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length | Type=0 | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Nested Attributes ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
No JOIN, PRUNE, or FWDR_PREF attributes may be immediately nested
within a JOIN attribute.
b) PRUNE (Type Code 1)
The PRUNE attribute indicates that all GROUP or SOURCE attributes
nested immediately within the PRUNE attribute should be pruned.
Thaler Informational [Page 24]
^L
RFC 3913 BGMP: Protocol Specification September 2004
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length | Type=1 | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Nested Attributes ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
No JOIN, PRUNE, or FWDR_PREF attributes may be immediately nested
within a PRUNE attribute.
c) GROUP (Type Code 2)
The GROUP attribute identifies a given group-prefix. In addition,
any attributes nested immediately within the GROUP attribute also
apply to the given group-prefix.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length | Type=2 | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +
| |
| Encoded-Address-Prefix |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Nested Attributes (optional) ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Encoded-Address-Prefix The multicast group prefix to be joined to
pruned, in the format described above.
Nested Attributes No GROUP, SOURCE, or FWDR_PREF attributes may
be immediately nested within a GROUP
attribute.
d) SOURCE (Type Code 3):
The SOURCE attribute identifies a given source-prefix. In
addition, any attributes nested immediately within the SOURCE
attribute also apply to the given source-prefix.
Thaler Informational [Page 25]
^L
RFC 3913 BGMP: Protocol Specification September 2004
The SOURCE attribute has the following format:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length | Type=2 | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +
| |
| Encoded-Address-Prefix |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Nested Attributes (optional) ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Encoded-Address-Prefix The Source-prefix in the format described
above.
Nested Attributes No GROUP, SOURCE, or FWDR_PREF attributes may
be immediately nested within a SOURCE
attribute.
e) FWDR_PREF (Type Code 4)
The FWDR_PREF attribute provides a forwarder preference value for
all GROUP or SOURCE attributes nested immediately within the
FWDR_PREF attribute. It is used by a BGMP speaker to inform other
BGMP speakers of the originating speaker's degree of preference for
a given group or source prefix. Usage of this attribute is
described in 5.5.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length | Type=1 | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Preference Value |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Nested Attributes ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Preference Value A 32-bit non-negative integer.
Nested Attributes No JOIN, PRUNE, or FWDR_PREF attributes may be
immediately nested within a FWDR_PREF attribute.
e) POISON_REVERSE (Type Code 5)
The POISON_REVERSE attribute provides a "poison-reverse" (PR-bit)
value for all SOURCE attributes nested immediately within the
POISON_REVERSE attribute. It is used by a BGMP speaker to inform
Thaler Informational [Page 26]
^L
RFC 3913 BGMP: Protocol Specification September 2004
other BGMP speakers from which it has received (S,G) Joins that
they are on the path of domains between the source and the root
domain.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length | Type=1 | Reserved |P|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Nested Attributes ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
P The PR-bit value.
Nested Attributes No attributes in the document other than SOURCE
may be immediately nested within a POISON_REVERSE
attribute.
5.4. Encoding examples
Below are enumerated examples of how various updates are built using
nested attributes, where A ( B ) denotes that attribute B is nested
within attribute A.
(*,G-prefix) Join: JOIN ( GROUP )
(*,G-prefix) Prune: PRUNE ( GROUP )
(S,G) Join towards S : GROUP ( JOIN ( SOURCE ) )
(S,G) Join cancelling prune towards root of G: GROUP ( JOIN ( SOURCE ) )
(S,G) Prune towards S: GROUP ( PRUNE ( SOURCE ) )
(S,G) Prune towards root of G: GROUP ( PRUNE ( SOURCE ) )
Switch from (*,G) to (S,G): PRUNE ( GROUP ( JOIN ( SOURCE ) ) )
Switch from (S,G) to (*,G): JOIN ( GROUP )
Initial (*,G) Join with S pruned: JOIN ( GROUP ( PRUNE ( SOURCE ) ) )
Forwarder preference announcement for G-prefix: FWDR_PREF ( GROUP )
Forwarder preference announcement for S-prefix: FWDR_PREF ( SOURCE )
5.5. KEEPALIVE Message Format
BGMP does not use any transport protocol-based keep-alive mechanism
to determine if peers are reachable. Instead, KEEPALIVE messages are
exchanged between peers often enough as not to cause the Hold Timer
to expire. A reasonable maximum time between the last KEEPALIVE or
UPDATE message sent, and the time at which a KEEPALIVE message is
sent, would be one third of the Hold Time interval. KEEPALIVE
messages MUST NOT be sent more frequently than one per second. An
implementation MAY adjust the rate at which it sends KEEPALIVE
messages as a function of the Hold Time interval.
Thaler Informational [Page 27]
^L
RFC 3913 BGMP: Protocol Specification September 2004
If the negotiated Hold Time interval is zero, then periodic KEEPALIVE
messages MUST NOT be sent.
A KEEPALIVE message consists of only a message header, and has a
length of 4 octets.
5.6. NOTIFICATION Message Format
A NOTIFICATION message is sent when an error condition is detected.
The BGMP connection is closed immediately after sending it if the
error is a fatal one.
In addition to the fixed-size BGMP header, the NOTIFICATION message
contains the following fields:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|O| Error code | Error subcode | Data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
O-bit:
Open-bit. If clear, the connection will be closed. If set,
indicates the error is not fatal.
Error Code:
This 1-octet unsigned integer indicates the type of NOTIFICATION.
The following Error Codes have been defined:
Error Code Symbolic Name Reference
1 Message Header Error Section 9.1
2 OPEN Message Error Section 9.2
3 UPDATE Message Error Section 9.3
4 Hold Timer Expired Section 9.5
5 Finite State Machine Error Section 9.6
6 Cease Section 9.7
Thaler Informational [Page 28]
^L
RFC 3913 BGMP: Protocol Specification September 2004
Error subcode:
This 1-octet unsigned integer provides more specific information
about the nature of the reported error. Each Error Code may have
one or more Error Subcodes associated with it. If no appropriate
Error Subcode is defined, then a zero (Unspecific) value is used
for the Error Subcode field. The notation (MC) below indicates
the error is a fatal one and the O-bit must be clear. Non-fatal
subcodes SHOULD be sent with the O-bit set.
Message Header Error subcodes:
2 - Bad Message Length (MC)
3 - Bad Message Type (MC)
OPEN Message Error subcodes:
1 - Unsupported Version (MC)
4 - Unsupported Optional Parameter
5 - Authentication Failure (MC)
6 - Unacceptable Hold Time (MC)
7 - Unsupported Capability (MC)
UPDATE Message Error subcodes:
1 - Malformed Attribute List (MC)
2 - Unrecognized Attribute Type
5 - Attribute Length Error (MC)
10 - Invalid Address
11 - Invalid Mask
13 - Unrecognized Address Family
Data:
This variable-length field is used to diagnose the reason for the
NOTIFICATION. The contents of the Data field depend upon the
Error Code and Error Subcode. See Section 7 below for more
details.
Note that the length of the Data field can be determined from the
message Length field by the formula:
Message Length = 6 + Data Length
The minimum length of the NOTIFICATION message is 6 octets
(including message header).
Thaler Informational [Page 29]
^L
RFC 3913 BGMP: Protocol Specification September 2004
6. BGMP Error Handling
This section describes actions to be taken when errors are detected
while processing BGMP messages. BGMP Error Handling is similar to
that of BGP [BGP].
When any of the conditions described here are detected, a
NOTIFICATION message with the indicated Error Code, Error Subcode,
and Data fields is sent, and the BGMP connection is closed if the
error is a fatal one. If no Error Subcode is specified, then a zero
must be used.
The phrase "the BGMP connection is closed" means that the transport
protocol connection has been closed and that all resources for that
BGMP connection have been deallocated. The remote peer is removed
from the target list of all tree state entries.
Unless specified explicitly, the Data field of the NOTIFICATION
message that is sent to indicate an error is empty.
6.1. Message Header error handling
All errors detected while processing the Message Header are indicated
by sending the NOTIFICATION message with Error Code Message Header
Error. The Error Subcode elaborates on the specific nature of the
error.
If the Length field of the message header is less than 4 or greater
than 4096, or if the Length field of an OPEN message is less than
the minimum length of the OPEN message, or if the Length field of an
UPDATE message is less than the minimum length of the UPDATE message,
or if the Length field of a KEEPALIVE message is not equal to 4, then
the Error Subcode is set to Bad Message Length. The Data field
contains the erroneous Length field.
If the Type field of the message header is not recognized, then the
Error Subcode is set to Bad Message Type. The Data field contains
the erroneous Type field.
6.2. OPEN message error handling
All errors detected while processing the OPEN message are indicated
by sending the NOTIFICATION message with Error Code OPEN Message
Error. The Error Subcode elaborates on the specific nature of the
error.
Thaler Informational [Page 30]
^L
RFC 3913 BGMP: Protocol Specification September 2004
If the version number contained in the Version field of the received
OPEN message is not supported, then the Error Subcode is set to
Unsupported Version Number. The Data field is a 2-octet unsigned
integer, which indicates the largest locally supported version number
less than the version the remote BGMP peer bid (as indicated in the
received OPEN message).
If the Hold Time field of the OPEN message is unacceptable, then the
Error Subcode MUST be set to Unacceptable Hold Time. An
implementation MUST reject Hold Time values of one or two seconds.
An implementation MAY reject any proposed Hold Time. An
implementation which accepts a Hold Time MUST use the negotiated
value for the Hold Time.
If one of the Optional Parameters in the OPEN message is not
recognized, then the Error Subcode is set to Unsupported Optional
Parameters.
If the OPEN message carries Authentication Information (as an
Optional Parameter), then the corresponding authentication procedure
is invoked. If the authentication procedure (based on Authentication
Code and Authentication Data) fails, then the Error Subcode is set to
Authentication Failure.
If the OPEN message indicates that the peer does not support a
capability which the receiver requires, the receiver may send a
NOTIFICATION message to the peer, and terminate peering. The Error
Subcode in the message is set to Unsupported Capability. The Data
field in the NOTIFICATION message lists the set of capabilities that
cause the speaker to send the message. Each such capability is
encoded the same way as it was encoded in the received OPEN message.
6.3. UPDATE message error handling
All errors detected while processing the UPDATE message are indicated
by sending the NOTIFICATION message with Error Code UPDATE Message
Error. The error subcode elaborates on the specific nature of the
error.
If any recognized attribute has Attribute Length that conflicts with
the expected length (based on the attribute type code), then the
Error Subcode is set to Attribute Length Error. The Data field
contains the erroneous attribute (type, length and value).
If the Encoded-Address-Prefix field in some attribute is
syntactically incorrect, then the Error Subcode is set to Invalid
Prefix Field.
Thaler Informational [Page 31]
^L
RFC 3913 BGMP: Protocol Specification September 2004
If any other is encountered when processing attributes (such as
invalid nestings), then the Error Subcode is set to Malformed
Attribute List, and the problematic attribute is included in the data
field.
6.4. NOTIFICATION message error handling
If a peer sends a NOTIFICATION message, and there is an error in that
message, there is unfortunately no means of reporting this error via
a subsequent NOTIFICATION message. Any such error, such as an
unrecognized Error Code or Error Subcode, should be noticed, logged
locally, and brought to the attention of the administration of the
peer. The means to do this, however, lies outside the scope of this
document.
6.5. Hold Timer Expired error handling
If a system does not receive successive KEEPALIVE and/or UPDATE
and/or NOTIFICATION messages within the period specified in the Hold
Time field of the OPEN message, then the NOTIFICATION message with
Hold Timer Expired Error Code must be sent and the BGMP connection
closed.
6.6. Finite State Machine error handling
Any error detected by the BGMP Finite State Machine (e.g., receipt of
an unexpected event) is indicated by sending the NOTIFICATION message
with Error Code Finite State Machine Error.
6.7. Cease
In absence of any fatal errors (that are indicated in this section),
a BGMP peer may choose at any given time to close its BGMP connection
by sending the NOTIFICATION message with Error Code Cease. However,
the Cease NOTIFICATION message must not be used when a fatal error
indicated by this section does exist.
6.8. Connection collision detection
If a pair of BGMP speakers try simultaneously to establish a TCP
connection to each other, then two parallel connections between this
pair of speakers might well be formed. We refer to this situation as
connection collision. Clearly, one of these connections must be
closed.
Based on the value of the BGMP Identifier a convention is established
for detecting which BGMP connection is to be preserved when a
collision does occur. The convention is to compare the BGMP
Thaler Informational [Page 32]
^L
RFC 3913 BGMP: Protocol Specification September 2004
Identifiers of the peers involved in the collision and to retain only
the connection initiated by the BGMP speaker with the higher-valued
BGMP Identifier.
Upon receipt of an OPEN message, the local system must examine all of
its connections that are in the OpenConfirm state. A BGMP speaker
may also examine connections in an OpenSent state if it knows the
BGMP Identifier of the peer by means outside of the protocol. If
among these connections there is a connection to a remote BGMP
speaker whose BGMP Identifier equals the one in the OPEN message,
then the local system performs the following collision resolution
procedure:
1. The BGMP Identifier of the local system is compared to the BGMP
Identifier of the remote system (as specified in the OPEN
message).
2. If the value of the local BGMP Identifier is less than the remote
one, the local system closes BGMP connection that already exists
(the one that is already in the OpenConfirm state), and accepts
BGMP connection initiated by the remote system.
3. Otherwise, the local system closes newly created BGMP connection
(the one associated with the newly received OPEN message), and
continues to use the existing one (the one that is already in the
OpenConfirm state).
Comparing BGMP Identifiers is done by treating them as (4-octet long)
unsigned integers.
A connection collision with an existing BGMP connection that is in
Established states causes unconditional closing of the newly created
connection. Note that a connection collision cannot be detected with
connections that are in Idle, or Connect, or Active states.
Closing the BGMP connection (that results from the collision
resolution procedure) is accomplished by sending the NOTIFICATION
message with the Error Code Cease.
7. BGMP Version Negotiation
BGMP speakers may negotiate the version of the protocol by making
multiple attempts to open a BGMP connection, starting with the
highest version number each supports. If an open attempt fails with
an Error Code OPEN Message Error, and an Error Subcode Unsupported
Version Number, then the BGMP speaker has available the version
number it tried, the version number its peer tried, the version
number passed by its peer in the NOTIFICATION message, and the
Thaler Informational [Page 33]
^L
RFC 3913 BGMP: Protocol Specification September 2004
version numbers that it supports. If the two peers do support one or
more common versions, then this will allow them to rapidly determine
the highest common version. In order to support BGMP version
negotiation, future versions of BGMP must retain the format of the
OPEN and NOTIFICATION messages.
7.1. BGMP Capability Negotiation
When a BGMP speaker sends an OPEN message to its BGMP peer, the
message may include an Optional Parameter, called Capabilities. The
parameter lists the capabilities supported by the speaker.
A BGMP speaker may use a particular capability when peering with
another speaker only if both speakers support that capability. A
BGMP speaker determines the capabilities supported by its peer by
examining the list of capabilities present in the Capabilities
Optional Parameter carried by the OPEN message that the speaker
receives from the peer.
8. BGMP Finite State machine
This section specifies BGMP operation in terms of a Finite State
Machine (FSM). Following is a brief summary and overview of BGMP
operations by state as determined by this FSM.
Initially BGMP is in the Idle state.
Idle state:
In this state BGMP refuses all incoming BGMP connections. No
resources are allocated to the peer. In response to the Start
event (initiated by either system or operator) the local system
initializes all BGMP resources, starts the ConnectRetry timer,
initiates a transport connection to the other BGMP peer, while
listening for a connection that may be initiated by the remote
BGMP peer, and changes its state to Connect. The exact value of
the ConnectRetry timer is a local matter, but should be
sufficiently large to allow TCP initialization.
If a BGMP speaker detects an error, it shuts down the connection
and changes its state to Idle. Getting out of the Idle state
requires generation of the Start event. If such an event is
generated automatically, then persistent BGMP errors may result in
persistent flapping of the speaker. To avoid such a condition it
is recommended that Start events should not be generated
immediately for a peer that was previously transitioned to Idle
due to an error. For a peer that was previously transitioned to
Idle due to an error, the time between consecutive generation of
Thaler Informational [Page 34]
^L
RFC 3913 BGMP: Protocol Specification September 2004
Start events, if such events are generated automatically, shall
exponentially increase. The value of the initial timer shall be
60 seconds. The time shall be doubled for each consecutive retry.
Any other event received in the Idle state is ignored.
Connect state:
In this state BGMP is waiting for the transport protocol
connection to be completed.
If the transport protocol connection succeeds, the local system
clears the ConnectRetry timer, completes initialization, sends an
OPEN message to its peer, and changes its state to OpenSent. If
the transport protocol connect fails (e.g., retransmission
timeout), the local system restarts the ConnectRetry timer,
continues to listen for a connection that may be initiated by the
remote BGMP peer, and changes its state to Active state.
In response to the ConnectRetry timer expired event, the local
system restarts the ConnectRetry timer, initiates a transport
connection to the other BGMP peer, continues to listen for a
connection that may be initiated by the remote BGMP peer, and
stays in the Connect state.
The Start event is ignored in the Connect state.
In response to any other event (initiated by either system or
operator), the local system releases all BGMP resources associated
with this connection and changes its state to Idle.
Active state:
In this state BGMP is trying to acquire a peer by listening for an
incoming transport protocol connection.
If the transport protocol connection succeeds, the local system
clears the ConnectRetry timer, completes initialization, sends an
OPEN message to its peer, sets its Hold Timer to a large value,
and changes its state to OpenSent. A Hold Timer value of 4
minutes is suggested.
In response to the ConnectRetry timer expired event, the local
system restarts the ConnectRetry timer, initiates a transport
connection to other BGMP peer, continues to listen for a
connection that may be initiated by the remote BGMP peer, and
changes its state to Connect.
Thaler Informational [Page 35]
^L
RFC 3913 BGMP: Protocol Specification September 2004
If the local system detects that a remote peer is trying to
establish BGMP connection to it, and the IP address of the remote
peer is not an expected one, the local system restarts the
ConnectRetry timer, rejects the attempted connection, continues to
listen for a connection that may be initiated by the remote BGMP
peer, and stays in the Active state.
The Start event is ignored in the Active state.
In response to any other event (initiated by either system or
operator), the local system releases all BGMP resources associated
with this connection and changes its state to Idle.
OpenSent state:
In this state BGMP waits for an OPEN message from its peer. When
an OPEN message is received, all fields are checked for
correctness. If the BGMP message header checking or OPEN message
checking detects an error (see Section 6.2), or a connection
collision (see Section 6.8) the local system sends a NOTIFICATION
message and changes its state to Idle.
If there are no errors in the OPEN message, BGMP sends a KEEPALIVE
message and sets a KeepAlive timer. The Hold Timer, which was
originally set to a large value (see above), is replaced with the
negotiated Hold Time value (see section 4.2). If the negotiated
Hold Time value is zero, then the Hold Time timer and KeepAlive
timers are not started. If the configured remote Autonomous
System value for this peering is the same as the local Autonomous
System number, then the connection is an "internal" connection;
otherwise, it is "external". Finally, the state is changed to
OpenConfirm.
If a disconnect notification is received from the underlying
transport protocol, the local system closes the BGMP connection,
restarts the ConnectRetry timer, while continue listening for
connection that may be initiated by the remote BGMP peer, and goes
into the Active state.
If the Hold Timer expires, the local system sends NOTIFICATION
message with error code Hold Timer Expired and changes its state
to Idle.
In response to the Stop event (initiated by either system or
operator) the local system sends NOTIFICATION message with Error
Code Cease and changes its state to Idle.
The Start event is ignored in the OpenSent state.
Thaler Informational [Page 36]
^L
RFC 3913 BGMP: Protocol Specification September 2004
In response to any other event the local system sends NOTIFICATION
message with Error Code Finite State Machine Error and changes its
state to Idle.
Whenever BGMP changes its state from OpenSent to Idle, it closes
the BGMP (and transport-level) connection and releases all
resources associated with that connection.
OpenConfirm state:
In this state BGMP waits for a KEEPALIVE or NOTIFICATION message.
If the local system receives a KEEPALIVE message, it changes its
state to Established.
If the Hold Timer expires before a KEEPALIVE message is received,
the local system sends NOTIFICATION message with error code Hold
Timer Expired and changes its state to Idle.
If the local system receives a NOTIFICATION message, it changes
its state to Idle.
If the KeepAlive timer expires, the local system sends a KEEPALIVE
message and restarts its KeepAlive timer.
If a disconnect notification is received from the underlying
transport protocol, the local system changes its state to Idle.
In response to the Stop event (initiated by either system or
operator) the local system sends NOTIFICATION message with Error
Code Cease and changes its state to Idle.
The Start event is ignored in the OpenConfirm state.
In response to any other event the local system sends NOTIFICATION
message with Error Code Finite State Machine Error and changes its
state to Idle.
Whenever BGMP changes its state from OpenConfirm to Idle, it
closes the BGMP (and transport-level) connection and releases all
resources associated with that connection.
Established state:
In the Established state BGMP can exchange UPDATE, NOTIFICATION,
and KEEPALIVE messages with its peer.
Thaler Informational [Page 37]
^L
RFC 3913 BGMP: Protocol Specification September 2004
If the local system receives an UPDATE or KEEPALIVE message, it
restarts its Hold Timer, if the negotiated Hold Time value is
non-zero.
If the local system receives a NOTIFICATION message, it changes
its state to Idle.
If the local system receives an UPDATE message and the UPDATE
message error handling procedure (see Section 6.3) detects an
error, the local system sends a NOTIFICATION message and changes
its state to Idle.
If a disconnect notification is received from the underlying
transport protocol, the local system changes its state to Idle.
If the Hold Timer expires, the local system sends a NOTIFICATION
message with Error Code Hold Timer Expired and changes its state
to Idle.
If the KeepAlive timer expires, the local system sends a KEEPALIVE
message and restarts its KeepAlive timer.
Each time the local system sends a KEEPALIVE or UPDATE message, it
restarts its KeepAlive timer, unless the negotiated Hold Time
value is zero.
In response to the Stop event (initiated by either system or
operator), the local system sends a NOTIFICATION message with
Error Code Cease and changes its state to Idle.
The Start event is ignored in the Established state.
In response to any other event, the local system sends
NOTIFICATION message with Error Code Finite State Machine Error
and changes its state to Idle.
Whenever BGMP changes its state from Established to Idle, it
closes the BGMP (and transport-level) connection, releases all
resources associated with that connection, and deletes all routes
derived from that connection.
9. Security Considerations
If a BGMP speaker accepts unauthorized or altered BGMP messages,
denial of service due to excess bandwidth consumption or lack of
multicast connectivity can result. Authentication of BGMP messages
can protect against this behavior.
Thaler Informational [Page 38]
^L
RFC 3913 BGMP: Protocol Specification September 2004
A BGMP implementation MUST implement Keyed MD5 [RFC2385] to secure
control messages, and MUST be capable of interoperating with peers
that do not support it. However, if one side of the connection is
configured with Keyed MD5 and the other side is not, the connection
SHOULD NOT be established.
This provides a weak security mechanism, as it is still possible for
denial of service to occur as a result of messages relayed through a
trusted peer. However, this model is the same as the currently
practiced security mechanism for BGP. It is anticipated that future
work will provide different stronger mechanisms for dealing with
these issues in routing protocols.
10. Acknowledgements
In addition to the editor, the following individuals have contributed
to the design of BGMP: Cengiz Alaettinoglu, Tony Ballardie, Steve
Casner, Steve Deering, Deborah Estrin, Dino Farinacci, Bill Fenner,
Mark Handley, Ahmed Helmy, Van Jacobson, Dave Meyer, and Satish
Kumar.
This document is the product of the IETF BGMP Working Group with Dave
Thaler as editor.
Rusty Eddy, Isidor Kouvelas, and Pavlin Radoslavov also provided
valuable feedback on this document.
11. References
11.1. Normative References
[INTEROP] Thaler, D., "Interoperability Rules for Multicast Routing
Protocols", RFC 2715, October 1999.
[RFC2385] Heffernan, A., "Protection of BGP sessions via the TCP MD5
Signature Option", RFC 2385, August 1998.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[V6PREFIX] Haberman, B. and D. Thaler, "Unicast-Prefix-based IPv6
Multicast Addresses", RFC 3306, August 2002.
Thaler Informational [Page 39]
^L
RFC 3913 BGMP: Protocol Specification September 2004
11.2. Informative References
[BGP] Rekhter, Y. and T. Li, "A Border Gateway Protocol 4 (BGP-
4)", RFC 1771, March 1995.
[MBGP] Bates, T., Rekhter, Y., Chandra, R., and D. Katz,
"Multiprotocol Extensions for BGP-4", RFC 2858, June 2000.
[CBT] Ballardie, A., "Core Based Trees (CBT version 2) Multicast
Routing -- Protocol Specification", RFC 2189, September
1997.
[DVMRP] Pusateri, T., "Distance Vector Multicast Routing
Protocol", Work in Progress, October 2003.
[IPv6AA] Hinden, R. and S. Deering, "Internet Protocol Version 6
(IPv6) Addressing Architecture", RFC 3513, April 2003.
[MOSPF] Moy, J., "Multicast Extensions to OSPF", RFC 1584, March
1994.
[PIMDM] Adams, A., Nicholas, J. and W. Siadak, "Protocol
Independent Multicast - Dense Mode (PIM-DM): Protocol
Specification (Revised)", Work in Progress, September
2003.
[PIMSM] Estrin, D., Farinacci, D., Helmy, A., Thaler, D., Deering,
S., Handley, M., Jacobson, V., Liu, C., Sharma, P., and L.
Wei, "Protocol Independent Multicast-Sparse Mode (PIM-SM):
Protocol Specification", RFC 2362, June 1998.
[REFLECT] Bates, T. and R. Chandra, "BGP Route Reflection: An
alternative to full mesh IBGP", RFC 1966, June 1996.
[V4PREFIX] Thaler, D., "Unicast-Prefix-based IPv4 Multicast
Addresses", Work in Progress, August 2004.
Authors' Address
Dave Thaler
Microsoft
One Microsoft Way
Redmond, WA 98052
EMail: dthaler@microsoft.com
Thaler Informational [Page 40]
^L
RFC 3913 BGMP: Protocol Specification September 2004
Full Copyright Statement
Copyright (C) The Internet Society (2004).
This document is subject to the rights, licenses and restrictions
contained in BCP 78, and except as set forth therein, the authors
retain all their rights.
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/S HE
REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE
INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Intellectual Property
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the IETF's procedures with respect to rights in IETF Documents can
be found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at ietf-
ipr@ietf.org.
Acknowledgement
Funding for the RFC Editor function is currently provided by the
Internet Society.
Thaler Informational [Page 41]
^L
|