1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
2037
2038
2039
2040
2041
2042
2043
2044
2045
2046
2047
2048
2049
|
Internet Research Task Force (IRTF) F. Gont
Request for Comments: 9415 SI6 Networks
Category: Informational I. Arce
ISSN: 2070-1721 Quarkslab
July 2023
On the Generation of Transient Numeric Identifiers
Abstract
This document performs an analysis of the security and privacy
implications of different types of "transient numeric identifiers"
used in IETF protocols and tries to categorize them based on their
interoperability requirements and their associated failure severity
when such requirements are not met. Subsequently, it provides advice
on possible algorithms that could be employed to satisfy the
interoperability requirements of each identifier category while
minimizing the negative security and privacy implications, thus
providing guidance to protocol designers and protocol implementers.
Finally, it describes a number of algorithms that have been employed
in real implementations to generate transient numeric identifiers and
analyzes their security and privacy properties. This document is a
product of the Privacy Enhancements and Assessments Research Group
(PEARG) in the IRTF.
Status of This Memo
This document is not an Internet Standards Track specification; it is
published for informational purposes.
This document is a product of the Internet Research Task Force
(IRTF). The IRTF publishes the results of Internet-related research
and development activities. These results might not be suitable for
deployment. This RFC represents the consensus of the Privacy
Enhancements and Assessments Research Group of the Internet Research
Task Force (IRTF). Documents approved for publication by the IRSG
are not candidates for any level of Internet Standard; see Section 2
of RFC 7841.
Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
https://www.rfc-editor.org/info/rfc9415.
Copyright Notice
Copyright (c) 2023 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document.
Table of Contents
1. Introduction
2. Terminology
3. Threat Model
4. Issues with the Specification of Transient Numeric Identifiers
5. Protocol Failure Severity
6. Categorizing Transient Numeric Identifiers
7. Common Algorithms for Transient Numeric Identifier Generation
7.1. Category #1: Uniqueness (Soft Failure)
7.2. Category #2: Uniqueness (Hard Failure)
7.3. Category #3: Uniqueness, Stable within Context (Soft
Failure)
7.4. Category #4: Uniqueness, Monotonically Increasing within
Context (Hard Failure)
8. Common Vulnerabilities Associated with Transient Numeric
Identifiers
8.1. Network Activity Correlation
8.2. Information Leakage
8.3. Fingerprinting
8.4. Exploitation of the Semantics of Transient Numeric
Identifiers
8.5. Exploitation of Collisions of Transient Numeric Identifiers
8.6. Exploitation of Predictable Transient Numeric Identifiers
for Injection Attacks
8.7. Cryptanalysis
9. Vulnerability Assessment of Transient Numeric Identifiers
9.1. Category #1: Uniqueness (Soft Failure)
9.2. Category #2: Uniqueness (Hard Failure)
9.3. Category #3: Uniqueness, Stable within Context (Soft
Failure)
9.4. Category #4: Uniqueness, Monotonically Increasing within
Context (Hard Failure)
10. IANA Considerations
11. Security Considerations
12. References
12.1. Normative References
12.2. Informative References
Appendix A. Algorithms and Techniques with Known Issues
A.1. Predictable Linear Identifiers Algorithm
A.2. Random-Increments Algorithm
A.3. Reusing Identifiers Across Different Contexts
Acknowledgements
Authors' Addresses
1. Introduction
Networking protocols employ a variety of transient numeric
identifiers for different protocol objects, such as IPv4 and IPv6
Identification values [RFC0791] [RFC8200], IPv6 Interface Identifiers
(IIDs) [RFC4291], transport-protocol ephemeral port numbers
[RFC6056], TCP Initial Sequence Numbers (ISNs) [RFC9293], NTP
Reference IDs (REFIDs) [RFC5905], and DNS IDs [RFC1035]. These
identifiers typically have specific requirements (e.g., uniqueness
during a specified period of time) that must be satisfied such that
they do not result in negative interoperability implications and an
associated failure severity when such requirements are not met.
| NOTE: Some documents refer to the DNS ID as the DNS "Query ID"
| or "TxID".
For more than 30 years, a large number of implementations of IETF
protocols have been subject to a variety of attacks, with effects
ranging from Denial of Service (DoS) or data injection to information
leakages that could be exploited for pervasive monitoring [RFC7258].
The root cause of these issues has been, in many cases, the poor
selection of transient numeric identifiers in such protocols, usually
as a result of insufficient or misleading specifications. While it
is generally trivial to identify an algorithm that can satisfy the
interoperability requirements of a given transient numeric
identifier, empirical evidence exists that doing so without
negatively affecting the security and/or privacy properties of the
aforementioned protocols is prone to error [RFC9414].
For example, implementations have been subject to security and/or
privacy issues resulting from:
* predictable IPv4 or IPv6 Identification values (e.g., see
[Sanfilippo1998a], [RFC6274], and [RFC7739]),
* predictable IPv6 IIDs (e.g., see [RFC7217], [RFC7707], and
[RFC7721]),
* predictable transport-protocol ephemeral port numbers (e.g., see
[RFC6056] and [Silbersack2005]),
* predictable TCP Initial Sequence Numbers (ISNs) (e.g., see
[Morris1985], [Bellovin1989], and [RFC6528]),
* predictable initial timestamps in TCP timestamps options (e.g.,
see [TCPT-uptime] and [RFC7323]), and
* predictable DNS IDs (see, e.g., [Schuba1993] and [Klein2007]).
Recent history indicates that, when new protocols are standardized or
new protocol implementations are produced, the security and privacy
properties of the associated transient numeric identifiers tend to be
overlooked, and inappropriate algorithms to generate such identifiers
are either suggested in the specifications or selected by
implementers. As a result, advice in this area is warranted.
We note that the use of cryptographic techniques may readily mitigate
some of the issues arising from predictable transient numeric
identifiers. For example, cryptographic authentication can readily
mitigate data injection attacks even in the presence of predictable
transient numeric identifiers (such as "sequence numbers"). However,
use of flawed algorithms (such as global counters) for generating
transient numeric identifiers could still result in information
leakages even when cryptographic techniques are employed.
This document contains a non-exhaustive survey of transient numeric
identifiers employed in various IETF protocols and aims to categorize
such identifiers based on their interoperability requirements and the
associated failure severity when such requirements are not met.
Subsequently, it provides advice on possible algorithms that could be
employed to satisfy the interoperability requirements of each
category while minimizing negative security and privacy implications.
Finally, it analyzes several algorithms that have been employed in
real implementations to meet such requirements and analyzes their
security and privacy properties.
This document represents the consensus of the Privacy Enhancements
and Assessments Research Group (PEARG).
2. Terminology
Transient Numeric Identifier:
A data object in a protocol specification that can be used to
definitely distinguish a protocol object (a datagram, network
interface, transport-protocol endpoint, session, etc.) from all
other objects of the same type, in a given context. Transient
numeric identifiers are usually defined as a series of bits and
represented using integer values. These identifiers are typically
dynamically selected, as opposed to statically assigned numeric
identifiers (see, e.g., [IANA-PROT]). We note that different
transient numeric identifiers may have additional requirements or
properties depending on their specific use in a protocol. We use
the term "transient numeric identifier" (or simply "numeric
identifier" or "identifier" as short forms) as a generic term to
refer to any data object in a protocol specification that
satisfies the identification property stated above.
Failure Severity:
The interoperability consequences of a failure to comply with the
interoperability requirements of a given identifier. Severity
considers the worst potential consequence of a failure, determined
by the system damage and/or time lost to repair the failure. In
this document, we define two types of failure severity: "soft
failure" and "hard failure".
Soft Failure:
A recoverable condition in which a protocol does not operate in
the prescribed manner but normal operation can be resumed
automatically in a short period of time. For example, a simple
packet-loss event that is subsequently recovered with a packet
retransmission can be considered a soft failure.
Hard Failure:
A non-recoverable condition in which a protocol does not operate
in the prescribed manner or it operates with excessive degradation
of service. For example, an established TCP connection that is
aborted due to an error condition constitutes, from the point of
view of the transport protocol, a hard failure, since it enters a
state from which normal operation cannot be resumed.
3. Threat Model
Throughout this document, we do not consider on-path attacks. That
is, we assume the attacker does not have physical or logical access
to the system(s) being attacked and that the attacker can only
observe traffic explicitly directed to the attacker. Similarly, an
attacker cannot observe traffic transferred between the sender and
the receiver(s) of a target protocol but may be able to interact with
any of these entities, including by, e.g., sending any traffic to
them to sample transient numeric identifiers employed by the target
hosts when communicating with the attacker.
For example, when analyzing vulnerabilities associated with TCP
Initial Sequence Numbers (ISNs), we consider the attacker is unable
to capture network traffic corresponding to a TCP connection between
two other hosts. However, we consider the attacker is able to
communicate with any of these hosts (e.g., establish a TCP connection
with any of them) to, e.g., sample the TCP ISNs employed by these
hosts when communicating with the attacker.
Similarly, when considering host-tracking attacks based on IPv6
Interface Identifiers, we consider an attacker may learn the IPv6
address employed by a victim host if, e.g., the address becomes
exposed as a result of the victim host communicating with an
attacker-operated server. Subsequently, an attacker may perform
host-tracking by probing a set of target addresses composed by a set
of target prefixes and the IPv6 Interface Identifier originally
learned by the attacker. Alternatively, an attacker may perform
host-tracking if, e.g., the victim host communicates with an
attacker-operated server as it moves from one location to another,
thereby exposing its configured addresses. We note that none of
these scenarios require the attacker observe traffic not explicitly
directed to the attacker.
4. Issues with the Specification of Transient Numeric Identifiers
While assessing IETF protocol specifications regarding the use of
transient numeric identifiers, we have found that most of the issues
discussed in this document arise as a result of one of the following
conditions:
* protocol specifications that under specify their transient numeric
identifiers
* protocol specifications that over specify their transient numeric
identifiers
* protocol implementations that simply fail to comply with the
specified requirements
A number of IETF protocol specifications under specified their
transient numeric identifiers, thus leading to implementations that
were vulnerable to numerous off-path attacks. Examples of them are
the specification of TCP local ports in [RFC0793] or the
specification of the DNS ID in [RFC1035].
| NOTE: The TCP local port in an active OPEN request is commonly
| known as the "ephemeral port" of the corresponding TCP
| connection [RFC6056].
On the other hand, there are a number of IETF protocol specifications
that over specify some of their associated transient numeric
identifiers. For example, [RFC4291] essentially overloads the
semantics of IPv6 Interface Identifiers (IIDs) by embedding link-
layer addresses in the IPv6 IIDs when the interoperability
requirement of uniqueness could be achieved in other ways that do not
result in negative security and privacy implications [RFC7721].
Similarly, [RFC2460] suggests the use of a global counter for the
generation of Identification values when the interoperability
requirement of uniqueness per {IPv6 Source Address, IPv6 Destination
Address} could be achieved with other algorithms that do not result
in negative security and privacy implications [RFC7739].
Finally, there are protocol implementations that simply fail to
comply with existing protocol specifications. For example, some
popular operating systems still fail to implement transport-protocol
ephemeral port randomization, as recommended in [RFC6056], or TCP
Initial Sequence Number randomization, as recommended in [RFC9293].
5. Protocol Failure Severity
Section 2 defines the concept of "failure severity", along with two
types of failure severities that we employ throughout this document:
soft and hard.
Our analysis of the severity of a failure is performed from the point
of view of the protocol in question. However, the corresponding
severity on the upper protocol (or application) might not be the same
as that of the protocol in question. For example, a TCP connection
that is aborted might or might not result in a hard failure of the
upper application, i.e., if the upper application can establish a new
TCP connection without any impact on the application, a hard failure
at the TCP protocol may have no severity at the application layer.
On the other hand, if a hard failure of a TCP connection results in
excessive degradation of service at the application layer, it will
also result in a hard failure at the application.
6. Categorizing Transient Numeric Identifiers
This section includes a non-exhaustive survey of transient numeric
identifiers, which are representative of all the possible
combinations of interoperability requirements and failure severities
found in popular protocols of different layers. Additionally, it
proposes a number of categories that can accommodate these
identifiers based on their interoperability requirements and their
associated failure severity (soft or hard).
| NOTE: All other transient numeric identifiers that were
| analyzed as part of this effort could be accommodated into one
| of the existing categories from Table 1.
+===============+===============================+==================+
| Identifier | Interoperability Requirements | Failure Severity |
+===============+===============================+==================+
| IPv6 ID | Uniqueness (for IPv6 address | Soft/Hard (1) |
| | pair) | |
+---------------+-------------------------------+------------------+
| IPv6 IID | Uniqueness (and stable within | Soft (3) |
| | IPv6 prefix) (2) | |
+---------------+-------------------------------+------------------+
| TCP ISN | Monotonically increasing (4) | Hard (4) |
+---------------+-------------------------------+------------------+
| TCP initial | Monotonically increasing (5) | Hard (5) |
| timestamp | | |
+---------------+-------------------------------+------------------+
| TCP ephemeral | Uniqueness (for connection | Hard |
| port | ID) | |
+---------------+-------------------------------+------------------+
| IPv6 Flow | Uniqueness | None (6) |
| Label | | |
+---------------+-------------------------------+------------------+
| DNS ID | Uniqueness | None (7) |
+---------------+-------------------------------+------------------+
Table 1: Survey of Transient Numeric Identifiers
NOTE:
(1) While a single collision of IPv6 Identification (ID) values
would simply lead to a single packet drop (and hence, a "soft"
failure), repeated collisions at high data rates might result in
self-propagating collisions of IPv6 IDs, thus possibly leading
to a hard failure [RFC4963].
(2) While the interoperability requirements are simply that the
Interface Identifier results in a unique IPv6 address, for
operational reasons, it is typically desirable that the
resulting IPv6 address (and hence, the corresponding Interface
Identifier) be stable within each network [RFC7217] [RFC8064].
(3) While IPv6 Interface Identifiers must result in unique IPv6
addresses, IPv6 Duplicate Address Detection (DAD) [RFC4862]
allows for the detection of duplicate addresses, and hence, such
Interface Identifier collisions can be recovered.
(4) In theory, there are no interoperability requirements for TCP
Initial Sequence Numbers (ISNs), since the TIME-WAIT state and
TCP's "quiet time" concept take care of old segments from
previous incarnations of a connection. However, a widespread
optimization allows for a new incarnation of a previous
connection to be created if the ISN of the incoming SYN is
larger than the last sequence number seen in that direction for
the previous incarnation of the connection. Thus, monotonically
increasing TCP ISNs allow for such optimization to work as
expected [RFC6528] and can help avoid connection-establishment
failures.
(5) Strictly speaking, there are no interoperability requirements
for the *initial* TCP timestamp employed by a TCP instance
(i.e., the TS Value (TSval) in a segment with the SYN bit set).
However, some TCP implementations allow a new incarnation of a
previous connection to be created if the TSval of the incoming
SYN is larger than the last TSval seen in that direction for the
previous incarnation of the connection (please see [RFC6191]).
Thus, monotonically increasing TCP initial timestamps (across
connections to the same endpoint) allow for such optimization to
work as expected [RFC6191] and can help avoid connection-
establishment failures.
(6) The IPv6 Flow Label [RFC6437], along with the IPv6 Source
Address and the IPv6 Destination Address, is typically employed
for load sharing [RFC7098]. Reuse of a Flow Label value for the
same set {Source Address, Destination Address} would typically
cause both flows to be multiplexed onto the same link. However,
as long as this does not occur deterministically, it will not
result in any negative implications.
(7) DNS IDs are employed, together with the IP Source Address, the
IP Destination Address, the transport-protocol Source Port, and
the transport-protocol Destination Port, to match DNS requests
and responses. However, since an implementation knows which DNS
requests were sent for that set of {IP Source Address, IP
Destination Address, transport-protocol Source Port, transport-
protocol Destination Port, DNS ID}, a collision of DNS IDs would
result, if anything, in a small performance penalty (the
response would nevertheless be discarded when it is found that
it does not answer the query sent in the corresponding DNS
query).
Based on the survey above, we can categorize identifiers as follows:
+=======+======================================+====================+
| Cat # | Category | Sample Numeric |
| | | IDs |
+=======+======================================+====================+
| 1 | Uniqueness (soft failure) | IPv6 Flow L., DNS |
| | | ID |
+-------+--------------------------------------+--------------------+
| 2 | Uniqueness (hard failure) | IPv6 ID, TCP |
| | | ephemeral port |
+-------+--------------------------------------+--------------------+
| 3 | Uniqueness, stable within context | IPv6 IID |
| | (soft failure) | |
+-------+--------------------------------------+--------------------+
| 4 | Uniqueness, monotonically increasing | TCP ISN, TCP |
| | within context (hard failure) | initial timestamp |
+-------+--------------------------------------+--------------------+
Table 2: Identifier Categories
We note that Category #4 could be considered a generalized case of
Category #3, in which a monotonically increasing element is added to
a stable (within context) element, such that the resulting
identifiers are monotonically increasing within a specified context.
That is, the same algorithm could be employed for both #3 and #4,
given appropriate parameters.
7. Common Algorithms for Transient Numeric Identifier Generation
The following subsections describe some sample algorithms that can be
employed for generating transient numeric identifiers for each of the
categories above while mitigating the vulnerabilities analyzed in
Section 8 of this document.
All of the variables employed in the algorithms of the following
subsections are of "unsigned integer" type, except for the "retry"
variable, which is of (signed) "integer" type.
7.1. Category #1: Uniqueness (Soft Failure)
The requirement of uniqueness with a soft failure severity can be
complied with a Pseudorandom Number Generator (PRNG).
| NOTE: Please see [RFC4086] regarding randomness requirements
| for security.
While most systems provide access to a PRNG, many of such PRNG
implementations are not cryptographically secure and therefore might
be statistically biased or subject to adversarial influence. For
example, ISO C [C11] rand(3) implementations are not
cryptographically secure.
| NOTE: Section 7.1 ("Uniform Deviates") of [Press1992] discusses
| the underlying issues affecting ISO C [C11] rand(3)
| implementations.
On the other hand, a number of systems provide an interface to a
Cryptographically Secure PRNG (CSPRNG) [RFC4086] [RFC8937], which
guarantees high entropy, unpredictability, and good statistical
distribution of the random values generated. For example, GNU/
Linux's CSPRNG implementation is available via the getentropy(3)
interface [GETENTROPY], while OpenBSD's CSPRNG implementation is
available via the arc4random(3) and arc4random_uniform(3) interfaces
[ARC4RANDOM]. Where available, these CSPRNGs should be preferred
over, e.g., POSIX [POSIX] random(3) or ISO C [C11] rand(3)
implementations.
In scenarios where a CSPRNG is not readily available to select
transient numeric identifiers of Category #1, a security and privacy
assessment of employing a regular PRNG should be performed,
supporting the implementation decision.
| NOTE: [Aumasson2018], [Press1992], and [Knuth1983] discuss
| theoretical and practical aspects of pseudorandom number
| generation and provide guidance on how to evaluate PRNGs.
We note that, since the premise is that collisions of transient
numeric identifiers of this category only lead to soft failures, in
many cases, the algorithm might not need to check the suitability of
a selected identifier (i.e., the suitable_id() function, described
below, could always return "true").
In scenarios where, e.g., simultaneous use of a given numeric
identifier is undesirable and an implementation detects such
condition, the implementation may opt to select the next available
identifier in the same sequence or select another random number.
Section 7.1.1 is an implementation of the former strategy, while
Section 7.1.2 is an implementation of the latter. Typically, the
algorithm in Section 7.1.2 results in a more uniform distribution of
the generated transient numeric identifiers. However, for transient
numeric identifiers where an implementation typically keeps local
state about unsuitable/used identifiers, the algorithm in
Section 7.1.2 may require many more iterations than the algorithm in
Section 7.1.1 to generate a suitable transient numeric identifier.
This will usually be affected by the current usage ratio of transient
numeric identifiers (i.e., the number of numeric identifiers
considered suitable / total number of numeric identifiers) and other
parameters. Therefore, in such cases, many implementations tend to
prefer the algorithm in Section 7.1.1 over the algorithm in
Section 7.1.2.
7.1.1. Simple Randomization Algorithm
/* Transient Numeric ID selection function */
id_range = max_id - min_id + 1;
next_id = min_id + (random() % id_range);
retry = id_range;
do {
if (suitable_id(next_id)) {
return next_id;
}
if (next_id == max_id) {
next_id = min_id;
} else {
next_id++;
}
retry--;
} while (retry > 0);
return ERROR;
NOTE:
random() is a PRNG that returns a pseudorandom unsigned integer
number of appropriate size. Beware that "adapting" the length of
the output of random() with a modulo operator (e.g., C language's
"%") may change the distribution of the PRNG. To preserve a
uniform distribution, the rejection sampling technique
[Romailler2020] can be used.
suitable_id() is a function that checks, if possible and
desirable, whether a candidate numeric identifier is suitable
(e.g., whether it is in use or has been recently employed).
Depending on how/where the numeric identifier is used, it may or
may not be possible (or even desirable) to check whether the
numeric identifier is suitable.
All the variables (in this algorithm and all the others algorithms
discussed in this document) are unsigned integers.
When an identifier is found to be unsuitable, this algorithm selects
the next available numeric identifier in sequence. Thus, even when
this algorithm selects numeric identifiers randomly, it is biased
towards the first available numeric identifier after a sequence of
unavailable numeric identifiers. For example, if this algorithm is
employed for transport-protocol ephemeral port randomization
[RFC6056] and the local list of unsuitable port numbers (e.g.,
registered port numbers that should not be used for ephemeral ports)
is significant, an attacker may actually have a significantly better
chance of guessing an ephemeral port number.
Assuming the randomness requirements for the PRNG are met (see
[RFC4086]), this algorithm does not suffer from any of the issues
discussed in Section 8.
7.1.2. Another Simple Randomization Algorithm
The following pseudocode illustrates another algorithm for selecting
a random transient numeric identifier where, in the event a selected
identifier is found to be unsuitable (e.g., already in use), another
identifier is randomly selected:
/* Transient Numeric ID selection function */
id_range = max_id - min_id + 1;
retry = id_range;
do {
next_id = min_id + (random() % id_range);
if (suitable_id(next_id)) {
return next_id;
}
retry--;
} while (retry > 0);
return ERROR;
NOTE:
random() is a PRNG that returns a pseudorandom unsigned integer
number of appropriate size. Beware that "adapting" the length of
the output of random() with a modulo operator (e.g., C language's
"%") may change the distribution of the PRNG. To preserve a
uniform distribution, the rejection sampling technique
[Romailler2020] can be used.
suitable_id() is a function that checks, if possible and
desirable, whether a candidate numeric identifier is suitable
(e.g., if it is not already in use). Depending on how/where the
numeric identifier is used, it may or may not be possible (or even
desirable) to check whether the numeric identifier is in use (or
whether it has been recently employed).
When an identifier is found to be unsuitable, this algorithm selects
another random numeric identifier. Thus, this algorithm might be
unable to select a transient numeric identifier (i.e., return
"ERROR"), even if there are suitable identifiers available, in cases
where a large number of identifiers are found to be unsuitable (e.g.,
"in use").
Assuming the randomness requirements for the PRNG are met (see
[RFC4086]), this algorithm does not suffer from any of the issues
discussed in Section 8.
7.2. Category #2: Uniqueness (Hard Failure)
One of the most trivial approaches for generating a unique transient
numeric identifier (with a hard failure severity) is to reduce the
identifier reuse frequency by generating the numeric identifiers with
a monotonically increasing function (e.g., linear). As a result, any
of the algorithms described in Section 7.4 ("Category #4: Uniqueness,
Monotonically Increasing within Context (Hard Failure)") can be
readily employed for complying with the requirements of this
transient numeric identifier category.
In cases where suitability (e.g., uniqueness) of the selected
identifiers can be definitely assessed by the local system, any of
the algorithms described in Section 7.1 ("Category #1: Uniqueness
(Soft Failure)") can be readily employed for complying with the
requirements of this numeric identifier category.
| NOTE: In the case of, e.g., TCP ephemeral ports or TCP ISNs, a
| transient numeric identifier that might seem suitable from the
| perspective of the local system might actually be unsuitable
| from the perspective of the remote system (e.g., because there
| is state associated with the selected identifier at the remote
| system). Therefore, in such cases, it is not possible to
| employ the algorithms from Section 7.1 ("Category #1:
| Uniqueness (Soft Failure)").
7.3. Category #3: Uniqueness, Stable within Context (Soft Failure)
The goal of the following algorithm is to produce identifiers that
are stable for a given context (identified by "CONTEXT") but that
change when the aforementioned context changes.
In order to avoid storing the transient numeric identifiers computed
for each CONTEXT in memory, the following algorithm employs a
calculated technique (as opposed to keeping state in memory) to
generate a stable transient numeric identifier for each given
context.
/* Transient Numeric ID selection function */
id_range = max_id - min_id + 1;
retry = 0;
do {
offset = F(CONTEXT, retry, secret_key);
next_id = min_id + (offset % id_range);
if (suitable_id(next_id)) {
return next_id;
}
retry++;
} while (retry <= MAX_RETRIES);
return ERROR;
NOTE:
CONTEXT is the concatenation of all the elements that define a
given context.
F() is a pseudorandom function (PRF). It must not be computable
from the outside (without knowledge of the secret key). F() must
also be difficult to reverse, such that it resists attempts to
obtain the secret key, even when given samples of the output of
F() and knowledge or control of the other input parameters. F()
should produce an output of at least as many bits as required for
the transient numeric identifier. SipHash-2-4 (128-bit key,
64-bit output) [SipHash] and BLAKE3 (256-bit key, arbitrary-length
output) [BLAKE3] are two possible options for F(). Alternatively,
F() could be implemented with a keyed hash message authentication
code (HMAC) [RFC2104]. HMAC-SHA-256 [FIPS-SHS] would be one
possible option for such implementation alternative. Note: Use of
HMAC-MD5 [RFC1321] or HMAC-SHA1 [FIPS-SHS] are not recommended for
F() [RFC6151] [RFC6194]. The result of F() is no more secure than
the secret key, and therefore, "secret_key" must be unknown to the
attacker and must be of a reasonable length. "secret_key" must
remain stable for a given CONTEXT, since otherwise, the numeric
identifiers generated by this algorithm would not have the desired
stability properties (i.e., stable for a given CONTEXT). In most
cases, "secret_key" should be selected with a PRNG (see [RFC4086]
for recommendations on choosing secrets) at an appropriate time
and stored in stable or volatile storage (as necessary) for future
use.
suitable_id() checks whether a candidate numeric identifier has
suitable uniqueness properties.
In this algorithm, the function F() provides a stateless and stable
per-CONTEXT offset, where CONTEXT is the concatenation of all the
elements that define the given context.
For example, if this algorithm is expected to produce IPv6 IIDs that
are unique per network interface and Stateless Address
Autoconfiguration (SLAAC) prefix, CONTEXT should be the concatenation
of, e.g., the network interface index and the SLAAC autoconfiguration
prefix (please see [RFC7217] for an implementation of this algorithm
for generation of stable IPv6 addresses).
The result of F() is stored in the variable "offset", which may take
any value within the storage type range, since we are restricting the
resulting identifier to be in the range [min_id, max_id] in a similar
way as in the algorithm described in Section 7.1.1.
As noted above, suitable_id() checks whether a candidate numeric
identifier has suitable uniqueness properties. Collisions (i.e., an
identifier that is not unique) are recovered by incrementing the
"retry" variable and recomputing F(), up to a maximum of MAX_RETRIES
times. However, recovering from collisions will usually result in
identifiers that fail to remain constant for the specified context.
This is normally acceptable when the probability of collisions is
small, as in the case of, e.g., IPv6 IIDs resulting from SLAAC
[RFC7217] [RFC8981].
For obvious reasons, the transient numeric identifiers generated with
this algorithm allow for network activity correlation and
fingerprinting within "CONTEXT". However, this is essentially a
design goal of this category of transient numeric identifiers.
7.4. Category #4: Uniqueness, Monotonically Increasing within Context
(Hard Failure)
7.4.1. Per-Context Counter Algorithm
One possible way of selecting unique monotonically increasing
identifiers (per context) is to employ a per-context counter. Such
an algorithm could be described as follows:
/* Transient Numeric ID selection function */
id_range = max_id - min_id + 1;
retry = id_range;
id_inc = increment() % id_range;
if( (next_id = lookup_counter(CONTEXT)) == ERROR){
next_id = min_id + random() % id_range;
}
do {
if ( (max_id - next_id) >= id_inc){
next_id = next_id + id_inc;
}
else {
next_id = min_id + id_inc - (max_id - next_id);
}
if (suitable_id(next_id)){
store_counter(CONTEXT, next_id);
return next_id;
}
retry = retry - id_inc;
} while (retry > 0);
return ERROR;
NOTE:
CONTEXT is the concatenation of all the elements that define a
given context.
increment() returns a small integer that is employed to increment
the current counter value to obtain the next transient numeric
identifier. This value must be larger than or equal to 1, and
much smaller than the number of possible values for the numeric
identifiers (i.e., "id_range"). Most implementations of this
algorithm employ a constant increment of 1. Using a value other
than 1 can help mitigate some information leakages (please see
below) at the expense of a possible increase in the numeric
identifier reuse frequency. The code above makes sure that the
increment employed in the algorithm (id_inc) is always smaller
than the number of possible values for the numeric identifiers
(i.e., "max_id - min_d + 1"). However, as noted above, this value
must also be much smaller than the number of possible values for
the numeric identifiers.
lookup_counter() is a function that returns the current counter
for a given context or an error condition if that counter does not
exist.
random() is a PRNG that returns a pseudorandom unsigned integer
number of appropriate size. Beware that "adapting" the length of
the output of random() with a modulo operator (e.g., C language's
"%") may change the distribution of the PRNG. To preserve a
uniform distribution, the rejection sampling technique
[Romailler2020] can be used.
store_counter() is a function that saves a counter value for a
given context.
suitable_id() checks whether a candidate numeric identifier has
suitable uniqueness properties.
Essentially, whenever a new identifier is to be selected, the
algorithm checks whether a counter for the corresponding context
exists. If it does, the value of such counter is incremented to
obtain the new transient numeric identifier, and the counter is
updated. If no counter exists for such context, a new counter is
created and initialized to a random value and used as the selected
transient numeric identifier. This algorithm produces a per-context
counter, which results in one monotonically increasing function for
each context. Since each counter is initialized to a random value,
the resulting values are unpredictable by an off-path attacker.
The choice of id_inc has implications on both the security and
privacy properties of the resulting identifiers and also on the
corresponding interoperability properties. On one hand, minimizing
the increments generally minimizes the identifier reuse frequency,
albeit at increased predictability. On the other hand, if the
increments are randomized, predictability of the resulting
identifiers is reduced, and the information leakage produced by
global constant increments is mitigated. However, using larger
increments than necessary can result in higher numeric identifier
reuse frequency.
This algorithm has the following drawbacks:
* It requires an implementation to store each per-context counter in
memory. If, as a result of resource management, the counter for a
given context must be removed, the last transient numeric
identifier value used for that context will be lost. Thus, if an
identifier subsequently needs to be generated for the same
context, the corresponding counter will need to be recreated and
reinitialized to a random value, thus possibly leading to reuse/
collision of numeric identifiers.
* Keeping one counter for each possible "context" may in some cases
be considered too onerous in terms of memory requirements.
Otherwise, the identifiers produced by this algorithm do not suffer
from the other issues discussed in Section 8.
7.4.2. Simple PRF-Based Algorithm
The goal of this algorithm is to produce monotonically increasing
transient numeric identifiers (for each given context) with a
randomized initial value. For example, if the identifiers being
generated must be monotonically increasing for each {Source Address,
Destination Address} set, then each possible combination of {Source
Address, Destination Address} should have a separate monotonically
increasing sequence that starts at a different random value.
Instead of maintaining a per-context counter (as in the algorithm
from Section 7.4.1), the following algorithm employs a calculated
technique to maintain a random offset for each possible context.
/* Initialization code */
counter = 0;
/* Transient Numeric ID selection function */
id_range = max_id - min_id + 1;
id_inc = increment() % id_range;
offset = F(CONTEXT, secret_key);
retry = id_range;
do {
next_id = min_id + (offset + counter) % id_range;
counter = counter + id_inc;
if (suitable_id(next_id)) {
return next_id;
}
retry = retry - id_inc;
} while (retry > 0);
return ERROR;
NOTE:
CONTEXT is the concatenation of all the elements that define a
given context. For example, if this algorithm is expected to
produce identifiers that are monotonically increasing for each set
{Source Address, Destination Address}, CONTEXT should be the
concatenation of Source Address and Destination Address.
increment() has the same properties and requirements as those
specified for increment() in Section 7.4.1.
F() is a PRF, with the same properties as those specified for F()
in Section 7.3.
suitable_id() checks whether a candidate numeric identifier has
suitable uniqueness properties.
In the algorithm above, the function F() provides a stateless,
stable, and unpredictable offset for each given context (as
identified by "CONTEXT"). Both the "offset" and "counter" variables
may take any value within the storage type range since we are
restricting the resulting identifier to be in the range [min_id,
max_id] in a similar way as in the algorithm described in
Section 7.1.1. This allows us to simply increment the "counter"
variable and rely on the unsigned integer to wrap around.
The result of F() is no more secure than the secret key, and
therefore, "secret_key" must be unknown to the attacker and must be
of a reasonable length. "secret_key" must remain stable for a given
CONTEXT, since otherwise, the numeric identifiers generated by this
algorithm would not have the desired properties (i.e., monotonically
increasing for a given CONTEXT). In most cases, "secret_key" should
be selected with a PRNG (see [RFC4086] for recommendations on
choosing secrets) at an appropriate time and stored in stable or
volatile storage (as necessary) for future use.
It should be noted that, since this algorithm uses a global counter
("counter") for selecting identifiers (i.e., all counters share the
same increment space), this algorithm results in an information
leakage (as described in Section 8.2). For example, if this
algorithm was used for selecting TCP ephemeral ports and an attacker
could force a client to periodically establish a new TCP connection
to an attacker-controlled system (or through an attacker-observable
routing path), the attacker could subtract consecutive Source Port
values to obtain the number of outgoing TCP connections established
globally by the victim host within that time period (up to wrap-
around issues and five-tuple collisions, of course). This
information leakage could be partially mitigated by employing small
random values for the increments (i.e., increment() function),
instead of having increment() return the constant "1".
We nevertheless note that an improved mitigation of this information
leakage could be more successfully achieved by employing the
algorithm from Section 7.4.3, instead.
7.4.3. Double-PRF Algorithm
A trade-off between maintaining a single global "counter" variable
and maintaining 2**N "counter" variables (where N is the width of the
result of F()) could be achieved as follows. The system would keep
an array of TABLE_LENGTH values, which would provide a separation of
the increment space into multiple buckets. This improvement could be
incorporated into the algorithm from Section 7.4.2 as follows:
/* Initialization code */
for(i = 0; i < TABLE_LENGTH; i++) {
table[i] = random();
}
/* Transient Numeric ID selection function */
id_range = max_id - min_id + 1;
id_inc = increment() % id_range;
offset = F(CONTEXT, secret_key1);
index = G(CONTEXT, secret_key2) % TABLE_LENGTH;
retry = id_range;
do {
next_id = min_id + (offset + table[index]) % id_range;
table[index] = table[index] + id_inc;
if (suitable_id(next_id)) {
return next_id;
}
retry = retry - id_inc;
} while (retry > 0);
return ERROR;
NOTE:
increment() has the same properties and requirements as those
specified for increment() in Section 7.4.1.
Both F() and G() are PRFs, with the same properties as those
required for F() in Section 7.3. The results of F() and G() are
no more secure than their respective secret keys ("secret_key1"
and "secret_key2", respectively), and therefore, both secret keys
must be unknown to the attacker and must be of a reasonable
length. Both secret keys must remain stable for the given
CONTEXT, since otherwise, the transient numeric identifiers
generated by this algorithm would not have the desired properties
(i.e., monotonically increasing for a given CONTEXT). In most
cases, both secret keys should be selected with a PRNG (see
[RFC4086] for recommendations on choosing secrets) at an
appropriate time and stored in stable or volatile storage (as
necessary) for future use.
"table[]" could be initialized with random values, as indicated by
the initialization code in the pseudocode above.
The "table[]" array assures that successive transient numeric
identifiers for a given context will be monotonically increasing.
Since the increment space is separated into TABLE_LENGTH different
spaces, the identifier reuse frequency will be (probabilistically)
lower than that of the algorithm in Section 7.4.2. That is, the
generation of an identifier for one given context will not
necessarily result in increments in the identifier sequence of other
contexts. It is interesting to note that the size of "table[]" does
not limit the number of different identifier sequences but rather
separates the *increment space* into TABLE_LENGTH different spaces.
The selected transient numeric identifier sequence will be obtained
by adding the corresponding entry from "table[]" to the value in the
"offset" variable, which selects the actual identifier sequence space
(as in the algorithm from Section 7.4.2).
An attacker can perform traffic analysis for any "increment space"
(i.e., context) into which the attacker has "visibility" -- namely,
the attacker can force a system to generate identifiers for
G(CONTEXT, secret_key2), where the result of G() identifies the
target "increment space". However, the attacker's ability to perform
traffic analysis is very reduced when compared to the simple PRF-
based identifiers (described in Section 7.4.2) and the predictable
linear identifiers (described in Appendix A.1). Additionally, an
implementation can further limit the attacker's ability to perform
traffic analysis by further separating the increment space (that is,
using a larger value for TABLE_LENGTH) and/or by randomizing the
increments (i.e., increment() returning a small random number as
opposed to the constant "1").
Otherwise, this algorithm does not suffer from the issues discussed
in Section 8.
8. Common Vulnerabilities Associated with Transient Numeric Identifiers
8.1. Network Activity Correlation
An identifier that is predictable within a given context allows for
network activity correlation within that context.
For example, a stable IPv6 Interface Identifier allows for network
activity to be correlated within the context in which the Interface
Identifier is stable [RFC7721]. A stable per-network IPv6 Interface
Identifier (as in [RFC7217]) allows for network activity correlation
within a network, whereas a constant IPv6 Interface Identifier (which
remains constant across networks) allows not only network activity
correlation within the same network but also across networks ("host-
tracking").
Similarly, an implementation that generates TCP ISNs with a global
counter could allow for fingerprinting and network activity
correlation across networks, since an attacker could passively infer
the identity of the victim based on the TCP ISNs employed for
subsequent communication instances. Similarly, an implementation
that generates predictable IPv6 Identification values could be
subject to fingerprinting attacks (see, e.g., [Bellovin2002]).
8.2. Information Leakage
Transient numeric identifiers that result in specific patterns can
produce an information leakage to other communicating entities. For
example, it is common to generate transient numeric identifiers with
an algorithm such as:
ID = offset(CONTEXT) + mono(CONTEXT);
This generic expression generates identifiers by adding a
monotonically increasing function (e.g., linear) to a randomized
offset. offset() is constant within a given context, whereas mono()
produces a monotonically increasing sequence for the given context.
Identifiers generated with this expression will generally be
predictable within CONTEXT.
The predictability of mono(), irrespective of the predictability of
offset(), can leak information that may be of use to attackers. For
example, a node that selects transport-protocol ephemeral port
numbers, as in:
ephemeral_port = offset(IP_Dst_Addr) + mono()
that is, with a per-destination offset but a global mono() function
(e.g., a global counter), will leak information about the total
number of outgoing connections that have been issued by the
vulnerable implementation.
Similarly, a node that generates IPv6 Identification values as in:
ID = offset(IP_Src_Addr, IP_Dst_Addr) + mono()
will leak out information about the total number of fragmented
packets that have been transmitted by the vulnerable implementation.
The vulnerabilities described in [Sanfilippo1998a],
[Sanfilippo1998b], and [Sanfilippo1999] are all associated with the
use of a global mono() function (i.e., with a global and constant
"CONTEXT") -- particularly when it is a linear function (constant
increments of 1).
Predicting transient numeric identifiers can be of help for other
types of attacks. For example, predictable TCP ISNs can open the
door to trivial connection-reset and data injection attacks (see
Section 8.6).
8.3. Fingerprinting
Fingerprinting is the capability of an attacker to identify or
reidentify a visiting user, user agent, or device via configuration
settings or other observable characteristics. Observable protocol
objects and characteristics can be employed to identify/reidentify
various entities. These entities can range from the underlying
hardware or operating system (OS) (vendor, type, and version) to the
user. [EFF] illustrates web-browser-based fingerprinting, but
similar techniques can be applied at other layers and protocols,
whether alternatively or in conjunction with it.
Transient numeric identifiers are one of the observable protocol
components that could be leveraged for fingerprinting purposes. That
is, an attacker could sample transient numeric identifiers to infer
the algorithm (and its associated parameters, if any) for generating
such identifiers, possibly revealing the underlying OS vendor, type,
and version. This information could possibly be further leveraged in
conjunction with other fingerprinting techniques and sources.
Evasion of protocol-stack fingerprinting can prove to be a very
difficult task, i.e., most systems make use of a wide variety of
protocols, each of which have a large number of parameters that can
be set to arbitrary values or generated with a variety of algorithms
with multiple parameters.
| NOTE: General protocol-based fingerprinting is discussed in
| [RFC6973], along with guidelines to mitigate the associated
| vulnerability. [Fyodor1998] and [Fyodor2006] are classic
| references on OS detection via TCP/IP stack fingerprinting.
| Network Mapper [nmap] is probably the most popular tool for
| remote OS identification via active TCP/IP stack
| fingerprinting. p0f [Zalewski2012], on the other hand, is a
| tool for performing remote OS detection via passive TCP/IP
| stack fingerprinting. Finally, [TBIT] is a TCP fingerprinting
| tool that aims at characterizing the behavior of a remote TCP
| peer based on active probes, which has been widely used in the
| research community.
Algorithms that, from the perspective of an observer (e.g., the
legitimate communicating peer), result in specific values or patterns
will allow for at least some level of fingerprinting. For example,
the algorithm from Section 7.3 will typically allow fingerprinting
within the context where the resulting identifiers are stable.
Similarly, the algorithms from Section 7.4 will result in
monotonically increasing sequences within a given context, thus
allowing for at least some level of fingerprinting (when the other
communicating entity can correlate different sampled identifiers as
belonging to the same monotonically increasing sequence).
Thus, where possible, algorithms from Section 7.1 should be preferred
over algorithms that result in specific values or patterns.
8.4. Exploitation of the Semantics of Transient Numeric Identifiers
Identifiers that are not semantically opaque tend to be more
predictable than semantically opaque identifiers. For example, a
Media Access Control (MAC) address contains an Organizationally
Unique Identifier (OUI), which may identify the vendor that
manufactured the corresponding network interface card. This can be
leveraged by an attacker trying to "guess" MAC addresses, who has
some knowledge about the possible Network Interface Card (NIC)
vendor.
[RFC7707] discusses a number of techniques to reduce the search space
when performing IPv6 address-scanning attacks by leveraging the
semantics of IPv6 IIDs.
8.5. Exploitation of Collisions of Transient Numeric Identifiers
In many cases, the collision of transient network identifiers can
have a hard failure severity (or result in a hard failure severity if
an attacker can cause multiple collisions deterministically, one
after another). For example, predictable IP Identification values
open the door to Denial of Service (DoS) attacks (see, e.g.,
[RFC5722].).
8.6. Exploitation of Predictable Transient Numeric Identifiers for
Injection Attacks
Some protocols rely on "sequence numbers" for the validation of
incoming packets. For example, TCP employs sequence numbers for
reassembling TCP segments, while IPv4 and IPv6 employ Identification
values for reassembling IPv4 and IPv6 fragments (respectively).
Lacking built-in cryptographic mechanisms for validating packets,
these protocols are therefore vulnerable to on-path data (see, e.g.,
[Joncheray1995]) and/or control-information (see, e.g., [RFC4953] and
[RFC5927]) injection attacks. The extent to which these protocols
may resist off-path (i.e., "blind") injection attacks depends on
whether the associated "sequence numbers" are predictable and the
effort required to successfully predict a valid "sequence number"
(see, e.g., [RFC4953] and [RFC5927]).
We note that the use of unpredictable "sequence numbers" is a
completely ineffective mitigation for on-path injection attacks and
also a mostly ineffective mitigation for off-path (i.e., "blind")
injection attacks. However, many legacy protocols (such as TCP) do
not incorporate cryptographic mitigations as part of the core
protocol but rather as optional features (see, e.g., [RFC5925]), if
available at all. Additionally, ad hoc use of cryptographic
mitigations might not be sufficient to relieve a protocol
implementation of generating appropriate transient numeric
identifiers. For example, use of the Transport Layer Security (TLS)
protocol [RFC8446] with TCP will protect the application protocol but
will not help to mitigate, e.g., TCP-based connection-reset attacks
(see, e.g., [RFC4953]). Similarly, use of SEcure Neighbor Discovery
(SEND) [RFC3971] will still imply reliance on the successful
reassembly of IPv6 fragments in those cases where SEND packets do not
fit into the link Maximum Transmission Unit (MTU) (see [RFC6980]).
8.7. Cryptanalysis
A number of algorithms discussed in this document (such as those
described in Sections 7.4.2 and 7.4.3) rely on PRFs. Implementations
that employ weak PRFs or keys of inappropriate size can be subject to
cryptanalysis, where an attacker can obtain the secret key employed
for the PRF, predict numeric identifiers, etc.
Furthermore, an implementation that overloads the semantics of the
secret key can result in more trivial cryptanalysis, possibly
resulting in the leakage of the value employed for the secret key.
| NOTE: [IPID-DEV] describes two vulnerable transient numeric
| identifier generators that employ cryptographically weak hash
| functions. Additionally, one of such implementations employs
| 32 bits of a kernel address as the secret key for a hash
| function, and therefore, successful cryptanalysis leaks the
| aforementioned kernel address, allowing for Kernel Address
| Space Layout Randomization (KASLR) [KASLR] bypass.
9. Vulnerability Assessment of Transient Numeric Identifiers
The following subsections analyze possible vulnerabilities associated
with the algorithms described in Section 7.
9.1. Category #1: Uniqueness (Soft Failure)
Possible vulnerabilities associated with the algorithms from
Section 7.1 include the following:
* use of flawed PRNGs (please see, e.g., [Zalewski2001],
[Zalewski2002], [Klein2007], and [CVEs])
* inadvertently affecting the distribution of an otherwise suitable
PRNG (please see, e.g., [Romailler2020])
Where available, CSPRNGs should be preferred over regular PRNGs, such
as, e.g., POSIX random(3) implementations. In scenarios where a
CSPRNG is not readily available, a security and privacy assessment of
employing a regular PRNG should be performed, supporting the
implementation decision.
| NOTE: Please see [RFC4086] regarding randomness requirements
| for security. [Aumasson2018], [Press1992], and [Knuth1983]
| discuss theoretical and practical aspects of random number
| generation and provide guidance on how to evaluate PRNGs.
When employing a PRNG, many implementations "adapt" the length of its
output with a modulo operator (e.g., C language's "%"), possibly
changing the distribution of the output of the PRNG.
For example, consider an implementation that employs the following
code:
id = random() % 50000;
This example implementation means to obtain a transient numeric
identifier in the range 0-49999. If random() produces, e.g., a
pseudorandom number of 16 bits (with uniform distribution), the
selected transient numeric identifier will have a nonuniform
distribution with the numbers in the range 0-15535 having double
frequency than the numbers in the range 15536-49999.
| NOTE: For example, in our sample code, both an output of 10 and
| output of 50010 from the random() function will result in an
| "id" value of 10.
This effect is reduced if the PRNG produces an output that is much
longer than the length implied by the modulo operation. We note that
to preserve a uniform distribution, the rejection sampling technique
[Romailler2020] can be used.
Use of algorithms other than PRNGs for generating identifiers of this
category is discouraged.
9.2. Category #2: Uniqueness (Hard Failure)
As noted in Section 7.2, this category can employ the same algorithms
as Category #4, since a monotonically increasing sequence tends to
minimize the transient numeric identifier reuse frequency.
Therefore, the vulnerability analysis in Section 9.4 also applies to
this category.
Additionally, as noted in Section 7.2, some transient numeric
identifiers of this category might be able to use the algorithms from
Section 7.1, in which case the same considerations as in Section 9.1
would apply.
9.3. Category #3: Uniqueness, Stable within Context (Soft Failure)
Possible vulnerabilities associated with the algorithms from
Section 7.3 are the following:
* Use of weak PRFs or inappropriate secret keys (whether
inappropriate selection or inappropriate size) could allow for
cryptanalysis, which could eventually be exploited by an attacker
to predict future transient numeric identifiers.
* Since the algorithm generates a unique and stable identifier
within a specified context, it may allow for network activity
correlation and fingerprinting within the specified context.
9.4. Category #4: Uniqueness, Monotonically Increasing within Context
(Hard Failure)
The algorithm described in Section 7.4.1 for generating identifiers
of Category #4 will result in an identifiable pattern (i.e., a
monotonically increasing sequence) for the transient numeric
identifiers generated for each CONTEXT, and thus will allow for
fingerprinting and network activity correlation within each CONTEXT.
On the other hand, a simple way to generalize and analyze the
algorithms described in Sections 7.4.2 and 7.4.3 for generating
identifiers of Category #4 is as follows:
/* Transient Numeric ID selection function */
id_range = max_id - min_id + 1;
retry = id_range;
id_inc = increment() % id_range;
do {
update_mono(CONTEXT, id_inc);
next_id = min_id + (offset(CONTEXT) + \
mono(CONTEXT)) % id_range;
if (suitable_id(next_id)) {
return next_id;
}
retry = retry - id_inc;
} while (retry > 0);
return ERROR;
NOTE:
increment() returns a small integer that is employed to generate a
monotonically increasing function. Most implementations employ a
constant value for "increment()" (usually 1). The value returned
by increment() must be much smaller than the value computed for
"id_range".
update_mono(CONTEXT, id_inc) increments the counter corresponding
to CONTEXT by "id_inc".
mono(CONTEXT) reads the counter corresponding to CONTEXT.
Essentially, an identifier (next_id) is generated by adding a
monotonically increasing function (mono()) to an offset value, which
is unknown to the attacker and stable for given context (CONTEXT).
The following aspects of the algorithm should be considered:
* For the most part, it is the offset() function that results in
identifiers that are unpredictable by an off-patch attacker.
While the resulting sequence is known to be monotonically
increasing, the use of a randomized offset value makes the
resulting values unknown to the attacker.
* The most straightforward "stateless" implementation of offset() is
with a PRF that takes the values that identify the context and a
secret key (not shown in the figure above) as arguments.
* One possible implementation of mono() would be to have mono()
internally employ a single counter (as in the algorithm from
Section 7.4.2) or map the increments for different contexts into a
number of counters/buckets, such that the number of counters that
need to be maintained in memory is reduced (as in the "Double-PRF
Algorithm" from Section 7.4.3).
* In all cases, a monotonically increasing function is implemented
by incrementing the previous value of a counter by increment()
units. In the most trivial case, increment() could return the
constant "1". But increment() could also be implemented to return
small random integers such that the increments are unpredictable
(see Appendix A.2 of this document). This represents a trade-off
between the unpredictability of the resulting transient numeric
identifiers and the transient numeric identifier reuse frequency.
Considering the generic algorithm illustrated above, we can identify
the following possible vulnerabilities:
* Since the algorithms for this category are similar to those of
Section 9.3, with the addition of a monotonically increasing
function, all the issues discussed in Section 9.3 ("Category #3:
Uniqueness, Stable within Context (Soft Failure)") also apply to
this case.
* mono() can be correlated to the number of identifiers generated
for a given context (CONTEXT). Thus, if mono() spans more than
the necessary context, the "increments" could be leaked to other
parties, thus disclosing information about the number of
identifiers that have been generated by the algorithm for all
contexts. This information disclosure becomes more evident when
an implementation employs a constant increment of 1. For example,
an implementation where mono() is actually a single global counter
will unnecessarily leak information about the number of
identifiers that have been generated by the algorithm (globally,
for all contexts). [Fyodor2003] describes one example of how such
information leakages can be exploited. We note that limiting the
span of the increment space will require a larger number of
counters to be stored in memory (i.e., a larger value for the
TABLE_LENGTH parameter of the algorithm in Section 7.4.3).
* Transient numeric identifiers generated with the algorithms
described in Sections 7.4.2 and 7.4.3 will normally allow for
fingerprinting within CONTEXT since, for such context, the
resulting identifiers will have an identifiable pattern (i.e., a
monotonically increasing sequence).
10. IANA Considerations
This document has no IANA actions.
11. Security Considerations
This entire document is about the security and privacy implications
of transient numeric identifiers. [RFC9416] recommends that protocol
specifications specify the interoperability requirements of their
transient numeric identifiers, perform a vulnerability assessment of
their transient numeric identifiers, and recommend an algorithm for
generating each of their transient numeric identifiers. This
document analyzes possible algorithms (and their implications) that
could be employed to comply with the interoperability requirements of
the most common categories of transient numeric identifiers while
minimizing the associated negative security and privacy implications.
12. References
12.1. Normative References
[RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791,
DOI 10.17487/RFC0791, September 1981,
<https://www.rfc-editor.org/info/rfc791>.
[RFC0793] Postel, J., "Transmission Control Protocol", RFC 793,
DOI 10.17487/RFC0793, September 1981,
<https://www.rfc-editor.org/info/rfc793>.
[RFC1035] Mockapetris, P., "Domain names - implementation and
specification", STD 13, RFC 1035, DOI 10.17487/RFC1035,
November 1987, <https://www.rfc-editor.org/info/rfc1035>.
[RFC1321] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321,
DOI 10.17487/RFC1321, April 1992,
<https://www.rfc-editor.org/info/rfc1321>.
[RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6
(IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460,
December 1998, <https://www.rfc-editor.org/info/rfc2460>.
[RFC4086] Eastlake 3rd, D., Schiller, J., and S. Crocker,
"Randomness Requirements for Security", BCP 106, RFC 4086,
DOI 10.17487/RFC4086, June 2005,
<https://www.rfc-editor.org/info/rfc4086>.
[RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing
Architecture", RFC 4291, DOI 10.17487/RFC4291, February
2006, <https://www.rfc-editor.org/info/rfc4291>.
[RFC4862] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless
Address Autoconfiguration", RFC 4862,
DOI 10.17487/RFC4862, September 2007,
<https://www.rfc-editor.org/info/rfc4862>.
[RFC5722] Krishnan, S., "Handling of Overlapping IPv6 Fragments",
RFC 5722, DOI 10.17487/RFC5722, December 2009,
<https://www.rfc-editor.org/info/rfc5722>.
[RFC5905] Mills, D., Martin, J., Ed., Burbank, J., and W. Kasch,
"Network Time Protocol Version 4: Protocol and Algorithms
Specification", RFC 5905, DOI 10.17487/RFC5905, June 2010,
<https://www.rfc-editor.org/info/rfc5905>.
[RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP
Authentication Option", RFC 5925, DOI 10.17487/RFC5925,
June 2010, <https://www.rfc-editor.org/info/rfc5925>.
[RFC6056] Larsen, M. and F. Gont, "Recommendations for Transport-
Protocol Port Randomization", BCP 156, RFC 6056,
DOI 10.17487/RFC6056, January 2011,
<https://www.rfc-editor.org/info/rfc6056>.
[RFC6151] Turner, S. and L. Chen, "Updated Security Considerations
for the MD5 Message-Digest and the HMAC-MD5 Algorithms",
RFC 6151, DOI 10.17487/RFC6151, March 2011,
<https://www.rfc-editor.org/info/rfc6151>.
[RFC6191] Gont, F., "Reducing the TIME-WAIT State Using TCP
Timestamps", BCP 159, RFC 6191, DOI 10.17487/RFC6191,
April 2011, <https://www.rfc-editor.org/info/rfc6191>.
[RFC6437] Amante, S., Carpenter, B., Jiang, S., and J. Rajahalme,
"IPv6 Flow Label Specification", RFC 6437,
DOI 10.17487/RFC6437, November 2011,
<https://www.rfc-editor.org/info/rfc6437>.
[RFC6528] Gont, F. and S. Bellovin, "Defending against Sequence
Number Attacks", RFC 6528, DOI 10.17487/RFC6528, February
2012, <https://www.rfc-editor.org/info/rfc6528>.
[RFC7217] Gont, F., "A Method for Generating Semantically Opaque
Interface Identifiers with IPv6 Stateless Address
Autoconfiguration (SLAAC)", RFC 7217,
DOI 10.17487/RFC7217, April 2014,
<https://www.rfc-editor.org/info/rfc7217>.
[RFC7323] Borman, D., Braden, B., Jacobson, V., and R.
Scheffenegger, Ed., "TCP Extensions for High Performance",
RFC 7323, DOI 10.17487/RFC7323, September 2014,
<https://www.rfc-editor.org/info/rfc7323>.
[RFC8064] Gont, F., Cooper, A., Thaler, D., and W. Liu,
"Recommendation on Stable IPv6 Interface Identifiers",
RFC 8064, DOI 10.17487/RFC8064, February 2017,
<https://www.rfc-editor.org/info/rfc8064>.
[RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6
(IPv6) Specification", STD 86, RFC 8200,
DOI 10.17487/RFC8200, July 2017,
<https://www.rfc-editor.org/info/rfc8200>.
[RFC8981] Gont, F., Krishnan, S., Narten, T., and R. Draves,
"Temporary Address Extensions for Stateless Address
Autoconfiguration in IPv6", RFC 8981,
DOI 10.17487/RFC8981, February 2021,
<https://www.rfc-editor.org/info/rfc8981>.
[RFC9293] Eddy, W., Ed., "Transmission Control Protocol (TCP)",
STD 7, RFC 9293, DOI 10.17487/RFC9293, August 2022,
<https://www.rfc-editor.org/info/rfc9293>.
12.2. Informative References
[ARC4RANDOM]
OpenBSD, "arc4random(3)", Library Functions Manual,
September 2019, <https://man.openbsd.org/arc4random>.
[Aumasson2018]
Aumasson, J-P., "Serious Cryptography: A Practical
Introduction to Modern Encryption", No Starch Press, Inc.,
ISBN-10 1-59327-826-8, November 2017.
[Bellovin1989]
Bellovin, S., "Security Problems in the TCP/IP Protocol
Suite", Computer Communications Review, Vol. 19, No. 2,
pp. 32-48, April 1989,
<https://www.cs.columbia.edu/~smb/papers/ipext.pdf>.
[Bellovin2002]
Bellovin, S., "A Technique for Counting NATted Hosts",
IMW'02, Marseille, France, ISBN 1-58113-603-X/02/0011,
November 2002,
<https://www.cs.columbia.edu/~smb/papers/fnat.pdf>.
[BLAKE3] "BLAKE3: one function, fast everywhere", September 2022,
<https://blake3.io/>.
[C11] ISO/IEC, "Information technology - Programming languages -
C", ISO/IEC 9899:2018, June 2018.
[CPNI-TCP] Centre for the Protection of National Infrastructure
(CPNI), "Security Assessment of the Transmission Control
Protocol (TCP)", CPNI Technical Note 3/2009, February
2009, <https://www.si6networks.com/files/publications/tn-
03-09-security-assessment-TCP.pdf>.
[CVEs] NVD, "Vulnerability Advisories for PRNGs",
<https://www.gont.com.ar/miscellanea/prng-cves/>.
[EFF] EFF, "Cover your tracks: See how trackers view your
browser", <https://coveryourtracks.eff.org/>.
[FIPS-SHS] NIST, "Secure Hash Standard (SHS)", FIPS PUB 180-4,
DOI 10.6028/NIST.FIPS.180-4, August 2015,
<https://nvlpubs.nist.gov/nistpubs/FIPS/
NIST.FIPS.180-4.pdf>.
[Fyodor1998]
Fyodor, "Remote OS detection via TCP/IP Stack
FingerPrinting", Phrack Magazine, Volume 8, Issue 54,
December 1998,
<http://www.phrack.org/archives/issues/54/9.txt>.
[Fyodor2003]
Fyodor, "Idle Scanning and related IPID games", 2003,
<https://nmap.org/presentations/CanSecWest03/CD_Content/
idlescan_paper/idlescan.html>.
[Fyodor2006]
Lyon, G., "Chapter 8. Remote OS Detection", January 2009,
<https://nmap.org/book/osdetect.html>.
[GETENTROPY]
Linux, "getentropy(3)", Linux Programmer's Manual, March
2021,
<https://man7.org/linux/man-pages/man3/getentropy.3.html>.
[IANA-PROT]
IANA, "Protocol Registries",
<https://www.iana.org/protocols>.
[IPID-DEV] Klein, A. and B. Pinkas, "From IP ID to Device ID and
KASLR Bypass (Extended Version)",
DOI 10.48550/arXiv.1906.10478, October 2019,
<https://arxiv.org/pdf/1906.10478.pdf>.
[Joncheray1995]
Joncheray, L., "Simple Active Attack Against TCP",
Proceedings of the Fifth USENIX UNIX Security Symposium,
June 1995, <https://www.usenix.org/legacy/publications/lib
rary/proceedings/security95/full_papers/joncheray.pdf>.
[KASLR] PaX Team, "Address Space Layout Randomization",
<https://pax.grsecurity.net/docs/aslr.txt>.
[Klein2007]
Klein, A., "OpenBSD DNS Cache Poisoning and Multiple O/S
Predictable IP ID Vulnerability", November 2007,
<https://dl.packetstormsecurity.net/papers/attack/OpenBSD_
DNS_Cache_Poisoning_and_Multiple_OS_Predictable_IP_ID_Vuln
erability.pdf>.
[Knuth1983]
Knuth, D., "The Art of Computer Programming", Volume 2
(Seminumerical Algorithms), 2nd Ed., Reading,
Massachusetts, Addison-Wesley Publishing Company, January
1981.
[Morris1985]
Morris, R., "A Weakness in the 4.2BSD UNIX TCP/IP
Software", CSTR 117, AT&T Bell Laboratories, Murray Hill,
NJ, February 1985,
<https://pdos.csail.mit.edu/~rtm/papers/117.pdf>.
[nmap] nmap, "Nmap: Free Security Scanner For Network Exploration
and Audit", 2020, <https://nmap.org/>.
[POSIX] IEEE, "IEEE Standard for Information Technology --
Portable Operating System Interface (POSIX(TM)) Base
Specifications, Issue 7", IEEE Std 1003.1-2017,
DOI 10.1109/IEEESTD.2018.8277153, January 2018,
<https://doi.org/10.1109/IEEESTD.2018.8277153>.
[Press1992]
Press, W., Teukolsky, S., Vetterling, W., and B. Flannery,
"Numerical Recipes in C: The Art of Scientific Computing",
2nd Ed., Cambridge University Press, ISBN 0-521-43108-5,
December 1992.
[RFC2104] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed-
Hashing for Message Authentication", RFC 2104,
DOI 10.17487/RFC2104, February 1997,
<https://www.rfc-editor.org/info/rfc2104>.
[RFC3971] Arkko, J., Ed., Kempf, J., Zill, B., and P. Nikander,
"SEcure Neighbor Discovery (SEND)", RFC 3971,
DOI 10.17487/RFC3971, March 2005,
<https://www.rfc-editor.org/info/rfc3971>.
[RFC4953] Touch, J., "Defending TCP Against Spoofing Attacks",
RFC 4953, DOI 10.17487/RFC4953, July 2007,
<https://www.rfc-editor.org/info/rfc4953>.
[RFC4963] Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly
Errors at High Data Rates", RFC 4963,
DOI 10.17487/RFC4963, July 2007,
<https://www.rfc-editor.org/info/rfc4963>.
[RFC5927] Gont, F., "ICMP Attacks against TCP", RFC 5927,
DOI 10.17487/RFC5927, July 2010,
<https://www.rfc-editor.org/info/rfc5927>.
[RFC6194] Polk, T., Chen, L., Turner, S., and P. Hoffman, "Security
Considerations for the SHA-0 and SHA-1 Message-Digest
Algorithms", RFC 6194, DOI 10.17487/RFC6194, March 2011,
<https://www.rfc-editor.org/info/rfc6194>.
[RFC6274] Gont, F., "Security Assessment of the Internet Protocol
Version 4", RFC 6274, DOI 10.17487/RFC6274, July 2011,
<https://www.rfc-editor.org/info/rfc6274>.
[RFC6973] Cooper, A., Tschofenig, H., Aboba, B., Peterson, J.,
Morris, J., Hansen, M., and R. Smith, "Privacy
Considerations for Internet Protocols", RFC 6973,
DOI 10.17487/RFC6973, July 2013,
<https://www.rfc-editor.org/info/rfc6973>.
[RFC6980] Gont, F., "Security Implications of IPv6 Fragmentation
with IPv6 Neighbor Discovery", RFC 6980,
DOI 10.17487/RFC6980, August 2013,
<https://www.rfc-editor.org/info/rfc6980>.
[RFC7098] Carpenter, B., Jiang, S., and W. Tarreau, "Using the IPv6
Flow Label for Load Balancing in Server Farms", RFC 7098,
DOI 10.17487/RFC7098, January 2014,
<https://www.rfc-editor.org/info/rfc7098>.
[RFC7258] Farrell, S. and H. Tschofenig, "Pervasive Monitoring Is an
Attack", BCP 188, RFC 7258, DOI 10.17487/RFC7258, May
2014, <https://www.rfc-editor.org/info/rfc7258>.
[RFC7707] Gont, F. and T. Chown, "Network Reconnaissance in IPv6
Networks", RFC 7707, DOI 10.17487/RFC7707, March 2016,
<https://www.rfc-editor.org/info/rfc7707>.
[RFC7721] Cooper, A., Gont, F., and D. Thaler, "Security and Privacy
Considerations for IPv6 Address Generation Mechanisms",
RFC 7721, DOI 10.17487/RFC7721, March 2016,
<https://www.rfc-editor.org/info/rfc7721>.
[RFC7739] Gont, F., "Security Implications of Predictable Fragment
Identification Values", RFC 7739, DOI 10.17487/RFC7739,
February 2016, <https://www.rfc-editor.org/info/rfc7739>.
[RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol
Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018,
<https://www.rfc-editor.org/info/rfc8446>.
[RFC8937] Cremers, C., Garratt, L., Smyshlyaev, S., Sullivan, N.,
and C. Wood, "Randomness Improvements for Security
Protocols", RFC 8937, DOI 10.17487/RFC8937, October 2020,
<https://www.rfc-editor.org/info/rfc8937>.
[RFC9414] Gont, F. and I. Arce, "Unfortunate History of Transient
Numeric Identifiers", RFC 9414, DOI 10.17487/RFC9414, July
2023, <https://www.rfc-editor.org/info/rfc9414>.
[RFC9416] Gont, F. and I. Arce, "Security Considerations for
Transient Numeric Identifiers Employed in Network
Protocols", BCP 72, RFC 9416, DOI 10.17487/RFC9416, July
2023, <https://www.rfc-editor.org/info/rfc9416>.
[Romailler2020]
Romailler, Y., "The Definitive Guide to "Modulo Bias and
How to Avoid It"!", Kudelski Security Research, July 2020,
<https://research.kudelskisecurity.com/2020/07/28/the-
definitive-guide-to-modulo-bias-and-how-to-avoid-it/>.
[Sanfilippo1998a]
Sanfilippo, S., "about the ip header id", message to the
Bugtraq mailing list, December 1998,
<http://seclists.org/bugtraq/1998/Dec/48>.
[Sanfilippo1998b]
Sanfilippo, S., "new tcp scan method", message to the
Bugtraq mailing list, 18 December 1998,
<https://seclists.org/bugtraq/1998/Dec/79>.
[Sanfilippo1999]
Sanfilippo, S., "more ip id", message to the Bugtraq
mailing list, November 1999,
<https://github.com/antirez/hping/raw/master/docs/MORE-
FUN-WITH-IPID>.
[Schuba1993]
Schuba, C., "Addressing Weakness in the Domain Name System
Protocol", August 1993,
<http://ftp.cerias.purdue.edu/pub/papers/christoph-schuba/
schuba-DNS-msthesis.pdf>.
[Shimomura1995]
Shimomura, T., "Technical details of the attack described
by Markoff in NYT", message to the USENET
comp.security.misc newsgroup, 25 January 1995,
<https://www.gont.com.ar/files/post-shimomura-usenet.txt>.
[Silbersack2005]
Silbersack, M., "Improving TCP/IP security through
randomization without sacrificing interoperability",
EuroBSDCon 2005 Conference,
<https://www.silby.com/eurobsdcon05/
eurobsdcon_silbersack.pdf>.
[SipHash] "SipHash: a fast short-input PRF", February 2023,
<https://github.com/veorq/SipHash>.
[TBIT] TBIT, "TBIT, the TCP Behavior Inference Tool", 2001,
<https://www.icir.org/tbit/>.
[TCPT-uptime]
McDanel, B., "TCP Timestamping - Obtaining System Uptime
Remotely", message to the Bugtraq mailing list, March
2001, <https://seclists.org/bugtraq/2001/Mar/182>.
[Zalewski2001]
Zalewski, M., "Strange Attractors and TCP/IP Sequence
Number Analysis", April 2001,
<https://lcamtuf.coredump.cx/oldtcp/tcpseq.html>.
[Zalewski2002]
Zalewski, M., "Strange Attractors and TCP/IP Sequence
Number Analysis - One Year Later (2002)",
<https://lcamtuf.coredump.cx/newtcp/>.
[Zalewski2012]
Zalewski, M., "p0f v3 (3.09b)",
<https://lcamtuf.coredump.cx/p0f.shtml>.
Appendix A. Algorithms and Techniques with Known Issues
The following subsections discuss algorithms and techniques with
known negative security and privacy implications.
| NOTE: As discussed in Section 1, the use of cryptographic
| techniques might allow for the safe use of some of these
| algorithms and techniques. However, this should be evaluated
| on a case-by-case basis.
A.1. Predictable Linear Identifiers Algorithm
One of the most trivial ways to achieve uniqueness with a low
identifier reuse frequency is to produce a linear sequence. This
type of algorithm has been employed in the past to generate
identifiers of Categories #1, #2, and #4 (please see Section 6 for an
analysis of these categories).
For example, the following algorithm has been employed (see, e.g.,
[Morris1985], [Shimomura1995], [Silbersack2005], and [CPNI-TCP]) in a
number of operating systems for selecting IP IDs, TCP ephemeral port
numbers, etc.:
/* Initialization code */
next_id = min_id;
id_inc= 1;
/* Transient Numeric ID selection function */
id_range = max_id - min_id + 1;
retry = id_range;
do {
if (next_id == max_id) {
next_id = min_id;
}
else {
next_id = next_id + id_inc;
}
if (suitable_id(next_id)) {
return next_id;
}
retry--;
} while (retry > 0);
return ERROR;
NOTE:
suitable_id() checks whether a candidate numeric identifier is
suitable (e.g., whether it is unique or not).
For obvious reasons, this algorithm results in predictable sequences.
Since a global counter is used to generate the transient numeric
identifiers ("next_id" in the example above), an entity that learns
one numeric identifier can infer past numeric identifiers and predict
future values to be generated by the same algorithm. Since the value
employed for the increments is known (such as "1" in this case), an
attacker can sample two values and learn the number of identifiers
that were generated in between the two sampled values. Furthermore,
if the counter is initialized, to some known value (e.g., when the
system is bootstrapped), the algorithm will leak additional
information, such as the number of transmitted fragmented datagrams
in the case of an IP ID generator [Sanfilippo1998a] or the system
uptime in the case of TCP timestamps [TCPT-uptime].
A.2. Random-Increments Algorithm
This algorithm offers a middle ground between the algorithms that
generate randomized transient numeric identifiers (such as those
described in Sections 7.1.1 and 7.1.2) and those that generate
identifiers with a predictable monotonically increasing function (see
Appendix A.1).
/* Initialization code */
next_id = random(); /* Initialization value */
id_rinc = 500; /* Determines the trade-off */
/* Transient Numeric ID selection function */
id_range = max_id - min_id + 1;
retry = id_range;
do {
/* Random increment */
id_inc = (random() % id_rinc) + 1;
if ( (max_id - next_id) >= id_inc){
next_id = next_id + id_inc;
}
else {
next_id = min_id + id_inc - (max_id - next_id);
}
if (suitable_id(next_id)) {
return next_id;
}
retry = retry - id_inc;
} while (retry > 0);
return ERROR;
NOTE:
random() is a PRNG that returns a pseudorandom unsigned integer
number of appropriate size. Beware that "adapting" the length of
the output of random() with a modulo operator (e.g., C language's
"%") may change the distribution of the PRNG. To preserve a
uniform distribution, the rejection sampling technique
[Romailler2020] can be used.
suitable_id() is a function that checks whether a candidate
identifier is suitable (e.g., whether it is unique or not).
This algorithm aims at producing a global monotonically increasing
sequence of transient numeric identifiers while avoiding the use of
fixed increments, which would lead to trivially predictable
sequences. The value "id_rinc" allows for direct control of the
trade-off between unpredictability and identifier reuse frequency.
The smaller the value of "id_rinc", the more similar this algorithm
is to a predicable, global linear identifier generation algorithm (as
the one in Appendix A.1). The larger the value of "id_rinc", the
more similar this algorithm is to the algorithm described in
Section 7.1.1 of this document.
When the identifiers wrap, there is a risk of collisions of transient
numeric identifiers (i.e., identifier reuse). Therefore, "id_rinc"
should be selected according to the following criteria:
* It should maximize the wrapping time of the identifier space.
* It should minimize identifier reuse frequency.
* It should maximize unpredictability.
Clearly, these are competing goals, and the decision of which value
of "id_rinc" to use is a trade-off. Therefore, the value of
"id_rinc" is at times a configurable parameter so that system
administrators can make the trade-off for themselves. We note that
the alternative algorithms discussed throughout this document offer
better interoperability, security, and privacy properties than this
algorithm, and hence, implementation of this algorithm is
discouraged.
A.3. Reusing Identifiers Across Different Contexts
Employing the same identifier across contexts in which stability is
not required (i.e., overloading the semantics of transient numeric
identifiers) usually has negative security and privacy implications.
For example, in order to generate transient numeric identifiers of
Category #2 or #3, an implementation or specification might be
tempted to employ a source for the numeric identifiers that is known
to provide unique values but that may also be predictable or leak
information related to the entity generating the identifier. This
technique has been employed in the past for, e.g., generating IPv6
IIDs by reusing the MAC address of the underlying network interface
card. However, as noted in [RFC7721] and [RFC7707], embedding link-
layer addresses in IPv6 IIDs not only results in predictable values
but also leaks information about the manufacturer of the underlying
network interface card, allows for network activity correlation, and
makes address-based scanning attacks feasible.
Acknowledgements
The authors would like to thank (in alphabetical order) Bernard
Aboba, Jean-Philippe Aumasson, Steven Bellovin, Luis León Cárdenas
Graide, Spencer Dawkins, Theo de Raadt, Guillermo Gont, Joseph
Lorenzo Hall, Gre Norcie, Colin Perkins, Vincent Roca, Shivan Sahib,
Rich Salz, Martin Thomson, and Michael Tüxen for providing valuable
comments on earlier draft versions of this document.
The authors would like to thank Shivan Sahib and Christopher Wood for
their guidance during the publication process of this document.
The authors would like to thank Jean-Philippe Aumasson and Mathew
D. Green (John Hopkins University) for kindly answering a number of
questions.
The authors would like to thank Diego Armando Maradona for his magic
and inspiration.
Authors' Addresses
Fernando Gont
SI6 Networks
Segurola y Habana 4310 7mo piso
Ciudad Autonoma de Buenos Aires
Argentina
Email: fgont@si6networks.com
URI: https://www.si6networks.com
Ivan Arce
Quarkslab
Segurola y Habana 4310 7mo piso
Ciudad Autonoma de Buenos Aires
Argentina
Email: iarce@quarkslab.com
URI: https://www.quarkslab.com
|