1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
|
Internet Engineering Task Force (IETF) H. Shah
Request for Comments: 7306 Broadcom Corporation
Category: Standards Track F. Marti
ISSN: 2070-1721 W. Noureddine
A. Eiriksson
Chelsio Communications, Inc.
R. Sharp
Intel Corporation
June 2014
Remote Direct Memory Access (RDMA) Protocol Extensions
Abstract
This document specifies extensions to the IETF Remote Direct Memory
Access Protocol (RDMAP) as specified in RFC 5040. RDMAP provides
read and write services directly to applications and enables data to
be transferred directly into Upper-Layer Protocol (ULP) Buffers
without intermediate data copies. The extensions specified in this
document provide the following capabilities and/or improvements:
Atomic Operations and Immediate Data.
Status of This Memo
This is an Internet Standards Track document.
This document is a product of the Internet Engineering Task Force
(IETF). It represents the consensus of the IETF community. It has
received public review and has been approved for publication by the
Internet Engineering Steering Group (IESG). Further information on
Internet Standards is available in Section 2 of RFC 5741.
Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
http://www.rfc-editor.org/info/rfc7306.
Shah, et al. Standards Track [Page 1]
^L
RFC 7306 RDMA Protocol Extensions June 2014
Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Shah, et al. Standards Track [Page 2]
^L
RFC 7306 RDMA Protocol Extensions June 2014
Table of Contents
1. Introduction ....................................................4
1.1. Discovery of RDMAP Extensions ..............................5
2. Requirements Language ...........................................5
3. Glossary ........................................................6
4. Header Format Extensions ........................................7
4.1. RDMAP Control and Invalidate STag Fields ...................7
4.2. RDMA Message Definitions ...................................9
5. Atomic Operations ...............................................9
5.1. Atomic Operation Details ..................................10
5.1.1. FetchAdd ...........................................10
5.1.2. CmpSwap ............................................12
5.2. Atomic Operations .........................................13
5.2.1. Atomic Operation Request Message ...................14
5.2.2. Atomic Operation Response Message ..................17
5.3. Atomicity Guarantees ......................................18
5.4. Atomic Operations Ordering and Completion Rules ...........18
6. Immediate Data .................................................20
6.1. RDMAP Interactions with ULP for Immediate Data ............20
6.2. Immediate Data Header Format ..............................21
6.3. Immediate Data or Immediate Data with SE Message ..........21
6.4. Ordering and Completions ..................................22
7. Ordering and Completions Table .................................22
8. Error Processing ...............................................25
8.1. Errors Detected at the Local Peer .........................25
8.2. Errors Detected at the Remote Peer ........................26
9. Security Considerations ........................................26
10. IANA Considerations ...........................................27
10.1. RDMAP Message Atomic Operation Subcodes ..................27
10.2. RDMAP Queue Numbers ......................................28
11. References ....................................................29
11.1. Normative References .....................................29
11.2. Informative References ...................................29
12. Acknowledgments ...............................................30
Appendix A. DDP Segment Formats for RDMA Messages .................31
A.1. DDP Segment for Atomic Operation Request ..................32
A.2. DDP Segment for Atomic Response ...........................33
A.3. DDP Segment for Immediate Data and Immediate Data with SE .33
Shah, et al. Standards Track [Page 3]
^L
RFC 7306 RDMA Protocol Extensions June 2014
1. Introduction
The RDMA Protocol [RFC5040] provides capabilities for zero-copy data
communications that preserve memory protection semantics, enabling
more efficient network protocol implementations. The RDMA Protocol
is part of the iWARP family of specifications which also include RFC
5041 [RFC5041], RFC 5044 [RFC5044], and RFC 6581 [RFC6581]. This
document specifies the following extensions to the RDMA Protocol
(RDMAP):
o Atomic Operations can be performed on remote memory locations.
Support for Atomic Operations enhances the usability of RDMAP in
distributed shared-memory environments.
o Immediate Data messages allow the ULP at the sender to provide a
small amount of data. When an Immediate Data message is sent
following an RDMA Write Message, the combination of the two
messages is an implementation of RDMA Write with Immediate message
that is found in other RDMA transport protocols.
Other RDMA transport protocols define the functionality added by
these extensions leading to differences in RDMA applications and/or
Upper-Layer Protocols. Removing these differences in the transport
protocols simplifies these applications and ULPs, and that is the
main motivation for the extensions specified in this document.
RSockets [RSOCKETS] is an example of RDMA-enabled middleware that
provides a socket interface as the upper-edge interface and utilizes
RDMA to provide more efficient networking for socket-based
applications. RSockets is aware of Immediate Data support in
InfiniBand [IB]. RSockets cannot utilize the RDMA Write with
Immediate Data operation from InfiniBand. The addition of the
Immediate Data operation specified in this document will alleviate
this difference in RSockets when running on InfiniBand and iWARP.
Structured high-performance computing applications based on the
Message-Passing Interface [MPI] may use Atomic Operations defined in
this specification. DAT Atomics [DAT_ATOMICS] is an example of RDMA-
enabled middleware that provides a portable RDMA programming
interface for various RDMA transport protocols. DAT Atomics includes
a primitive for InfiniBand that is not supported by iWARP RDMA-
enabled Network Interface Controllers or RNICs. The addition of
Atomic Operations as specified in this document will allow Atomic
Operations in DAT Atomics to work for both InfiniBand and RNICs
interchangeably.
Shah, et al. Standards Track [Page 4]
^L
RFC 7306 RDMA Protocol Extensions June 2014
For more background on RDMA Protocol applicability, see
"Applicability of Remote Direct Memory Access Protocol (RDMA) and
Direct Data Placement Protocol (DDP)" [RFC5045].
1.1. Discovery of RDMAP Extensions
Today there are RDMA applications and/or ULPs that are aware of the
existence of Atomic and Immediate Data operations for RDMA transports
such as InfiniBand and application programming interfaces such as
Open Fabrics Verbs [OFAVERBS]. Today, these applications need to be
aware that RDMAP does not support certain of these operations.
Typically, the availability of these capabilities is exposed to the
applications through adapter query interfaces in software.
Applications then have to decide to use or not use Immediate Data or
Atomic Operations based on the results of the query interfaces. Such
query interfaces typically return the scope of atomicity guarantees,
not the individual Atomic Operations supported. Therefore, this
specification requires all Atomic Operations defined within to be
supported if an RNIC supports any Atomic Operations.
In cases where heterogeneous hardware, with differing support for
Atomic Operations and Immediate Data Operations, is deployed for use
by RDMA applications and/or ULPs, applications are either statically
configured to use or not use optional features or use application-
specific negotiation mechanisms. For the extensions covered by this
document, it is RECOMMENDED that RDMA applications and/or ULPs
negotiate at the application or ULP level the usage of these
extensions. The definition of such application-specific mechanisms
is outside the scope of this specification. For backward
compatibility, existing applications and/or ULPs should not assume
that these extensions are supported.
In the absence of application-specific negotiation of the features
defined within this specification, the new operations can be
attempted, and reported errors can be used to determine a remote
peer's capabilities. In the case of Atomics, a FetchAdd operation
with "Add Data" set to 0 can safely be used to determine the
existence of Atomic Operations without modifying the content of a
remote peer's memory. A Remote Operation Error or Unexpected OpCode
error will be reported by the remote peer if there is an Immediate
Data or Atomic Operation that is not supported by the remote peer.
2. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
Shah, et al. Standards Track [Page 5]
^L
RFC 7306 RDMA Protocol Extensions June 2014
3. Glossary
This document is an extension of RFC 5040, and key words are defined
in the glossary of that document.
Atomic Operation - an operation that results in an execution of a
memory operation at a specific ULP Buffer address on a remote node
using the Tagged Buffer data transfer model. The consumer can use
Atomic Operations to read, modify, and write memory at the
destination ULP Buffer address, while at the same time
guaranteeing that no other Atomic Operation read or write accesses
to the ULP Buffer address targeted by the Atomic Operation will
occur across any other RDMAP Streams on an RNIC at the Responder.
Atomic Operation Request - an RDMA Message used by the Data Source to
perform an Atomic Operation at the Responder.
Atomic Operation Response - an RDMA Message used by the Responder to
describe the completion of an Atomic Operation at the Responder.
CmpSwap - an Atomic Operation that is used to compare and swap a
value at a specific address on a remote node.
FetchAdd - an Atomic Operation that is used to atomically increment a
value at a specific ULP Buffer address on a remote node.
Immediate Data - a small fixed-size portion of data sent from the
Data Source to a Data Sink.
Immediate Data Message - an RDMA Message used by the Data Source to
send Immediate Data to the Data Sink.
Immediate Data with Solicited Event (SE) Message - an RDMA Message
used by the Data Source to send Immediate Data with Solicited
Event to the Data Sink.
iWARP - a suite of wire protocols comprised of RFC 5040, RFC 5041,
RFC 5044, and RFC 6581.
Requester - the sender of an RDMA Atomic Operation request.
Responder - the receiver of an RDMA Atomic Operation request.
RNIC - RDMA-enabled Network Interface Controller. In this context,
this would be a network I/O adapter or embedded controller with
iWARP functionality.
Shah, et al. Standards Track [Page 6]
^L
RFC 7306 RDMA Protocol Extensions June 2014
ULP - Upper-Layer Protocol. The protocol layer above the one
currently being referenced. The ULP for RFC 5040 / RFC 5041 is
expected to be an OS, Application, adaptation layer, or
proprietary device. The RFC 5040 / RFC 5041 documents do not
specify a ULP -- they provide a set of semantics that allow a ULP
to be designed to utilize RFC 5040 / RFC 5041.
4. Header Format Extensions
The control information of RDMA Messages is included in header fields
defined in RFC 5041, the Direct Data Placement (DDP) protocol. RFC
5040 defines the RDMAP header formats layered on the DDP header
definition. This specification extends RFC 5040 with the following
new formats:
o Four new RDMA Messages carry additional RDMAP headers. The
Immediate Data operation and Immediate Data with Solicited Event
operation each include 8 bytes of data following the RDMAP header.
Atomic Operations include Atomic Request or Atomic Response
headers following the RDMAP header. The RDMAP header for Atomic
Request messages is 52 bytes long as specified in Figure 4. The
RDMAP header for Atomic Response Messages is 32 bytes long as
specified in Figure 5.
o Introduction of a new queue for untagged Buffers (QN=3) used for
Atomic Response tracking.
4.1. RDMAP Control and Invalidate STag Fields
For reference, Figure 1 depicts the format of the DDP Control and
RDMAP Control Fields, in the style and convention of RFC 5040:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|T|L| Resrv | DV| RV|Rsv| Opcode|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Invalidate STag |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 1: DDP Control and RDMAP Control Fields
The DDP Control Field consists of the T (Tagged), L (Last), Resrv,
and DV (DDP protocol Version) fields [RFC5041]. The RDMAP Control
Field consists of the RV (RDMA Version), Rsv, and Opcode fields
[RFC5040].
Shah, et al. Standards Track [Page 7]
^L
RFC 7306 RDMA Protocol Extensions June 2014
This specification adds values for the RDMA Opcode field to those
specified in RFC 5040. Figure 2 defines the new values of the RDMA
Opcode field that are used for the RDMA Messages defined in this
specification.
As shown in Figure 2, STag (Steering Tag) and Tagged Offset are not
applicable for the RDMA Messages defined in this specification.
Figure 2 also shows the appropriate Queue Number for each Opcode.
All RDMA Messages defined in this specification MUST have:
The RDMA Version (RV) field: 01b.
Opcode field: Set to one of the values in Figure 2.
Invalidate STag: Set to zero by the sender, ignored by the receiver.
-------+-----------+-------+------+-------+---------+-------------
RDMA | Message | Tagged| STag | Queue | In- | Message
Opcode | Type | Flag | and | Number| validate| Length
| | | TO | | STag | Communicated
| | | | | | between DDP
| | | | | | and RDMAP
-------+-----------+-------+------+-------+---------+-------------
1000b | Immediate | 0 | N/A | 0 | N/A | Yes
| Data | | | | |
-------+-----------+----------------------------------------------
1001b | Immediate | 0 | N/A | 0 | N/A | Yes
| Data with | | | | |
| SE | | | | |
-------+-----------+----------------------------------------------
1010b | Atomic | 0 | N/A | 1 | N/A | Yes
| Request | | | | |
-------+-----------+----------------------------------------------
1011b | Atomic | 0 | N/A | 3 | N/A | Yes
| Response | | | | |
-------+-----------+----------------------------------------------
Figure 2: Additional RDMA Usage of DDP Fields
Note: N/A means Not Applicable.
This extension defines RDMAP use of Queue Number 3 for Untagged
Buffers for Atomic Responses. This queue is used for tracking
outstanding Atomic Requests.
All other DDP and RDMAP Control Fields are set as described in RFC
5040.
Shah, et al. Standards Track [Page 8]
^L
RFC 7306 RDMA Protocol Extensions June 2014
4.2. RDMA Message Definitions
The following figure defines which RDMA Headers are used on each new
RDMA Message and which new RDMA Messages are allowed to carry ULP
payload.
-------+-----------+-------------------+-------------------------
RDMA | Message | RDMA Header Used | ULP Message allowed in
Message| Type | | the RDMA Message
OpCode | | |
| | |
-------+-----------+-------------------+-------------------------
1000b | Immediate | Immediate Data | No
| Data | Header |
-------+-----------+-------------------+-------------------------
1001b | Immediate | Immediate Data | No
| Data with | Header |
| SE | |
-------+-----------+-------------------+-------------------------
1010b | Atomic | Atomic Request | No
| Request | Header |
-------+-----------+-------------------+-------------------------
1011b | Atomic | Atomic Response | No
| Response | Header |
-------+-----------+-------------------+-------------------------
Figure 3: RDMA Message Definitions
5. Atomic Operations
The RDMA Protocol Specification in RFC 5040 does not include support
for Atomic Operations, which are an important building block for
implementing distributed shared memory.
This document extends the RDMA Protocol specification with a set of
basic Atomic Operations and specifies their resource and ordering
rules. The Atomic Operations specified in this document provide
equivalent functionality to the InfiniBand RDMA transport as well as
extended Atomic Operations defined in Open Fabrics Verbs, to allow
applications that use these primitives to work interchangeably over
iWARP. Other operations are left for future consideration.
Atomic Operations as specified in this document execute a 64-bit
memory operation at a specified destination ULP Buffer address on a
Responder node using the Tagged Buffer data transfer model. The
operations atomically read, modify, and write back the contents of
the destination ULP Buffer address and guarantee that Atomic
Operations on this ULP Buffer address by other RDMAP Streams on the
Shah, et al. Standards Track [Page 9]
^L
RFC 7306 RDMA Protocol Extensions June 2014
same RNIC do not occur between the read and the write caused by the
Atomic Operation. Therefore, the Responder RNIC MUST implement
mechanisms to prevent Atomic Operations to a memory registered for
Atomic Operations while an Atomic Operation targeting the memory is
in progress. The Requester of an Atomic Operation cannot rely on
Atomic Operation behavior at the Responder across multiple RNICs or
with respect to other applications/ULPs running at the Responder that
can access the ULP Buffer. It is OPTIONAL for an RNIC to provide
such behavior when implementing the Atomic Operations specified in
this document. An RNIC that supports Atomic Operations as specified
in this document MUST implement both the FetchAdd operation as
specified in Section 5.1.1 and the CmpSwap operation as specified in
Section 5.1.2. The advertisement of Tagged Buffer information for
Atomic Operations is outside the scope of this specification and is
handled by the ULPs.
Implementation note: It is RECOMMENDED that the applications do not
use the ULP Buffer addresses used for Atomic Operations for other
RDMA operations due to the lack of atomicity guarantees between
operations other than Atomic Operations.
Implementation note: Errors related to the alignment in the following
sections cover Atomic Operations targeted at a ULP Buffer address
that is not aligned to a 64-bit boundary.
Atomic Operation Request Messages use the same remote addressing
mechanism as RDMA Reads and Writes. The ULP Buffer address specified
in the request is in the address space of the Remote Peer to which
the Atomic Operation is targeted.
Atomic Operation Response Messages MUST use the Untagged Buffer model
with QN=3. Queue number 3 will be used to track outstanding Atomic
Operation Request messages at the Requester. When the Atomic
Operation Response message is received, the Message Sequence Number
(MSN) will be used to locate the corresponding Atomic Operation
request in order to complete the Atomic Operation request.
5.1. Atomic Operation Details
The following subsections describe the Atomic Operations in more
detail.
5.1.1. FetchAdd
The FetchAdd Atomic Operation requests the Responder to read a 64-bit
Original Remote Data Value at a 64-bit aligned ULP Buffer address in
the Responder's memory, perform the FetchAdd operation on multiple
fields of selectable length specified by 64-bit "Add Mask", and write
Shah, et al. Standards Track [Page 10]
^L
RFC 7306 RDMA Protocol Extensions June 2014
the result back to the same ULP Buffer address. The Atomic addition
is performed independently on each one of these fields. A bit set in
the Add Mask field specifies the field boundary; for each field, a
bit is set at the most significant bit position for each field,
causing any carry out of that bit position to be discarded when the
addition is performed.
FetchAdd Atomic Operations MUST target ULP Buffer addresses that are
64-bit aligned. FetchAdd Atomic Operations that target ULP Buffer
addresses that are not 64-bit aligned MUST be surfaced as errors, and
the Responder's memory MUST NOT be modified in such cases.
Additionally, an error MUST be surfaced and a terminate message MUST
be generated. The setting of the Add Mask field to
0x0000000000000000 results in Atomic Add of 64-bit Original Remote
Data Value and 64-bit "Add Data".
The pseudocode below describes a masked FetchAdd Atomic Operation.
bit_location = 1
carry = 0
Remote Data Value = 0
for bit = 0 to 63
{
if (bit != 0 ) bit_location = bit_location << 1
val1 = (Original Remote Data Value & bit_location) >> bit
val2 = (Add Data & bit_location) >> bit
sum = carry + val1 + val2
carry = (sum & 2) >> 1
sum = sum & 1
if (sum)
Remote Data Value |= bit_location
carry = ((carry) && (!(Add Mask & bit_location)))
}
Shah, et al. Standards Track [Page 11]
^L
RFC 7306 RDMA Protocol Extensions June 2014
The FetchAdd operation is performed in the endian format of the
target memory. The "Original Remote Data Value" is converted from
the endian format of the target memory for return and returned to the
Requester. The fields are in big-endian format on the wire.
The Requester specifies:
o Remote STag
o Remote Tagged Offset
o Add Data
o Add Mask
The Responder returns:
o Original Remote Data
5.1.2. CmpSwap
The CmpSwap Atomic Operation requires the Responder to read a 64-bit
value at a ULP Buffer address that is 64-bit aligned in the
Responder's memory, to perform an AND logical operation using the
64-bit Compare Mask field in the Atomic Operation Request header,
then to compare it with the result of a logical AND operation of the
Compare Mask and the Compare Data fields in the header. If the two
values are equal, the Responder is required to swap masked bits in
the same ULP Buffer address with the masked Swap Data. If the two
masked compare values are not equal, the contents of the Responder's
memory are not changed. In either case, the original value read from
the ULP Buffer address is converted from the endian format of the
target memory for return and returned to the Requester. The fields
are in big-endian format on the wire.
The Requester specifies:
o Remote STag
o Remote Tagged Offset
o Swap Data
o Swap Mask
o Compare Data
o Compare Mask
Shah, et al. Standards Track [Page 12]
^L
RFC 7306 RDMA Protocol Extensions June 2014
The Responder returns:
o Original Remote Data Value
The following pseudocode describes the masked CmpSwap operation
result.
if (!((Compare Data ^ Original Remote Data Value) &
Compare Mask))
then
Remote Data Value =
(Original Remote Data Value & ~(Swap Mask))
| (Swap Data & Swap Mask)
else
Remote Data Value = Original Remote Data Value
After the operation, the remote data Buffer MUST contain the
"Original Remote Data Value" (if comparison did not match) or the
masked "Swap Data" (if the comparison did match). CmpSwap Atomic
Operations MUST target ULP Buffer addresses that are 64-bit aligned.
If a CmpSwap Atomic Operation is attempted on a target ULP Buffer
address that is not 64-bit aligned:
o The operation MUST NOT be performed,
o The Responder's memory MUST NOT be modified,
o The result MUST be surfaced as an error, and
o A terminate message MUST be generated. (See Section 8.2 for the
contents of the terminate message.)
5.2. Atomic Operations
The Atomic Operation Request and Response are RDMA Messages. An
Atomic Operation makes use of the DDP Untagged Buffer Model. Atomic
Operation Request messages MUST use the same Queue Number as RDMA
Read Requests (QN=1). Reusing the same Queue Number for Atomic
Request messages allows the Atomic Operations to reuse the same
infrastructure (e.g., Outbound and Inbound RDMA Read Queue Depth
Shah, et al. Standards Track [Page 13]
^L
RFC 7306 RDMA Protocol Extensions June 2014
(ORD/IRD) flow control) as defined for RDMA Read Requests. Atomic
Operation Response messages MUST set Queue Number (QN) to 3 in the
DDP header.
The RDMA Message OpCode for an Atomic Request Message is 1010b. The
RDMA Message OpCode for an Atomic Response Message is 1011b.
5.2.1. Atomic Operation Request Message
The Atomic Operation Request Message carries an Atomic Operation
Header that describes the ULP Buffer address in the Responder's
memory. The Atomic Operation Request header immediately follows the
DDP header. The RDMAP layer passes to the DDP layer a RDMAP Control
Field. The following figure depicts the Atomic Operation Request
Header that is used for all Atomic Operation Request Messages:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reserved (Not Used) |AOpCode|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Request Identifier |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Remote STag |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Remote Tagged Offset |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Add or Swap Data |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Add or Swap Mask |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Compare Data |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Compare Mask |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 4: Atomic Operation Request Header
Shah, et al. Standards Track [Page 14]
^L
RFC 7306 RDMA Protocol Extensions June 2014
Reserved (Not Used): 28 bits
This field is set to zero on transmit, ignored on receive.
Atomic Operation Code (AOpCode): 4 bits.
See Figure 5. All Atomic Operation Codes from Figure 5 MUST be
implemented by an RNIC that supports Atomic Operations.
Request Identifier: 32 bits.
The Request Identifier specifies a number that is used to
identify the Atomic Operation Request Message. The value used
in this field is selected by the RNIC that sends the message,
and it is reflected back to the Local Peer in the Atomic
Operation Response message.
Remote STag: 32 bits.
The Remote STag identifies the Remote Peer's Tagged Buffer
targeted by the Atomic Operation. The Remote STag is
associated with the RDMAP Stream through a mechanism that is
outside the scope of the RDMAP specification.
Remote Tagged Offset: 64 bits.
The Remote Tagged Offset specifies the starting offset, in
octets, from the base of the Remote Peer's Tagged Buffer
targeted by the Atomic Operation. The Remote Tagged Offset MAY
start at an arbitrary offset but MUST represent a ULP Buffer
address that is 64-bit aligned.
Add or Swap Data: 64 bits.
The Add or Swap Data field specifies the 64-bit "Add Data"
value in an Atomic FetchAdd Operation or the 64-bit "Swap Data"
value in an Atomic Swap or CmpSwap Operation.
Add or Swap Mask: 64 bits
This field is used in masked Atomic Operations (FetchAdd and
CmpSwap) to perform a bitwise logical AND operation as
specified in the definition of these operations. For non-
masked Atomic Operations (Swap), this field is set to
ffffffffffffffffh on transmit and ignored by the receiver.
Shah, et al. Standards Track [Page 15]
^L
RFC 7306 RDMA Protocol Extensions June 2014
Compare Data: 64 bits.
The Compare Data field specifies the 64-bit "Compare Data"
value in an Atomic CmpSwap Operation. For Atomic Operations
FetchAdd and Atomic Swap, the Compare Data field is set to zero
on transmit and ignored by the receiver.
Compare Mask: 64 bits
This field is used in masked Atomic Operation CmpSwap to
perform a bitwise logical AND operation as specified in the
definition of these operations. For Atomic Operations FetchAdd
and Swap, this field is set to ffffffffffffffffh on transmit
and ignored by the receiver.
---------+-----------+----------+----------+---------+---------
Atomic | Atomic | Add or | Add or | Compare | Compare
Operation| Operation | Swap | Swap | Data | Mask
Code | | Data | Mask | |
---------+-----------+----------+----------+---------+---------
0000b | FetchAdd | Add Data | Add Mask | N/A | N/A
---------+-----------+----------+----------+---------+---------
0010b | CmpSwap | Swap Data| Swap Mask| Valid | Valid
---------+-----------+-----------------------------------------
Figure 5: Atomic Operation Message Definitions
The Atomic Operation Request Message has the following semantics:
1. An Atomic Operation Request Message MUST reference an Untagged
Buffer. That is, the Local Peer's RDMAP layer MUST request that
the DDP mark the Message as Untagged.
2. One Atomic Operation Request Message MUST consume one Untagged
Buffer.
3. The Responder's RDMAP layer MUST process an Atomic Operation
Request Message. A valid Atomic Operation Request Message MUST
NOT be delivered to the Responder's ULP (i.e., it is processed by
the RDMAP layer).
4. At the Responder, an error MUST be surfaced in response to
delivery to the Remote Peer's RDMAP layer of an Atomic Operation
Request Message with an Atomic Operation Code that the RNIC does
not support.
Shah, et al. Standards Track [Page 16]
^L
RFC 7306 RDMA Protocol Extensions June 2014
5. An Atomic Operation Request Message MUST reference the RDMA Read
Request Queue. That is, the Requester's RDMAP layer MUST request
that the DDP layer set the Queue Number field to one.
6. The Requester MUST pass to the DDP layer Atomic Operation Request
Messages in the order they were submitted by the ULP.
7. The Responder MUST process the Atomic Operation Request Messages
in the order they were sent.
8. If the Responder receives a valid Atomic Operation Request
Message, it MUST respond with a valid Atomic Operation Response
Message.
5.2.2. Atomic Operation Response Message
The Atomic Operation Response Message carries an Atomic Operation
Response Header that contains the "Original Request Identifier" and
"Original Remote Data Value". The Atomic Operation Response Header
immediately follows the DDP header. The RDMAP layer passes to the
DDP layer a RDMAP Control Field. The following figure depicts the
Atomic Operation Response header that is used for all Atomic
Operation Response Messages:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Original Request Identifier |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Original Remote Data Value |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 6: Atomic Operation Response Header
Original Request Identifier: 32 bits.
The Original Request Identifier is set to the value specified in
the Request Identifier field that was originally provided in the
corresponding Atomic Operation Request Message.
Original Remote Data Value: 64 bits.
The Original Remote Value specifies the original 64-bit value
stored at the ULP Buffer address targeted by the Atomic Operation.
Shah, et al. Standards Track [Page 17]
^L
RFC 7306 RDMA Protocol Extensions June 2014
The Atomic Operation Response Message has the following semantics:
1. The Atomic Operation Response Message for the associated Atomic
Operation Request Message travels in the opposite direction.
2. An Atomic Operation Response Message MUST consume an Untagged
Buffer. That is, the Responder RDMAP layer MUST request that the
DDP mark the Message as Untagged.
3. An Atomic Operation Response Message MUST reference the Queue
Number 3. That is, the Responder's RDMAP layer MUST request that
the DDP layer set the Queue Number field to 3.
4. The Responder MUST ensure that a sufficient number of Untagged
Buffers are available on the RDMA Read Request Queue (Queue with
DDP Queue Number 1) to support the maximum number of Atomic
Operation Requests negotiated by the ULP in addition to the
maximum number of RDMA Read Requests negotiated by the ULP.
5. The Requester MUST ensure that a sufficient number of Untagged
Buffers are available on the RDMA Atomic Response Queue (Queue
with DDP Queue Number 3) to support the maximum number of Atomic
Operation Requests negotiated by the ULP.
6. The RDMAP layer MUST Deliver the Atomic Operation Response Message
to the ULP.
7. At the Requester, when an invalid Atomic Operation Response
Message is delivered to the Remote Peer's RDMAP layer, an error is
surfaced.
8. When the Responder receives Atomic Operation Request messages, the
Responder RDMAP layer MUST pass Atomic Operation Response Messages
to the DDP layer, in the order that the Atomic Operation Request
Messages were received by the RDMAP layer, at the Responder.
5.3. Atomicity Guarantees
Atomicity of the Read-Modify-Write (RMW) on the Responder's node by
the Atomic Operation MUST be assured in the context of concurrent
atomic accesses by other RDMAP Streams on the same RNIC.
5.4. Atomic Operations Ordering and Completion Rules
In addition to the ordering and completion rules described in RFC
5040, the following rules apply to implementations of the Atomic
Operations.
Shah, et al. Standards Track [Page 18]
^L
RFC 7306 RDMA Protocol Extensions June 2014
1. For an Atomic Operation, the Requester MUST NOT consider the
contents of the Tagged Buffer at the Responder to be modified by
that specific Atomic Operation until the Atomic Operation Response
Message has been Delivered to RDMAP at the Requester.
2. Atomicity guarantees MUST be provided within the scope of a single
RNIC.
Implementation Note: This requirement for atomicity among
operations is limited to the scope of a single RNIC. Atomicity
guarantees are OPTIONAL with respect to access to the Tagged
Buffer by any other method than an Atomic Operation via the same
RNIC. Examples of such accesses that may not be atomic with
respect to an Atomic Operation include accesses via other RNICs
and local processor memory access to the Tagged Buffer.
3. Atomic Operation Request Messages MUST NOT start processing at the
Responder until they have been Delivered to RDMAP by DDP.
4. Atomic Operation Response Messages MAY be generated at the
Responder after subsequent RDMA Write Messages or Send Messages
have been Placed or Delivered.
5. Atomic Operation Response Message processing at the Responder MUST
be started only after the Atomic Operation Request Message has
been Delivered by the DDP layer (thus, all previous RDMA Messages
on that DDP Stream have been Delivered).
6. Send Messages MAY be Completed at the Responder before prior
incoming Atomic Operation Request Messages have completed their
response processing.
7. An Atomic Operation MUST NOT be Completed at the Requester until
the DDP layer Delivers the associated incoming Atomic Operation
Response Message.
8. If more than one outstanding Atomic Request Message is supported
by both peers, the Atomic Operation Request Messages MUST be
processed in the order they were delivered by the DDP layer on the
Responder. Atomic Operation Response Messages MUST be submitted
to the DDP layer on the Responder in the order the Atomic
Operation Request Messages were Delivered by DDP.
Shah, et al. Standards Track [Page 19]
^L
RFC 7306 RDMA Protocol Extensions June 2014
6. Immediate Data
The Immediate Data operation is typically used in conjunction with an
RDMA Write Operation to improve ULP processing efficiency. The
efficiency is gained by causing an RDMA Completion to be generated
immediately following the RDMA Write operation. This RDMA Completion
delivers 8 bytes of Immediate Data at the Remote Peer. The
combination of an RDMA Write Message followed by an Immediate Data
Operation has the same behavior as the RDMA Write with Immediate Data
operation found in InfiniBand. An Immediate Data operation that is
not preceded by an RDMA Write operation causes an RDMA Completion.
6.1. RDMAP Interactions with ULP for Immediate Data
For Immediate Data operations, the following are the interactions
between the RDMAP Layer and the ULP:
o At the Data Source:
- The ULP passes to the RDMAP Layer the following:
* 8 bytes of ULP Immediate Data
- When the Immediate Data operation Completes, an indication of
the Completion results.
o At the Data Sink:
- If the Immediate Data operation is Completed successfully, the
RDMAP Layer passes the following information to the ULP Layer:
* 8 bytes of Immediate Data
* An Event, if the Data Sink is configured to generate an
Event.
- If the Immediate Data operation is Completed in error, the Data
Sink RDMAP Layer will pass up the corresponding error
information to the Data Sink ULP and send a Terminate Message
to the Data Source RDMAP Layer. The Data Source RDMAP Layer
will then pass up the Terminate Message to the ULP.
Shah, et al. Standards Track [Page 20]
^L
RFC 7306 RDMA Protocol Extensions June 2014
6.2. Immediate Data Header Format
The Immediate Data and Immediate Data with SE Messages carry
Immediate Data as shown in Figure 7. The RDMAP layer passes to the
DDP layer an RDMAP Control Field and 8 bytes of Immediate Data. The
first 8 bytes of the data following the DDP header contains the
Immediate Data. See Appendix A.3 for the DDP segment format of an
Immediate Data or Immediate Data with SE Message.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Immediate Data |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 7: Immediate Data or Immediate Data with SE Message Header
Immediate Data: 64 bits.
8 bytes of data transferred from the Data Source to an untagged
Buffer at the Data Sink.
6.3. Immediate Data or Immediate Data with SE Message
The Immediate Data or Immediate Data with SE Message uses the DDP
Untagged Buffer Model to transfer Immediate Data from the Data Source
to the Data Sink.
o An Immediate Data or Immediate Data with SE Message MUST reference
an Untagged Buffer. That is, the Local Peer's RDMAP Layer MUST
request that the DDP layer mark the Message as Untagged.
o One Immediate Data or Immediate Data with SE Message MUST consume
one Untagged Buffer.
o At the Remote Peer, the Immediate Data and Immediate Data with SE
Messages MUST be Delivered to the Remote Peer's ULP in the order
they were sent.
o For an Immediate Data or Immediate Data with SE Message, the Local
Peer's RDMAP Layer MUST request that the DDP layer set the Queue
Number field to zero.
o For an Immediate Data or Immediate Data with SE Message, the Local
Peer's RDMAP Layer MUST request that the DDP layer transmit 8
bytes of data.
Shah, et al. Standards Track [Page 21]
^L
RFC 7306 RDMA Protocol Extensions June 2014
o The Local Peer MUST issue Immediate Data and Immediate Data with
SE Messages in the order they were submitted by the ULP.
o The Remote Peer MUST check that Immediate Data and Immediate Data
with SE Messages include exactly 8 bytes of data from the DDP
layer. The DDP header carries the length field that is reported
by the DDP layer.
6.4. Ordering and Completions
Ordering and completion rules for Immediate Data are the same as
those for a Send operation as described in Section 5.5 of RFC 5040.
7. Ordering and Completions Table
The following table summarizes the ordering relationships for Atomic
and Immediate Data operations from the standpoint of the Local Peer
issuing the Operations. Note that in the table that follows, Send
includes Send, Send with Invalidate, Send with Solicited Event, and
Send with Solicited Event and Invalidate. Also note that in the
table below, Immediate Data includes Immediate Data and Immediate
Data with Solicited Event.
---------+----------+-------------+-------------+------------------
First | Second | Placement | Placement | Ordering
Operation| Operation| Guarantee at| Guarantee at| Guarantee at
| | Remote Peer | Local Peer | Remote Peer
---------+----------+-------------+-------------+------------------
Immediate| Send | No Placement| Not | Completed in
Data | | Guarantee | Applicable | Order
| | between Send| |
| | Payload and | |
| | Immediate | |
| | Data | |
---------+----------+-------------+-------------+------------------
Immediate| RDMA | No Placement| Not | Not
Data | Write | Guarantee | Applicable | Applicable
| | between RDMA| |
| | Write | |
| | Payload and | |
| | Immediate | |
| | Data | |
Shah, et al. Standards Track [Page 22]
^L
RFC 7306 RDMA Protocol Extensions June 2014
---------+----------+-------------+-------------+------------------
Immediate| RDMA | No Placement| RDMA Read | RDMA Read
Data | Read | Guarantee | Response | Response
| | between | will not be | Message will
| | Immediate | Placed until| not be
| | Data and | Immediate | generated
| | RDMA Read | Data is | until
| | Request | Placed at | Immediate Data
| | | Remote Peer | has been
| | | | Completed
---------+----------+-------------+-------------+------------------
Immediate| Atomic | No Placement| Atomic | Atomic
Data | | Guarantee | Response | Response
| | between | will not be | Message will
| | Immediate | Placed until| not be
| | Data and | Immediate | generated
| | Atomic | Data is | until
| | Request | Placed at | Immediate Data
| | | Remote Peer | has been
| | | | Completed
---------+----------+-------------+-------------+------------------
Immediate| Immediate| No Placement| Not | Completed in
Data or | Data | Guarantee | Applicable | Order
Send | | | |
---------+----------+-------------+-------------+------------------
RDMA | Immediate| No Placement| Not | Immediate Data
Write | Data | Guarantee | Applicable | is Completed
| | | | after RDMA
| | | | Write is Placed
| | | | and Delivered
---------+----------+-------------+-------------+------------------
RDMA Read| Immediate| No Placement| Immediate | Not Applicable
| Data | Guarantee | Data MAY be |
| | between | Placed |
| | Immediate | before |
| | Data and | RDMA Read |
| | RDMA Read | Response is |
| | Request | generated |
---------+----------+-------------+-------------+------------------
Atomic | Immediate| No Placement| Immediate | Not Applicable
| Data | Guarantee | Data MAY be |
| | between | Placed |
| | Immediate | before |
| | Data and | Atomic |
| | Atomic | Response is |
| | Request | generated |
Shah, et al. Standards Track [Page 23]
^L
RFC 7306 RDMA Protocol Extensions June 2014
---------+----------+-------------+-------------+------------------
Atomic | Send | No Placement| Send Payload| Not Applicable
| | Guarantee | MAY be |
| | between Send| Placed |
| | Payload and | before |
| | Atomic | Atomic |
| | Request | Response is |
| | | generated |
---------+----------+-------------+-------------+------------------
Atomic | RDMA | No Placement| RDMA Write | Not
| Write | Guarantee | Payload MAY | Applicable
| | between RDMA| be Placed |
| | Write | before |
| | Payload and | Atomic |
| | Atomic | Response is |
| | Request | generated |
---------+----------+-------------+-------------+------------------
Atomic | RDMA | No Placement| No Placement| RDMA Read
| Read | Guarantee | Guarantee | Response
| | between | between | Message will
| | Atomic | Atomic | not be
| | Request and | Response | generated
| | RDMA Read | and RDMA | until Atomic
| | Request | Read | Response Message
| | | Response | has been
| | | | generated
---------+----------+-------------+-------------+------------------
Atomic | Atomic | Placed in | No Placement| Second Atomic
| | order | Guarantee | Request
| | | between two | Message will
| | | Atomic | not be
| | | Responses | processed
| | | | until first
| | | | Atomic Response
| | | | has been
| | | | generated
---------+----------+-------------+-------------+------------------
Send | Atomic | No Placement| Atomic | Atomic Response
| | Guarantee | Response | Message will not
| | between Send| will not be | be generated
| | Payload and | Placed at | until Send has
| | Atomic | the Local | been Completed
| | Request | Peer until |
| | | Send Payload|
| | | is Placed |
| | | at the |
| | | Remote Peer |
Shah, et al. Standards Track [Page 24]
^L
RFC 7306 RDMA Protocol Extensions June 2014
---------+----------+-------------+-------------+------------------
RDMA | Atomic | No Placement| Atomic | Not
Write | | Guarantee | Response | Applicable
| | between RDMA| will not be |
| | Write | Placed at |
| | Payload and | the Local |
| | Atomic | Peer until |
| | Request | RDMA Write |
| | | Payload |
| | | is Placed |
| | | at the |
| | | Remote Peer |
---------+----------+-------------+-------------+------------------
RDMA | Atomic | No Placement| No Placement| Atomic Response
Read | | Guarantee | Guarantee | Message will
| | between | between | not be generated
| | Atomic | Atomic | until RDMA
| | Request and | Response | Read Response
| | RDMA Read | and RDMA | has been
| | Request | Read | generated
| | | Response |
---------+----------+-------------+-------------+------------------
8. Error Processing
In addition to the error processing described in Section 7 of RFC
5040, the following rules apply for the new RDMA Messages defined in
this specification.
8.1. Errors Detected at the Local Peer
The Local Peer MUST send a Terminate Message for each of the
following cases:
1. For errors detected while creating an Atomic Request, Atomic
Response, Immediate Data, or Immediate Data with SE Message, or
other reasons not directly associated with an incoming Message,
the Terminate Message and Error code are sent instead of the
Message. In this case, the Error Type and Error Code fields are
included in the Terminate Message, but the Terminated DDP Header
and Terminated RDMA Header fields are set to zero.
2. For errors detected on an incoming Atomic Request, Atomic
Response, Immediate Data, or Immediate Data with SE (after the
Message has been Delivered by DDP), the Terminate Message is sent
at the earliest possible opportunity, preferably in the next
Shah, et al. Standards Track [Page 25]
^L
RFC 7306 RDMA Protocol Extensions June 2014
outgoing RDMA Message. In this case, the Error Type, Error Code,
and Terminated DDP Header fields are included in the Terminate
Message, but the Terminated RDMA Header field is set to zero.
8.2. Errors Detected at the Remote Peer
On incoming Atomic Requests, Atomic Responses, Immediate Data, and
Immediate Data with Solicited Event, the following MUST be validated:
o The DDP layer MUST validate all DDP Segment fields.
o The RDMA OpCode MUST be valid.
o The RDMA Version MUST be valid.
On incoming Atomic requests the following additional validation MUST
be performed:
o The RDMAP layer MUST validate that the Remote Peer's Tagged ULP
Buffer address references a ULP Buffer address that is 64-bit
aligned. In the case of an error, the RDMAP layer MUST generate a
Terminate Message indicating RDMA Layer Remote Operation Error
with Error Code Name "Catastrophic error, localized to RDMAP
Stream" as described in Section 4.8 of RFC 5040. Implementation
Note: A ULP implementation can avoid this error by having the
target ULP Buffer of an Atomic Operation 64-bit aligned.
9. Security Considerations
This document specifies extensions to the RDMA Protocol specification
in RFC 5040, and as such the Security Considerations discussed in
Section 8 of RFC 5040 apply. In particular, Atomic Operations use
ULP Buffer addresses for the Remote Peer Buffer addressing used in
RFC 5040 as required by the security model described in RFC 5042
[RFC5042].
RDMAP and related protocols may be used by applications that exhibit
distinctive traffic characteristics such as message timing, source,
destination, and size patterns. Examples include structured high-
performance computing applications based on the MPI interface. For
such applications, analysis of encrypted traffic could reveal
sensitive information, e.g., the nature of the application, size of
data set being used, and information about the application's rate of
progress. Such information can be hidden from passive observation
via use of Encapsulating Security Payload version 3 (ESPv3) Traffic
Flow Confidentiality [RFC4303] to obfuscate the encrypted traffic's
characteristics. ESPv3 implementation requirements for RDMAP are
specified in [RFC7146].
Shah, et al. Standards Track [Page 26]
^L
RFC 7306 RDMA Protocol Extensions June 2014
10. IANA Considerations
IANA has added the following entries to the "RDMAP Message Operation
Codes" registry of "Remote Direct Data Placement (RDDP)" registry:
0x8, Immediate Data, this specification
0x9, Immediate Data with Solicited Event, this specification
0xA, Atomic Request, this specification
0xB, Atomic Response, this specification
In addition, the following registry has been added to the "Remote
Direct Data Placement (RDDP)" registry. The following section
specifies the registry, its initial contents, and the administration
policy in more detail.
10.1. RDMAP Message Atomic Operation Subcodes
Name of the registry: "RDMAP Message Atomic Operation Subcodes"
Namespace details: RDMAP Message Atomic Operation Subcodes are 4-bit
values.
Information that must be provided to assign a new value: An IESG-
approved Standards Track specification defining the semantics and
interoperability requirements of the proposed new value and the
fields to be recorded in the registry.
Fields to record in the registry: RDMAP Message Atomic Operation
Subcode, Atomic Operation, RFC Reference.
Initial registry contents:
0x0, FetchAdd, this specification
0x1, Reserved, this specification
0x2, CmpSwap, this specification
Note: An experimental RDMAP Message Operation Code has already been
allocated; hence, there is no need for an experimental RDMAP Message
Atomic Operation Subcode.
Shah, et al. Standards Track [Page 27]
^L
RFC 7306 RDMA Protocol Extensions June 2014
All other values are Unassigned and available to IANA for assignment.
New RDMAP Message Atomic Operation Subcodes should be assigned
sequentially in order to better support implementations that process
RDMAP Message Atomic Operations in hardware.
Allocation Policy: Standards Action [RFC5226]
10.2. RDMAP Queue Numbers
Name of the registry: "RDMAP DDP Untagged Queue Numbers"
Namespace details: RDMAP DDP Untagged Queue numbers are 32-bit
values.
Information that must be provided to assign a new value: An IESG-
approved Standards Track specification defining the semantics and
interoperability requirements of the proposed new value and the
fields to be recorded in the registry.
Fields to record in the registry: RDMAP DDP Untagged Queue Numbers,
Queue Usage Description, RFC Reference.
Initial registry contents:
0x00000000, Queue 0 (Send operation Variants), [RFC5040]
0x00000001, Queue 1 (RDMA Read Request operations), [RFC5040]
0x00000002, Queue 2 (Terminate operations), [RFC5040]
0x00000003, Queue 3 (Atomic Response operations), this specification
Note: An experimental RDMAP Message Operation Code has already been
allocated; hence, there is no need for an experimental RDMAP DDP
Untagged Queue Number.
All other values are Unassigned and available to IANA for assignment.
New RDMAP queue numbers should be assigned sequentially in order to
better support implementations that perform RDMAP queue selection in
hardware.
Allocation Policy: Standards Action [RFC5226]
Shah, et al. Standards Track [Page 28]
^L
RFC 7306 RDMA Protocol Extensions June 2014
11. References
11.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC4303] Kent, S., "IP Encapsulating Security Payload (ESP)", RFC
4303, December 2005.
[RFC5040] Recio, R., Metzler, B., Culley, P., Hilland, J., and D.
Garcia, "A Remote Direct Memory Access Protocol
Specification", RFC 5040, October 2007.
[RFC5041] Shah, H., Pinkerton, J., Recio, R., and P. Culley, "Direct
Data Placement over Reliable Transports", RFC 5041,
October 2007.
[RFC5042] Pinkerton, J. and E. Deleganes, "Direct Data Placement
Protocol (DDP) / Remote Direct Memory Access Protocol
(RDMAP) Security", RFC 5042, October 2007.
[RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an
IANA Considerations Section in RFCs", BCP 26, RFC 5226,
May 2008.
[RFC7146] Black, D. and P. Koning, "Securing Block Storage Protocols
over IP: RFC 3723 Requirements Update for IPsec v3", RFC
7146, April 2014.
11.2. Informative References
[DAT_ATOMICS]
DAT Collaborative, "IB Transport Specific Extensions for
DAT 2.0", User Direct Access Programming Library,
<http://www.datcollaborative.org/DAT_IB_Extensions.pdf>.
[IB] InfiniBand Trade Association, "InfiniBand Architecture
Specification Volumes 1 and 2", Release 1.1, November
2002, <http://www.infinibandta.org/specs>.
[MPI] Message Passing Interface Forum, "MPI: A Message-Passing
Interface Standard, Version 3.0", September 2012,
<http://www.mpi-forum.org/docs/mpi-3.0/mpi30-report.pdf>.
Shah, et al. Standards Track [Page 29]
^L
RFC 7306 RDMA Protocol Extensions June 2014
[OFAVERBS] Rosenstock, H., "Subject: Re: [PATCH 0/2] Add support for
enhanced atomic operations", message to the linux-rdma
mailing list,
<http://www.spinics.net/lists/linux-rdma/msg02405.html>.
[RFC5044] Culley, P., Elzur, U., Recio, R., Bailey, S., and J.
Carrier, "Marker PDU Aligned Framing for TCP
Specification", RFC 5044, October 2007.
[RFC5045] Bestler, C., Ed., and L. Coene, "Applicability of Remote
Direct Memory Access Protocol (RDMA) and Direct Data
Placement (DDP)", RFC 5045, October 2007.
[RFC6581] Kanevsky, A., Ed., Bestler, C., Ed., Sharp, R., and S.
Wise, "Enhanced Remote Direct Memory Access (RDMA)
Connection Establishment", RFC 6581, April 2012.
[RSOCKETS] Hefty, S., "RDMA CM - RDMA enabled Sockets library for
Open Fabrics", <http://git.openfabrics.org/?p=~shefty/
librdmacm.git;a=summary>.
12. Acknowledgments
The authors would like to acknowledge the following individuals who
provided valuable comments and suggestions.
o David Black
o Arkady Kanevsky
o Bernard Metzler
o Jim Pinkerton
o Tom Talpey
o Steve Wise
o Don Wood
Shah, et al. Standards Track [Page 30]
^L
RFC 7306 RDMA Protocol Extensions June 2014
Appendix A. DDP Segment Formats for RDMA Messages
This appendix is for information only and is NOT part of the
standard. It simply depicts the DDP Segment format for the various
RDMA Messages.
Shah, et al. Standards Track [Page 31]
^L
RFC 7306 RDMA Protocol Extensions June 2014
A.1. DDP Segment for Atomic Operation Request
The following figure depicts an Atomic Operation Request, DDP
Segment:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP Control | RDMA Control |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reserved (Not Used) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP (Atomic Operation Request) Queue Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP (Atomic Operation Request) Message Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP (Atomic Operation Request) Message Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reserved (Not Used) |AOpCode|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Request Identifier |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Remote STag |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Remote Tagged Offset |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Add or Swap Data |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Add or Swap Mask |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Compare Data |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Compare Mask |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Shah, et al. Standards Track [Page 32]
^L
RFC 7306 RDMA Protocol Extensions June 2014
A.2. DDP Segment for Atomic Response
The following figure depicts an Atomic Operation Response, DDP
Segment:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP Control | RDMA Control |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reserved (Not Used) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP (Atomic Operation Request) Queue Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP (Atomic Operation Request) Message Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP (Atomic Operation Request) Message Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Original Request Identifier |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Original Remote Value |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
A.3. DDP Segment for Immediate Data and Immediate Data with SE
The following figure depicts an Immediate Data or Immediate Data with
SE, DDP Segment:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP Control | RDMA Control |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reserved (Not Used) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP (Send) Queue Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP (Send) Message Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDP Message Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Immediate Data |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Shah, et al. Standards Track [Page 33]
^L
RFC 7306 RDMA Protocol Extensions June 2014
Authors' Addresses
Hemal Shah
Broadcom Corporation
5300 California Avenue
Irvine, CA 92617
US
Phone: 1-949-926-6941
EMail: hemal@broadcom.com
Felix Marti
Chelsio Communications, Inc.
370 San Aleso Ave.
Sunnyvale, CA 94085
US
Phone: 1-408-962-3600
EMail: felix@chelsio.com
Asgeir Eiriksson
Chelsio Communications, Inc.
370 San Aleso Ave.
Sunnyvale, CA 94085
US
Phone: 1-408-962-3600
EMail: asgeir@chelsio.com
Wael Noureddine
Chelsio Communications, Inc.
370 San Aleso Ave.
Sunnyvale, CA 94085
US
Phone: 1-408-962-3600
EMail: wael@chelsio.com
Robert Sharp
Intel Corporation
1300 South Mopac Expy, Mailstop: AN4-4B
Austin, TX 78746
US
Phone: 1-512-362-1407
EMail: robert.o.sharp@intel.com
Shah, et al. Standards Track [Page 34]
^L
|