doc/rfc/rfc3453.txt


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011

Network Working Group                                            M. Luby
Request for Comments: 3453                              Digital Fountain
Category: Informational                                      L. Vicisano
                                                                   Cisco
                                                              J. Gemmell
                                                               Microsoft
                                                                L. Rizzo
                                                              Univ. Pisa
                                                              M. Handley
                                                                    ICIR
                                                            J. Crowcroft
                                                         Cambridge Univ.
                                                           December 2002


    The Use of Forward Error Correction (FEC) in Reliable Multicast

Status of this Memo

   This memo provides information for the Internet community.  It does
   not specify an Internet standard of any kind.  Distribution of this
   memo is unlimited.

Copyright Notice

   Copyright (C) The Internet Society (2002).  All Rights Reserved.

Abstract

   This memo describes the use of Forward Error Correction (FEC) codes
   to efficiently provide and/or augment reliability for one-to-many
   reliable data transport using IP multicast.  One of the key
   properties of FEC codes in this context is the ability to use the
   same packets containing FEC data to simultaneously repair different
   packet loss patterns at multiple receivers.  Different classes of FEC
   codes and some of their basic properties are described and
   terminology relevant to implementing FEC in a reliable multicast
   protocol is introduced.  Examples are provided of possible abstract
   formats for packets carrying FEC.


Luby, et. al.                Informational                      [Page 1]
^L
RFC 3453               FEC in Reliable Multicast           December 2002


Table of Contents

   1. Rationale and Overview . . . . . . . . . . . . . . . . . . . .   2
     1.1. Application of FEC codes . . . . . . . . . . . . . . . . .   5
   2. FEC Codes. . . . . . . . . . . . . . . . . . . . . . . . . . .   6
     2.1. Simple codes . . . . . . . . . . . . . . . . . . . . . . .   6
     2.2. Small block FEC codes. . . . . . . . . . . . . . . . . . .   8
     2.3. Large block FEC codes. . . . . . . . . . . . . . . . . . .  10
     2.4. Expandable FEC codes . . . . . . . . . . . . . . . . . . .  11
     2.5. Source blocks with variable length source symbols. . . . .  13
   3. Security Considerations. . . . . . . . . . . . . . . . . . . .  14
   4. Intellectual Property Disclosure . . . . . . . . . . . . . . .  14
   5. Acknowledgments. . . . . . . . . . . . . . . . . . . . . . . .  15
   6. References . . . . . . . . . . . . . . . . . . . . . . . . . .  15
   7. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . .  17
   8. Full Copyright Statement . . . . . . . . . . . . . . . . . . .  18

1.  Rationale and Overview

   There are many ways to provide reliability for transmission
   protocols.  A common method is to use ARQ, automatic request for
   retransmission.  With ARQ, receivers use a back channel to the sender
   to send requests for retransmission of lost packets.  ARQ works well
   for one-to-one reliable protocols, as evidenced by the pervasive
   success of TCP/IP.  ARQ has also been an effective reliability tool
   for one-to-many reliability protocols, and in particular for some
   reliable IP multicast protocols.  However, for one-to-very-many
   reliability protocols, ARQ has limitations, including the feedback
   implosion problem because many receivers are transmitting back to the
   sender, and the need for a back channel to send these requests from
   the receiver.  Another limitation is that receivers may experience
   different loss patterns of packets, and thus receivers may be delayed
   by retransmission of packets that other receivers have lost that but
   they have already received.  This may also cause wasteful use of
   bandwidth used to retransmit packets that have already been received
   by many of the receivers.

   In environments where ARQ is either costly or impossible because
   there is either a very limited capacity back channel or no back
   channel at all, such as satellite transmission, a Data Carousel
   approach to reliability is sometimes used [1].  With a Data Carousel,
   the sender partitions the object into equal length pieces of data,
   which we hereafter call source symbols, places them into packets, and
   then continually cycles through and sends these packets.  Receivers
   continually receive packets until they have received a copy of each
   packet.  Data Carousel has the advantage that it requires no back
   channel because there is no data that flows from receivers to the
   sender.  However, Data Carousel also has limitations.  For example,


Luby, et. al.                Informational                      [Page 2]
^L
RFC 3453               FEC in Reliable Multicast           December 2002


   if a receiver loses a packet in one round of transmission it must
   wait an entire round before it has a chance to receive that packet
   again.  This may also cause wasteful use of bandwidth, as the sender
   continually cycles through and transmits the packets until no
   receiver is missing a packet.

   Forward Error Correction (FEC) codes provide a reliability method
   that can be used to augment or replace other reliability methods,
   especially for one-to-many reliability protocols such as reliable IP
   multicast.  We first briefly review some of the basic properties and
   types of FEC codes before reviewing their uses in the context of
   reliable IP multicast.  Later, we provide a more detailed description
   of some of FEC codes.

   In the general literature, FEC refers to the ability to overcome both
   erasures (losses) and bit-level corruption.  However, in the case of
   an IP multicast protocol, the network layers will detect corrupted
   packets and discard them or the transport layers can use packet
   authentication to discard corrupted packets.  Therefore the primary
   application of FEC codes to IP multicast protocols is as an erasure
   code.  The payloads are generated and processed using an FEC erasure
   encoder and objects are reassembled from reception of packets
   containing the generated encoding using the corresponding FEC erasure
   decoder.

   The input to an FEC encoder is some number k of equal length source
   symbols.  The FEC encoder generates some number of encoding symbols
   that are of the same length as the source symbols.  The chosen length
   of the symbols can vary upon each application of the FEC encoder, or
   it can be fixed.  These encoding symbols are placed into packets for
   transmission.  The number of encoding symbols placed into each packet
   can vary on a per packet basis, or a fixed number of symbols (often
   one) can be placed into each packet.  Also, in each packet is placed
   enough information to identify the particular encoding symbols
   carried in that packet.  Upon receipt of packets containing encoding
   symbols, the receiver feeds these encoding symbols into the
   corresponding FEC decoder to recreate an exact copy of the k source
   symbols.  Ideally, the FEC decoder can recreate an exact copy from
   any k of the encoding symbols.

   In a later section, we describe a technique for using FEC codes as
   described above to handle blocks with variable length source symbols.

   Block FEC codes work as follows.  The input to a block FEC encoder is
   k source symbols and a number n.  The encoder generates a total of n
   encoding symbols.  The encoder is systematic if it generates n-k
   redundant symbols yielding an encoding block of n encoding symbols in
   total composed of the k source symbols and the n-k redundant symbols.


Luby, et. al.                Informational                      [Page 3]
^L
RFC 3453               FEC in Reliable Multicast           December 2002


   A block FEC decoder has the property that any k of the n encoding
   symbols in the encoding block is sufficient to reconstruct the
   original k source symbols.

   Expandable FEC codes work as follows.  An expandable FEC encoder
   takes as input k source symbols and generates as many unique encoding
   symbols as requested on demand, where the amount of time for
   generating each encoding symbol is the same independent of how many
   encoding symbols are generated.  An expandable FEC decoder has the
   property that any k of the unique encoding symbols is sufficient to
   reconstruct the original k source symbols.

   The above definitions explain the ideal situation when the reception
   of any k encoding symbols is sufficient to recover the k source
   symbols, in which case the reception overhead is 0%.  For some
   practical FEC codes, slightly more than k encoding symbols are needed
   to recover the k source symbols.  If k*(1+ep) encoding symbols are
   needed, we say the reception overhead is ep*100%, e.g., if k*1.05
   encoding symbols are needed then the reception overhead is 5%.

   Along a different dimension, we classify FEC codes loosely as being
   either small or large.  A small FEC code is efficient in terms of
   processing time requirements for encoding and decoding for small
   values of k, and a large FEC code is efficient for encoding and
   decoding for much large values of k.  There are implementations of
   block FEC codes that have encoding times proportional to n-k times
   the length of the k source symbols, and decoding times proportional
   to l times the length of the k source symbols, where l is the number
   of missing source symbols among the k received encoding symbols and l
   can be as large as k.  Because of the growth rate of the encoding and
   decoding times as a product of k and n-k, these are typically
   considered to be small block FEC codes.  There are block FEC codes
   with a small reception overhead that can generate n encoding symbols
   and can decode the k source symbols in time proportional to the
   length of the n encoding symbols.  These codes are considered to be
   large block FEC codes.  There are expandable FEC codes with a small
   reception overhead that can generate each encoding symbol in time
   roughly proportional to its length, and can decode all k source
   symbols in time roughly proportional to their length.  These are
   considered to be large expandable FEC codes.  We describe examples of
   all of these types of codes later.

   Ideally, FEC codes in the context of IP multicast can be used to
   generate encoding symbols that are transmitted in packets in such a
   way that each received packet is fully useful to a receiver to
   reassemble the object regardless of previous packet reception
   patterns.  Thus, if some packets are lost in transit between the
   sender and the receiver, instead of asking for specific


Luby, et. al.                Informational                      [Page 4]
^L
RFC 3453               FEC in Reliable Multicast           December 2002


   retransmission of the lost packets or waiting till the packets are
   resent using Data Carousel, the receiver can use any other subsequent
   equal number of packets that arrive to reassemble the object.  These
   packets can either be proactively transmitted or they can be
   explicitly requested by receivers.  This implies that the same packet
   is fully useful to all receivers to reassemble the object, even
   though the receivers may have previously experienced different packet
   loss patterns.  This property can reduce or even eliminate the
   problems mentioned above associated with ARQ and Data Carousel and
   thereby dramatically increase the scalability of the protocol to
   orders of magnitude more receivers.

1.1.  Application of FEC codes

   For some reliable IP multicast protocols, FEC codes are used in
   conjunction with ARQ to provide reliability.  For example, a large
   object could be partitioned into a number of source blocks consisting
   of a small number of source symbols each, and in a first round of
   transmission all of the source symbols for all the source blocks
   could be transmitted.  Then, receivers could report back to the
   sender the number of source symbols they are missing from each source
   block.  The sender could then compute the maximum number of missing
   source symbols from each source block among all receivers.  Based on
   this, a small block FEC encoder could be used to generate for each
   source block a number of redundant symbols equal to the computed
   maximum number of missing source symbols from the block among all
   receivers, as long as this maximum maximum for each block does not
   exceed the number of redundant symbols that can be generated
   efficiently.  In a second round of transmission, the server would
   then send all of these redundant symbols for all blocks.  In this
   example, if there are no losses in the second round of transmission
   then all receivers will be able to recreate an exact copy of each
   original block.  In this case, even if different receivers are
   missing different symbols in different blocks, transmitted redundant
   symbols for a given block are useful to all receivers missing symbols
   from that block in the second round.

   For other reliable IP multicast protocols, FEC codes are used in a
   Data Carousel fashion to provide reliability, which we call an FEC
   Data Carousel.  For example, an FEC Data Carousel using a large block
   FEC code could work as follows.  The large block FEC encoder produces
   n encoding symbols considering all the k source symbols of an object
   as one block.  The sender cycles through and transmits the n encoding
   symbols in packets in the same order in each round.  An FEC Data
   Carousel can have much better protection against packet loss than a
   Data Carousel.  For example, a receiver can join the transmission at
   any point in time, and, as long as the receiver receives at least k
   encoding symbols during the transmission of the next n encoding


Luby, et. al.                Informational                      [Page 5]
^L
RFC 3453               FEC in Reliable Multicast           December 2002


   symbols, the receiver can completely recover the object.  This is
   true even if the receiver starts receiving packets in the middle of a
   pass through the encoding symbols.  This method can also be used when
   the object is partitioned into blocks and a short block FEC code is
   applied to each block separately.  In this case, as we explain in
   more detail below, it is useful to interleave the symbols from the
   different blocks when they are transmitted.

   Since any number of encoding symbols can be generated using an
   expandable FEC encoder, reliable IP multicast protocols that use
   expandable FEC codes generally rely solely on these codes for
   reliability.  For example, when an expandable FEC code is used in a
   FEC Data Carousel application, the encoding packets never repeat, and
   thus any k of the encoding symbols in the potentially unbounded
   number of encoding symbols are sufficient to recover the original k
   source symbols.

   For additional reliable IP multicast protocols, the method to obtain
   reliability is to generate enough encoding symbols so that each
   encoding symbol is transmitted only once (at most).  For example, the
   sender can decide a priori how many encoding symbols it will
   transmit, use an FEC code to generate that number of encoding symbols
   from the object, and then transmit the encoding symbols to all
   receivers.  This method is applicable to streaming protocols, for
   example, where the stream is partitioned into objects, the source
   symbols for each object are encoded into encoding symbols using an
   FEC code, and then the sets of encoding symbols for each object are
   transmitted one after the other using IP multicast.

2.  FEC Codes

2.1.  Simple codes

   There are some very simple codes that are effective for repairing
   packet loss under very low loss conditions.  For example, to provide
   protection from a single loss is to partition the object into fixed
   size source symbols and then add a redundant symbol that is the
   parity (XOR) of all the source symbols.  The size of a source symbol
   is chosen so that it fits perfectly into the payload of a packet,
   i.e. if the packet payload is 512 bytes then each source symbol is
   512 bytes.  The header of each packet contains enough information to
   identify the payload.  This is an example of encoding symbol ID.  The
   encoding symbol IDs can consist of two parts in this example.  The
   first part is an encoding flag that is equal to 1 if the encoding
   symbol is a source symbol and is equal to 0 if the encoding symbol is
   a redundant symbol.  The second part of the encoding symbol ID is a
   source symbol ID if the encoding flag is 1 and a redundant symbol ID
   if the encoding flag is 0.  The source symbol IDs can be numbered


Luby, et. al.                Informational                      [Page 6]
^L
RFC 3453               FEC in Reliable Multicast           December 2002


   from 0 to k-1 and the redundant symbol ID can be 0.  For example, if
   the object consists of four source symbols that have values a, b, c
   and d, then the value of the redundant symbol is e = a XOR b XOR c
   XOR d.  Then, the packets carrying these symbols look like:

            (1, 0: a), (1, 1: b), (1, 2: c), (1, 3: d), (0, 0: e).

   In this example, the encoding symbol ID consists of the first two
   values, where the first value is the encoding flag and the second
   value is either a source symbol ID or the redundant symbol ID.  The
   portion of the packet after the colon is the value of the encoding
   symbol.  Any single source symbol of the object can be recovered as
   the parity of all the other symbols.  For example, if packets

                  (1, 0: a), (1, 1: b), (1, 3: d), (0, 0: e)

   are received then the missing source symbol value with source symbol
   ID = 2 can be recovered by computing a XOR b XOR d XOR e = c.

   Another way of forming the encoding symbol ID is to let values
   0,...,k-1 correspond to the k source symbols and value k correspond
   to the redundant symbol that is the XOR of the k source symbols.

   When the number of source symbols in the object is large, a simple
   block code variant of the above can be used.  In this case, the
   source symbols are grouped together into source blocks of some number
   k of consecutive symbols each, where k may be different for different
   blocks.  If a block consists of k source symbols then a redundant
   symbol is added to form an encoding block consisting of k+1 encoding
   symbols.  Then, a source block consisting of k source symbols can be
   recovered from any k of the k+1 encoding symbols from the associated
   encoding block.

   Slightly more sophisticated ways of adding redundant symbols using
   parity can also be used.  For example, one can group a block
   consisting of k source symbols in an object into a p x p square
   matrix, where p = sqrt(k).  Then, for each row a redundant symbol is
   added that is the parity of all the source symbols in the row.
   Similarly, for each column a redundant symbol is added that is the
   parity of all the source symbols in the column.  Then, any row of the
   matrix can be recovered from any p of the p+1 symbols in the row, and
   similarly for any column.  Higher dimensional product codes using
   this technique can also be used.  However, one must be wary of using
   these constructions without some thought towards the possible loss
   patterns of symbols.  Ideally, the property that one would like to
   obtain is that if k source symbols are encoded into n encoding
   symbols (the encoding symbols consist of the source symbols and the
   redundant symbols) then the k source symbols can be recovered from


Luby, et. al.                Informational                      [Page 7]
^L
RFC 3453               FEC in Reliable Multicast           December 2002


   any k of the n encoding symbols.  Using the simple constructions
   described above does not yield codes that come close to obtaining
   this ideal behavior.

2.2.  Small block FEC codes

   Reliable IP multicast protocols may use a block (n, k) FEC code [2].
   For such codes, k source symbols are encoded into n > k encoding
   symbols, such that any k of the encoding symbols can be used to
   reassemble the original k source symbols.  Thus, these codes have no
   reception overhead when used to encode the entire object directly.
   Block codes are usually systematic, which means that the n encoding
   symbols consist of the k source symbols and n-k redundant symbols
   generated from these k source symbols, where the size of a redundant
   symbol is the same as that for a source symbol.  For example, the
   first simple code (XOR) described in the previous subsection is a
   (k+1, k) code.  In general, the freedom to choose n larger than k+1
   is desirable, as this can provide much better protection against
   losses.  A popular example of these types of codes is a class of
   Reed-Solomon codes, which are based on algebraic methods using finite
   fields.  Implementations of (n, k) FEC erasure codes are efficient
   enough to be used by personal computers [16].  For example, [15]
   describes an implementation where the encoding and decoding speeds
   decay as C/j, where the constant C is on the order of 10 to 80
   Mbytes/second for Pentium class machines of various vintages and j is
   upper bounded by min(k, n-k).

   In practice, the values of k and n must be small (for example below
   256) for such FEC codes as large values make encoding and decoding
   prohibitively expensive.  As many objects are longer than k symbols
   for reasonable values of k and the symbol length (e.g. larger than 16
   kilobyte for k = 16 using 1 kilobyte symbols), they can be divided
   into a number of source blocks.  Each source block consists of some
   number k of source symbols, where k may vary between different source
   blocks.  The FEC encoder is used to encode a k source symbol source
   block into a n encoding symbol encoding block, where the number n of
   encoding symbols in the encoding block may vary for each source
   block.  For a receiver to completely recover the object, for each
   source block consisting of k source symbols, k distinct encoding
   symbols (i.e., with different encoding symbol IDs) must be received
   from the corresponding encoding block.  For some encoding blocks,
   more encoding symbols may be received than there are source symbols
   in the corresponding source block, in which case the excess encoding
   symbols are discarded.  An example encoding structure is shown in
   Figure 1.


Luby, et. al.                Informational                      [Page 8]
^L
RFC 3453               FEC in Reliable Multicast           December 2002


       |   source symbol IDs   |   source symbols IDs  |
       |   of source block 0   |   of source block 1   |
                    |                          |
                    v                          v
       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
       |0 |1 |2 |3 |4 |5 |6 |7 |0 |1 |2 |3 | 4|5 |6 |7 |
       +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
                               |
                           FEC encoder
                               |
                               v
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
   |0 |1 |2 |3 | 4| 5| 6| 7| 8| 9| 0| 1| 2| 3| 4| 5| 6| 7| 8| 9|
   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
                  ^                             ^
                  |                             |
   |  encoding symbol IDs        | encoding symbol IDs         |
   |  of encoding block 0        | of encoding block 1         |

   Figure 1. The encoding structure for an object divided into two
   source blocks consisting of 8 source symbols each, and the FEC
   encoder is used to generate 2 additional redundant symbols (10
   encoding symbols in total) for each of the two source blocks.

   In many cases, an object is partitioned into equal length source
   blocks each consisting of k contiguous source symbols of the object,
   i.e., block c consists of the range of source symbols [ck, (c+1)k-1].
   This ensures that the FEC encoder can be optimized to handle a
   particular number k of source symbols.  This also ensures that memory
   references are local when the sender reads source symbols to encode,
   and when the receiver reads encoding symbols to decode.  Locality of
   reference is particularly important when the object is stored on
   disk, as it reduces the disk seeks required.  The block number and
   the source symbol ID within that block can be used to uniquely
   specify a source symbol within the object. If the size of the object
   is not a multiple of k source symbols, then the last source block
   will contain less than k symbols.

   The block numbers can be numbered consecutively starting from zero.
   Encoding symbols within a block can be uniquely identified by an
   encoding symbol ID.  One way of identifying encoding symbols within a
   block is to use the combination of an encoding flag that identifies
   the symbol as either a source symbol or a redundant symbol together
   with either a source symbol ID or a redundant symbol ID.  For
   example, an encoding flag value of 1 can indicate that the encoding
   symbol is a source symbol and 0 can indicate that it is a redundant
   symbol.  The source symbol IDs can be numbered from 0 to k-1 and the
   redundant symbol IDs can be numbered from 0 to n-k-1.


Luby, et. al.                Informational                      [Page 9]
^L
RFC 3453               FEC in Reliable Multicast           December 2002


   For example, if the object consists 10 source symbols with values a,
   b, c, d, e, f, g, h, i, and j, and k = 5 and n = 8, then there are
   two source blocks consisting of 5 symbols each, and there are two
   encoding blocks consisting of 8 symbols each.  Let p, q and r be the
   values of the redundant symbols for the first encoding block, and let
   x, y and z be the values of the redundant symbols for the second
   encoding block.  Then the encoding symbols together with their
   identifiers are

   (0, 1, 0: a), (0, 1, 1: b), (0, 1, 2: c), (0, 1, 3: d), (0, 1, 4: e),
   (0, 0, 0: p), (0, 0, 1: q), (0, 0, 2: r),
   (1, 1, 0: f), (1, 1, 1: g), (1, 1, 2: h), (1, 1, 3: i), (1, 1, 4: j),
   (1, 0, 0: x), (1, 0, 1: y), (1, 0, 2: z).

   In this example, the first value identifies the block number and the
   second two values together identify the encoding symbol within the
   block, i.e, the encoding symbol ID consists of the encoding flag
   together with either the source symbol ID or the redundant symbol ID
   depending on the value of the encoding flag.  The value of the
   encoding symbol is written after the colon.  Each block can be
   recovered from any 5 of the 8 encoding symbols associated with that
   block.  For example, reception of

    (0, 1, 1: b), (0, 1, 2: c), (0, 1, 3: d), (0, 0, 0: p), (0, 0, 1: q)

   is sufficient to recover the first source block, and reception of

    (1, 1, 0: f), (1, 1, 1: g), (1, 0, 0: x), (1, 0, 1: y), (1, 0, 2: z)

   is sufficient to recover the second source block.

   Another way of uniquely identifying encoding symbols within a block
   is to let the encoding symbol IDs for source symbols be 0,...,k-1 and
   to let the encoding symbol IDs for redundant symbols be k,...,n-1.

2.3.  Large block FEC codes

   Tornado codes [12], [13], [10], [11], [9] are large block FEC codes
   that provide an alternative to small block FEC codes.  An (n, k)
   Tornado code requires slightly more than k out of n encoding symbols
   to recover k source symbols, i.e., there is a small reception
   overhead.  The benefit of the small cost of non-zero reception
   overhead is that the value of k may be on the order of tens of
   thousands and still the encoding and decoding are efficient.  Because
   of memory considerations, in practice the value of n is restricted to
   be a small multiple of k, e.g., n = 2k.  For example, [4] describes
   an implementation of Tornado codes where the encoding and decoding
   speeds are tens of megabytes per second for Pentium class machines of


Luby, et. al.                Informational                     [Page 10]
^L
RFC 3453               FEC in Reliable Multicast           December 2002


   various vintages when k is in the tens of thousands and n = 2k.  The
   reception overhead for such values of k and n is in the 5-10% range.
   Tornado codes require a large amount of out of band information to be
   communicated to all senders and receivers for each different object
   length, and require an amount of memory on the encoder and decoder
   which is proportional to the object length times 2n/k.

   Tornado codes are designed to have low reception overhead on average
   with respect to reception of a random portion of the encoding
   packets.  Thus, to ensure that a receiver can reassemble the object
   with low reception overhead, the packets are permuted into a random
   order before transmission.

2.4.  Expandable FEC codes

   All of the FEC codes described up to this point are block codes.
   There is a different type of FEC codes that we call expandable FEC
   codes.  Like block codes, an expandable FEC encoder operates on an
   object of known size that is partitioned into equal length source
   symbols.  Unlike block codes, there is no predetermined number of
   encoding symbols that can be generated for expandable FEC codes.
   Instead, an expandable FEC encoder can generate as few or as many
   unique encoding symbols as required on demand.

   LT codes [6], [7], [8], [5] are an example of large expandable FEC
   codes with a small reception overhead.  An LT encoder uses
   randomization to generate each encoding symbol randomly and
   independently of all other encoding symbols.  Like Tornado codes, the
   number of source symbols k may be very large for LT codes, i.e., on
   the order of tens to hundreds of thousands, and the encoder and
   decoder run efficiently in software.  For example the encoding and
   decoding speeds for LT codes are in the range 3-20 megabytes per
   second for Pentium class machines of various vintages when k is in
   the high tens of thousands.  An LT encoder can generate as few or as
   many encoding symbols as required on demand.  When a new encoding
   symbol is to be generated by an LT encoder, it is based on a randomly
   chosen encoding symbol ID that uniquely describes how the encoding
   symbol is to be generated from the source symbols. In general, each
   encoding symbol ID value corresponds to a unique encoding symbol, and
   thus the space of possible encoding symbols is approximately four
   billion if for example the encoding symbol ID is 4 bytes.  Thus, the
   chance that a particular encoding symbol is the same as any other
   particular encoding symbol is inversely proportional to the number of
   possible encoding symbol IDs.  An LT decoder has the property that
   with very high probability the receipt of any set of slightly more
   than k randomly and independently generated encoding symbols is
   sufficient to reassemble the k source symbols.  For example, when k


Luby, et. al.                Informational                     [Page 11]
^L
RFC 3453               FEC in Reliable Multicast           December 2002


   is on the order of tens to hundreds of thousands the reception
   overhead is less than 5% with no failures in hundreds of millions of
   trials under any loss conditions.

   Because encoding symbols are randomly and independently generated by
   choosing random encoding symbol IDs, LT codes have the property that
   encoding symbols for the same k source symbols can be generated and
   transmitted from multiple senders and received by a receiver and the
   reception overhead and the decoding time is the same as if though all
   the encoding symbols were generated by a single sender.  The only
   requirement is that the senders choose their encoding symbol IDs
   randomly and independently of one another.

   There is a weak tradeoff between the number of source symbols and the
   reception overhead for LT codes, and the larger the number of source
   symbols the smaller the reception overhead.  Thus, for shorter
   objects, it is sometimes advantageous to partition the object into
   many short source symbols and include multiple encoding symbols in
   each packet.  In this case, a single encoding symbol ID is used to
   identify the multiple encoding symbols contained in a single packet.

   There are a couple of factors for choosing the appropriate symbol
   length/ number of source symbols tradeoff. The primary consideration
   is that there is a fixed overhead per symbol in the overall
   processing requirements of the encoding and decoding, independent of
   the number of source symbols.  Thus, using shorter symbols means that
   this fixed overhead processing per symbol will be a larger component
   of the overall processing requirements, leading to larger overall
   processing requirements.  A second much less important consideration
   is that there is a component of the processing per symbol that
   depends logarithmically on the number of source symbols, and thus for
   this reason there is a slight preference towards fewer source
   symbols.

   Like small block codes, there is a point when the object is large
   enough that it makes sense to partition it into blocks when using LT
   codes.  Generally the object is partitioned into blocks whenever the
   number of source symbols times the packet payload length is less than
   the size of the object.  Thus, if the packet payload is 1024 bytes
   and the maximum number of source symbols is 128,000 then any object
   over 128 megabytes will be partitioned into more than one block.  One
   can choose the maximum number of source symbols to use, depending on
   the desired encoding and decoding speed versus reception overhead
   tradeoff desired.  Encoding symbols can be uniquely identified by a
   block number (when the object is large enough to be partitioned into
   more than one block) and an encoding symbol ID.  The block numbers,
   if they are used, are generally numbered consecutively starting from
   zero within the object.  The block number and the encoding symbol ID


Luby, et. al.                Informational                     [Page 12]
^L
RFC 3453               FEC in Reliable Multicast           December 2002


   are both chosen uniformly and randomly from their range when an
   encoding symbol is to be transmitted.  For example, suppose the
   number of source symbols is 128,000 and the number of blocks is 2.
   Then, each packet generated by the LT encoder could be of the form
   (b, x: y).  In this example, the first value identifies the block
   number and the second value identifies the encoding symbol within the
   block.  In this example, the block number b is either 0 or 1, and the
   encoding symbol ID x might be a 32-bit value.  The value y after the
   colon is the value of the encoding symbol.

2.5.  Source blocks with variable length source symbols

   For all the FEC codes described above, all the source symbols in the
   same source block are all of the same length.  In this section, we
   describe a general technique to handle the case when it is desirable
   to use source symbols of varying lengths in a single source block.
   This technique is applicable to block FEC codes.

   Let l_1, l_2, ... , l_k be the lengths in bytes of k varying length
   source symbols to be considered part of the same source block.  Let
   lmax be the maximum over i = 1, ... , k of l_i.  To prepare the
   source block for the FEC encoder, pad each source symbol i out to
   length lmax with a suffix of lmax-l_i zeroes, and then prepend to the
   beginning of this the value l_i.  Thus, each padded source symbol is
   of length x+lmax, assuming that it takes x bytes to store an integer
   with possible values 0,...,lmax, where x is a protocol constant known
   to both the encoder and the decoder.  These padded source symbols,
   each of length x+lmax, are the input to the encoder, together with
   the value n.  The encoder then generates n-k redundant symbols, each
   of length x+lmax.

   The encoding symbols that are placed into packets consist of the
   original k varying length source symbols and n-k redundant symbols,
   each of length x+lmax.  From any k of the received encoding symbols,
   the FEC decoder recreates the k original source symbols as follows.
   If all k original source symbols are received, then no decoding is
   necessary.  Otherwise, at least one redundant symbol is received,
   from which the receiver can easily determine if the block is composed
   of variable- length source symbols: if the redundant symbol(s) is
   longer than the source symbols then the source symbols are variable-
   length.  Note that in a variable-length block the redundant symbols
   are always longer than the longest source symbol, due to the presence
   of the prepended symbol- length.  The receiver can determine the
   value of lmax by subtracting x from the length of a received
   redundant symbol.  For each of the received original source symbols,
   the receiver can generate the corresponding padded source symbol as
   described above.  Then, the input to the FEC decoder is the set of
   received redundant symbols, together with the set of padded source


Luby, et. al.                Informational                     [Page 13]
^L
RFC 3453               FEC in Reliable Multicast           December 2002


   symbols constructed from the received original symbols.  The FEC
   decoder then produces the set of k padded source symbols.  Once the k
   padded source symbols have been recovered, the length l_i of original
   source symbol i can be recovered from the first x bytes of the ith
   padded source symbol, and then original source symbol i is obtained
   by taking the next l_i bytes following the x bytes of the length
   field.

3.  Security Considerations

   The use of FEC, in and of itself, imposes no additional security
   considerations versus sending the same information without FEC.
   However, just like for any transmission system, a malicious sender
   may try to inject packets carrying corrupted encoding symbols.  If a
   receiver accepts one or more corrupted encoding symbol, in place of
   authentic ones, then such a receiver may reconstruct a corrupted
   object.

   Application-level transmission object authentication can detect the
   corrupted transfer, but the receiver must discard the transferred
   object.  By injecting corrupted encoding symbols, they are accepted
   as valid encoding symbols by a receiver, which at the very least, is
   an effective denial of service attack.

   In light of this possibility, FEC receivers may screen the source
   address of a received symbol against a list of authentic transmitter
   addresses.  Since source addresses may be spoofed, transport
   protocols using FEC may provide mechanisms for robust source
   authentication of each encoding symbol.  Multicast routers along the
   path of a FEC transfer may provide the capability of discarding
   multicast packets that originated on that subnet, and whose source IP
   address does not correspond with that subnet.

   It is recommended that a packet authentication scheme such as TESLA
   [14] be used in conjunction with FEC codes.  Then, packets that
   cannot be authenticated can be discarded and the object can be
   reliably recovered from the received authenticated packets.

4.  Intellectual Property Disclosure

   The IETF has been notified of intellectual property rights claimed in
   regard to some or all of the specification contained in this
   document.  For more information consult the online list of claimed
   rights.


Luby, et. al.                Informational                     [Page 14]
^L
RFC 3453               FEC in Reliable Multicast           December 2002


5.  Acknowledgments

   Thanks to Vincent Roca and Hayder Radha for their detailed comments
   on this document.

6.  References

   [1]  Acharya, S., Franklin, M. and S. Zdonik, "Dissemination - Based
        Data Delivery Using Broadcast Disks", IEEE Personal
        Communications, pp.50-60, Dec 1995.

   [2]  Blahut, R.E., "Theory and Practice of Error Control Codes",
        Addison Wesley, MA, 1984.

   [3]  Bradner, S., "The Internet Standards Process -- Revision 3", BCP
        9, RFC 2026, October 1996.

   [4]  Byers, J.W., Luby, M., Mitzenmacher, M. and A. Rege, "A Digital
        Fountain Approach to Reliable Distribution of Bulk Data",
        Proceedings ACM SIGCOMM '98, Vancouver, Canada, Sept 1998.

   [5]  Haken, A., Luby, M., Horn, G., Hernek, D., Byers, J. and M.
        Mitzenmacher, "Generating high weight encoding symbols using a
        basis", U.S. Patent No. 6,411,223, June 25, 2002.

   [6]  Luby, M., "Information Additive Code Generator and Decoder for
        Communication Systems", U.S. Patent No. 6,307,487, October 23,
        2001.

   [7]  Luby, M., "Information Additive Group Code Generator and Decoder
        for Communication Systems", U.S. Patent No. 6,320,520, November
        20, 2001.

   [8]  Luby, M., "Information Additive Code Generator and Decoder for
        Communication Systems", U.S. Patent No. 6,373,406, April 16,
        2002.

   [9]  Luby, M. and M. Mitzenmacher, "Loss resilient code with double
        heavy tailed series of redundant layers", U.S. Patent No.
        6,195,777, February 27, 2001.

   [10] Luby, M., Mitzenmacher, M., Shokrollahi, A., Spielman, D. and V.
        Stemann, "Message encoding with irregular graphing", U.S. Patent
        No. 6,163,870, December 19, 2000.


Luby, et. al.                Informational                     [Page 15]
^L
RFC 3453               FEC in Reliable Multicast           December 2002


   [11] Luby, M., Mitzenmacher, M., Shokrollahi, A. and D. Spielman,
        "Efficient Erasure Correcting Codes", IEEE Transactions on
        Information Theory, Special Issue: Codes on Graphs and Iterative
        Algorithms, pp.  569-584, Vol. 47, No. 2, February 2001.

   [12] Luby, M., Shokrollahi, A., Stemann, V., Mitzenmacher, M. and D.
        Spielman, "Loss resilient decoding technique", U.S. Patent No.
        6,073,250, June 6, 2000.

   [13] Luby, M., Shokrollahi, A., Stemann, V., Mitzenmacher, M. and D.
        Spielman, "Irregularly graphed encoding technique", U.S. Patent
        No.  6,081,909, June 27, 2000.

   [14] Perrig, A., Canetti, R., Song, D. and J.D. Tygar, "Efficient and
        Secure Source Authentication for Multicast", Network and
        Distributed System Security Symposium, NDSS 2001, pp. 35-46,
        February 2001.

   [15] Rizzo, L., "Effective Erasure Codes for Reliable Computer
        Communication Protocols", ACM SIGCOMM Computer Communication
        Review, Vol.27, No.2, pp.24-36, Apr 1997.

   [16] Rizzo, L., "On the Feasibility of Software FEC", DEIT Tech
        Report, http://www.iet.unipi.it/~luigi/softfec.ps, Jan 1997.


Luby, et. al.                Informational                     [Page 16]
^L
RFC 3453               FEC in Reliable Multicast           December 2002


7.  Authors' Addresses

   Michael Luby
   Digital Fountain
   39141 Civic Center Drive
   Suite 300
   Fremont, CA  94538

   EMail: luby@digitalfountain.com

   Lorenzo Vicisano
   Cisco Systems, Inc.
   170 West Tasman Dr.,
   San Jose, CA, USA, 95134

   EMail: lorenzo@cisco.com

   Jim Gemmell
   Microsoft Research
   455 Market St. #1690
   San Francisco, CA, 94105

   EMail: jgemmell@microsoft.com

   Luigi Rizzo
   Dip. di Ing. dell'Informazione
   Universita` di Pisa
   via Diotisalvi 2, 56126 Pisa, Italy

   EMail:luigi@iet.unipi.it

   Mark Handley
   ICSI Center for Internet Research
   1947 Center St.
   Berkeley CA, USA, 94704

   EMail: mjh@icir.org

   Jon Crowcroft
   Marconi Professor of Communications Systems
   University of Cambridge
   Computer Laboratory
   William Gates Building
   J J Thomson Avenue
   Cambridge
   CB3 0FD

   EMail: Jon.Crowcroft@cl.cam.ac.uk


Luby, et. al.                Informational                     [Page 17]
^L
RFC 3453               FEC in Reliable Multicast           December 2002


8.  Full Copyright Statement

   Copyright (C) The Internet Society (2002).  All Rights Reserved.

   This document and translations of it may be copied and furnished to
   others, and derivative works that comment on or otherwise explain it
   or assist in its implementation may be prepared, copied, published
   and distributed, in whole or in part, without restriction of any
   kind, provided that the above copyright notice and this paragraph are
   included on all such copies and derivative works.  However, this
   document itself may not be modified in any way, such as by removing
   the copyright notice or references to the Internet Society or other
   Internet organizations, except as needed for the purpose of
   developing Internet standards in which case the procedures for
   copyrights defined in the Internet Standards process must be
   followed, or as required to translate it into languages other than
   English.

   The limited permissions granted above are perpetual and will not be
   revoked by the Internet Society or its successors or assigns.

   This document and the information contained herein is provided on an
   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Acknowledgement

   Funding for the RFC Editor function is currently provided by the
   Internet Society.


Luby, et. al.                Informational                     [Page 18]
^L