diff options
Diffstat (limited to 'doc/rfc/rfc4997.txt')
-rw-r--r-- | doc/rfc/rfc4997.txt | 3475 |
1 files changed, 3475 insertions, 0 deletions
diff --git a/doc/rfc/rfc4997.txt b/doc/rfc/rfc4997.txt new file mode 100644 index 0000000..08b73ce --- /dev/null +++ b/doc/rfc/rfc4997.txt @@ -0,0 +1,3475 @@ + + + + + + +Network Working Group R. Finking +Request for Comments: 4997 Siemens/Roke Manor Research +Category: Standards Track G. Pelletier + Ericsson + July 2007 + + + Formal Notation for RObust Header Compression (ROHC-FN) + +Status of This Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Copyright Notice + + Copyright (C) The IETF Trust (2007). + +Abstract + + This document defines Robust Header Compression - Formal Notation + (ROHC-FN), a formal notation to specify field encodings for + compressed formats when defining new profiles within the ROHC + framework. ROHC-FN offers a library of encoding methods that are + often used in ROHC profiles and can thereby help to simplify future + profile development work. + +Table of Contents + + 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 + 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 + 3. Overview of ROHC-FN . . . . . . . . . . . . . . . . . . . . . 5 + 3.1. Scope of the Formal Notation . . . . . . . . . . . . . . . 6 + 3.2. Fundamentals of the Formal Notation . . . . . . . . . . . 7 + 3.2.1. Fields and Encodings . . . . . . . . . . . . . . . . . 7 + 3.2.2. Formats and Encoding Methods . . . . . . . . . . . . . 9 + 3.3. Example Using IPv4 . . . . . . . . . . . . . . . . . . . . 11 + 4. Normative Definition of ROHC-FN . . . . . . . . . . . . . . . 13 + 4.1. Structure of a Specification . . . . . . . . . . . . . . . 13 + 4.2. Identifiers . . . . . . . . . . . . . . . . . . . . . . . 14 + 4.3. Constant Definitions . . . . . . . . . . . . . . . . . . . 15 + 4.4. Fields . . . . . . . . . . . . . . . . . . . . . . . . . . 16 + 4.4.1. Attribute References . . . . . . . . . . . . . . . . . 17 + 4.4.2. Representation of Field Values . . . . . . . . . . . . 17 + + + + +Finking & Pelletier Standards Track [Page 1] + +RFC 4997 ROHC-FN July 2007 + + + 4.5. Grouping of Fields . . . . . . . . . . . . . . . . . . . . 17 + 4.6. "THIS" . . . . . . . . . . . . . . . . . . . . . . . . . . 18 + 4.7. Expressions . . . . . . . . . . . . . . . . . . . . . . . 19 + 4.7.1. Integer Literals . . . . . . . . . . . . . . . . . . . 20 + 4.7.2. Integer Operators . . . . . . . . . . . . . . . . . . 20 + 4.7.3. Boolean Literals . . . . . . . . . . . . . . . . . . . 20 + 4.7.4. Boolean Operators . . . . . . . . . . . . . . . . . . 20 + 4.7.5. Comparison Operators . . . . . . . . . . . . . . . . . 21 + 4.8. Comments . . . . . . . . . . . . . . . . . . . . . . . . . 21 + 4.9. "ENFORCE" Statements . . . . . . . . . . . . . . . . . . . 22 + 4.10. Formal Specification of Field Lengths . . . . . . . . . . 23 + 4.11. Library of Encoding Methods . . . . . . . . . . . . . . . 24 + 4.11.1. uncompressed_value . . . . . . . . . . . . . . . . . . 24 + 4.11.2. compressed_value . . . . . . . . . . . . . . . . . . . 25 + 4.11.3. irregular . . . . . . . . . . . . . . . . . . . . . . 26 + 4.11.4. static . . . . . . . . . . . . . . . . . . . . . . . . 27 + 4.11.5. lsb . . . . . . . . . . . . . . . . . . . . . . . . . 27 + 4.11.6. crc . . . . . . . . . . . . . . . . . . . . . . . . . 29 + 4.12. Definition of Encoding Methods . . . . . . . . . . . . . . 29 + 4.12.1. Structure . . . . . . . . . . . . . . . . . . . . . . 30 + 4.12.2. Arguments . . . . . . . . . . . . . . . . . . . . . . 37 + 4.12.3. Multiple Formats . . . . . . . . . . . . . . . . . . . 38 + 4.13. Profile-Specific Encoding Methods . . . . . . . . . . . . 40 + 5. Security Considerations . . . . . . . . . . . . . . . . . . . 41 + 6. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 41 + 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 41 + 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 42 + 8.1. Normative References . . . . . . . . . . . . . . . . . . . 42 + 8.2. Informative References . . . . . . . . . . . . . . . . . . 42 + Appendix A. Formal Syntax of ROHC-FN . . . . . . . . . . . . . . 43 + Appendix B. Bit-level Worked Example . . . . . . . . . . . . . . 45 + B.1. Example Packet Format . . . . . . . . . . . . . . . . . . 45 + B.2. Initial Encoding . . . . . . . . . . . . . . . . . . . . . 46 + B.3. Basic Compression . . . . . . . . . . . . . . . . . . . . 47 + B.4. Inter-Packet Compression . . . . . . . . . . . . . . . . . 48 + B.5. Specifying Initial Values . . . . . . . . . . . . . . . . 50 + B.6. Multiple Packet Formats . . . . . . . . . . . . . . . . . 51 + B.7. Variable Length Discriminators . . . . . . . . . . . . . . 53 + B.8. Default Encoding . . . . . . . . . . . . . . . . . . . . . 55 + B.9. Control Fields . . . . . . . . . . . . . . . . . . . . . . 56 + B.10. Use of "ENFORCE" Statements as Conditionals . . . . . . . 59 + + + + + + + + + + +Finking & Pelletier Standards Track [Page 2] + +RFC 4997 ROHC-FN July 2007 + + +1. Introduction + + Robust Header Compression - Formal Notation (ROHC-FN) is a formal + notation designed to help with the definition of ROHC [RFC4995] + header compression profiles. Previous header compression profiles + have been so far specified using a combination of English text + together with ASCII Box notation. Unfortunately, this was sometimes + unclear and ambiguous, revealing the limitations of defining complex + structures and encodings for compressed formats this way. The + primary objective of the Formal Notation is to provide a more + rigorous means to define header formats -- compressed and + uncompressed -- as well as the relationships between them. No other + formal notation exists that meets these requirements, so ROHC-FN aims + to meet them. + + In addition, ROHC-FN offers a library of encoding methods that are + often used in ROHC profiles, so that the specification of new + profiles using the formal notation can be achieved without having to + redefine this library from scratch. Informally, an encoding method + defines a two-way mapping between uncompressed data and compressed + data. + +2. Terminology + + o Compressed format + + A compressed format consists of a list of fields that provides + bindings between encodings and the fields it compresses. One or + more compressed formats can be combined to represent an entire + compressed header format. + + o Context + + Context is information about the current (de)compression state of + the flow. Specifically, a context for a specific field can be + either uninitialised, or it can include a set of one or more + values for the field's attributes defined by the compression + algorithm, where a value may come from the field's attributes + corresponding to a previous packet. See also a more generalized + definition in Section 2.2 of [RFC4995]. + + o Control field + + Control fields are transmitted from a ROHC compressor to a ROHC + decompressor, but are not part of the uncompressed header itself. + + + + + + +Finking & Pelletier Standards Track [Page 3] + +RFC 4997 ROHC-FN July 2007 + + + o Encoding method, encodings + + Encoding methods are two-way relations that can be applied to + compress and decompress fields of a protocol header. + + o Field + + The protocol header is divided into a set of contiguous bit + patterns known as fields. Each field is defined by a collection + of attributes that indicate its value and length in bits for both + the compressed and uncompressed headers. The way the header is + divided into fields is specific to the definition of a profile, + and it is not necessary for the field divisions to be identical to + the ones given by the specification(s) for the protocol header + being compressed. + + o Library of encoding methods + + The library of encoding methods contains a number of commonly used + encoding methods for compressing header fields. + + o Profile + + A ROHC [RFC4995] profile is a description of how to compress a + certain protocol stack. Each profile consists of a set of formats + (for example, uncompressed and compressed formats) along with a + set of rules that control compressor and decompressor behaviour. + + o ROHC-FN specification + + The specification of the set of formats of a ROHC profile using + ROHC-FN. + + o Uncompressed format + + An uncompressed format consists of a list of fields that provides + the order of the fields to be compressed for a contiguous set of + bits whose bit layout corresponds to the protocol header being + compressed. + +3. Overview of ROHC-FN + + This section gives an overview of ROHC-FN. It also explains how + ROHC-FN can be used to specify the compression of header fields as + part of a ROHC profile. + + + + + + +Finking & Pelletier Standards Track [Page 4] + +RFC 4997 ROHC-FN July 2007 + + +3.1. Scope of the Formal Notation + + This section explains how the formal notation relates to the ROHC + framework and to specifications of ROHC profiles. + + The ROHC framework [RFC4995] provides the general principles for + performing robust header compression. It defines the concept of a + profile, which makes ROHC a general platform for different + compression schemes. It sets link layer requirements, and in + particular negotiation requirements, for all ROHC profiles. It + defines a set of common functions such as Context Identifiers (CIDs), + padding, and segmentation. It also defines common formats (IR, IR- + DYN, Feedback, Add-CID, etc.), and finally it defines a generic, + profile independent, feedback mechanism. + + A ROHC profile is a description of how to compress a certain protocol + stack. For example, ROHC profiles are available for RTP/UDP/IP and + many other protocol stacks. + + At a high level, each ROHC profile consists of a set of formats + (defining the bits to be transmitted) along with a set of rules that + control compressor and decompressor behaviour. The purpose of the + formats is to define how to compress and decompress headers. The + formats define one or more compressed versions of each uncompressed + header, and simultaneously define the inverse: how to relate a + compressed header back to the original uncompressed header. + + The set of formats will typically define compression of headers + relative to a context of field values from previous headers in a + flow, improving the overall compression by taking into account + redundancies between headers of successive packets. Therefore, in + addition to defining the formats, a profile has to: + + o specify how to manage the context for both the compressor and the + decompressor, + + o define when and what to send in feedback messages, if any, from + decompressor to compressor, + + o outline compression principles to make the profile robust against + bit errors and dropped packets. + + All this is needed to ensure that the compressor and decompressor + contexts are kept consistent with each other, while still + facilitating the best possible compression performance. + + The ROHC-FN is designed to help in the specification of compressed + formats that, when put together based on the profile definition, make + + + +Finking & Pelletier Standards Track [Page 5] + +RFC 4997 ROHC-FN July 2007 + + + up the formats used in a ROHC profile. It offers a library of + encoding methods for compressing fields, and a mechanism for + combining these encoding methods to create compressed formats + tailored to a specific protocol stack. + + The scope of ROHC-FN is limited to specifying the relationship + between the compressed and uncompressed formats. To form a complete + profile specification, the control logic for the profile behaviour + needs to be defined by other means. + +3.2. Fundamentals of the Formal Notation + + There are two fundamental elements to the formal notation: + + 1. Fields and their encodings, which define the mapping between a + header's uncompressed and compressed forms. + + 2. Encoding methods, which define the way headers are broken down + into fields. Encoding methods define lists of uncompressed + fields and the lists of compressed fields they map onto. + + These two fundamental elements are at the core of the notation and + are outlined below. + +3.2.1. Fields and Encodings + + Headers are made up of fields. For example, version number, header + length, and sequence number are all fields used in real protocols. + + Fields have attributes. Attributes describe various things about the + field. For example: + + field.ULENGTH + + The above indicates the uncompressed length of the field. A field is + said to have a value attribute, i.e., a compressed value or an + uncompressed value, if the corresponding length attribute is greater + than zero. See Section 4.4 for more details on field attributes. + + The relationship between the compressed and uncompressed attributes + of a field are specified with encoding methods, using the following + notation: + + field =:= encoding_method; + + In the field definition above, the symbol "=:=" means "is encoded + by". This field definition does not represent an assignment + operation from the right hand side to the left side. Instead, it is + + + +Finking & Pelletier Standards Track [Page 6] + +RFC 4997 ROHC-FN July 2007 + + + a two-way mapping between the compressed and uncompressed attributes + of the field. It both represents the compression and the + decompression operation in a single field definition, through a + process of two-way matching. + + Two-way matching is a binary operation that attempts to make the + operands (i.e., the compressed and uncompressed attributes) match. + This is similar to the unification process in logic. The operands + represent one unspecified data object and one specified object. + Values can be matched from either operand. + + During compression, the uncompressed attributes of the field are + already defined. The given encoding matches the compressed + attributes against them. During decompression, the compressed + attributes of the field are already defined, so the uncompressed + attributes are matched to the compressed attributes using the given + encoding method. Thus, both compression and decompression are + defined by a single field definition. + + Therefore, an encoding method (including any parameters specified) + creates a reversible binding between the attributes of a field. At + the compressor, a format can be used if a set of bindings that is + successful for all the attributes in all its fields can be found. At + the decompressor, the operation is reversed using the same bindings + and the attributes in each field are filled according to the + specified bindings; decoding fails if the binding for an attribute + fails. + + For example, the "static" encoding method creates a binding between + the attribute corresponding to the uncompressed value of the field + and the corresponding value of the field in the context. + + o For the compressor, the "static" binding is successful when both + the context value and the uncompressed value are the same. If the + two values differ then the binding fails. + + o For the decompressor, the "static" binding succeeds only if a + valid context entry containing the value of the uncompressed field + exists. Otherwise, the binding will fail. + + Both the compressed and uncompressed forms of each field are + represented as a string of bits; the most significant bit first, of + the length specified by the length attribute. The bit string is the + binary representation of the value attribute of the field, modulo + "2^length", where "length" is the length attribute of the field. + However, this is only the representation of the bits exchanged + between the compressor and the decompressor, designed to allow + + + + +Finking & Pelletier Standards Track [Page 7] + +RFC 4997 ROHC-FN July 2007 + + + maximum compression efficiency. The FN itself uses the full range of + integers. See Section 4.4.2 for further details. + +3.2.2. Formats and Encoding Methods + + The ROHC-FN provides a library of commonly used encoding methods. + Encoding methods can be defined using plain English, or using a + formal definition consisting of, for example, a collection of + expressions (Section 4.7) and "ENFORCE" statements (Section 4.9). + + ROHC-FN also provides mechanisms for combining fields and their + encoding methods into higher level encoding methods following a well- + defined structure. This is similar to the definition of functions + and procedures in an ordinary programming language. It allows + complexity to be handled by being broken down into manageable parts. + New encoding methods are defined at the top level of a profile. + These can then be used in the definition of other higher level + encoding methods, and so on. + + new_encoding_method // This block is an encoding method + { + UNCOMPRESSED { // This block is an uncompressed format + field_1 [ 16 ]; + field_2 [ 32 ]; + field_3 [ 48 ]; + } + + CONTROL { // This block defines control fields + ctrl_field_1; + ctrl_field_2; + } + + DEFAULT { // This block defines default encodings + // for specified fields + ctrl_field_2 =:= encoding_method_2; + field_1 =:= encoding_method_1; + } + + COMPRESSED format_0 { // This block is a compressed format + field_1; + field_2 =:= encoding_method_2; + field_3 =:= encoding_method_3; + ctrl_field_1 =:= encoding_method_4; + ctrl_field_2; + } + + + + + + +Finking & Pelletier Standards Track [Page 8] + +RFC 4997 ROHC-FN July 2007 + + + COMPRESSED format_1 { // This block is a compressed format + field_1; + field_2 =:= encoding_method_3; + field_3 =:= encoding_method_4; + ctrl_field_2 =:= encoding_method_5; + ctrl_field_3 =:= encoding_method_6; // This is a control field + // with no uncompressed value + } + } + + In the example above, the encoding method being defined is called + "new_encoding_method". The section headed "UNCOMPRESSED" indicates + the order of fields in the uncompressed header, i.e., the + uncompressed header format. The number of bits in each of the fields + is indicated in square brackets. After this is another section, + "CONTROL", which defines two control fields. Following this is the + "DEFAULT" section which defines default encoding methods for two of + the fields (see below). Finally, two alternative compressed formats + follow, each defined in sections headed "COMPRESSED". The fields + that occur in the compressed formats are either: + + o fields that occur in the uncompressed format; or + + o control fields that have an uncompressed value and that occur in + the CONTROL section; or + + o control fields that do not have an uncompressed value and thus are + defined as part of the compressed format. + + Central to each of these formats is a "field list", which defines the + fields contained in the format and also the order that those fields + appear in that format. For the "DEFAULT" and "CONTROL" sections, the + field order is not significant. + + In addition to specifying field order, the field list may also + specify bindings for any or all of the fields it contains. Fields + that have no bindings defined for them are bound using the default + bindings specified in the "DEFAULT" section (see Section 4.12.1.5). + + Fields from the compressed format have the same name as they do in + the uncompressed format. If there are any fields that are present + exclusively in the compressed format, but that do have an + uncompressed value, they must be declared in the "CONTROL" section of + the definition of the encoding method (see Section 4.12.1.3 for more + details on defining control fields). + + Fields that have no uncompressed value do not appear in an + "UNCOMPRESSED" field list and do not have to appear in the "CONTROL" + + + +Finking & Pelletier Standards Track [Page 9] + +RFC 4997 ROHC-FN July 2007 + + + field list either. Instead, they are only declared in the compressed + field lists where they are used. + + In the example above, all the fields that appear in the compressed + format are also found in the uncompressed format, or the control + field list, except for ctrl_field_3; this is possible because + ctrl_field_3 has no "uncompressed" value at all. Fields such as a + checksum on the compressed information fall into this category. + +3.3. Example Using IPv4 + + This section gives an overview of how the notation is used by means + of an example. The example will develop the formal notation for an + encoding method capable of compressing a single, well-known header: + the IPv4 header [RFC791]. + + The first step is to specify the overall structure of the IPv4 + header. To do this, we use an encoding method that we will call + "ipv4_header". More details on definitions of encoding methods can + be found in Section 4.12. This is notated as follows: + + ipv4_header + { + + The fragment of notation above declares the encoding method + "ipv4_header", the definition follows the opening brace (see + Section 4.12). + + Definitions within the pair of braces are local to "ipv4_header". + This scoping mechanism helps to clarify which fields belong to which + formats; it is also useful when compressing complex protocol stacks + with several headers, often with the same field names occurring in + multiple headers (see Section 4.2). + + The next step is to specify the fields contained in the uncompressed + IPv4 header to represent the uncompressed format for which the + encoding method will define one or more compressed formats. This is + accomplished using ROHC-FN as follows: + + + + + + + + + + + + + +Finking & Pelletier Standards Track [Page 10] + +RFC 4997 ROHC-FN July 2007 + + + UNCOMPRESSED { + version [ 4 ]; + header_length [ 4 ]; + dscp [ 6 ]; + ecn [ 2 ]; + length [ 16 ]; + id [ 16 ]; + reserved [ 1 ]; + dont_frag [ 1 ]; + more_fragments [ 1 ]; + offset [ 13 ]; + ttl [ 8 ]; + protocol [ 8 ]; + checksum [ 16 ]; + src_addr [ 32 ]; + dest_addr [ 32 ]; + } + + The width of each field is indicated in square brackets. This part + of the notation is used in the example for illustration to help the + reader's understanding. However, indicating the field lengths in + this way is optional since the width of each field can also normally + be derived from the encoding that is used to compress/decompress it + for a specific format. This part of the notation is formally defined + in Section 4.10. + + The next step is to specify the compressed format. This includes the + encodings for each field that map between the compressed and + uncompressed forms of the field. In the example, these encoding + methods are mainly taken from the ROHC-FN library (see Section 4.11). + Since the intention here is to illustrate the use of the notation, + rather than to describe the optimum method of compressing IPv4 + headers, this example uses only three encoding methods. + + The "uncompressed_value" encoding method (defined in Section 4.11.1) + can compress any field whose uncompressed length and value are fixed, + or can be calculated using an expression. No compressed bits need to + be sent because the uncompressed field can be reconstructed using its + known size and value. The "uncompressed_value" encoding method is + used to compress five fields in the IPv4 header, as described below: + + COMPRESSED { + header_length =:= uncompressed_value(4, 5); + version =:= uncompressed_value(4, 4); + reserved =:= uncompressed_value(1, 0); + offset =:= uncompressed_value(13, 0); + more_fragments =:= uncompressed_value(1, 0); + + + + +Finking & Pelletier Standards Track [Page 11] + +RFC 4997 ROHC-FN July 2007 + + + The first parameter indicates the length of the uncompressed field in + bits, and the second parameter gives its integer value. + + Note that the order of the fields in the compressed format is + independent of the order of the fields in the uncompressed format. + + The "irregular" encoding method (defined in Section 4.11.3) can be + used to encode any field for which both uncompressed attributes + (ULENGTH and UVALUE) are defined, and whose ULENGTH attribute is + either fixed or can be calculated using an expression. It is a fail- + safe encoding method that can be used for such fields in the case + where no other encoding method applies. All of the bits in the + uncompressed form of the field are present in the compressed form as + well; hence this encoding does not achieve any compression. + + src_addr =:= irregular(32); + dest_addr =:= irregular(32); + length =:= irregular(16); + id =:= irregular(16); + ttl =:= irregular(8); + protocol =:= irregular(8); + dscp =:= irregular(6); + ecn =:= irregular(2); + dont_frag =:= irregular(1); + + Finally, the third encoding method is specific only to the + uncompressed format defined above for the IPv4 header, + "inferred_ip_v4_header_checksum": + + checksum =:= inferred_ip_v4_header_checksum [ 0 ]; + } + } + + The "inferred_ip_v4_header_checksum" encoding method is different + from the other two encoding methods in that it is not defined in the + ROHC-FN library of encoding methods. Its definition could be given + either by using the formal notation as part of the profile definition + itself (see Section 4.12) or by using plain English text (see + Section 4.13). + + In our example, the "inferred_ip_v4_header_checksum" is a specific + encoding method that calculates the IP checksum from the rest of the + header values. Like the "uncompressed_value" encoding method, no + compressed bits need to be sent, since the field value can be + reconstructed at the decompressor. This is notated explicitly by + specifying, in square brackets, a length of 0 for the checksum field + in the compressed format. Again, this notation is optional since the + encoding method itself would be defined as sending zero compressed + + + +Finking & Pelletier Standards Track [Page 12] + +RFC 4997 ROHC-FN July 2007 + + + bits, however it is useful to the reader to include such notation + (see Section 4.10 for details on this part of the notation). + + Finally the definition of the format is terminated with a closing + brace. At this point, the above example has defined a compressed + format that can be used to represent the entire compressed IPv4 + header, and provides enough information to allow an implementation to + construct the compressed format from an uncompressed format + (compression) and vice versa (decompression). + +4. Normative Definition of ROHC-FN + + This section gives the normative definition of ROHC-FN. ROHC-FN is a + declarative language that is referentially transparent, with no side + effects. This means that whenever an expression is evaluated, there + are no other effects from obtaining the value of the expression; the + same expression is thus guaranteed to have the same value wherever it + appears in the notation, and it can always be interchanged with its + value in any of the formats it appears in (subject to the scope rules + of identifiers of Section 4.2). + + The formal notation describes the structure of the formats and the + relationships between their uncompressed and compressed forms, rather + than describing how compression and decompression is performed. + + In various places within this section, text inside angle brackets has + been used as a descriptive placeholder. The use of angle brackets in + this way is solely for the benefit of the reader of this document. + Neither the angle brackets, nor their contents form a part of the + notation. + +4.1. Structure of a Specification + + The specification of the compressed formats of a ROHC profile using + ROHC-FN is called a ROHC-FN specification. ROHC-FN specifications + are case sensitive and are written in the 7-bit ASCII character set + (as defined in [RFC2822]) and consist of a sequence of zero or more + constant definitions (Section 4.3), an optional global control field + list (Section 4.12.1.3) and one or more encoding method definitions + (Section 4.12). + + Encoding methods can be defined using the formal notation or can be + predefined encoding methods. + + Encoding methods are defined using the formal notation by giving one + or more uncompressed formats to represent the uncompressed header and + one or more compressed formats. These formats are related to each + other by "fields", each of which describes a certain part of an + + + +Finking & Pelletier Standards Track [Page 13] + +RFC 4997 ROHC-FN July 2007 + + + uncompressed and/or a compressed header. In addition to the formats, + each encoding method may contain control fields, initial values, and + default field encodings sections. The attributes of a field are + bound by using an encoding method for it and/or by using "ENFORCE" + statements (Section 4.9) within the formats. Each of these are + terminated by a semi-colon. + + Predefined encoding methods are not defined in the formal notation. + Instead they are defined by giving a short textual reference + explaining where the encoding method is defined. It is not necessary + to define the library of encoding methods contained in this document + in this way, their definition is implicit to the usage of the formal + notation. + +4.2. Identifiers + + In ROHC-FN, identifiers are used for any of the following: + + o encoding methods + + o formats + + o fields + + o parameters + + o constants + + All identifiers may be of any length and may contain any combination + of alphanumeric characters and underscores, within the restrictions + defined in this section. + + All identifiers must start with an alphabetic character. + + It is illegal to have two or more identifiers that differ from each + other only in capitalisation, in the same scope. + + All letters in identifiers for constants must be upper case. + + It is illegal to use any of the following as identifiers (including + alternative capitalisations): + + o "false", "true" + + o "ENFORCE", "THIS", "VARIABLE" + + o "ULENGTH", "UVALUE" + + + + +Finking & Pelletier Standards Track [Page 14] + +RFC 4997 ROHC-FN July 2007 + + + o "CLENGTH", "CVALUE" + + o "UNCOMPRESSED", "COMPRESSED", "CONTROL", "INITIAL", or "DEFAULT" + + Format names cannot be referred to in the notation, although they are + considered to be identifiers. (See Section 4.12.3.1 for more details + on format names.) + + All identifiers used in ROHC-FN have a "scope". The scope of an + identifier defines the parts of the specification where that + identifier applies and from which it can be referred to. If an + identifier has a "global" scope, then it applies throughout the + specification that contains it and can be referred to from anywhere + within it. If an identifier has a "local" scope, then it only + applies to the encoding method in which it is defined, it cannot be + referenced from outside the local scope of that encoding method. If + an identifier has a local scope, that identifier can therefore be + used in multiple different local scopes to refer to different items. + + All instances of an identifier within its scope refer to the same + item. It is not possible to have different items referred to by a + single identifier within any given scope. For this reason, if there + is an identifier that has global scope it cannot be used separately + in a local scope, since a globally-scoped identifier is already + applicable in all local scopes. + + The identifiers for each encoding method and each constant all have a + global scope. Each format and field also has an identifier. The + scope of format and field identifiers is local, with the exception of + global control fields, which have a global scope. Therefore it is + illegal for a format or field to have the same identifier as another + format or field within the same scope, or as an encoding method or a + constant (since they have global scope). + + Note that although format names (see Section 4.12.3.1) are considered + to be identifiers, they are not referred to in the notation, but are + primarily for the benefit of the reader. + +4.3. Constant Definitions + + Constant values can be defined using the "=" operator. Identifiers + for constants must be all upper case. For example: + + SOME_CONSTANT = 3; + + Constants are defined by an expression (see Section 4.7) on the + right-hand side of the "=" operator. The expression must yield a + constant value. That is, the expression must be one whose terms are + + + +Finking & Pelletier Standards Track [Page 15] + +RFC 4997 ROHC-FN July 2007 + + + all either constants or literals and must not vary depending on the + header being compressed. + + Constants have a global scope. Constants must be defined at the top + level, outside any encoding method definition. Constants are + entirely equivalent to the value they refer to, and are completely + interchangeable with that value. Unlike field attributes, which may + change from packet to packet, constants have the same value for all + packets. + +4.4. Fields + + Fields are the basic building blocks of a ROHC-FN specification. + Fields are the units into which headers are divided. Each field may + have two forms: a compressed form and an uncompressed form. Both + forms are represented as bits exchanged between the compressor and + the decompressor in the same way, as an unsigned string of bits; the + most significant bit first. + + The properties of the compressed form of a field are defined by an + encoding method and/or "ENFORCE" statements. This entirely + characterises the relationship between the uncompressed and + compressed forms of that field. This is achieved by specifying the + relationships between the field's attributes. + + The notation defines four field attributes, two for the uncompressed + form and a corresponding two for the compressed form. The attributes + available for each field are: + + uncompressed attributes of a field: + + o "UVALUE" and "ULENGTH", + + compressed attributes of a field: + + o "CVALUE" and "CLENGTH". + + The two value attributes contain the respective numerical values of + the field, i.e., "UVALUE" gives the numerical value of the + uncompressed form of the field, and the attribute "CVALUE" gives the + numerical value of the compressed form of the field. The numerical + values are derived by interpreting the bit-string representations of + the field as bit strings; the most significant bit first. + + The two length attributes indicate the length in bits of the + associated bit string; "ULENGTH" for the uncompressed form, and + "CLENGTH" for the compressed form. + + + + +Finking & Pelletier Standards Track [Page 16] + +RFC 4997 ROHC-FN July 2007 + + + Attributes are undefined unless they are bound to a value, in which + case they become defined. If two conflicting bindings are given for + a field attribute then the bindings fail along with the (combination + of) formats in which those bindings were defined. + + Uncompressed attributes do not always reflect an aspect of the + uncompressed header. Some fields do not originate from the + uncompressed header, but are control fields. + +4.4.1. Attribute References + + Attributes of a particular field are formally referred to by using + the field's name followed by a "." and the attribute's identifier. + + For example: + + rtp_seq_number.UVALUE + + The above gives the uncompressed value of the rtp_seq_number field. + The primary reason for referencing attributes is for use in + expressions, which are explained in Section 4.7. + +4.4.2. Representation of Field Values + + Fields are represented as bit strings. The bit string is calculated + using the value attribute ("val") and the length attribute ("len"). + The bit string is the binary representation of "val % (2 ^ len)". + + For example, if a field's "CLENGTH" attribute was 8, and its "CVALUE" + attribute was -1, the compressed representation of the field would be + "-1 % (2 ^ 8)", which equals "-1 % 256", which equals 255, 11111111 + in binary. + + ROHC-FN supports the full range of integers for use in expressions + (see Section 4.7), but the representation of the formats (i.e., the + bits exchanged between the compressor and the decompressor) is in the + above form. + +4.5. Grouping of Fields + + Since the order of fields in a "COMPRESSED" field list + (Section 4.12.1.2) do not have to be the same as the order of fields + in an "UNCOMPRESSED" field list (Section 4.12.1.1), it is possible to + group together any number of fields that are contiguous in a + "COMPRESSED" format, to allow them all to be encoded using a single + encoding method. The group of fields is specified immediately to the + left of "=:=" in place of a single field name. + + + + +Finking & Pelletier Standards Track [Page 17] + +RFC 4997 ROHC-FN July 2007 + + + The group is notated by giving a colon-separated list of the fields + to be grouped together. For example there may be two non-contiguous + fields in an uncompressed header that are two halves of what is + effectively a single sequence number: + + grouping_example + { + UNCOMPRESSED { + minor_seq_num; // 12 bits + other_field; // 8 bits + major_seq_num; // 4 bits + } + + COMPRESSED { + other_field =:= irregular(8); + major_seq_num + : minor_seq_num =:= lsb(3, 0); + } + } + + The group of fields is presented to the encoding method as a + contiguous group of bits, assembled by the concatenation of the + fields in the order they are given in the group. The most + significant bit of the combined field is the most significant bit of + the first field in the list, and the least significant bit of the + combined field is the least significant bit of the last field in the + list. + + Finally, the length attributes of the combined field are equal to the + sum of the corresponding length attributes for all the fields in the + group. + +4.6. "THIS" + + Within the definition of an encoding method, it is possible to refer + to the field (i.e., the group of contiguous bits) the method is + encoding, using the keyword "THIS". + + This is useful for gaining access to the attributes of the field + being encoded. For example it is often useful to know the total + uncompressed length of the uncompressed format that is being encoded: + + THIS.ULENGTH + + + + + + + + +Finking & Pelletier Standards Track [Page 18] + +RFC 4997 ROHC-FN July 2007 + + +4.7. Expressions + + ROHC-FN includes the usual infix style of expressions, with + parentheses "(" and ")" used for grouping. Expressions can be made + up of any of the components described in the following subsections. + + The semantics of expressions are generally similar to the expressions + in the ANSI-C programming language [C90]. The definitive list of + expressions in ROHC-FN follows in the next subsections; the list + below provides some examples of the difference between expressions in + ANSI-C and expressions in ROHC-FN: + + o There is no limit on the range of integers. + + o "x ^ y" evaluates to x raised to the power of y. This has a + precedence higher than *, / and %, but lower than unary - and is + right to left associative. + + o There is no comma operator. + + o There are no "modify" operators (no assignment operators and no + increment or decrement). + + o There are no bitwise operators. + + Expressions may refer to any of the attributes of a field (as + described in Section 4.4), to any defined constant (see Section 4.3) + and also to encoding method parameters, if any are in scope (see + Section 4.12). + + If any of the attributes, constants, or parameters used in the + expression are undefined, the value of the expression is undefined. + Undefined expressions cause the environment (for example, the + compressed format) in which they are used to fail if a defined value + is required. Defined values are required for all compressed + attributes of fields that appear in the compressed format. Defined + values are not required for all uncompressed attributes of fields + which appear in the uncompressed format. It is up to the profile + creator to define what happens to the unbound field attributes in + this case. It should be noted that in such a case, transparency of + the compression process will be lost; i.e., it will not be possible + for the decompressor to reproduce the original header. + + Expressions cannot be used as encoding methods directly because they + do not completely characterise a field. Expressions only specify a + single value whereas a field is made up of several values: its + attributes. For example, the following is illegal: + + + + +Finking & Pelletier Standards Track [Page 19] + +RFC 4997 ROHC-FN July 2007 + + + tcp_list_length =:= (data_offset + 20) / 4; + + There is only enough information here to define a single attribute of + "tcp_list_length". Although this makes no sense formally, this could + intuitively be read as defining the "UVALUE" attribute. However, + that would still leave the length of the uncompressed field undefined + at the decompressor. Such usage is therefore prohibited. + +4.7.1. Integer Literals + + Integers can be expressed as decimal values, binary values (prefixed + by "0b"), or hexadecimal values (prefixed by "0x"). Negative + integers are prefixed by a "-" sign. For example "10", "0b1010", and + "-0x0a" are all valid integer literals, having the values 10, 10, and + -10 respectively. + +4.7.2. Integer Operators + + The following "integer" operators are available, which take integer + arguments and return an integer result: + + o ^, for exponentiation. "x ^ y" returns the value of "x" to the + power of "y". + + o *, / for multiplication and division. "x * y" returns the product + of "x" and "y". "x / y" returns the quotient, rounded down to the + next integer (the next one towards negative infinity). + + o +, - for addition and subtraction. "x + y" returns the sum of "x" + and "y". "x - y" returns the difference. + + o % for modulo. "x % y" returns "x" modulo "y"; x - y * (x / y). + +4.7.3. Boolean Literals + + The boolean literals are "false", and "true". + +4.7.4. Boolean Operators + + The following "boolean" operators are available, which take boolean + arguments and return a boolean result: + + o &&, for logical "and". Returns true if both arguments are true. + Returns false otherwise. + + o ||, for logical "or". Returns true if at least one argument is + true. Returns false otherwise. + + + + +Finking & Pelletier Standards Track [Page 20] + +RFC 4997 ROHC-FN July 2007 + + + o !, for logical "not". Returns true if its argument is false. + Returns false otherwise. + +4.7.5. Comparison Operators + + The following "comparison" operators are available, which take + integer arguments and return a boolean result: + + o ==, !=, for equality and its negative. "x == y" returns true if x + is equal to y. Returns false otherwise. "x != y" returns true if + x is not equal to y. Returns false otherwise. + + o <, >, for less than and greater than. "x < y" returns true if x is + less than y. Returns false otherwise. "x > y" returns true if x + is greater than y. Returns false otherwise. + + o >=, <=, for greater than or equal and less than or equal, the + inverse functions of <, >. "x >= y" returns false if x is less + than y. Returns true otherwise. "x <= y" returns false if x is + greater than y. Returns true otherwise. + +4.8. Comments + + Free English text can be inserted into a ROHC-FN specification to + explain why something has been done a particular way, to clarify the + intended meaning of the notation, or to elaborate on some point. + + The FN uses an end of line comment style, which makes use of the "//" + comment marker. Any text between the "//" marker and the end of the + line has no formal meaning. For example: + + //----------------------------------------------------------------- + // IR-REPLICATE header formats + //----------------------------------------------------------------- + + // The following fields are included in all of the IR-REPLICATE + // header formats: + // + UNCOMPRESSED { + discriminator; // 8 bits + tcp_seq_number; // 32 bits + tcp_flags_ecn; // 2 bits + + Comments do not affect the formal meaning of what is notated, but can + be used to improve readability. Their use is optional. + + Comments may help to provide clarifications to the reader, and serve + different purposes to implementers. Comments should thus not be + + + +Finking & Pelletier Standards Track [Page 21] + +RFC 4997 ROHC-FN July 2007 + + + considered of lesser importance when inserting them into a ROHC-FN + specification; they should be consistent with the normative part of + the specification. + +4.9. "ENFORCE" Statements + + The "ENFORCE" statement provides a way to add predicates to a format, + all of which must be fulfilled for the format to succeed. An + "ENFORCE" statement shares some similarities with an encoding method. + Specifically, whereas an encoding method binds several field + attributes at once, an "ENFORCE" statement typically binds just one + of them. In fact, all the bindings that encoding methods create can + be expressed in terms of a collection of "ENFORCE" statements. Here + is an example "ENFORCE" statement which binds the "UVALUE" attribute + of a field to 5. + + ENFORCE(field.UVALUE == 5); + + An "ENFORCE" statement must only be used inside a field list (see + Section 4.12). It attempts to force the expression given to be true + for the format that it belongs to. + + An abbreviated form of an "ENFORCE" statement is available for + binding length attributes using "[" and "]", see Section 4.10. + + Like an encoding method, an "ENFORCE" statement can only be + successfully used in a format if the binding it describes is + achievable. A format containing the example "ENFORCE" statement + above would not be usable if the field had also been bound within + that same format with "uncompressed_value" encoding, which gave it a + "UVALUE" other than 5. + + An "ENFORCE" statement takes a boolean expression as a parameter. It + can be used to assert that the expression is true, in order to choose + a particular format from a list of possible formats specified in an + encoding method (see Section 4.12), or just to bind an expression as + in the example above. The general form of an "ENFORCE" statement is + therefore: + + ENFORCE(<boolean expression>); + + There are three possible conditions that the expression may be in: + + 1. The boolean expression evaluates to false, in which case the + local scope of the format that contains the "ENFORCE" statement + cannot be used. + + + + + +Finking & Pelletier Standards Track [Page 22] + +RFC 4997 ROHC-FN July 2007 + + + 2. The boolean expression evaluates to true, in which case the + binding is created and successful. + + 3. The value of the boolean expression is undefined. In this case, + the binding is also created and successful. + + In all three cases, any undefined term becomes bound by the + expression. Generally speaking, an "ENFORCE" statement is either + being used as an assignment (condition 3 above) or being used to test + if a particular format is usable, as is the case with conditions 1 + and 2. + +4.10. Formal Specification of Field Lengths + + In many of the examples each field has been followed by a comment + indicating the length of the field. Indicating the length of a field + like this is optional, but can be very helpful for the reader. + However, whilst useful to the reader, comments have no formal + meaning. + + One of the most common uses for "ENFORCE" statements (see + Section 4.9) is to explicitly define the length of a field within a + header. Using "ENFORCE" statements for this purpose has formal + meaning but is not so easy to read. Therefore, an abbreviated form + is provided for this use of "ENFORCE", which is both easy to read and + has formal meaning. + + An expression defining the length of a field can be specified in + square brackets after the appearance of that field in a format. If + the field can take several alternative lengths, then the expressions + defining those lengths can be enumerated as a comma separated list + within the square brackets. For example: + + field_1 [ 4 ]; + field_2 [ a+b, 2 ]; + field_3 =:= lsb(16, 16) [ 26 ]; + + The actual length attribute, which is bound by this notation, depends + on whether it appears in a "COMPRESSED", "UNCOMPRESSED", or "CONTROL" + field list (see Section 4.12.1 and its subsections). In a + "COMPRESSED" field list, the field's "CLENGTH" attribute is bound. + In "UNCOMPRESSED" and "CONTROL" field lists, the field's "ULENGTH" + attribute is bound. Abbreviated "ENFORCE" statements are not allowed + in "DEFAULT" sections (see Section 4.12.1.5). Therefore, the above + notation would not be allowed to appear in a "DEFAULT" section. + However, if the above appeared in an "UNCOMPRESSED" or "CONTROL" + section, it would be equivalent to: + + + + +Finking & Pelletier Standards Track [Page 23] + +RFC 4997 ROHC-FN July 2007 + + + field_1; ENFORCE(field_1.ULENGTH == 4); + field_2; ENFORCE((field_2.ULENGTH == 2) + || (field_2.ULENGTH == a+b)); + field_3 =:= lsb(16, 16); ENFORCE(field_3.ULENGTH == 26); + + A special case exists for fields that have a variable length that the + notator does not wish, or is not able to, define using an expression. + The keyword "VARIABLE" can be used in the following case: + + variable_length_field [ VARIABLE ]; + + Formally, this provides no restrictions on the field length, but maps + onto any positive integer or to a value of zero. It will therefore + be necessary to define the length of the field elsewhere (see the + final paragraphs of Section 4.12.1.1 and Section 4.12.1.2). This may + either be in the notation or in the English text of the profile + within which the FN is contained. Within the square brackets, the + keyword "VARIABLE" may be used as a term in an expression, just like + any other term that normally appears in an expression. For example: + + field [ 8 * (5 + VARIABLE) ]; + + This defines a field whose length is a whole number of octets and at + least 40 bits (5 octets). + +4.11. Library of Encoding Methods + + A number of common techniques for compressing header fields are + defined as part of the ROHC-FN library so that they can be reused + when creating new ROHC-FN specifications. Their notation is + described below. + + As an alternative, or a complement, to this library of encoding + methods, a ROHC-FN specification can define its own set of encoding + methods, using the formal notation (see Section 4.12) or using a + textual definition (see Section 4.13). + +4.11.1. uncompressed_value + + The "uncompressed_value" encoding method is used to encode header + fields for which the uncompressed value can be defined using a + mathematical expression (including constant values). This encoding + method is defined as follows: + + + + + + + + +Finking & Pelletier Standards Track [Page 24] + +RFC 4997 ROHC-FN July 2007 + + + uncompressed_value(len, val) { + UNCOMPRESSED { + field; + ENFORCE(field.ULENGTH == len); + ENFORCE(field.UVALUE == val); + } + COMPRESSED { + field; + ENFORCE(field.CLENGTH == 0); + } + } + + To exemplify the usage of "uncompressed_value" encoding, the IPv6 + header version number is a 4-bit field that always has the value 6: + + version =:= uncompressed_value(4, 6); + + Here is another example of value encoding, using an expression to + calculate the length: + + padding =:= uncompressed_value(nbits - 8, 0); + + The expression above uses an encoding method parameter, "nbits", that + in this example specifies how many significant bits there are in the + data to calculate how many pad bits to use. See Section 4.12.2 for + more information on encoding method parameters. + +4.11.2. compressed_value + + The "compressed_value" encoding method is used to define fields in + compressed formats for which there is no counterpart in the + uncompressed format (i.e., control fields). It can be used to + specify compressed fields whose value can be defined using a + mathematical expression (including constant values). This encoding + method is defined as follows: + + compressed_value(len, val) { + UNCOMPRESSED { + field; + ENFORCE(field.ULENGTH == 0); + } + COMPRESSED { + field; + ENFORCE(field.CLENGTH == len); + ENFORCE(field.CVALUE == val); + } + } + + + + +Finking & Pelletier Standards Track [Page 25] + +RFC 4997 ROHC-FN July 2007 + + + One possible use of this encoding method is to define padding in a + compressed format: + + pad_to_octet_boundary =:= compressed_value(3, 0); + + A more common use is to define a discriminator field to make it + possible to differentiate between different compressed formats within + an encoding method (see Section 4.12). For convenience, the notation + provides syntax for specifying "compressed_value" encoding in the + form of a binary string. The binary string to be encoded is simply + given in single quotes; the "CLENGTH" attribute of the field binds + with the number of bits in the string, while its "CVALUE" attribute + binds with the value given by the string. For example: + + discriminator =:= '01101'; + + This has exactly the same meaning as: + + discriminator =:= compressed_value(5, 13); + +4.11.3. irregular + + The "irregular" encoding method is used to encode a field in the + compressed format with a bit pattern identical to the uncompressed + field. This encoding method is defined as follows: + + irregular(len) { + UNCOMPRESSED { + field; + ENFORCE(field.ULENGTH == len); + } + COMPRESSED { + field; + ENFORCE(field.CLENGTH == len); + ENFORCE(field.CVALUE == field.UVALUE); + } + } + + For example, the checksum field of the TCP header is a 16-bit field + that does not follow any predictable pattern from one header to + another (and so it cannot be compressed): + + tcp_checksum =:= irregular(16); + + Note that the length does not have to be constant, for example, an + expression can be used to derive the length of the field from the + value of another field. + + + + +Finking & Pelletier Standards Track [Page 26] + +RFC 4997 ROHC-FN July 2007 + + +4.11.4. static + + The "static" encoding method compresses a field whose length and + value are the same as for a previous header in the flow, i.e., where + the field completely matches an existing entry in the context: + + field =:= static; + + The field's "UVALUE" and "ULENGTH" attributes bind with their + respective values in the context and the "CLENGTH" attribute is bound + to zero. + + Since the field value is the same as a previous field value, the + entire field can be reconstructed from the context, so it is + compressed to zero bits and does not appear in the compressed format. + + For example, the source port of the TCP header is a field whose value + does not change from one packet to the next for a given flow: + + src_port =:= static; + +4.11.5. lsb + + The least significant bits encoding method, "lsb", compresses a field + whose value differs by a small amount from the value stored in the + context. The least significant bits of the field value are + transmitted instead of the original field value. + + field =:= lsb(<num_lsbs_param>, <offset_param>); + + Here, "num_lsbs_param" is the number of least significant bits to + use, and "offset_param" is the interpretation interval offset as + defined below. + + The parameter "num_lsbs_param" binds with the "CLENGTH" attribute, + the "UVALUE" attribute binds to the value within the interval whose + least significant bits match the "CVALUE" attribute. The value of + the "ULENGTH" can be derived from the information stored in the + context. + + For example, the TCP sequence number: + + tcp_sequence_number =:= lsb(14, 8192); + + This takes up 14 bits, and can communicate any value that is between + 8192 lower than the value of the field stored in context and 8191 + above it. + + + + +Finking & Pelletier Standards Track [Page 27] + +RFC 4997 ROHC-FN July 2007 + + + The interpretation interval can be described as a function of a value + stored in the context, ref_value, and of num_lsbs_param: + + f(context_value, num_lsbs_param) = [ref_value - offset_param, + ref_value + (2^num_lsbs_param - 1) - offset_param] + + where offset_param is an integer. + + <-- interpretation interval (size is 2^num_lsbs_param) --> + |---------------------------+----------------------------| + lower ref_value upper + bound bound + + where: + + lower bound = ref_value - offset_param + upper bound = ref_value + (2^num_lsbs_param-1) - offset_param + + The "lsb" encoding method can therefore compress a field whose value + lies between the lower and the upper bounds, inclusively, of the + interpretation interval. In particular, if offset_param = 0, then + the field value can only stay the same or increase relative to the + reference value ref_value. If offset_param = -1, then it can only + increase, whereas if offset_param = 2^num_lsbs_param, then it can + only decrease. + + The compressed field takes up the specified number of bits in the + compressed format (i.e., num_lsbs_param). + + The compressor may not be able to determine the exact reference value + stored in the decompressor context and that will be used by the + decompressor, since some packets that would have updated the context + may have been lost or damaged. However, from feedback received or by + making assumptions, the compressor can limit the candidate set of + values. The compressor can then select a format that uses "lsb" + encoding, defined with suitable values for its parameters + num_lsbs_param and offset_param, such that no matter which context + value in the candidate set the decompressor uses, the resulting + decompression is correct. If that is not possible, the "lsb" + encoding method fails (which typically results in a less efficient + compressed format being chosen by the compressor). How the + compressor determines what reference values it stores and maintains + in its set of candidate references is outside the scope of the + notation. + + + + + + + +Finking & Pelletier Standards Track [Page 28] + +RFC 4997 ROHC-FN July 2007 + + +4.11.6. crc + + The "crc" encoding method provides a CRC calculated over a block of + data. The algorithm used to calculate the CRC is the one specified + in [RFC4995]. The "crc" method takes a number of parameters: + + o the number of bits for the CRC (crc_bits), + + o the bit-pattern for the polynomial (bit_pattern), + + o the initial value for the CRC register (initial_value), + + o the value of the block of data, represented using either the + "UVALUE" or "CVALUE" attribute of a field (block_data_value); and + + o the size in octets of the block of data (block_data_length). + + That is: + + field =:= crc(<num_bits>, <bit_pattern>, <initial_value>, + <block_data_value>, <block_data_length>); + + When specifying the bit pattern for the polynomial, each bit + represents the coefficient for the corresponding term in the + polynomial. Note that the highest order term is always present (by + definition) and therefore does not need specifying in the bit + pattern. Therefore, a CRC polynomial with n terms in it is + represented by a bit pattern with n-1 bits set. + + The CRC is calculated in least significant bit (LSB) order. + + For example: + + // 3 bit CRC, C(x) = x^0 + x^1 + x^3 + crc_field =:= crc(3, 0x6, 0xF, THIS.CVALUE, THIS.CLENGTH); + + Usage of the "THIS" keyword (see Section 4.6) as shown above, is + typical when using "crc" encoding. For example, when used in the + encoding method for an entire header, it causes the CRC to be + calculated over all fields in the header. + +4.12. Definition of Encoding Methods + + New encoding methods can be defined in a formal specification. These + compose groups of individual fields into a contiguous block. + + Encoding methods have names and may have parameters; they can also be + used in the same way as any other encoding method from the library of + + + +Finking & Pelletier Standards Track [Page 29] + +RFC 4997 ROHC-FN July 2007 + + + encoding methods. Since they can contain references to other + encoding methods, complicated formats can be broken down into + manageable pieces in a hierarchical fashion. + + This section describes the various features used to define new + encoding methods. + +4.12.1. Structure + + This simplest form of defining an encoding method is to specify a + single encoding. For example: + + compound_encoding_method + { + UNCOMPRESSED { + field_1; // 4 bits + field_2; // 12 bits + } + + COMPRESSED { + field_2 =:= uncompressed_value(12, 9); // 0 bits + field_1 =:= irregular(4); // 4 bits + } + } + + The above begins with the new method's identifier, + "compound_encoding_method". The definition of the method then + follows inside curly brackets, "{" and "}". The first item in the + definition is the "UNCOMPRESSED" field list, which gives the order of + the fields in the uncompressed format. This is followed by the + compressed format field list ("COMPRESSED"). This list gives the + order of fields in the compressed format and also gives the encoding + method for each field. + + In the example, both the formats list each field exactly once. + However, sometimes it is necessary to specify more than one binding + for a given field, which means it appears more than once in the field + list. In this case, it is the first occurrence of the field in the + list that indicates its position in the field order. The subsequent + occurrences of the field only specify binding information, not field + order information. + + The different components of this example are described in more detail + below. Other components that can be used in the definition of + encoding methods are also defined thereafter. + + + + + + +Finking & Pelletier Standards Track [Page 30] + +RFC 4997 ROHC-FN July 2007 + + +4.12.1.1. Uncompressed Format - "UNCOMPRESSED" + + The uncompressed field list is defined by "UNCOMPRESSED", which + specifies the fields of the uncompressed format in the order that + they appear in the uncompressed header. The sum of the lengths of + each individual uncompressed field in the list must be equal to the + length of the field being encoded. Finally, the representation of + the uncompressed format described using the list of fields in the + "UNCOMPRESSED" section, for which compressed formats are being + defined, always consists of one single contiguous block of bits. + + In the example above in Section 4.12.1, the uncompressed field list + is "field_1", followed by "field_2". This means that a field being + encoded by this method is divided into two subfields, "field_1" and + "field_2". The total uncompressed length of these two fields + therefore equals the length of the field being encoded: + + field_1.ULENGTH + field_2.ULENGTH == THIS.ULENGTH + + In the example, there are only two fields, but any number of fields + may be used. This relationship applies to however many fields are + actually used. Any arrangement of fields that efficiently describes + the content of the uncompressed header may be chosen -- this need not + be the same as the one described in the specifications for the + protocol header being compressed. + + For example, there may be a protocol whose header contains a 16-bit + sequence number, but whose sessions tend to be short-lived. This + would mean that the high bits of the sequence number are almost + always constant. The "UNCOMPRESSED" format could reflect this by + splitting the original uncompressed field into two fields, one field + to represent the almost-always-zero part of the sequence number, and + a second field to represent the salient part. + + An "UNCOMPRESSED" field list may specify encoding methods in the same + way as the "COMPRESSED" field list in the example. Encoding methods + specified therein are used whenever a packet with that uncompressed + format is being encoded. The encoding of a packet with a given + uncompressed format can only succeed if all of its encoding methods + and "ENFORCE" statements succeed (see Section 4.9). + + The total length of each uncompressed format must always be defined. + The length of each of the fields in an uncompressed format must also + be defined. This means that the bindings in the "UNCOMPRESSED", + "COMPRESSED" (see Section 4.12.1.2 below), "CONTROL" (see + Section 4.12.1.3 below), "INITIAL" (see Section 4.12.1.4 below), and + "DEFAULT" (see Section 4.12.1.5 below) field lists must, between + them, define the "ULENGTH" attribute of every field in an + + + +Finking & Pelletier Standards Track [Page 31] + +RFC 4997 ROHC-FN July 2007 + + + uncompressed format so that there is an unambiguous mapping from the + bits in the uncompressed format to the fields listed in the + "UNCOMPRESSED" field list. + +4.12.1.2. Compressed Format - "COMPRESSED" + + Similar to the uncompressed field list, the fields in the compressed + header will appear in the order specified by the compressed field + list given for a compressed format. Each individual field is encoded + in the manner given for that field. The total length of the + compressed data will be the sum of the compressed lengths of all the + individual fields. In the example from Section 4.12.1, the encoding + methods used for these fields indicate that they are zero and 4 bits + long, making a total of 4 bits. + + The order of the fields specified in a "COMPRESSED" field list does + not have to match the order they appear in the "UNCOMPRESSED" field + list. It may be desirable to reorder the fields in the compressed + format to align the compressed header to the octet boundary, or for + other reasons. In the above example, the order is in fact the + opposite of that in the uncompressed format. + + The compressed field list specifies that the encoding for "field_1" + is "irregular", and takes up 4 bits in both the compressed format and + uncompressed format. The encoding for "field_2" is + "uncompressed_value", which means that the field has a fixed value, + so it can be compressed to zero bits. The value it takes is 9, and + it is 12 bits wide in the uncompressed format. + + Fields like "field_2", which compress to zero bits in length, may + appear anywhere in the field list without changing the compressed + format because their position in the list is not significant. In + fact, if the encoding method for this field were defined elsewhere + (for example, in the "UNCOMPRESSED" section), this field could be + omitted from the "COMPRESSED" section altogether: + + compound_encoding_method + { + UNCOMPRESSED { + field_1; // 4 bits + field_2 =:= uncompressed_value(12, 9); // 12 bits + } + + COMPRESSED { + field_1 =:= irregular(4); // 4 bits + } + } + + + + +Finking & Pelletier Standards Track [Page 32] + +RFC 4997 ROHC-FN July 2007 + + + The total length of each compressed format must always be defined. + The length of each of the fields in a compressed format must also be + defined. This means that the bindings in the "UNCOMPRESSED", + "COMPRESSED", "CONTROL" (see Section 4.12.1.3 below), "INITIAL" (see + Section 4.12.1.4 below), and "DEFAULT" (see Section 4.12.1.5 below) + field lists must between them define the "CLENGTH" attribute of every + field in a compressed format so that there is an unambiguous mapping + from the bits in the compressed format to the fields listed in the + "COMPRESSED" field list. + +4.12.1.3. Control Fields - "CONTROL" + + Control fields are defined using the "CONTROL" field list. The + control field list specifies all fields that do not appear in the + uncompressed format, but that have an uncompressed value + (specifically those with an "ULENGTH" greater than zero). Such + fields may be used to help compress fields from the uncompressed + format more efficiently. A control field could be used to improve + efficiency by representing some commonality between a number of the + uncompressed fields, or by representing some information about the + flow that is not explicitly contained in the protocol headers. + + For example in IPv4, the behaviour of the IP-ID field in a flow + varies depending on how the endpoints handle IP-IDs. Sometimes the + behaviour is effectively random and sometimes the IP-ID follows a + predictable sequence. The type of IP-ID behaviour is information + that is never communicated explicitly in the uncompressed header. + + However, a profile can still be designed to identify the behaviour + and adjust the compression strategy according to the identified + behaviour, thereby improving the compression performance. To do so, + the ROHC-FN specification can introduce an explicit field to + communicate the IP-ID behaviour in compressed format -- this is done + by introducing a control field: + + ipv4 + { + UNCOMPRESSED { + version; // 4 bits + hdr_length; // 4 bits + protocol; // 8 bits + dscp; // 6 bits + ip_ecn_flags; // 2 bits + ttl_hopl; // 8 bits + df; // 1 bit + mf; // 1 bit + rf; // 1 bit + frag_offset; // 13 bits + + + +Finking & Pelletier Standards Track [Page 33] + +RFC 4997 ROHC-FN July 2007 + + + ip_id; // 16 bits + src_addr; // 32 bits + dst_addr; // 32 bits + checksum; // 16 bits + length; // 16 bits + } + + CONTROL { + ip_id_behavior; // 1 bit + : + : + + The "CONTROL" field list is equivalent to the "UNCOMPRESSED" field + list for fields that do not appear in the uncompressed format. It + defines a field that has the same properties (the same defined + attributes, etc.) as fields appearing in the uncompressed format. + + Control fields are initialised by using the appropriate encoding + methods and/or by using "ENFORCE" statements. This may be done + inside the "CONTROL" field list. + + For example: + + example_encoding_method_definition + { + UNCOMPRESSED { + field_1 =:= some_encoding; + } + + CONTROL { + scaled_field; + ENFORCE(scaled_field.UVALUE == field_1.UVALUE / 8); + ENFORCE(scaled_field.ULENGTH == field_1.ULENGTH - 3); + } + + COMPRESSED { + scaled_field =:= lsb(4, 0); + } + } + + This control field is used to scale down a field in the uncompressed + format by a factor of 8 before encoding it with the "lsb" encoding + method. Scaling it down makes the "lsb" encoding more efficient. + + Control fields may also be used with a global scope. In this case, + their declaration must be outside of any encoding method definition. + They are then visible within any encoding method, thus allowing + information to be shared between encoding methods directly. + + + +Finking & Pelletier Standards Track [Page 34] + +RFC 4997 ROHC-FN July 2007 + + +4.12.1.4. Initial Values - "INITIAL" + + In order to allow fields in the very first usage of a specific format + to be compressed with "static", "lsb", or other encoding methods that + depend on the context, it is possible to specify initial bindings for + such fields. This is done using "INITIAL", for example: + + INITIAL { + field =:= uncompressed_value(4, 6); + } + + This initialises the "UVALUE" of "field" to 6 and initialises its + "ULENGTH" to 4. Unlike all other bindings specified in the formal + notation, these bindings are applied to the context of the field, if + the field's context is undefined. This is particularly useful when + using encoding methods that rely on context being present, such as + "static" or "lsb", with the first packet in a flow. + + Because the "INITIAL" field list is used to bind the context alone, + it makes no sense to specify initial bindings that themselves rely on + the context, for example, "lsb". Such usage is not allowed. + +4.12.1.5. Default Field Bindings - "DEFAULT" + + Default bindings may be specified for each field or attribute. The + default encoding methods specify the encoding method to use for a + field if no binding is given elsewhere for the value of that field. + This is helpful to keep the definition of the formats concise, as the + same encoding method need not be repeated for every format, when + defining multiple formats (see Section 4.12.3). + + Default bindings are optional and may be given for any combination of + fields and attributes which are in scope. + + The syntax for specifying default bindings is similar to that used to + specify a compressed or uncompressed format. However, the order of + the fields in the field list does not affect the order of the fields + in either the compressed or uncompressed format. This is because the + field order is specified individually for each "COMPRESSED" format + and "UNCOMPRESSED" format. + + Here is an example: + + DEFAULT { + field_1 =:= uncompressed_value(4, 1); + field_2 =:= uncompressed_value(4, 2); + field_3 =:= lsb(3, -1); + ENFORCE(field_4.ULENGTH == 4); + + + +Finking & Pelletier Standards Track [Page 35] + +RFC 4997 ROHC-FN July 2007 + + + } + + Here default bindings are specified for fields 1 to 3. A default + binding for the "ULENGTH" attribute of field_4 is also specified. + + Fields for which there is a default encoding method do not need their + bindings to be specified in the field list of any format that uses + the default encoding method for that field. Any format that does not + use the default encoding method must explicitly specify a binding for + the value of that field's attributes. + + If elsewhere a binding is not specified for the attributes of a + field, the default encoding method is used. If the default encoding + method always compresses the field down to zero bits, the field can + be omitted from the compressed format's field list. Like any other + zero-bit field, its position in the field list is not significant. + + The "DEFAULT" field list may contain default bindings for individual + attributes by using "ENFORCE" statements. A default binding for an + individual attribute will only be used if elsewhere there is no + binding given for that attribute or the field to which it belongs. + If elsewhere there is an "ENFORCE" statement binding that attribute, + or an encoding method binding the field to which it belongs, the + default binding for the attribute will not be used. This applies + even if the specified encoding method does not bind the particular + attribute given in the "DEFAULT" section. However, an "ENFORCE" + statement elsewhere that only binds the length of the field still + allows the default bindings to be used, except for default "ENFORCE" + statements which bind nothing but the field's length. + + To clarify, assuming the default bindings given in the example above, + the first three of the following four compressed formats would not + use the default binding for "field_4.ULENGTH": + + COMPRESSED format1 { + ENFORCE(field_4.ULENGTH == 3); // set ULENGTH to 3 + ENFORCE(field_4.UVALUE == 7); // set UVALUE to 7 + } + + COMPRESSED format2 { + field_4 =:= irregular(3); // set ULENGTH to 3 + } + + COMPRESSED format3 { + field_4 =:= '1010'; // set ULENGTH to zero + } + + + + + +Finking & Pelletier Standards Track [Page 36] + +RFC 4997 ROHC-FN July 2007 + + + COMPRESSED format4 { + ENFORCE(field_4.UVALUE == 12); // use default ULENGTH + } + + The fourth format is the only one that uses the default binding for + "field_4.ULENGTH". + + In summary, the default bindings of an encoding method are only used + for formats that do not already specify a binding for the value of + all of their fields. For the formats that do use default bindings, + only those fields and attributes whose bindings are not specified are + looked up in the "DEFAULT" field list. + +4.12.2. Arguments + + Encoding methods may take arguments that control the mapping between + compressed and uncompressed fields. These are specified immediately + after the method's name, in parentheses, as a comma-separated list. + + For example: + + poor_mans_lsb(variable_length) + { + UNCOMPRESSED { + constant_bits; + variable_bits; + } + + COMPRESSED { + variable_bits =:= irregular(variable_length); + constant_bits =:= static; + } + } + + As with any encoding method, all arguments take individual values, + such as an integer literal or a field attribute, rather than entire + fields. Although entire fields cannot be passed as arguments, it is + possible to pass each of their attributes instead, which is + equivalent. + + Recall that all bindings are two-way, so that rather than the + arguments acting as "inputs" to the encoding method, the result of an + encoding method may be to bind the parameters passed to it. + + + + + + + + +Finking & Pelletier Standards Track [Page 37] + +RFC 4997 ROHC-FN July 2007 + + + For example: + + set_to_double(arg1, arg2) + { + CONTROL { + ENFORCE(arg1 == 2 * arg2); + } + } + + This encoding method will attempt to bind the first argument to twice + the value of the second. In fact this "encoding" method is + pathological. Since it defines no fields, it does not do any actual + encoding at all. "CONTROL" sections are more appropriate to use for + this purpose than "UNCOMPRESSED". + +4.12.3. Multiple Formats + + Encoding methods can also define multiple formats for a given header. + This allows different compression methods to be used depending on + what is the most efficient way of compressing a particular header. + + For example, a field may have a fixed value most of the time, but the + value may occasionally change. Using a single format for the + encoding, this field would have to be encoded using "irregular" (see + Section 4.11.3), even though the value only changes rarely. However, + by defining multiple formats, we can provide two alternative + encodings: one for when the value remains fixed and another for when + the value changes. + + This is the topic of the following sub-sections. + +4.12.3.1. Naming Convention + + When compressed formats are defined, they must be defined using the + reserved word "COMPRESSED". Similarly, uncompressed formats must be + defined using the reserved word "UNCOMPRESSED". After each of these + keywords, a name may be given for the format. If no name is given to + the format, the name of the format is empty. + + Format names, except for the case where the name is empty, follow the + syntactic rules of identifiers as described in Section 4.2. + + Format names must be unique within the scope of the encoding method + to which they belong, except for the empty name, which may be used + for one "COMPRESSED" and one "UNCOMPRESSED" format. + + + + + + +Finking & Pelletier Standards Track [Page 38] + +RFC 4997 ROHC-FN July 2007 + + +4.12.3.2. Format Discrimination + + Each of the compressed formats has its own field list. A compressor + may pick any of these alternative formats to compress a header, as + long as the field bindings it employs can be used with the + uncompressed format. For example, the compressor could not choose to + use a compressed format that had a "static" encoding for a field + whose "UVALUE" attribute differs from its corresponding value in the + context. + + More formally, the compressor can choose any combination of an + uncompressed format and a compressed format for which no binding for + any of the field's attributes "fail", i.e., the encoding methods and + "ENFORCE" statements (see Section 4.9) that bind their compressed + attributes succeed. If there are multiple successful combinations, + the compressor can choose any one. Otherwise if there are no + successful combinations, the encoding method "fails". A format will + never fail due to it not defining the "UVALUE" attribute of a field. + A format only fails if it fails to define one of the compressed + attributes of one of the fields in the compressed format, or leaves + the length of the uncompressed format undefined. + + Because the compressor has a choice, it must be possible for the + decompressor to discriminate between the different compressed formats + that the compressor could have chosen. A simple approach to this + problem is for each compressed format to include a "discriminator" + that uniquely identifies that particular "COMPRESSED" format. A + discriminator is a control field; it is not derived from any of the + uncompressed field values (see Section 4.11.2). + +4.12.3.3. Example of Multiple Formats + + Putting this all together, here is a complete example of the + definition of an encoding method with multiple compressed formats: + + example_multiple_formats + { + UNCOMPRESSED { + field_1; // 4 bits + field_2; // 4 bits + field_3; // 24 bits + } + + DEFAULT { + field_1 =:= static; + field_2 =:= uncompressed_value(4, 2); + field_3 =:= lsb(4, 0); + } + + + +Finking & Pelletier Standards Track [Page 39] + +RFC 4997 ROHC-FN July 2007 + + + COMPRESSED format0 { + discriminator =:= '0'; // 1 bit + field_3; // 4 bits + } + + COMPRESSED format1 { + discriminator =:= '1'; // 1 bit + field_1 =:= irregular(4); // 4 bits + field_3 =:= irregular(24); // 24 bits + } + } + + Note the following: + + o "field_1" and "field_3" both have default encoding methods + specified for them, which are used in "format0", but are + overridden in "format1"; the default encoding method of "field_2" + however, is not overridden. + + o "field_1" and "field_2" have default encoding methods that + compress to zero bits. When these are used in "format0", the + field names do not appear in the field list. + + o "field_3" has an encoding method that does not compress to zero + bits, so whilst "field_3" has no encoding specified for it in the + field list of "format0", it still needs to appear in the field + list to specify where it goes in the compressed format. + + o In the example, all the fields in the uncompressed format have + default encoding methods specified for them, but this is not a + requirement. Default encodings can be specified for only some or + even none of the fields of the uncompressed format. + + o In the example, all the default encoding methods are on fields + from the uncompressed format, but this is not a requirement. + Default encoding methods can be specified for control fields. + +4.13. Profile-Specific Encoding Methods + + The library of encoding methods defined by ROHC-FN in Section 4.11 + provides a basic and generic set of field encoding methods. When + using a ROHC-FN specification in a ROHC profile, some additional + encodings specific to the particular protocol header being compressed + may, however, be needed, such as methods that infer the value of a + field from other values. + + These methods are specific to the properties of the protocol being + compressed and will thus have to be defined within the profile + + + +Finking & Pelletier Standards Track [Page 40] + +RFC 4997 ROHC-FN July 2007 + + + specification itself. Such profile-specific encoding methods, + defined either in ROHC-FN syntax or rigorously in plain text, can be + referred to in the ROHC-FN specification of the profile's formats in + the same way as any method in the ROHC-FN library. + + Encoding methods that are not defined in the formal notation are + specified by giving their name, followed by a short description of + where they are defined, in double quotes, and a semi-colon. + + For example: + + inferred_ip_v4_header_checksum "defined in RFCxxxx Section 6.4.1"; + +5. Security Considerations + + This document describes a formal notation similar to ABNF [RFC4234], + and hence is not believed to raise any security issues (note that + ABNF has a completely separate purpose to the ROHC formal notation). + +6. Contributors + + Richard Price did much of the foundational work on the formal + notation. He authored the initial document describing a formal + notation on which this document is based. + + Kristofer Sandlund contributed to this work by applying new ideas to + the ROHC-TCP profile, by providing feedback, and by helping resolve + different issues during the entire development of the notation. + + Carsten Bormann provided the translation of the formal notation + syntax using ABNF in Appendix A, and also contributed with feedback + and reviews to validate the completeness and correctness of the + notation. + +7. Acknowledgements + + A number of important concepts and ideas have been borrowed from ROHC + [RFC3095]. + + Thanks to Mark West, Eilert Brinkmann, Alan Ford, and Lars-Erik + Jonsson for their contributions, reviews, and feedback that led to + significant improvements to the readability, completeness, and + overall quality of the notation. + + Thanks to Stewart Sadler, Caroline Daniels, Alan Finney, and David + Findlay for their reviews and comments. Thanks to Rob Hancock and + Stephen McCann for their early work on the formal notation. The + + + + +Finking & Pelletier Standards Track [Page 41] + +RFC 4997 ROHC-FN July 2007 + + + authors would also like to thank Christian Schmidt, Qian Zhang, + Hongbin Liao, and Max Riegel for their comments and valuable input. + + Additional thanks: this document was reviewed during working group + last-call by committed reviewers Mark West, Carsten Bormann, and Joe + Touch, as well as by Sally Floyd who provided a review at the request + of the Transport Area Directors. Thanks also to Magnus Westerlund + for his feedback in preparation for the IESG review. + +8. References + +8.1. Normative References + + [C90] ISO/IEC, "ISO/IEC 9899:1990 Information technology -- + Programming Language C", ISO 9899:1990, April 1990. + + [RFC2822] Resnick, P., Ed., "STANDARD FOR THE FORMAT OF ARPA + INTERNET TEXT MESSAGES", RFC 2822, April 2001. + + [RFC4234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax + Specifications: ABNF", RFC 4234, October 2005. + + [RFC4995] Jonsson, L-E., Pelletier, G., and K. Sandlund, "The RObust + Header Compression (ROHC) Framework", RFC 4995, July 2007. + +8.2. Informative References + + [RFC3095] Bormann, C., Burmeister, C., Degermark, M., Fukushima, H., + Hannu, H., Jonsson, L-E., Hakenberg, R., Koren, T., Le, + K., Liu, Z., Martensson, A., Miyazaki, A., Svanbro, K., + Wiebke, T., Yoshimura, T., and H. Zheng, "RObust Header + Compression (ROHC): Framework and four profiles: RTP, UDP, + ESP, and uncompressed", RFC 3095, July 2001. + + [RFC791] University of Southern California, "DARPA INTERNET PROGRAM + PROTOCOL SPECIFICATION", RFC 791, September 1981. + + + + + + + + + + + + + + + +Finking & Pelletier Standards Track [Page 42] + +RFC 4997 ROHC-FN July 2007 + + +Appendix A. Formal Syntax of ROHC-FN + + This section gives a definition of the syntax of ROHC-FN in ABNF + [RFC4234], using "fnspec" as the start rule. + + ; overall structure + fnspec = S *(constdef S) [globctl S] 1*(methdef S) + constdef = constname S "=" S expn S ";" + globctl = CONTROL S formbody + methdef = id S [parmlist S] "{" S 1*(formatdef S) "}" + / id S [parmlist S] STRQ *STRCHAR STRQ S ";" + parmlist = "(" S id S *( "," S id S ) ")" + formatdef = formhead S formbody + formhead = UNCOMPRESSED [ 1*WS id ] + / COMPRESSED [ 1*WS id ] + / CONTROL / INITIAL / DEFAULT + formbody = "{" S *((fielddef/enforcer) S) "}" + fielddef = fieldgroup S ["=:=" S encspec S] [lenspec S] ";" + fieldgroup = fieldname *( S ":" S fieldname ) + fieldname = id + encspec = "'" *("0"/"1") "'" + / id [ S "(" S expn S *( "," S expn S ) ")"] + lenspec = "[" S expn S *("," S expn S) "]" + enforcer = ENFORCE S "(" S expn S ")" S ";" + + + ; expressions + expn = *(expnb S "||" S) expnb + expnb = *(expna S "&&" S) expna + expna = *(expn7 S ("=="/"!=") S) expn7 + expn7 = *(expn6 S ("<"/"<="/">"/">=") S) expn6 + expn6 = *(expn4 S ("+"/"-") S) expn4 + expn4 = *(expn3 S ("*"/"/"/"%") S) expn3 + expn3 = expn2 [S "^" S expn3] + expn2 = ["!" S] expn1 + expn1 = expn0 / attref / constname / litval / id + expn0 = "(" S expn S ")" / VARIABLE + attref = fieldnameref "." attname + fieldnameref = fieldname / THIS + attname = ( U / C ) ( LENGTH / VALUE ) + litval = ["-"] "0b" 1*("0"/"1") + / ["-"] "0x" 1*(DIGIT/"a"/"b"/"c"/"d"/"e"/"f") + / ["-"] 1*DIGIT + / false / true + + + + + + + +Finking & Pelletier Standards Track [Page 43] + +RFC 4997 ROHC-FN July 2007 + + + ; lexical categories + constname = UPCASE *(UPCASE / DIGIT / "_") + id = ALPHA *(ALPHA / DIGIT / "_") + ALPHA = %x41-5A / %x61-7A + UPCASE = %x41-5A + DIGIT = %x30-39 + COMMENT = "//" *(SP / HTAB / VCHAR) CRLF + SP = %x20 + HTAB = %x09 + VCHAR = %x21-7E + CRLF = %x0A / %x0D.0A + NL = COMMENT / CRLF + WS = SP / HTAB / NL + S = *WS + STRCHAR = SP / HTAB / %x21 / %x23-7E + STRQ = %x22 + + + ; case-sensitive literals + C = %d67 + COMPRESSED = %d67.79.77.80.82.69.83.83.69.68 + CONTROL = %d67.79.78.84.82.79.76 + DEFAULT = %d68.69.70.65.85.76.84 + ENFORCE = %d69.78.70.79.82.67.69 + INITIAL = %d73.78.73.84.73.65.76 + LENGTH = %d76.69.78.71.84.72 + THIS = %d84.72.73.83 + U = %d85 + UNCOMPRESSED = %d85.78.67.79.77.80.82.69.83.83.69.68 + VALUE = %d86.65.76.85.69 + VARIABLE = %d86.65.82.73.65.66.76.69 + false = %d102.97.108.115.101 + true = %d116.114.117.101 + + + + + + + + + + + + + + + + + + +Finking & Pelletier Standards Track [Page 44] + +RFC 4997 ROHC-FN July 2007 + + +Appendix B. Bit-level Worked Example + + This section gives a worked example at the bit level, showing how a + simple ROHC-FN specification describes the compression of real data + from an imaginary protocol header. The example used has been kept + fairly simple, whilst still aiming to illustrate some of the + intricacies that arise in use of the notation. In particular, fields + have been kept short to make it possible to read the binary + representation of the headers without too much difficulty. + +B.1. Example Packet Format + + Our imaginary header is just 16 bits long, and consists of the + following fields: + + 1. version number -- 2 bits + + 2. type -- 2 bits + + 3. flow id -- 4 bits + + 4. sequence number -- 4 bits + + 5. flag bits -- 4 bits + + So for example 0101000100010000 indicates a header with a version + number of one, a type of one, a flow id of one, a sequence number of + one, and all flag bits set to zero. + + Here is an ASCII box notation diagram of the imaginary header: + + 0 1 2 3 4 5 6 7 + +---+---+---+---+---+---+---+---+ + |version| type | flow_id | + +---+---+---+---+---+---+---+---+ + | sequence_no | flag_bits | + +---+---+---+---+---+---+---+---+ + + + + + + + + + + + + + + +Finking & Pelletier Standards Track [Page 45] + +RFC 4997 ROHC-FN July 2007 + + +B.2. Initial Encoding + + An initial definition based solely on the above information is as + follows: + + eg_header + { + UNCOMPRESSED { + version_no [ 2 ]; + type [ 2 ]; + flow_id [ 4 ]; + sequence_no [ 4 ]; + flag_bits [ 4 ]; + } + + COMPRESSED initial_definition { + version_no =:= irregular(2); + type =:= irregular(2); + flow_id =:= irregular(4); + sequence_no =:= irregular(4); + flag_bits =:= irregular(4); + } + } + + This defines the format nicely, but doesn't actually offer any + compression. If we use it to encode the above header, we get: + + Uncompressed header: 0101000100010000 + Compressed header: 0101000100010000 + + This is because we have stated that all fields are "irregular" -- + i.e., we haven't specified anything about their behaviour. + + Note that since we have only one compressed format and one + uncompressed format, it makes no difference whether the encoding + methods for each field are specified in the compressed or + uncompressed format. It would make no difference at all if we wrote + the following instead: + + eg_header + { + UNCOMPRESSED { + version_no =:= irregular(2); + type =:= irregular(2); + flow_id =:= irregular(4); + sequence_no =:= irregular(4); + flag_bits =:= irregular(4); + } + + + +Finking & Pelletier Standards Track [Page 46] + +RFC 4997 ROHC-FN July 2007 + + + COMPRESSED initial_definition { + version_no [ 2 ]; + type [ 2 ]; + flow_id [ 4 ]; + sequence_no [ 4 ]; + flag_bits [ 4 ]; + } + } + +B.3. Basic Compression + + In order to achieve any compression we need to notate more knowledge + about the header and its behaviour in a flow. For example, we may + know the following facts about the header: + + 1. version number -- indicates which version of the protocol this + is: always one for this version of the protocol. + + 2. type -- may take any value. + + 3. flow id -- may take any value. + + 4. sequence number -- make take any value. + + 5. flag bits -- contains three flags, a, b, and c, each of which may + be set or clear, and a reserved flag bit, which is always clear + (i.e., zero). + + We could notate this knowledge as follows: + + eg_header + { + UNCOMPRESSED { + version_no [ 2 ]; + type [ 2 ]; + flow_id [ 4 ]; + sequence_no [ 4 ]; + abc_flag_bits [ 3 ]; + reserved_flag [ 1 ]; + } + + COMPRESSED basic { + version_no =:= uncompressed_value(2, 1) [ 0 ]; + type =:= irregular(2) [ 2 ]; + flow_id =:= irregular(4) [ 4 ]; + sequence_no =:= irregular(4) [ 4 ]; + abc_flag_bits =:= irregular(3) [ 3 ]; + reserved_flag =:= uncompressed_value(1, 0) [ 0 ]; + + + +Finking & Pelletier Standards Track [Page 47] + +RFC 4997 ROHC-FN July 2007 + + + } + } + + Using this simple scheme, we have successfully encoded the fact that + one of the fields has a permanently fixed value of one, and therefore + contains no useful information. We have also encoded the fact that + the final flag bit is always zero, which again contains no useful + information. Both of these facts have been notated using the + "uncompressed_value" encoding method (see Section 4.11.1). + + Using this new encoding on the above header, we get: + + Uncompressed header: 0101000100010000 + Compressed header: 0100010001000 + + This reduces the amount of data we need to transmit by roughly 20%. + However, this encoding fails to take advantage of relationships + between values of a field in one packet and its value in subsequent + packets. For example, every header in the following sequence is + compressed by the same amount despite the similarities between them: + + Uncompressed header: 0101000100010000 + Compressed header: 0100010001000 + + + Uncompressed header: 0101000101000000 + Compressed header: 0100010100000 + + + Uncompressed header: 0110000101110000 + Compressed header: 1000010111000 + +B.4. Inter-Packet Compression + + The profile we have defined so far has not compressed the sequence + number or flow ID fields at all, since they can take any value. + However the value of each of these fields in one header has a very + simple relationship to their values in previous headers: + + o the sequence number is unusual -- it increases by three each time, + + o the flow_id stays the same -- it always has the same value that it + did in the previous header in the flow, + + o the abc_flag_bits stay the same most of the time -- they usually + have the same value that they did in the previous header in the + flow. + + + + +Finking & Pelletier Standards Track [Page 48] + +RFC 4997 ROHC-FN July 2007 + + + An obvious way of notating this is as follows: + + // This obvious encoding will not work (correct encoding below) + eg_header + { + UNCOMPRESSED { + version_no [ 2 ]; + type [ 2 ]; + flow_id [ 4 ]; + sequence_no [ 4 ]; + abc_flag_bits [ 3 ]; + reserved_flag [ 1 ]; + } + + COMPRESSED obvious { + version_no =:= uncompressed_value(2, 1); + type =:= irregular(2); + flow_id =:= static; + sequence_no =:= lsb(0, -3); + abc_flag_bits =:= irregular(3); + reserved_flag =:= uncompressed_value(1, 0); + } + } + + The dependency on previous packets is notated using the "static" and + "lsb" encoding methods (see Section 4.11.4 and Section 4.11.5 + respectively). However there are a few problems with the above + notation. + + Firstly, and most importantly, the "flow_id" field is notated as + "static", which means that it doesn't change from packet to packet. + However, the notation does not indicate how to communicate the value + of the field initially. There is no point saying "it's the same + value as last time" if there has not been a first time where we + define what that value is, so that it can be referred back to. The + above notation provides no way of communicating that. Similarly with + the sequence number -- there needs to be a way of communicating its + initial value. In fact, except for the explicit notation indicating + their lengths, even the lengths of these two fields would be left + undefined. This problem will be solved below, in Appendix B.5. + + Secondly, the sequence number field is communicated very efficiently + in zero bits, but it is not at all robust against packet loss. If a + packet is lost then there is no way to handle the missing sequence + number. When communicating sequence numbers, or any other field + encoded with "lsb" encoding, a very important consideration for the + notator is how robust against packet loss the compressed protocol + should be. This will vary a lot from protocol stack to protocol + + + +Finking & Pelletier Standards Track [Page 49] + +RFC 4997 ROHC-FN July 2007 + + + stack. For the example protocol we'll assume short, low overhead + flows and say we need to be robust to the loss of just one packet, + which we can achieve with two bits of "lsb" encoding (one bit isn't + enough since the sequence number increases by three each time -- see + Section 4.11.5). This will be addressed below in Appendix B.5. + + Finally, although the flag bits are usually the same as in the + previous header in the flow, the profile doesn't make any use of this + fact; since they are sometimes not the same as those in the previous + header, it is not safe to say that they are always the same, so + "static" encoding can't be used exclusively. This problem will be + solved later through the use of multiple formats in Appendix B.6. + +B.5. Specifying Initial Values + + To communicate initial values for fields compressed with a context + dependent encoding such as "static" or "lsb" we use an "INITIAL" + field list. This can help with fields whose start value is fixed and + known. For example, if we knew that at the start of the flow that + "flow_id" would always be 1 and "sequence_no" would always be 0, we + could notate that like this: + + // This encoding will not work either (correct encoding below) + eg_header + { + UNCOMPRESSED { + version_no [ 2 ]; + type [ 2 ]; + flow_id [ 4 ]; + sequence_no [ 4 ]; + abc_flag_bits [ 3 ]; + reserved_flag [ 1 ]; + } + + INITIAL { + // set initial values of fields before flow starts + flow_id =:= uncompressed_value(4, 1); + sequence_no =:= uncompressed_value(4, 0); + } + + COMPRESSED obvious { + version_no =:= uncompressed_value(2, 1); + type =:= irregular(2); + flow_id =:= static; + sequence_no =:= lsb(2, -3); + abc_flag_bits =:= irregular(3); + reserved_flag =:= uncompressed_value(1, 0); + } + + + +Finking & Pelletier Standards Track [Page 50] + +RFC 4997 ROHC-FN July 2007 + + + } + + However, this use of "INITIAL" is no good since the initial values of + both "flow_id" and "sequence_no" vary from flow to flow. "INITIAL" + is only applicable where the initial value of a field is fixed, as is + often the case with control fields. + +B.6. Multiple Packet Formats + + To communicate initial values for the sequence number and flow ID + fields correctly, and to take advantage of the fact that the flag + bits are usually the same as in the previous header, we need to + depart from the single format encoding we are currently using and + instead use multiple formats. Here, we have expressed the encodings + for two of the fields in the uncompressed format, since they will + always be true for uncompressed headers of that format. The + remaining fields, whose encoding method may depend on exactly how the + header is being compressed, have their encodings specified in the + compressed formats. + + eg_header + { + UNCOMPRESSED { + version_no =:= uncompressed_value(2, 1) [ 2 ]; + type [ 2 ]; + flow_id [ 4 ]; + sequence_no [ 4 ]; + abc_flag_bits [ 3 ]; + reserved_flag =:= uncompressed_value(1, 0) [ 1 ]; + } + + + COMPRESSED irregular_format { + discriminator =:= '0' [ 1 ]; + version_no [ 0 ]; + type =:= irregular(2) [ 2 ]; + flow_id =:= irregular(4) [ 4 ]; + sequence_no =:= irregular(4) [ 4 ]; + abc_flag_bits =:= irregular(3) [ 3 ]; + reserved_flag [ 0 ]; + } + + COMPRESSED compressed_format { + discriminator =:= '1' [ 1 ]; + version_no [ 0 ]; + type =:= irregular(2) [ 2 ]; + flow_id =:= static [ 0 ]; + sequence_no =:= lsb(2, -3) [ 2 ]; + + + +Finking & Pelletier Standards Track [Page 51] + +RFC 4997 ROHC-FN July 2007 + + + abc_flag_bits =:= static [ 0 ]; + reserved_flag [ 0 ]; + } + } + + Note that we have added a discriminator field, so that the + decompressor can tell which format has been used by the compressor. + The format with a "static" flow ID and "lsb" encoded sequence number + is now 5 bits long. Note that despite having to add the + discriminator field, this format is still the same size as the + original incorrect "obvious" format because it takes advantage of the + fact that the abc flag bits rarely change. + + However, the original "basic" format has also grown by one bit due to + the addition of the discriminator ("irregular_format"). An important + consideration when creating multiple formats is whether each format + occurs frequently enough that the average compressed header length is + shorter as a result of its usage. For example, if in fact the flag + bits always changed between packets, the "compressed_format" encoding + could never be used; all we would have achieved is lengthening the + "basic" format by one bit. + + Using the above notation, we now get: + + Uncompressed header: 0101000100010000 + Compressed header: 00100010001000 + + + Uncompressed header: 0101000101000000 + Compressed header: 10100 ; 00100010100000 + + + Uncompressed header: 0110000101110000 + Compressed header: 11011 ; 01000010111000 + + The first header in the stream is compressed the same way as before, + except that it now has the extra 1-bit discriminator at the start + (0). When a second header arrives with the same flow ID as the first + and its sequence number three higher, it can be compressed in two + possible ways: either by using "compressed_format" or, in the same + way as previously, by using "irregular_format". + + Note that we show all theoretically possible encodings of a header as + defined by the ROHC-FN specification, separated by semi-colons. + Either of the above encodings for each header could be produced by a + valid implementation, although a good implementation would always aim + to pick the encoding that leads to the best compression. A good + implementation would also take robustness into account and therefore + + + +Finking & Pelletier Standards Track [Page 52] + +RFC 4997 ROHC-FN July 2007 + + + probably wouldn't assume on the second packet that the decompressor + had available the context necessary to decompress the shorter + "compressed_format" form. + + Finally, note that the fields whose encoding methods are specified in + the uncompressed format have zero length when compressed. This means + their position in the compressed format is not significant. In this + case, there is no need to notate them when defining the compressed + formats. In the next part of the example we will see that they have + been removed from the compressed formats altogether. + +B.7. Variable Length Discriminators + + Suppose we do some analysis on flows of our example protocol and + discover that whilst it is usual for successive packets to have the + same flags, on the occasions when they don't, the packet is almost + always a "flags set" packet in which all three of the abc flags are + set. To encode the flow more efficiently a format needs to be + written to reflect this. + + This now gives a total of three formats, which means we need three + discriminators to differentiate between them. The obvious solution + here is to increase the number of bits in the discriminator from one + to two and use discriminators 00, 01, and 10 for example. However we + can do slightly better than this. + + Any uniquely identifiable discriminator will suffice, so we can use + 00, 01, and 1. If the discriminator starts with 1, that's the whole + thing. If it starts with 0, the decompressor knows it has to check + one more bit to determine the kind of format. + + Note that care must be taken when using variable length + discriminators. For example, it would be erroneous to use 0, 01, and + 10 as discriminators since after reading an initial 0, the + decompressor would have no way of knowing if the next bit was a + second bit of discriminator, or the first bit of the next field in + the format. However, 0, 10, and 11 would be correct, as the first + bit again indicates whether or not there are further discriminator + bits to follow. + + + + + + + + + + + + +Finking & Pelletier Standards Track [Page 53] + +RFC 4997 ROHC-FN July 2007 + + + This gives us the following: + + eg_header + { + UNCOMPRESSED { + version_no =:= uncompressed_value(2, 1) [ 2 ]; + type [ 2 ]; + flow_id [ 4 ]; + sequence_no [ 4 ]; + abc_flag_bits [ 3 ]; + reserved_flag =:= uncompressed_value(1, 0) [ 1 ]; + } + + COMPRESSED irregular_format { + discriminator =:= '00' [ 2 ]; + type =:= irregular(2) [ 2 ]; + flow_id =:= irregular(4) [ 4 ]; + sequence_no =:= irregular(4) [ 4 ]; + abc_flag_bits =:= irregular(3) [ 3 ]; + } + + COMPRESSED flags_set { + discriminator =:= '01' [ 2 ]; + type =:= irregular(2) [ 2 ]; + flow_id =:= static [ 0 ]; + sequence_no =:= lsb(2, -3) [ 2 ]; + abc_flag_bits =:= uncompressed_value(3, 7) [ 0 ]; + } + + COMPRESSED flags_static { + discriminator =:= '1' [ 1 ]; + type =:= irregular(2) [ 2 ]; + flow_id =:= static [ 0 ]; + sequence_no =:= lsb(2, -3) [ 2 ]; + abc_flag_bits =:= static [ 0 ]; + } + } + + Here is some example output: + + Uncompressed header: 0101000100010000 + Compressed header: 000100010001000 + + + Uncompressed header: 0101000101000000 + Compressed header: 10100 ; 000100010100000 + + + + + +Finking & Pelletier Standards Track [Page 54] + +RFC 4997 ROHC-FN July 2007 + + + Uncompressed header: 0110000101110000 + Compressed header: 11011 ; 001000010111000 + + + Uncompressed header: 0111000110101110 + Compressed header: 011110 ; 001100011010111 + + Here we have a very similar sequence to last time, except that there + is now an extra message on the end that has the flag bits set. The + encoding for the first message in the stream is now one bit larger, + the encoding for the next two messages is the same as before, since + that format has not grown; thanks to the use of variable length + discriminators. Finally, the packet that comes through with all the + flag bits set can be encoded in just six bits, only one bit more than + the most common format. Without the extra format, this last packet + would have to be encoded using the longest format and would have + taken up 14 bits. + +B.8. Default Encoding + + Some of the common encoding methods used so far have been "factored + out" into the definition of the uncompressed format, meaning that + they don't need to be defined for every compressed format. However, + there is still some redundancy in the notation. For a number of + fields, the same encoding method is used several times in different + formats (though not necessarily in all of them), but the field + encoding is redefined explicitly each time. If the encoding for any + of these fields changed in the future, then every format that uses + that encoding would have to be modified to reflect this change. + + This problem can be avoided by specifying default encoding methods + for these fields. Doing so can also lead to a more concisely notated + profile: + + eg_header + { + UNCOMPRESSED { + version_no =:= uncompressed_value(2, 1) [ 2 ]; + type [ 2 ]; + flow_id [ 4 ]; + sequence_no [ 4 ]; + abc_flag_bits [ 3 ]; + reserved_flag =:= uncompressed_value(1, 0) [ 1 ]; + } + + DEFAULT { + type =:= irregular(2); + flow_id =:= static; + + + +Finking & Pelletier Standards Track [Page 55] + +RFC 4997 ROHC-FN July 2007 + + + sequence_no =:= lsb(2, -3); + } + + COMPRESSED irregular_format { + discriminator =:= '00' [ 2 ]; + type [ 2 ]; // Uses default + flow_id =:= irregular(4) [ 4 ]; // Overrides default + sequence_no =:= irregular(4) [ 4 ]; // Overrides default + abc_flag_bits =:= irregular(3) [ 3 ]; + } + + COMPRESSED flags_set { + discriminator =:= '01' [ 2 ]; + type [ 2 ]; // Uses default + sequence_no [ 2 ]; // Uses default + abc_flag_bits =:= uncompressed_value(3, 7); + } + + COMPRESSED flags_static { + discriminator =:= '1' [ 1 ]; + type [ 2 ]; // Uses default + sequence_no [ 2 ]; // Uses default + abc_flag_bits =:= static; + } + } + + The above profile behaves in exactly the same way as the one notated + previously, since it has the same meaning. Note that the purpose + behind the different formats becomes clearer with the default + encoding methods factored out: all that remains are the encodings + that are specific to each format. Note also that default encoding + methods that compress down to zero bits have become completely + implicit. For example the compressed formats using the default + encoding for "flow_id" don't mention it (the default is "static" + encoding that compresses to zero bits). + +B.9. Control Fields + + One inefficiency in the compression scheme we have produced thus far + is that it uses two bits to provide the "lsb" encoded sequence number + with robustness for the loss of just one packet. In theory, only one + bit should be needed. The root of the problem is the unusual + sequence number that the protocol uses -- it counts up in increments + of three. In order to encode it at maximum efficiency we need to + translate this into a field that increments by one each time. We do + this using a control field. + + + + + +Finking & Pelletier Standards Track [Page 56] + +RFC 4997 ROHC-FN July 2007 + + + A control field is extra data that is communicated in the compressed + format, but which is not a direct encoding of part of the + uncompressed header. Control fields can be used to communicate extra + information in the compressed format, that allows other fields to be + compressed more efficiently. + + The control field that we introduce scales the sequence number down + by a factor of three. Instead of encoding the original sequence + number in the compressed packet, we encode the scaled sequence + number, allowing us to have robustness to the loss of one packet by + using just one bit of "lsb" encoding: + + eg_header + { + UNCOMPRESSED { + version_no =:= uncompressed_value(2, 1) [ 2 ]; + type [ 2 ]; + flow_id [ 4 ]; + sequence_no [ 4 ]; + abc_flag_bits [ 3 ]; + reserved_flag =:= uncompressed_value(1, 0) [ 1 ]; + } + + CONTROL { + // need modulo maths to calculate scaling correctly, + // due to 4 bit wrap around + scaled_seq_no [ 4 ]; + ENFORCE(sequence_no.UVALUE + == (scaled_seq_no.UVALUE * 3) % 16); + } + + DEFAULT { + type =:= irregular(2); + flow_id =:= static; + scaled_seq_no =:= lsb(1, -1); + } + + COMPRESSED irregular_format { + discriminator =:= '00' [ 2 ]; + type [ 2 ]; + flow_id =:= irregular(4) [ 4 ]; + scaled_seq_no =:= irregular(4) [ 4 ]; // Overrides default + abc_flag_bits =:= irregular(3) [ 3 ]; + } + + COMPRESSED flags_set { + discriminator =:= '01' [ 2 ]; + type [ 2 ]; + + + +Finking & Pelletier Standards Track [Page 57] + +RFC 4997 ROHC-FN July 2007 + + + scaled_seq_no [ 1 ]; // Uses default + abc_flag_bits =:= uncompressed_value(3, 7); + } + + COMPRESSED flags_static { + discriminator =:= '1' [ 1 ]; + type [ 2 ]; + scaled_seq_no [ 1 ]; // Uses default + abc_flag_bits =:= static; + } + } + + Normally, the encoding method(s) used to encode a field specifies the + length of the field. In the above notation, since there is no + encoding method using "sequence_no" directly, its length needs to be + defined explicitly using an "ENFORCE" statement. This is done using + the abbreviated syntax, both for consistency and also for ease of + readability. Note that this is unusual: whereas the majority of + field length indications are redundant (and thus optional), this one + isn't. If it was removed from the above notation, the length of the + "sequence_no" field would be undefined. + + Here is some example output: + + Uncompressed header: 0101000100010000 + Compressed header: 000100011011000 + + + Uncompressed header: 0101000101000000 + Compressed header: 1010 ; 000100011100000 + + + Uncompressed header: 0110000101110000 + Compressed header: 1101 ; 001000011101000 + + + Uncompressed header: 0111000110101110 + Compressed header: 01110 ; 001100011110111 + + In this form, we see that this gives us a saving of a further bit in + most packets. Assuming the bulk of a flow is made up of + "flags_static" headers, the mean size of the headers in a compressed + flow is now just over a quarter of their size in an uncompressed + flow. + + + + + + + +Finking & Pelletier Standards Track [Page 58] + +RFC 4997 ROHC-FN July 2007 + + +B.10. Use of "ENFORCE" Statements as Conditionals + + Earlier, we created a new format "flags_set" to handle packets with + all three of the flag bits set. As it happens, these three flags are + always all set for "type 3" packets, and are never all set for other + packet types (a "type 3" packet is one where the type field is set to + three). + + This allows extra efficiency in encoding such packets. We know the + type is three, so we don't need to encode the type field in the + compressed header. The type field was previously encoded as + "irregular(2)", which is two bits long. Removing this reduces the + size of the "flags_set" format from five bits to three, making it the + smallest format in the encoding method definition. + + In order to notate that the "flags_set" format should only be used + for "type 3" headers, and the "flags_static" format only when the + type isn't three, it is necessary to state these conditions inside + each format. This can be done with an "ENFORCE" statement: + + eg_header + { + UNCOMPRESSED { + version_no =:= uncompressed_value(2, 1) [ 2 ]; + type [ 2 ]; + flow_id [ 4 ]; + sequence_no [ 4 ]; + abc_flag_bits [ 3 ]; + reserved_flag =:= uncompressed_value(1, 0) [ 1 ]; + } + + CONTROL { + // need modulo maths to calculate scaling correctly, + // due to 4 bit wrap around + scaled_seq_no [ 4 ]; + ENFORCE(sequence_no.UVALUE + == (scaled_seq_no.UVALUE * 3) % 16); + } + + DEFAULT { + type =:= irregular(2); + scaled_seq_no =:= lsb(1, -1); + flow_id =:= static; + } + + COMPRESSED irregular_format { + discriminator =:= '00' [ 2 ]; + type [ 2 ]; + + + +Finking & Pelletier Standards Track [Page 59] + +RFC 4997 ROHC-FN July 2007 + + + flow_id =:= irregular(4) [ 4 ]; + scaled_seq_no =:= irregular(4) [ 4 ]; + abc_flag_bits =:= irregular(3) [ 3 ]; + } + + COMPRESSED flags_set { + ENFORCE(type.UVALUE == 3); // redundant condition + discriminator =:= '01' [ 2 ]; + type =:= uncompressed_value(2, 3) [ 0 ]; + scaled_seq_no [ 1 ]; + abc_flag_bits =:= uncompressed_value(3, 7) [ 0 ]; + } + + COMPRESSED flags_static { + ENFORCE(type.UVALUE != 3); + discriminator =:= '1' [ 1 ]; + type [ 2 ]; + scaled_seq_no [ 1 ]; + abc_flag_bits =:= static [ 0 ]; + } + } + + The two "ENFORCE" statements in the last two formats act as "guards". + Guards prevent formats from being used under the wrong circumstances. + In fact, the "ENFORCE" statement in "flags_set" is redundant. The + condition it guards for is already enforced by the new encoding + method used for the "type" field. The encoding method + "uncompressed_value(2,3)" binds the "UVALUE" attribute to three. + This is exactly what the "ENFORCE" statement does, so it can be + removed without any change in meaning. The "uncompressed_value" + encoding method on the other hand is not redundant. It specifies + other bindings on the type field in addition to the one that the + "ENFORCE" statement specifies. Therefore it would not be possible to + remove the encoding method and leave just the "ENFORCE" statement. + + Note that a guard is solely preventative. A guard can never force a + format to be chosen by the compressor. A format can only be + guaranteed to be chosen in a given situation if there are no other + formats that can be used instead. This is demonstrated in the + example output below. The compressor can still choose the + "irregular" format if it wishes: + + Uncompressed header: 0101000100010000 + Compressed header: 000100011011000 + + + Uncompressed header: 0101000101000000 + Compressed header: 1010 ; 000100011100000 + + + +Finking & Pelletier Standards Track [Page 60] + +RFC 4997 ROHC-FN July 2007 + + + Uncompressed header: 0110000101110000 + Compressed header: 1101 ; 001000011101000 + + + Uncompressed header: 0111000110101110 + Compressed header: 010 ; 001100011110111 + + This saves just two extra bits (a 7% saving) in the example flow. + +Authors' Addresses + + Robert Finking + Siemens/Roke Manor Research + Old Salisbury Lane + Romsey, Hampshire SO51 0ZN + UK + + Phone: +44 (0)1794 833189 + EMail: robert.finking@roke.co.uk + URI: http://www.roke.co.uk + + + Ghyslain Pelletier + Ericsson + Box 920 + Lulea SE-971 28 + Sweden + + Phone: +46 (0) 8 404 29 43 + EMail: ghyslain.pelletier@ericsson.com + + + + + + + + + + + + + + + + + + + + + +Finking & Pelletier Standards Track [Page 61] + +RFC 4997 ROHC-FN July 2007 + + +Full Copyright Statement + + Copyright (C) The IETF Trust (2007). + + This document is subject to the rights, licenses and restrictions + contained in BCP 78, and except as set forth therein, the authors + retain all their rights. + + This document and the information contained herein are provided on an + "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS + OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND + THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS + OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF + THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED + WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +Intellectual Property + + The IETF takes no position regarding the validity or scope of any + Intellectual Property Rights or other rights that might be claimed to + pertain to the implementation or use of the technology described in + this document or the extent to which any license under such rights + might or might not be available; nor does it represent that it has + made any independent effort to identify any such rights. Information + on the procedures with respect to rights in RFC documents can be + found in BCP 78 and BCP 79. + + Copies of IPR disclosures made to the IETF Secretariat and any + assurances of licenses to be made available, or the result of an + attempt made to obtain a general license or permission for the use of + such proprietary rights by implementers or users of this + specification can be obtained from the IETF on-line IPR repository at + http://www.ietf.org/ipr. + + The IETF invites any interested party to bring to its attention any + copyrights, patents or patent applications, or other proprietary + rights that may cover technology that may be required to implement + this standard. Please address the information to the IETF at + ietf-ipr@ietf.org. + +Acknowledgement + + Funding for the RFC Editor function is currently provided by the + Internet Society. + + + + + + + +Finking & Pelletier Standards Track [Page 62] + |