view Side-By-Side changes
Internet-Draft University of Tennessee Expires:January 12,21 May 2002July 12,21 November 2001 The Binary Low-Overhead Block Presentation Protocoldraft-moore-rescap-blob-00.txtdraft-moore-rescap-blob-01.txt Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This document is being submitted as a contribution to the IETF rescap working group. Comments regarding this internet-draft should be sent to the rescap mailing list at rescap@cs.utk.edu, or to the author at the address listed below. Requests to subscribe to the rescap mailing list should be sent to rescap-REQUEST@cs.utk.edu. Please include the document identifier draft-moore-rescap-blob-01.txt in any comments. Known errata of this specification, as well as sample code, will be made available at http://www.cs.utk.edu/~moore/blob/ This Internet-Draft will expire onJanuary 12, 2001. Abstract21 May 2002. ABSTRACT This memo describes the Binary Low-Overhead Block (BLOB) protocol for on-the-wire presentation of data in the context of higher-level protocols. BLOB is designed to encode and decode data with low overhead on most CPUs, to be reasonably space-efficient, and for its representation to be sufficiently precise that it is suitable as acanonical format for digital signatures.Moore ExpiresJanuary 12,21 May 2002 [Page 1] BLOB Protocol Internet-Draft12 July21 November 2001 canonical format for digital signatures. 1. Introduction When designing applications-layer protocols there is sometimes a need to have an efficient means of encoding protocol elements or protocol data units. Existing solutions in this space may be deemed inadequate, for various reasons. For example: - ASN.1[1][2] and BER[2][3] are baroque both in terms of the abstract syntax and available on-the-wire representations, and complex to implement. - ONC XDR[3][4] requires a stub generator and support libraries which are not easily available on all platforms, and there are subtle differences between the APIs provided by different implementations. XDR is large enough that it's not usually feasible to write your own implementation, and it's difficult to write portable code that can work with the various implementations that are deployed. Many XDR implementations have significant unnecessary processing overhead. This impairsperformaceperformance of applications based on XDR and gives the protocol itself a worse reputation than it otherwise deserves. - The design of MIME[4][5] was heavily influenced by the need to be able to operate over existing text-based mail systems which imposed a number of constraints. This worked out well for email, but for other applications, MIME is neither efficient in terms of storage density nor easy to parse. - XML[5][6] is easier to parse than MIME, but still requires significant processing overhead. There is also a large and growing body of "culture" regarding how XML should be used, which paradoxically imposes a significant barrier to use of XML. (To be fair, MIME also has a fair amount of "culture" associated with it.) Finally, for small and regular data structures XML imposes a lot of overhead. BLOB was designed to serve as an alternative to these presentation layers for use in representing relatively simplestrucutres,structures, consisting of a limited set of primitive data types, and where the structures can reasonably be contained within a single protocol data unit. BLOB is designed with the following considerations: - It should be easy and efficient to generate the encoded form. Moore Expires 21 May 2002 [Page 2] BLOB Protocol Internet-Draft 21 November 2001 - The encoded form should require minimal processing to decode, ideally being usable in-place (without allocating memory or copying) on most platforms.Moore Expires January 12, 2002 [Page 2] BLOB Protocol Internet-Draft 12 July 2001- It should be easy to write programs whichmainpulatemanipulate and exchange BLOBs, without needing significant external support in the form of libraries or stub generators. - The structure should be easy and efficient to verify for internal consistency. - For any structure to be represented there should be a unique (canonical) on-the-wire encoding which is always used. - It should be reasonably space-efficient. However, this is secondary to minimizing processing overhead. The BLOB approach is more feasible now than in years past because data representations have become more uniform across different computing platforms. Essentially all widely-used computers now support 32-bit integers, can address 32-bit integers which are not aligned on any larger boundary, use word sizes which are a multiple of 8 bits, and can directly address strings of 8-bit characters which are not aligned on any boundary larger than an octet. Such computers are termed "well- behaved" with respect to BLOB. BLOB is designed to be usable on machines which do not have these characteristics, but such machines will necessarily incur more data conversion overhead. 1.1. Notation The word BLOB in upper case letters is used to refer to the protocol; that is, the algorithm used to define the encoding and decoding of data structures defined in this memo. The word "blob" in lower case letters refers to a data structure (sequence of octets) that has been produced by, or can be decoded by, the BLOB protocol. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document, when spelled entirely in upper case letters, are to be interpreted as described in [1]. 2.Data TypesBLOB Overview A "blob" is a linear (octet-stream) encoding of some data structure, which is used as a protocol data unit within some application. The structure(or "struct"), whichencoded by a blob is a collection of "components". Each of the components of a blob is either a "scalar" (meaning that the component consists of exactly one instance of that data type) or an Moore Expires 21 May 2002 [Page 3] BLOB Protocol Internet-Draft 21 November 2001 "array" (meaning that the component consists of a sequence ofheterogeneouszero or more "elements" of a uniform datatypes.type). The data types which can appear asmemberscomponents of astructblob are:-unsigned integer(32-bit) -(32 bits in length), string(variable-length(a variable-length sequence of octets with arbitraryoctets) - integer array (variable-length sequence of unsigned integers) - string array (variable-length sequence of strings) - struct (heterogeneous sequence of anyvalues), or blob. Any of thesetypes) These primitive types were chosen because they are directly usable on most hardware, they represent the vast majority of datatypesusedcan occur as a scalar or innetworking protocols, and because data types outsidean array. Since one blob can contain other blobs, complex nesting ofthis set are often specific to the higher-level protocol anyway. Having a limited set of data types allows for a more compact encoding, whichstructures iseasier to decode,possible. However the blob encoder and decoder treat "embedded" blobs (blobs whichdoesn't need separate marshalling routinesoccur as components of an outer blob) as opaque structures. For example, embedded blobs are not automatically decoded along with outer blobs, and a formatting error in an embedded blob does not create a formatting error foreach individual structure. Moore Expires January 12, 2002 [Page 3] BLOB Protocol Internet-Draft 12 July 2001any blob that contains it. "Variable-length" here means that the lengths of arrays need not be pre- determined by the protocol using BLOB.However theThe maximum lengths of strings and arrays are constrained by the use of a 32-bit unsigned integer for the length of the blob, and the representation of offsetswithinof data relative to the start of the blob as 32-bit unsigned integers.TheyLengths may be further constrained by thehigher- levelhigher-level protocol's choice of transmission medium - for instance, if the blob must fit into a UDP datagram. The number ofmembers of a structarray elements isconsiderably more constrained (as will become clear below),limited to 255 arrays of each data type, but this should be adequate for most data structuresencounteredneeded in network protocols.When other2.1 Use of Data Types Not Supported by BLOB The primitive types (unsigned 32-bit integer and octet string) were chosen because they represent the majority of data types used in network protocols, they areneededdirectly supported by most computer hardware, and because data types outside of this set are often specific to the higher- level protocol anyway. Having aprotocol, theysmall set of data types allows BLOB to be a compact yet self-describing encoding, which is efficient to decode and which does not require separate marshaling routines for each protocol data unit used by an application. A few additional types (in particular, single- and double-precision floating point) are being considered for future versions of BLOB. The BLOB protocol is intended to allow new primitive types to be added without changing the format of blobs that do not include these types. When a higher-level protocol needs to use a data type that is not directly supported by BLOB, such data must be represented in terms of theabove primitiveavailable types. The higher-level protocol specification mustchoosedefine therepresentation,representation of such data in terms of types supported by BLOB, and the conversion between the blob representation and the native format must be explicitly managed by the applications. For instance:- AMoore Expires 21 May 2002 [Page 4] BLOB Protocol Internet-Draft 21 November 2001 - A signed 32-bit integer may be transmitted as an unsigned 32-bit integer by encoding the signed integer in twos-complement format. On most modern machines no conversion will be necessary; however on machines for which the smallest integerrepresenationrepresentation is larger than 32 bits it will be necessary for the application to sign- extend the result. - A 64-bit integer may be transmitted as two consecutive 32-bit integers (with the most significant word first), which would require that the receiving application arrange those two integers according to its native byte ordering. Alternatively a 64-bit integer may be transmitted as eight consecutive octets within a string (most significant byte first), which would require that the receiving application re-arrange those octets according to its local byte ordering. - A multi-dimensional array may be represented as a single- dimensional array with thesizesdimensions of thedimensionsarray passed as separate integerparameters;components. -FloatingIn the current version of BLOB, floating point numbers may be encoded in IEEE format and transmitted as either integers (modulo sign-extension issues) or strings (modulosign-extension issues),alignment issues). Future versions of BLOB may support floating point numbers directly. - A small dense set may be represented as bits withinana scalar integer.SlightlyA larger densesetsset may be encodedas bit offsets intousing individual bits of the elements of an integer array.Larger or sparse sets may be represented by encoding them in a string. Moore Expires January 12, 2002 [Page 4] BLOB Protocol Internet-Draft 12 July 20013. BLOBProtocol The basic unit of BLOB encoding is a "struct". A "blob" is a sequence of octets which forms the on-the-wire representation of a struct.Organization At the most basic level, the blob consists of an integer portion followed bya stringan opaque portion. The integer portionconsists of a header, an argument list, and an integer pool. Each of theseis a sequence ofintegers, all of which are 4 octets in length andunsigned 32-bit (4-octet) quantities, representedon-the- wireon-the-wire in network byte(big-endian)("big-endian") order. Thestringopaque portion is a sequence ofoctets; order of octets is preserved within the string portion.8-bit (1-octet) quantities. The blob is separated intostringopaque and integer portions in order to facilitateeasy decoding. In order for the blob to be usable on a little-endian cpu, each integer of the integer portion will need to have its octets reversed. By contrast, the string portion has the same representationefficient decoding onboth big-endian andlittle-endianplatforms. Thusmachines, or ona "well-behaved" little-endianany machinethe blob can be converted from on-the-wire format towith aformat which is usable locally, merely by reversing the orderword size other than 32 bits. Having all of theoctetsintegers withineacha blob co-located in a contiguous area allows an implementation to efficiently convert all of thefirst (string_pool_offset / 4) 32-bitintegers to local format at the same time. Strings of octets are assumed to have theblob. Nosame representation on all platforms, so conversion isnecessary in orderunlikely touse a blob on a "well-behaved" big-endian machine. Since a blob isbe needed for theon-the-wire representation of a struct, if a blob contains one or more structs as components of the outer struct, they will themselves be represented as blobs. Those blobs will be stored in the string pool. Inner blobs must be explicitly decoded/converted by the receiving application; they are not automatically decoded when the outer blob is decoded.opaque portion. Moore ExpiresJanuary 12,21 May 2002 [Page 5] BLOB Protocol Internet-Draft12 July21 November 20013.1 Structure of a blobThestructureinteger portion of a blob isas follows: octet offset name 0 +--------------------------------+ \ | blob_length | | 4 +--------------------------------+ | | integer_pool_offset | | 8 +--------------------------------+ | | string_pool_offset | | 12 +--------------------------------+ | | argument_counts | | 16 +--------------------------------+ + integer portion : : | : argumentfurther divided into a header, a list: | : : | integer_pool_offset +--------------------------------+ | : : | : integer_pool : | : : / string_pool_offset +--------------------------------+ \ : : | : string_pool : + string portion : : | blob length +--------------------------------+/ blob_lengthof array bases, and an integer pool. Theblob_lengthheader is used to store various data needed to decode thelength of the entireblobin octets. The length includes the space occupied by blob_length. integer_pool_offsetand check it for consistency. Theinteger_pool_offset isarray bases portion contains theoctet offset (relativeoffsets (positions relative to the start of the blob) of theinteger_pool portioneach of theblob. integer_pool_offset must be a multiple of four, greater than or equal to 16, and less than or equal to string_pool_offset. Ifarrays in thelength of integer_pool is zero, integer_pool_offset will be equalblob (including the arrays used tostring_pool_offset. string_pool_offsetstore scalar components). Thestring_pool_offsetinteger pool is used for storing integer data as well as theoffset (relative to the start of the blob) of the string_pool portionoffsets ofthe blob. It must be greater than or equal to integer_pool_offsetembedded blobs andless than or equalstrings. The opaque portion is divided into a blob pool and a string pool. The blob pool is used toblob_length. If the length ofstore embedded blobs; thestring_poolstring pool iszero, string_pool_offset will be equalused toblob_length. Moore Expires January 12, 2002 [Page 6] BLOB Protocol Internet-Draft 12 July 2001 argument_countsstore strings. Theargument_counts field indicatesblob pool occurs immediately following thenumber of each kindinteger pool in order to ensure that embedded blobs are always aligned on a four-octet boundary (relative to the start ofargument. This fieldthe blob). Each embedded blob iscalculated as follows: argument_counts = (num_int_args) + (num_int_array_args << 8) + (num_string_or_struct_args << 16) + (num_string_or_struct_array_args << 24) where num_xxx_argspadded with 0-3 zero octets until its length isthe number of argumentsan exact multiple oftype xxx, and num_xxx_array_args is4 octets. This ensures that all embedded blobs are aligned to 4-octet boundaries, allowing thenumber of arguments of type arrayblob decoder to assume (if the outer blob is on an aligned boundary) that each ofxxx. argument_list The argument_list containsthe embedded blobs is also aligned. Each string is padded with alistsingle octet with a value ofintegerszero, whichrepresent the membersis not part of thestruct. In order that the blob may be sanity checkedstring. This is forinternal consistency without wasting lots of space, the arguments within the argument_listconvenience when strings arearranged soused to store character data, with programming languages thatsimilar types of argumentsuse a zero- valued octet as a string terminator. Embedded blobs are opaque to their enclosing blob and areconsecutive. WithinNOT automatically parsed or decoded when theargument_list,outer blob is decoded. If thearguments appear inreceiving application wishes to examine contents of an inner blob, it must decode it separately from thefollowing order: 1. int arguments 2. intenclosing blob. A blob can have both scalar and arrayarguments 3. string or struct arguments 4. string or struct array arguments integer_pool The integer_pool contains integerscomponents. For simplicity inthe following order: 1. The elementsdecoding and to eliminate some edge cases, all ofinteger arrays, in the order that these arrays appear intheargument list. 2. Offsetsscalar integers ofstrings and structs within string and struct arrays,a blob are stored in a "scalar integer array" which immediately follows theorder thatlast integer array component of theoffsetsblob. Similarly, all ofthese arrays appear intheargument list. These offsetsscalar (embedded) blob parameters) areoffsets from the beginning ofstored in a "scalar blob array" which immediately follows theblob,last blob array component, andpoint intoall of thestring_pool. string_pool The string_pool begins at string_pool_offset and contains strings and embedded structs whichscalar string parameters arereferenced within the outer struct. The strings and structs appearstored in a "scalar string array" which follows thefollowing order: 1. Contentslast string array component. 3.1 Representation ofstrings or structs thatdata types In general, all components of a blob arereferenced in the argument list, in the orderelements of an array. A distinguished array of each type is used to store scalar components of thatthose offsets appear intype. The base of any array (whether it is a numbered array component or an array used to hold scalar components) can be determined by decoding the array_counts_and_flags field of theargument list.blob header. Moore ExpiresJanuary 12,21 May 2002 [Page7]6] BLOB Protocol Internet-Draft12 July21 November 20012. ContentsSince strings (and blobs) can be of varying length, an array of stringsor structs that are elements(or blobs) is represented internally by an array ofarrays, inintegers. Each of these integers indicates theorder that their offsets appear instorage location (within theinteger_pool. For compatibility with programming languages which terminate strings with a zero octet, a zero octet is automatically appended to each string inblob) of thestring_pool. 3.2 Struct Member Encoding The memberscontents of the string or blob. These integers are consecutive; the offset of element 2 of an array immediately follows the offset of element 1. Similarly, the array elements occupy consecutive storage - the storage occupied by string 3 of an array immediately follows that occupied by string 2. This allows the size of array N to be computed by subtracting its offset from that of the following array; this works for any numbered array. It also allows the length of element M to be computed by subtracting its offset from that of the following element; this works for elements (within bounds) of numbered arrays. The last scalar blob or string is a boundary case; these require an explicit test to correctly determine their length. The individual components of astructblob are encoded as follows:-3.1.1 integers and integer arrays An"int"unsigned integer is represented as a 32-bitintegerquantity in big-endian format.-All integer components appear in the integer_pool section of a blob. An"int array"integer array is represented asan integer offset relative tozero or more contiguous 32-bit integers, that are stored within thebeginninginteger_pool section of theblob, which pointsblob. The location (or "base") of the array relative to theelementsstart of thearray.blob is stored as a 32-bit integer offset. Theelementsbase ofthethis arrayareis stored in theinteger_pool, in increasing order, in big-endian format. The offsetarray_bases portion ofanthe blob. Scalar integerarray must therefore be greater than or equal to integer_pool_offset and less than or equal to string_pool_offset. Consecutive int arrayscomponents a blob arestoredencoded inconsecutive locations within the integer_pool. Thusa scalar integer array. The storage for thelengthelements ofan integerthis arrayN (where Nisless thanin thenumber ofintegerarrays, minus 1) can be determinedpool, and immediately follows the storage used bysubtractingthe last numbered integer array. The offset of the scalar integer arrayN fromappears in theoffsetarray_bases portion ofinteger array N+1, and dividingtheresult by 4. The lengthblob. 3.1.2 (embedded) blobs and blob arrays An embedded blob component is represented as a series ofthe last integer array can be determined by subtracting the offsetoctets which is an integral multiple ofthat integer arrayfour octets long. The storage for embedded blobs is taken from theoffsetblob pool of thefirst string array, or if there are no string arrays, from string_pool_offset. - A "string" is represented as anenclosing blob. An integer offsetrelative(relative to the beginning of theblob, which points to the contents ofblob) indicates thestring. The contentsstarting location of thestringembedded blob. For scalar embedded blob components these offsets are encoded in a scalar blob array. This array (of blob offsets) is stored in thestring_pool. The offsetinteger pool and immediately follows the offsets ofany string must therefore be greater than or equal tothestring_pool_offset and less than or equalnumbered blob arrays. A blob array is represented as an integer base (stored in array_bases) which points toblob_length. String arguments, and elementsan array ofstring arrays, are stored consecutivelyintegers (stored in thestring pool.integer pool), each Moore Expires 21 May 2002 [Page 7] BLOB Protocol Internet-Draft 21 November 2001 element of which is the offset of a blob (within the blob pool). Eachstringembedded blob (within the blob pool) is followedin the string_poolby from 0-3 octets with the value zero, so that any subsequent blob will be aligned on azero octet which isfour-octet boundary. These padding octets are not considered part of thestring. Thusblob; however, the length ofany string argument (other thanthelast) can be calculated by subtracting its offsetinner blob (as seen from theoffset of the subsequentenclosing blob) will include any padding. 3.1.3 strings and stringargument, minus 1. The length of the lastarrays A stringargument can be calculated by subtracting its offset from the offsetis represented as a sequence ofthe first elementoctets; these octets may have arbitrary values. The contets ofthe first string array, or if therestrings areno string arrays, from Moore Expires January 12, 2002 [Page 8] BLOB Protocol Internet-Draft 12 July 2001 blob_length. Strings can be of zero length,stored inwhich casethecorrespondingstring_pool. An integer offsetpoints to a zero octet which is immediately followed by the next string(stored in integer_pool) indicates thestring_pool. Strings can also be 'missing' or NULL, in which caselocation of theoffset is zero. -contents of the string. A"string array"string array is represented as an integeroffset (relative to the beginning of the blob)base (stored in array_bases) which points to an array of integers (stored in the integer pool), each element of whichpoints toindicates the offset of a string(within the(stored in string pool).The length of anyEach stringarray element (other than the last oneis followed inthat array) can be calculated by subtracting its offset fromtheoffsetstring_pool by a zero octet which is not part of thesubsequent element, minus 1. Thestring. Thus the length ofthe last element in aany stringarray(other than the last scalar stringarray)component) can be calculated by subtracting its offset from the offset of thefirst element of thesubsequentstring array,string, minus 1.The length of the last element in the last string arrayStrings can becalculated by subtracting its offset from blob_length, minus 1. - A "struct" is represented as an integer offset (relative to the beginning of the blob) which points to the beginningofan inner blob (storedzero length, inthe string portion of the outer blob),whichcontainscase theinner struct. - A "struct array" is represented as an integercorresponding offset(relative to the beginning of the blob), whichpoints toan array of integers (stored in the integer pool), each element ofa zero octet whichpoints tois immediately followed by theoffsetnext string in the string_pool. Moore Expires 21 May 2002 [Page 8] BLOB Protocol Internet-Draft 21 November 2001 3.2 Structure of a blob(within the string pool) that representsThe structure of astruct. 4. Useblob is as follows: octet offset name 0 +--------------------------------+ \ | blob_length | | 4 +--------------------------------+ | | integer_pool_offset | | 8 +--------------------------------+ | | blob_pool_offset | | 12 +--------------------------------+ | | string_pool_offset | | 16 +--------------------------------+ | | array_count_and_flags | | 20 +--------------------------------+ + integer portion : : | : array_bases : | : : | integer_pool_offset +--------------------------------+ | : : | : integer_pool : | : : / blob_pool_offset +--------------------------------+ \ : : | : blob_pool : | : : | string_pool_offset +--------------------------------+ + opaque portion : : | : string_pool : | : : | blob_length +--------------------------------+ / For this version ofBLOBs by higher-level protocols Higher-level protocols usingthe BLOBas an encoding mechanism need to define their protocol data unitsprotocol, the integer portion begins at offset 0 and is blob_pool_offset octets intermslength. The opaque portion begins at blob_pool_offset and is (blob_length - blob_pool_offset) octets in length. Future versions of the BLOB"structs". Since BLOB groups all similarly-typedprotocol may add additional pools for other datatogether within the blob (for ease of conversion),types, andsincetherefore may change these formulas. BLOBrigidly definesdecoder implementations MUST therefore decode 'array_count_and_flags' (see below) and verify that theorder in which data must appear, applications generally cannot referflags portion of this field is equal toprotocol elements within a blob by a fixed offset. Instead,zero, before translating theapplication code references protocol elements in terms of "the second string parameter", "the third integer parameter" or "the second elementremainder of thefourthintegerarray parameter". Macros which allow these elements to be accessed from a decoded blob structure are easily constructed. It is possible to define a simple specification language which allows the elements of a structportion tobe specified intheorder that makesformat used by the local machine. Moore ExpiresJanuary 12,21 May 2002 [Page 9] BLOB Protocol Internet-Draft12 July21 November 2001most senseThe following paragraphs describe the fields within a blob: blob_length The blob_length is the length of the entire blob in octets. The length includes the space occupied by blob_length. blob_length does not include any padding which is added to make anapplication, and which producesembedded blob alistmultiple ofmacros which map from protocol data element namesfour octets long. integer_pool_offset The integer_pool_offset is the octet offset (relative toroutines which can access those data elements. This hidesthedetailsstart ofBLOB's reordering fromtheapplication without significantly impairing efficiency. An exampleblob) ofsuch a language is given in Appendix B. If higher-level protocols employ data types other thantheBLOB primitive data types, they must define howinteger_pool field of theapplication-specific data types are represented as oneblob. integer_pool_offset MUST be a multiple of four, greater than ormore BLOB primitive types,equal to 24, andimplementations ofless than or equal to blob_pool_offset. If theprotocollength of integer_pool is zero, integer_pool_offset will beresponsible for conversion. Applications which require a canonical form (say for signing) should specify the conversion from application data typesequal toBLOB types so that thereblob_pool_offset. blob_pool_offset The blob_pool_offset isexactly one possible representationthe offset (relative to the start ofeach application data type within BLOB. Since a single blobs cannot encode arbitrarily complex structures, and since nesting blobs add a bitthe blob) ofoverhead, protocol designers should avoid deep nestingthe blob_pool field ofstructures. For instance, what totheapplication is conceptually an array of structs mayblob. blob_pool_offset MUST bebetter represented within BLOB asaset of parallel arrays. At the same time, nestingmultiple ofstructs is useful when it is desired that an inner blob be opaquefour, greater than or equal to integer_pool_offset, and less than or equal to string_pool_offset. If thelayerlength ofa protocol that decodestheouter blob. 5. Encoding Issues Most blobs will contain at least one variable-length data structure. This implies that a program that encodes a blobblob_pool is zero, blob_pool_offset willusuallybeunableequal togenerate the elements of a blob in-place. Instead,string_pool_offset. string_pool_offset The string_pool_offset is theprogram will needoffset (relative tocopytheelementsstart ofa blob from their various locations into a contiguous location in memory, in ther order prescribed bytheBLOB specification. A sample implementation is given in Appendix C. 6. Decoding Issues On "well-behaved" machines it should be possible to use blobs in-place after convertingblob) of theintegerstring_pool portion of theblob to the local byte order. The protocol elements within the blob can then be accessed with macros.blob. Itis necessary to check the blob for consistency before using it. In particular: - The blob_length mustMUST beconsistent with the lengtha multiple ofthe PDUfour, greater than orbuffer in which the blob was received. (For instance, it must not beequal to blob_pool_offset, and less than or equal to blob_length. If the length ofdata received). - The blob_length must be at least 16 (which would bethelengthstring_pool is zero, string_pool_offset will be equal to blob_length. array_counts_and_flags The array_counts_and_flags field indicates how many ofan empty blob with no arguments). Moore Expires January 12,each kind of array element are contained within the blob. This field is calculated as follows: array_counts_and_flags = (num_int_arrays) + (num_blob_arrays << 8) + (num_string_arrays << 16) + (flags << 24) where num_xxx_args is the number of array arguments of type xxx. The "flags" portion of this field is used to indicate extensions to this format. Blobs that do not use these extensions will have a flags field of zero. For this version of the BLOB protocol, the flags field MUST be zero. Moore Expires 21 May 2002 [Page 10] BLOB Protocol Internet-Draft12 July21 November 2001-array_basess Theinteger_pool_offset must be equalarray_bases field contains the bases (offsets relative to the start of thenumberblob) ofarguments (decoded from argument_counts) multiplied by 4, plus 16. - The string_pool_offset must be greater than or equal to integer_pool_offset. - The string_pool_offset must be less than or equal to blob_length. -each of the arrays in the blob, including those arrays which contain the scalar components of the blob (using separate arrays for scalar integer, struct, and string components). Specifically the array_bases field contains, in order: 1. Theoffsetbase of each integerarray must be a multiplearray. There are num_int_arrays (possibly zero) of4. -these. 2. Theoffsetbase of thefirstscalar integerarray (if any) must be equal to integer_pool_offset. - Each subsequent non-nullarray. This base is always present, even if there are no scalar integerarray offset must be greater than or equal tocomponents. If there are no scalar integer components of thepreviousblob, the scalar integer arrayoffset, and less than string_pool_offset. - The offsetbase will be the same as the base of blob array 0. (If there are no blob arrays in thefirst elementblob, the base of thefirst stringscalar integer arraymustwill begreater than or equal totheoffsetsame as the base of thelast non-null integer array. -scalar blob array.) 3. Theoffset of the first elementbase of eachsubsequent string array must be greater than or equal to the offsetblob array. There are num_blob_arrays (possibly zero) ofthe first elementthese. 4. The base of theprevious stringscalar blob array.- The first string argument must have an offset equal to string_pool. - Each subsequent non-null string argument mustThis base is always present. If there are no embedded scalar blob components in the blob, the scalar blob array base will havean offset greater (by at least 1) than that oftheprevious string argument. - The first element ofsame value as thefirstbase of string arraymust have an0. (If there are no string arrays in this blob, this offsetgreater (by at least 1) thanwill be theoffsetsame as the base of thelastscalar stringargument. -array.) 5. Thefirst elementbase ofany subsequenteach stringarray must have an offset which is greater (by at least 1) than the last elementarray. There are num_string_arrays (possibly zero) of these. 6. The base of thepreviousscalar string array.- Each element of aIf there are no scalar stringarray must have an offset greater (by at least 1) than the offsetcomponents of theprevious element in that array. - Except forblob, thefirst string, there must be a zero octet preceding each offsetbase ofeach non-null string argument or non-nullthe scalar string arrayelement. -will be equal to blob_length. 7. Any additional bases of arrays, or offsets of scalar components, which might be defined by future versions of this protocol. Thelast octetpresence of additional data types not supported in this version of thestring_pool mustBLOB protocol will be indicated by azero. Moore Expires January 12, 2002 [Page 11] BLOB Protocol Internet-Draft 12 July 2001 A sample implementation is givennonzero value inAppendix D. 7. Security Considerations It is believed thattheBLOB encoding is unique and can serve as a useful 'canonical form' for a data structure. However, if higher-level protocols encode non-native data types as BLOB primitive types, they must also define a unique representation for each quantity to be stored in that data-type. In order to prevent possible attacks by transmissionflags portion ofblobs containing bogus offsets, it is essential to performthebounds checks listed in section 6 while decoding blobs. While such attacks could not easily overwrite memory with data chosen by an attacker, they could cause a serverarray_counts_and_flags field. integer_pool The integer_pool contains 32-bit integers, assumed tomalfunction. 8. Author's Address Keith Moore University of Tennessee 1122 Volunteer Blvd, Suite 203 Knoxville TN 37996-3450 email: moore@cs.utk.edu 9. References [1] "Specificationbe unsigned. These may be either scalar integer, elements ofBasic Encoding Rules for Abstract Syntax Notation One (ASN.1)", CCITT Recommendation X.209, January 1988. [2] "Specificationinteger arrays, offsets ofASN.1 encoding rules: Basic, Canonical, and Distinguished Encoding Rules", ITU-T X.690, January 1994. [3] Srinivasan, R., "XDR: External Data Representation Standard", RFC 1832, August 1995. [4] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part One: Formatscalar blobs or strings, or bases ofInternet Message Bodies", RFC 2045, November 1996. [5] "Extensible Markup Language (XML) 1.0 (Second Edition)", W3C Recommendation, October 2000, <http://www.w3.org/TR/2000/REC-xml-20001006>. [6] Crocker, D. (ed.), Overell, P. "Augmented BNF for Syntax Specifications: ABNF.". RFC 2234, November 1997.blob or string arrays The integers within the integer_pool MUST appear in the following order: Moore ExpiresJanuary 12,21 May 2002 [Page12]11] BLOB Protocol Internet-Draft12 July21 November 2001Appendix A. ASCII-Art Picture1. The elements ofa BLOB This diagram attempts to illustrateinteger arrays. The integer array components appear in order, and within each array, theorderingelements appear in order. The arrays and their elements are numbered from zero. Thus the 0th element of thevarious1st integer array immediately follows the last element of the 0th integer array. 2. The elements ofa blob andtherelationshipscalar integer array. Thus integer scalar component 0 immediately follows the last element of theoffsets tolast integer array; followed by integer scalar component 1, etc. (If there are no integer arrays, theelements to which they point: octetoffsetnameof integer scalar 0+--------------------------------+ | blob_length | 4 +--------------------------------+ | integer_pool_offset | 8 +--------------------------------+ | string_pool_offset | 12 +--------------------------------+ | argument_counts | 16 +--------------------------------+is integer_pool). 3. Theargument list looks like this: 16 +--------------------------------+ | 1st scalar int arg | +--------------------------------+ | 2nd scalar int arg | +--------------------------------+ : : (16 + num_int_args / 4) +--------------------------------+ | offsetoffsets of1st int array arg |--+ +--------------------------------+ | | offsetelements of2nd int array arg |--|--+ +--------------------------------+ | | 16 + (num_int_args + : : | | num_int_array_args) / 4 +--------------------------------+ | | |blob arrays. Each blob offset MUST be an integral multiple of1st struct/string arg|--|--|---+ +--------------------------------+ | | | |four, and each blob offsetof 2nd string/string arg|--|--|---|-+ +--------------------------------+ | | | | : : | | | | 16 + (num_int_args + : : | | | | num_int_array_args + : : | | | | numstr_or_strct_args) / 4 +--------------------------------+ | | | | |MUST point into the blob_pool. The offset of1st str* array arg |--|--|-+ | | +--------------------------------+ | | | | | | offsetthe element 0 of2nd str*blob arrayarg | | | | | | +--------------------------------+ | | | | | | offset0 MUST be equal to blob_pool_offset. Each subsequent element of3rd str*a blob arrayarg | | | | | | integer poolMUST have an offset+--------------------------------+ | | | | | | | | | | Moore Expires January 12, 2002 [Page 13] BLOB Protocol Internet-Draft 12 July 2001 | | | | | | | | | | The integer pool looks like this: | | | | | | | | | | integer pool offset = | | | | | offset of 1st +--------------------------------+ <+ | | | | int array arg | 1st element of 1st array | | | | | +--------------------------------+ | | | | | 2nd element of 1st array | | | | | +--------------------------------+ | | | | : : | | | | : : | | | | : : | | | |equal to the offset of2nd +--------------------------------+ <---+ | | | int array arg | 1st elementthe preceding blob plus the declared length of2nd array | | | | +--------------------------------+ | | | : : | | | : : | | | offsetthe preceding blob (after padding). NOTE: The data within an embedded blob is considered opaque to the enclosing blob; the only reason for separating blobs from strings is to ensure padding of1st +--------------------------------+ <-----+ | | str* array arg | offsetblobs to 4-octet boundaries. Blob encoders SHOULD NOT insist that the length field of1st eleman embedded blob is consistent with the length declared for that blob, and blob decoders SHOULD NOT check the length fields of1st str* | | | +--------------------------------+ | | | offsetembedded blobs when decoding the enclosing blob. 4. The offsets of2nd elemelements of1st str* | | | +--------------------------------+ | | : : | |the scalar blob array. Each blob offset MUST be a integral multiple of2nd +--------------------------------+ | | str* array arg |four, and MUST point into the blob_pool. The offset of1st elemscalar blob component 0 MUST immediately follow the last element of2nd str* | | | +--------------------------------+ | | |the last blob array. (If there are no blob arrays, the offset of scalar blob component 0 is blob_pool). Each subsequent scalar blob component MUST have an offset equal to the offset of2nd elemthe preceding blob plus the length of2nd str* | | | +--------------------------------+ | | | | | |the preceding blob (after padding). 5. The offsets of elements of stringpool looks like this: | | | |arrays. These offsets MUST point into the string_pool. Element 0 of stringpoolarray 0 MUST have an offset= | |equal to string_pool_offset, and each subsequent string MUST have an offset equal to the preceding string's offset, plus the length offirst +--------------------------------+ <-------+ | string arg | S T R I | | +--------------------------------+ | | N Gthe preceding string, plus 1\0 | | +--------------------------------+ <---------+ offset(for the trailing zero octet). 6. The offsets ofsecond | S e c o |elements of the scalar stringarg +--------------------------------+ | n d \040 S | +--------------------------------+ | t r i n | +--------------------------------+ | g \0 | +----------------+array. These offsets MUST point into the string_pool. The scalar string component 0 MUST have an offset equal to the offset of the Moore ExpiresJanuary 12,21 May 2002 [Page14]12] BLOB Protocol Internet-Draft12 July21 November 2001Appendix B. Example Abstract Syntax This syntax used to describe BLOB structurespreceding string, plus the length of the preceding string, plus 1 (for the trailing zero octet). (If there are no string arrays, the offset of scalar string 0 isdescribed below usingstring_pool). blob_pool The blob_pool contains structures which are encoded in blob format. These structures may be scalar blob components of theABNF syntax from [6]: file = *(block / comment-line) block = "BEGIN" 1*space id [ 1*space comment ] CRLF *element END [ comment ] CRLF element = "int" 1*space identifier [ comment ] CRLF / "string" 1*space identifier [ comment ] CRLF / "int<>" 1*space identifier [ comment ] CRLF / "string<>" 1*space identifier [ comment ] CRLF / "struct" 1*space identifier [ comment ] CRLF comment = *space "#" *char comment-line = comment CRLF id = letter *(letter / digit / "_") letter = "A".."Z" / "a".."z" digit = "0".."9" space = %20 / %09 char = %01..%09 / %0B / %0C / %0E..%FF CRLF = 0*1%0D 0*1%0A Here is a simple awk program to interpret this syntax and produce a listouter blob, or elements ofC #define macros.scalar blob arrays of the outer blob. Themacros arecontents of blob_pool appear in theform #define structname_element_type number where 'structname' isfollowing order: 1. The contents of each element of each blob array. Element 0 of blob array 0 appears first, followed by element 1 of blob array 0, etc. 2. The contents of each element of thenamescalar blob array, used to store scalar (embedded) blob components of thestructure, 'element' isouter blob. Each blob in thenameblob pool MUST be padded with from zero to three octets, each with a value of zero, so that theelement, and 'type'length of each blob is an exact multiple of four octets. string_pool The string_pool contains unaligned strings of arbitrary octets. These strings may be used for character data or for any other data which can be represented as asuffix indicatingstring of octets. BLOB makes no assumptions regarding thetypeformat of data (character encoding scheme, etc.) that is stored in strings. The contents of theelement (i = int, s = string/struct, ia = integer array, sa = struct/string array) for easestring_pool appear invisual type checking. This program is quite simplisticthe following order: 1. The contents of each element of each string array of the blob. 2. The contents of each element of the scalar string array. For compatibility with programming languages which terminate strings with a zero octet, a zero octet is automatically appended to each string in the string_pool. This zero octet is not part of the string. Since zero octets MAY appear within BLOB strings, the zero octet that is appended to each string MUST NOT be used as a string terminator except when the higher-level protocol has specified that they may be used in this way. 4. Use of blobs by higher-level protocols Higher-level protocols using BLOB as an encoding mechanism need to define their protocol data units in terms of blobs. Since BLOB groups all similarly-typed data together within the blob (for ease of conversion), andperforms no error checking.since BLOB rigidly defines the order in which data must Moore ExpiresJanuary 12,21 May 2002 [Page15]13] BLOB Protocol Internet-Draft12 July21 November 2001#!/bin/sh #appear, applications generally cannot refer to protocol elements within a blob by a fixed offset. Instead, thesed line deletes comments sed -e 's/[ ]*#.*//' | awk ' $1 == "BEGIN" { current_id = $2; nint = nstr = ninta = nstra = 0; } $1 == "int" { inames[nint] = $2; nint++; next; } $1 == "string" { snames[nstr] = $2; nstr++; next; } $1 == "struct" { snames[nstr] = $2; nstr++; next; } $1 == "int<>" { ianames[ninta] = $2; ninta++; next; } $1 == "string<>" { sanames[nstra] = $2; nstra++; next; } $1 == "struct<>" { sanames[nstra] = $2; nstra++; next; } $1 == "END" {application code references protocol elements in terms of "the second scalar string component", "the third scalar integer component" or "the second element of the fourth integer array component". Macros or functions which allow these elements to be accessed from a decoded blob structure are easily constructed. It is possible to design a simple specification language which allows the elements of a blob to be specified in the order that makes the most sense to an application, and which produces a list of macros which map from protocol data element names to routines which can access those data elements. This hides the details of BLOB's reordering from the application without significantly impairing efficiency. An example of such a language is given in Appendix B. If higher-level protocols employ data types other than the BLOB primitive data types, they must define how the application-specific data types are represented as one or more BLOB primitive types, and implementations of the protocol will be responsible for conversion. Applications which require a canonical form (say for signing) should specify the conversion from application data types to BLOB types so that there is exactly one possible representation of each application data type within BLOB. Since each blob is self-contained with its own header, embedded blobs add a bit of overhead. Protocol designers should avoid unnecessary nesting of structures. For instance, what is conceptually an array of structures to an application might be better represented within BLOB as several parallel arrays. However, nesting of blobs is useful when it is desired that an inner blob be opaque to the layer of a protocol that decodes the outer blob. 4.1. Encoding Issues Most blobs will contain at least one variable-length data structure. This implies that the offsets of the components within the blob will not be known in advance, and a program that encodes a blob will usually be unable to generate the elements of a blob in-place. The encoder routine will generally need to copy the elements of a blob from their various locations into a contiguous area of memory, in the order prescribed by the BLOB specification. 4.2. Decoding Issues On "well-behaved" machines it should be possible to use blobs in-place after converting the integer portion of the blob to the local byte order. The protocol elements within the blob can then be accessed with Moore Expires 21 May 2002 [Page 14] BLOB Protocol Internet-Draft 21 November 2001 macros. It is necessary to check the blob for consistency before using it. In particular: - The blob_length must be consistent with the length of the PDU or buffer in which the blob was received. (For instance, it must not be less than the length of data received). - The blob_length must be at least 32 (which would be the length of an empty blob with no arguments). - The 'flags' portion of array_counts_and_flags MUST be zero. - The integer_pool_offset must be equal to the the number of arguments (decoded from array_counts_and_flags) multiplied by 4, plus 20. - The blob_pool_offset must be greater than or equal to integer_pool_offset. - The string_pool_offset must be greater than or equal to blob_pool_offset. - The string_pool_offset must be less than or equal to blob_length. - The base of each integer array and each blob array must be an integral multiple of 4. - The base of the first integer array (if any) must be equal to integer_pool_offset. - Each subsequent integer array base must be greater than or equal to the previous integer array base, and less than or equal to blob_pool_offset. - The offset of element 0 of the first blob array (if any) must be equal to blob_pool_offset. - Each subsequent blob offset must be greater than the previous blob offset. - The last blob offset must be less than string_pool_offset. - The first string component must have an offset equal to string_pool. Moore Expires 21 May 2002 [Page 15] BLOB Protocol Internet-Draft 21 November 2001 - The offset of each subsequent string must be greater than the offset of the first element of the previous string. - Except for(i = 0; i < nint; ++i) printf ("#define %s_%s_i %d\n", current_id, inames[i], i);the first string, there must be a zero octet preceding each offset of each string component or string array element. - The last octet in the string_pool must be a zero. 4.3 Encoding and decoding code A free software sample blob encoder and decoder have been written and will be made available at the location listed in Appendix C. 5. Security Considerations It is believed that the BLOB encoding is unique and can serve as a useful 'canonical form' for(i = 0; i < nstr; ++i) printf ("#define %s_%s_s %d\n", current_id, snames[i], i);a data structure. However, if higher-level protocols encode non-native data types as BLOB primitive types, they must also define a unique representation for(i = 0; i < ninta; ++i) printf ("#define %s_%s_ia %d\n", current_id, ianames[i], i);each quantity to be stored in that data-type. In order to prevent possible attacks by transmission of blobs containing bogus offsets, it is essential to perform the bounds checks listed in section 4.2 while decoding blobs. While such attacks could not easily overwrite memory with data chosen by an attacker, they could cause a server to malfunction. 6. Author's Address Keith Moore University of Tennessee 1122 Volunteer Blvd, Suite 203 Knoxville TN 37996-3450 email: moore@cs.utk.edu 7. References [1]. Bradner, S. "Key words for(i = 0; i < nstra; ++i) printf ("#define %s_%s_sa %d\n", current_id, sanames[i], i); next; }'use in RFCs to Indicate Requirement Levels", RFC 2119, March 1997. [2] "Specification of Basic Encoding Rules for Abstract Syntax Notation One (ASN.1)", CCITT Recommendation X.209, January 1988. [3] "Specification of ASN.1 encoding rules: Basic, Canonical, and Distinguished Encoding Rules", ITU-T X.690, January 1994. Moore ExpiresJanuary 12,21 May 2002 [Page 16] BLOB Protocol Internet-Draft12 July21 November 2001Appendix C. Example Encoding Code NB: due to deadline pressures this code has not been recently tested,[4] Srinivasan, R., "XDR: External Data Representation Standard", RFC 1832, August 1995. [5] Freed, N. andprobably contains bugs. Check http://www.cs.utk.edu/~moore/blob for the latest version. struct preblob { int ni_args; /* number of integer arguments */ int i_args[256]; /* integer arguments */ int nia_args; /* number of integer array arguments */ int *ia_args[256]; /* bases of integer array arguments */ int lia_args[256]; /* num elements in each integer array */ int ns_args; /* number of string arguments */ char *s_args[256]; /* bases of string arguments */ int ls_args[256]; /* length of each string argument */ int nsa_args; /* number of string array arguments */ char **sa_args[256]; /* base of each string array */ int nlsa_args[256]; /* number of elements in each string array */ int *lsa_args[256]; /* lengths of strings in each string array */ char *blob; int blobsize; }; /* initialize a blob - this is called only once */ #define blob_init (p) \ memset (&(p), 0, sizeof (struct preblob)) /* reset the state of a blob without leaking any memory that it has allocated */ #define blob_reset (p) \ do { \ char *tblob = (p).blob; \ int tblobsize = (p).blobsize; \ blob_init (p); \ (p).blob = tblob; \ (p).blobsize = tblobsize; \ } while (0) /* set the numberN. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part One: Format ofinteger parameters in a blob */ #define blob_set_nint (p, n) \ (p).ni_args = (n);Internet Message Bodies", RFC 2045, November 1996. [6] "Extensible Markup Language (XML) 1.0 (Second Edition)", W3C Recommendation, October 2000, <http://www.w3.org/TR/2000/REC-xml-20001006>. [7] Crocker, D. (ed.), Overell, P. "Augmented BNF for Syntax Specifications: ABNF.". RFC 2234, November 1997. Moore ExpiresJanuary 12,21 May 2002 [Page 17] BLOB Protocol Internet-Draft12 July21 November 2001/* set the value of the nth integer parameter to x */ #define blob_set_int (p, n, x) \ (p).i_args[n] = (x) /* set the numberAppendix A. ASCII-Art Picture ofstring parameters ina blob*/ #define blob_set_nstr (p, n) \ (p).ns_args = (n); /* setThis diagram attempts to illustrate thenumberordering ofinteger array parameters in a blob */ #define blob_set_ninta (p, n) \ (p).nia_args = (n); /* setthenumbervarious elements ofstring array parameters in a blob */ #define blob_set_nstra (p, n) \ (p).nsa_args = (n); /* seta blob and thevaluerelationship of thenth string parameteroffsets to'str' where 'str'the elements to which they point. The following isNUL-terminated */ #define blob_set_str0 (p, n, str) \ doa dump, in an assembler-like notation, of a blob which encodes: 2 scalar integers with values 10, 20 (decimal) 1 integer array, with elements {\ (p).s_args[n] = (str); \ (p).ls_args[n] = strlen(str); \1 2 3 4 }while (0) /* set0 scalar blobs 0 blob arrays 1 scalar string with the valueof the nth"string" 2 stringparameter to 'str' where 'str' is 'len' bytes long */ #define blob_set_strl (p, n, str, len) \ doarrays, with elements {\ (p).s_args[n] = (str); \ (p).ls_args[n] = (len); \"a" "b" }while (0) /* setand { "cc" "dd" "ee" }. "label" denotes the name assigned to a particular offset; "xx" gives the offset in hexadecimal; "contents" gives the value of thenth integer array to the in-core integer array startingoctet or octets which appear at'base'that offset; andcontaining 'nelem' elements */ #define blob_set_int_array (p, n, base, nelem) \ do { \ (p).ia_args[n] = (base); \ (p).lia_args[n] = (nelem); \ } while (0)"description" gives a description of the value that appears in that location. label xx contents description ------------------------:--:---------:------------------------ :00: 00000070: blob_length :04: 0000002c: integer_pool :08: 0000005c: blob_pool :0c: 0000005c: string_pool :10: 00020002: array_count_and_flags :14: 0000002c: int_array_base_0 :18: 0000003c: scalar_int_array_base :1c: 00000044: scalar_blob_array_base :20: 00000044: string_array_base_0 :24: 0000004c: string_array_base_1 :28: 00000058: scalar_string_array_base integer_pool: int_array_base_0:2c: 00000001: :30: 00000002: :34: 00000003: :38: 00000004: scalar_int_array_base:3c: 0000000a: (10 decimal) :40: 00000014: (20 decimal) scalar_blob_array_base: string_array_base_0:44: 0000005c: ptr_to_str[0,0] :48: 0000005e: ptr_to_str[0,1] string_array_base_1:4c: 00000060: ptr_to_str[1,0] :50: 00000063: ptr_to_str[1,1] :54: 00000066: ptr_to_str[1,2] scalar_string_array_base:58: 00000069: ptr_to_scalar_str[0] Moore ExpiresJanuary 12,21 May 2002 [Page 18] BLOB Protocol Internet-Draft12 July21 November 2001/* set the value of the nth string arrayblob_pool: string_pool: ptr_to_str[0,0]:5c: 61: 'a' :5d: 00: ptr_to_str[0,1]:5e: 62: 'b' :5f: 00: ptr_to_str[0,0]:60: 63: 'c' :61: 63: 'c' :62: 00: ptr_to_str[0,0]:63: 64: 'd' :64: 64: 'd' :65: 00: ptr_to_str[0,0]:66: 65: 'e' :67: 65: 'e' :68: 00: ptr_to_scalar_str[0]:69: 73: 's' :6a: 74: 't' :6b: 72: 'r' :6c: 69: 'i' :6d: 6e: 'n' :6e: 67: 'g' :6f: 00: blob_length:70: Moore Expires 21 May 2002 [Page 19] BLOB Protocol Internet-Draft 21 November 2001 Appendix B. Example Abstract Syntax This syntax used tothe in-core string array starting at 'bases' and containing 'nelem' strings, where each stringdescribe BLOB structures isNUL-terminated */ #define blob_set_str0_array (p, n, bases, nelem) \ do { \ (p).sa_args[n]described below using the ABNF syntax from [7]: file =(bases); \ (p).lsa_args[n]*(block / comment-line) block =NULL; \ (p).nlsa_args[n]"BEGIN" 1*space id [ 1*space comment ] CRLF *element END [ comment ] CRLF element =(nelem); \ } while (0) /* * set the value of the nth string array to the * in-core string array starting at 'bases' * with the lengths stored in integer array 'lengths' * where each array is 'nelem' long */ #define blob_set_strl_array (p, n, bases, lengths, nelem) \ do { \ (p).sa_args[n]"int" 1*space identifier [ comment ] CRLF / "string" 1*space identifier [ comment ] CRLF / "int<>" 1*space identifier [ comment ] CRLF / "string<>" 1*space identifier [ comment ] CRLF / "struct" 1*space identifier [ comment ] CRLF "struct<>" 1*space identifier [ comment ] CRLF comment =(bases); \ (p).lsa_args[n]*space "#" *char comment-line =(lengths); \ (p).nlsa_args[n]comment CRLF id =(nelem); \ } while (0) /* * encode an int 'x' in big-endian format at ptr 'p'. * this is designed to be portable, there are certainly more * efficient ways to do this on any specific machine * * it should be okay to assume that 'ptr' is aligned on a 4-byte * boundary. */ #define ENCODE_INT(ptr, x) \ do { \ *ptr++letter *(letter / digit / "_") letter =((x) >> 24) & 0xff; \ *ptr++"A".."Z" / "a".."z" digit =((x) >> 16) & 0xff; \ *ptr++"0".."9" space =((x) >> 8) & 0xff; \ *ptr++%20 / %09 char =(x) & 0xff; \ } while (0) Moore Expires January 12, 2002 [Page 19] BLOB Protocol Internet-Draft 12 July 2001 /* * this routine encodes%01..%09 / %0B / %0C / %0E..%FF CRLF = 0*1%0D 0*1%0A Here is ablob pointedsimple awk program toby 'p' *interpret this syntax andleavesproduce a list of C #define macros. The macros are of theresult at p->blob * withform #define structname_element_type number where 'structname' is thesize in p->blobsize */ int blob_encode (struct preblob *p) { int i; int size = 0; int ipoolsize = 0; int spoolsize = 0; int nargs; unsigned int argcounts; char *ptr; char *iptr; char *sptr; if ((p->ni_args > 255) || (p->nia_args > 255) || (p->ns_args > 255) || (p->nsa_args > 255)) return -1; /* too many arguments */ /* * calculatename of theamountstructure, 'element' is the name ofspace needed */ nargsthe element, and 'type' is a suffix indicating the type of the element (i =p->ni_args + p->nia_args + p->ns_args + p->nsa_args; argcountsint, b =p->ni_args + (p->nia_args << 8) + (p->ns_args << 16) + (p->nsa_args << 24); sizeblob, s = string, ia =16 + (4 * nargs); /* size ofintegerarray arguments */ for (iarray, ba =0; i < p->nia_args; ++i) ipoolsize += p->lia_args[i] * 4; /* size of string arguments */ for (iblob array, sa =0; i < p->ns_args; ++i) { if (p->s_args[i] != 0) spoolsize += p->ls_args[i] + 1; } /* size ofstringarray arguments */array) for(i = 0; i < p->nsa_args; ++i) { int j; int *lengths = p->lsa_args[i];ease in visual type checking. This program is quite simplistic and performs no error checking. Moore ExpiresJanuary 12,21 May 2002 [Page 20] BLOB Protocol Internet-Draft12 July21 November 2001ipoolsize += p->nlsa_args[i] * 4; for (j#!/bin/sh # the sed line deletes comments sed -e 's/[ ]*#.*//' | awk ' $1 == "BEGIN" { current_id = $2; nint = nblob = nstr = ninta = nbloba = nstra = 0;j < p->nlsa_args[i]; ++j) { if (p->sa_args[i][j] != 0) { if (lengths) spoolsize += lengths[j] + 1; else spoolsize += strlen (p->sa_args[i][j]) + 1; }}} size = size + ipoolsize + spoolsize; /* * make sure there's enough space allocated */ if (p->blobsize$1 ==0)"int" {p->blob = (char *) malloc (size); p->blobsizeinames[nint] =size;$2; nint++; next; }else$1 == "string" {p->blobsnames[nstr] =(char *) realloc (p->blob, size); p->blobsize$2; nstr++; next; } $1 == "struct" { bnames[nblob] =size;$2; nblob++; next; }/* * now, encode things */ ptr$1 == "int<>" { ianames[ninta] =p->blob; iptr$2; ninta++; next; } $1 == "string<>" { sanames[nstra] =p->blob + 16 + (nargs * 4); sptr$2; nstra++; next; } $1 == "struct<>" { banames[nbloba] =p->blob + 16 + (nargs * 4) + ipoolsize; /* header */ ENCODE_INT (ptr, size); ENCODE_INT (ptr, 16 + (nargs * 4)); ENCODE_INT (ptr, 16 + (nargs * 4) + ipoolsize); ENCODE_INT (ptr, argcounts); /* int arguments */$2; nbloba++; next; } $1 == "END" { for (i = 0; i <p->ni_args;nint; ++i)ENCODE_INT (ptr, p->i_args[i]); /* int array arguments */printf ("#define %s_%s_i %d\n", current_id, inames[i], i); for (i = 0; i <p->nia_args;nblob; ++i){ int j; ENCODE_INT (ptr, iptr - p->blob);printf ("#define %s_%s_b %d\n", current_id, bnames[i], i); for(j(i = 0;ji <p->lia_args[i]; ++j) ENCODE_INT (iptr, p->ia_args[i][j]); Moore Expires January 12, 2002 [Page 21] BLOB Protocol Internet-Draft 12 July 2001 } /* string arguments */nstr; ++i) printf ("#define %s_%s_s %d\n", current_id, snames[i], i); for (i = 0; i <p->ns_args;ninta; ++i){ if (p->s_args[i] != 0) { ENCODE_INT (ptr, sptr - p->blob); memcpy (sptr, p->s_args[i], p->ls_args[i]); sptr[p->ls_args[i]] = '\0'; sptr += p->ls_args[i] + 1; } else ENCODE_INT (ptr, 0); } /* string array arguments */printf ("#define %s_%s_ia %d\n", current_id, ianames[i], i); for (i = 0; i <p->nsa_args;nbloba; ++i){ int j; ENCODE_INT (ptr, iptr - p->blob);printf ("#define %s_%s_ba %d\n", current_id, banames[i], i); Moore Expires 21 May 2002 [Page 21] BLOB Protocol Internet-Draft 21 November 2001 for(j(i = 0;ji <p->nlsa_args[i]; ++j) { if (p->sa_args[i][j] != 0) { ENCODE_INT (iptr, sptr - p->blob); if (p->lsa_args[i]) { memcpy (sptr, p->sa_args[i][j], p->lsa_args[i][j]); sptr += p->lsa_args[i][j]; *sptr++ = '\0'; } else { char *src = p->sa_args[i][j]; while (*sptr++ = *src++); } } else ENCODE_INT (iptr, 0); } } }nstra; ++i) printf ("#define %s_%s_sa %d\n", current_id, sanames[i], i); next; }' AppendixD:C. Example Encoding and Decoding CodeThis code will be supplied in a later version of this document.Check http://www.cs.utk.edu/~moore/blob foravailability.the latest version. Moore ExpiresJanuary 12,21 May 2002 [Page 22] ----