draft-moore-rescap-blob-00.txt  -->   draft-moore-rescap-blob-01.txt

view Side-By-Side changes

Internet-Draft                                   University of Tennessee
Expires: January 12, 21 May 2002                                  July 12,                                    21 November 2001


          The Binary Low-Overhead Block Presentation Protocol

                     draft-moore-rescap-blob-00.txt

                     draft-moore-rescap-blob-01.txt

Status of this Memo

This document is an Internet-Draft and is in full conformance with all
provisions of Section 10 of RFC2026.

Internet-Drafts are working documents of the Internet Engineering Task
Force (IETF), its areas, and its working groups.  Note that other groups
may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time.  It is inappropriate to use Internet-Drafts as reference material
or to cite them other than as "work in progress."

The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt

The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html

This document is being submitted as a contribution to the IETF rescap
working group.  Comments regarding this internet-draft should be sent to
the rescap mailing list at rescap@cs.utk.edu, or to the author at the
address listed below.  Requests to subscribe to the rescap mailing list
should be sent to rescap-REQUEST@cs.utk.edu.  Please include the
document identifier draft-moore-rescap-blob-01.txt  in any comments.

Known errata of this specification, as well as sample code, will be made
available at http://www.cs.utk.edu/~moore/blob/

This Internet-Draft will expire on January 12, 2001.

Abstract 21 May 2002.

ABSTRACT

This memo describes the Binary Low-Overhead Block (BLOB) protocol for
on-the-wire presentation of data in the context of higher-level
protocols.  BLOB is designed to encode and decode data with low overhead
on most CPUs, to be reasonably space-efficient, and for its
representation to be sufficiently precise that it is suitable as a
canonical format for digital signatures.



Moore                      Expires January 12, 21 May 2002                  [Page 1]
BLOB Protocol                Internet-Draft                 12 July             21 November 2001


canonical format for digital signatures.

1. Introduction

When designing applications-layer protocols there is sometimes a need to
have an efficient means of encoding protocol elements or protocol data
units.  Existing solutions in this space may be deemed inadequate, for
various reasons.  For example:

-    ASN.1 [1] [2] and BER [2] [3] are baroque both in terms of the abstract
     syntax and available on-the-wire representations, and complex to
     implement.

-    ONC XDR [3] [4] requires a stub generator and support libraries which
     are not easily available on all platforms, and there are subtle
     differences between the APIs provided by different implementations.
     XDR is large enough that it's not usually feasible to write your
     own implementation, and it's difficult to write portable code that
     can work with the various implementations that are deployed.  Many
     XDR implementations have significant unnecessary processing
     overhead.  This impairs performace performance of applications based on XDR
     and gives the protocol itself a worse reputation than it otherwise
     deserves.

-    The design of MIME [4] [5] was heavily influenced by the need to be
     able to operate over existing text-based mail systems which imposed
     a number of constraints.  This worked out well for email, but for
     other applications, MIME is neither efficient in terms of storage
     density nor easy to parse.

-    XML [5] [6] is easier to parse than MIME, but still requires
     significant processing overhead.  There is also a large and growing
     body of "culture" regarding how XML should be used, which
     paradoxically imposes a significant barrier to use of XML.  (To be
     fair, MIME also has a fair amount of "culture" associated with it.)
     Finally, for small and regular data structures XML imposes a lot of
     overhead.

BLOB was designed to serve as an alternative to these presentation
layers for use in representing relatively simple strucutres, structures, consisting
of a limited set of primitive data types, and where the structures can
reasonably be contained within a single protocol data unit.

BLOB is designed with the following considerations:

-    It should be easy and efficient to generate the encoded form.





Moore                      Expires 21 May 2002                  [Page 2]
BLOB Protocol                Internet-Draft             21 November 2001


-    The encoded form should require minimal processing to decode,
     ideally being usable in-place (without allocating memory or
     copying) on most platforms.



Moore                   Expires January 12, 2002                [Page 2]
BLOB Protocol                Internet-Draft                 12 July 2001

-    It should be easy to write programs which mainpulate manipulate and exchange
     BLOBs, without needing significant external support in the form of
     libraries or stub generators.

-    The structure should be easy and efficient to verify for internal
     consistency.

-    For any structure to be represented there should be a unique
     (canonical) on-the-wire encoding which is always used.

-    It should be reasonably space-efficient.  However, this is
     secondary to minimizing processing overhead.

The BLOB approach is more feasible now than in years past because data
representations have become more uniform across different computing
platforms.  Essentially all widely-used computers now support 32-bit
integers, can address 32-bit integers which are not aligned on any
larger boundary, use word sizes which are a multiple of 8 bits, and can
directly address strings of 8-bit characters which are not aligned on
any boundary larger than an octet.  Such computers are termed "well-
behaved" with respect to BLOB.  BLOB is designed to be usable on
machines which do not have these characteristics, but such machines will
necessarily incur more data conversion overhead.

1.1. Notation

The word BLOB in upper case letters is used to refer to the protocol;
that is, the algorithm used to define the encoding and decoding of data
structures defined in this memo.  The word "blob" in lower case letters
refers to a data structure (sequence of octets) that has been produced
by, or can be decoded by, the BLOB protocol.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document, when spelled entirely in upper case letters, are to be
interpreted as described in [1].

2. Data Types BLOB Overview

A "blob" is a linear (octet-stream) encoding of some data structure,
which is used as a protocol data unit within some application.  The
structure (or
"struct"), which encoded by a blob is a collection of "components".  Each of
the components of a blob is either a "scalar" (meaning that the
component consists of exactly one instance of that data type) or an



Moore                      Expires 21 May 2002                  [Page 3]
BLOB Protocol                Internet-Draft             21 November 2001


"array" (meaning that the component consists of a sequence of heterogeneous zero or
more "elements" of a uniform data types. type).

The data types which can appear as members components of a struct blob are:

- unsigned
integer (32-bit)

- (32 bits in length), string (variable-length (a variable-length sequence of
octets with arbitrary octets)

-    integer array (variable-length sequence of unsigned integers)

-    string array (variable-length sequence of strings)

-    struct (heterogeneous sequence of any values), or blob.  Any of these types)

These primitive types were chosen because they are directly usable on
most hardware, they represent the vast majority of data types used can occur as
a scalar or in
networking protocols, and because data types outside an array.

Since one blob can contain other blobs, complex nesting of this set are
often specific to the higher-level protocol anyway.  Having a limited
set of data types allows for a more compact encoding, which structures is easier to
decode,
possible.  However the blob encoder and decoder treat "embedded" blobs
(blobs which doesn't need separate marshalling routines occur as components of an outer blob) as opaque structures.
For example, embedded blobs are not automatically decoded along with
outer blobs, and a formatting error in an embedded blob does not create
a formatting error for each
individual structure.




Moore                   Expires January 12, 2002                [Page 3]
BLOB Protocol                Internet-Draft                 12 July 2001 any blob that contains it.

"Variable-length" here means that the lengths of arrays need not be pre-
determined by the protocol using BLOB.  However the  The maximum lengths of strings
and arrays are constrained by the use of a 32-bit unsigned integer for
the length of the blob, and the representation of offsets within of data
relative to the start of the blob as 32-bit unsigned integers.  They  Lengths
may be further constrained by the higher-
level higher-level protocol's choice of
transmission medium - for instance, if the blob must fit into a UDP
datagram.  The number of members of a struct array elements is
considerably more constrained (as will become clear below), limited to 255 arrays of each
data type, but this should be adequate for most data structures encountered needed
in network protocols.

When other

2.1 Use of Data Types Not Supported by BLOB

The primitive types (unsigned 32-bit integer and octet string) were
chosen because they represent the majority of data types used in network
protocols, they are needed directly supported by most computer hardware, and
because data types outside of this set are often specific to the higher-
level protocol anyway.  Having a protocol, they small set of data types allows BLOB to
be a compact yet self-describing encoding, which is efficient to decode
and which does not require separate marshaling routines for each
protocol data unit used by an application.  A few additional types (in
particular, single- and double-precision floating point) are being
considered for future versions of BLOB.  The BLOB protocol is intended
to allow new primitive types to be added without changing the format of
blobs that do not include these types.

When a higher-level protocol needs to use a data type that is not
directly supported by BLOB, such data must be represented in terms of
the above primitive available types. The higher-level protocol specification must
choose define
the representation, representation of such data in terms of types supported by BLOB, and
the conversion between the blob representation and the native format
must be explicitly managed by the applications.  For instance:

-    A




Moore                      Expires 21 May 2002                  [Page 4]
BLOB Protocol                Internet-Draft             21 November 2001


-    A signed 32-bit integer may be transmitted as an unsigned 32-bit
     integer by encoding the signed integer in twos-complement format.
     On most modern machines no conversion will be necessary; however on
     machines for which the smallest integer represenation representation is larger
     than 32 bits it will be necessary for the application to sign-
     extend the result.

-    A 64-bit integer may be transmitted as two consecutive 32-bit
     integers (with the most significant word first), which would
     require that the receiving application arrange those two integers
     according to its native byte ordering.  Alternatively a 64-bit
     integer may be transmitted as eight consecutive octets within a
     string (most significant byte first), which would require that the
     receiving application re-arrange those octets according to its
     local byte ordering.

-    A multi-dimensional array may be represented as a single-
     dimensional array with the sizes dimensions of the dimensions array passed as
     separate integer parameters; components.

-    Floating    In the current version of BLOB, floating point numbers may be
     encoded in IEEE format and transmitted as either integers (modulo
     sign-extension issues) or strings (modulo sign-extension
     issues), alignment issues).
     Future versions of BLOB may support floating point numbers
     directly.

-    A small dense set may be represented as bits within an a scalar
     integer.
     Slightly  A larger dense sets set may be encoded as bit offsets into using individual bits
     of the elements of an integer array. Larger or sparse sets may be represented by encoding
     them in a string.







Moore                   Expires January 12, 2002                [Page 4]
BLOB Protocol                Internet-Draft                 12 July 2001

3. BLOB Protocol

The basic unit of BLOB encoding is a "struct".  A "blob" is a sequence
of octets which forms the on-the-wire representation of a struct. Organization

At the most basic level, the blob consists of an integer portion
followed by a string an opaque portion.  The integer portion consists of a header,
an argument list, and an integer pool.  Each of these is a sequence of
integers, all of which are 4 octets in length and
unsigned 32-bit (4-octet) quantities, represented on-the-
wire on-the-wire in network
byte (big-endian) ("big-endian") order.  The string opaque portion is a sequence of octets; order of octets is preserved within the string
portion. 8-bit
(1-octet) quantities.

The blob is separated into string opaque and integer portions in order to
facilitate easy decoding.  In order for the blob to be usable on a
little-endian cpu, each integer of the integer portion will need to have
its octets reversed.  By contrast, the string portion has the same
representation efficient decoding on both big-endian and little-endian platforms.  Thus machines, or on
a "well-behaved" little-endian any
machine the blob can be converted from
on-the-wire format to with a format which is usable locally, merely by
reversing the order word size other than 32 bits.  Having all of the octets integers
within each a blob co-located in a contiguous area allows an implementation
to efficiently convert all of the first
(string_pool_offset / 4) 32-bit integers to local format at the same
time.  Strings of octets are assumed to have the blob.  No same representation on
all platforms, so conversion is
necessary in order unlikely to use a blob on a "well-behaved" big-endian machine.

Since a blob is be needed for the on-the-wire representation of a struct, if a blob
contains one or more structs as components of the outer struct, they
will themselves be represented as blobs.  Those blobs will be stored in
the string pool.  Inner blobs must be explicitly decoded/converted by
the receiving application; they are not automatically decoded when the
outer blob is decoded. opaque
portion.





Moore                      Expires January 12, 21 May 2002                  [Page 5]
BLOB Protocol                Internet-Draft                 12 July             21 November 2001


3.1 Structure of a blob


The structure integer portion of a blob is as follows:

       octet offset                name

                  0 +--------------------------------+ \
                    |          blob_length           | |
                  4 +--------------------------------+ |
                    |      integer_pool_offset       | |
                  8 +--------------------------------+ |
                    |      string_pool_offset        | |
                 12 +--------------------------------+ |
                    |        argument_counts         | |
                 16 +--------------------------------+ + integer portion
                    :                                : |
                    :          argument further divided into a header, a list         : |
                    :                                : |
integer_pool_offset +--------------------------------+ |
                    :                                : |
                    :          integer_pool          : |
                    :                                : /
 string_pool_offset +--------------------------------+ \
                    :                                : |
                    :           string_pool          : + string portion
                    :                                : |
        blob length +--------------------------------+/


blob_length
of array bases, and an integer pool.  The blob_length header is used to store
various data needed to decode the length of the entire blob in octets.  The
     length includes the space occupied by blob_length.

integer_pool_offset and check it for consistency.
The integer_pool_offset is array bases portion contains the octet offset (relative offsets (positions relative to the
start of the blob) of the integer_pool portion each of the blob.
     integer_pool_offset must be a multiple of four, greater than or
     equal to 16, and less than or equal to string_pool_offset.  If arrays in the
     length of integer_pool is zero, integer_pool_offset will be equal blob (including the
arrays used to string_pool_offset.

string_pool_offset store scalar components).  The string_pool_offset integer pool is used for
storing integer data as well as the offset (relative to the start of the
     blob) of the string_pool portion offsets of the blob.  It must be greater
     than or equal to integer_pool_offset embedded blobs and less than or equal
strings.

The opaque portion is divided into a blob pool and a string pool.  The
blob pool is used to
     blob_length.  If the length of store embedded blobs; the string_pool string pool is zero,
     string_pool_offset will be equal used to blob_length.




Moore                   Expires January 12, 2002                [Page 6]
BLOB Protocol                Internet-Draft                 12 July 2001


argument_counts
store strings.  The argument_counts field indicates blob pool occurs immediately following the number of each kind integer
pool in order to ensure that embedded blobs are always aligned on a
four-octet boundary (relative to the start of
     argument.  This field the blob).

Each embedded blob is calculated as follows:

          argument_counts = (num_int_args) +
                            (num_int_array_args << 8) +
                            (num_string_or_struct_args << 16) +
                            (num_string_or_struct_array_args << 24)

     where num_xxx_args padded with 0-3 zero octets until its length is the number of arguments an
exact multiple of type xxx, and
     num_xxx_array_args is 4 octets.  This ensures that all embedded blobs are
aligned to 4-octet boundaries, allowing the number of arguments of type array blob decoder to assume (if
the outer blob is on an aligned boundary) that each of xxx.

argument_list
     The argument_list contains the embedded
blobs is also aligned.

Each string is padded with a list single octet with a value of integers zero, which represent the
     members is
not part of the struct.  In order that the blob may be sanity
     checked string.  This is for internal consistency without wasting lots of space, the
     arguments within the argument_list convenience when strings are arranged so used
to store character data, with programming languages that similar
     types of arguments use a zero-
valued octet as a string terminator.

Embedded blobs are opaque to their enclosing blob and are consecutive.  Within NOT
automatically parsed or decoded when the argument_list, outer blob is decoded.  If the
     arguments appear in
receiving application wishes to examine contents of an inner blob, it
must decode it separately from the following order:

     1.   int arguments

     2.   int enclosing blob.

A blob can have both scalar and array arguments

     3.   string or struct arguments

     4.   string or struct array arguments

integer_pool
     The integer_pool contains integers components.  For simplicity in the following order:

     1.   The elements
decoding and to eliminate some edge cases, all of integer arrays, in the order that these arrays
          appear in the argument list.

     2.   Offsets scalar integers of strings and structs within string and struct
          arrays,
a blob are stored in a "scalar integer array" which immediately follows
the order that last integer array component of the offsets blob.  Similarly, all of these arrays appear
          in the argument list.  These offsets
scalar (embedded) blob parameters) are offsets from the
          beginning of stored in a "scalar blob array"
which immediately follows the blob, last blob array component, and point into all of the string_pool.

string_pool
     The string_pool begins at string_pool_offset and contains strings
     and embedded structs which
scalar string parameters are referenced within the outer struct.
     The strings and structs appear stored in a "scalar string array" which
follows the following order:

     1.   Contents last string array component.

3.1 Representation of strings or structs that data types

In general, all components of a blob are referenced in the
          argument list, in the order elements of an array.  A
distinguished array of each type is used to store scalar components of
that those offsets appear in type.  The base of any array (whether it is a numbered array
component or an array used to hold scalar components) can be determined
by decoding the array_counts_and_flags field of the
          argument list. blob header.




Moore                      Expires January 12, 21 May 2002                  [Page 7] 6]
BLOB Protocol                Internet-Draft                 12 July             21 November 2001


     2.   Contents


Since strings (and blobs) can be of varying length, an array of strings or structs that are elements
(or blobs) is represented internally by an array of arrays, in integers.  Each of
these integers indicates the order that their offsets appear in storage location (within the integer_pool.

For compatibility with programming languages which terminate strings
with a zero octet, a zero octet is automatically appended to each string
in blob) of the string_pool.

3.2 Struct Member Encoding

The members
contents of the string or blob.  These integers are consecutive; the
offset of element 2 of an array immediately follows the offset of
element 1.  Similarly, the array elements occupy consecutive storage -
the storage occupied by string 3 of an array immediately follows that
occupied by string 2.  This allows the size of array N to be computed by
subtracting its offset from that of the following array; this works for
any numbered array.  It also allows the length of element M to be
computed by subtracting its offset from that of the following element;
this works for elements (within bounds) of numbered arrays.  The last
scalar blob or string is a boundary case; these require an explicit test
to correctly determine their length.

The individual components of a struct blob are encoded as follows:

     -

3.1.1 integers and integer arrays

An "int" unsigned integer is represented as a 32-bit integer quantity in big-endian
format.

     -  All integer components appear in the integer_pool section of a
blob.

An "int array" integer array is represented as an integer offset relative to zero or more contiguous 32-bit
integers, that are stored within the beginning integer_pool section of the blob, which points blob.
The location (or "base") of the array relative to the elements start of the
          array. blob
is stored as a 32-bit integer offset.  The elements base of the this array are is stored
in the
          integer_pool, in increasing order, in big-endian format.  The
          offset array_bases portion of an the blob.

Scalar integer array must therefore be greater than or
          equal to integer_pool_offset and less than or equal to
          string_pool_offset.

          Consecutive int arrays components a blob are stored encoded in consecutive locations
          within the integer_pool.  Thus a scalar integer array.
The storage for the length elements of an integer this array
          N (where N is less than in the number of integer arrays, minus 1)
          can be determined pool, and
immediately follows the storage used by subtracting the last numbered integer array.
The offset of the scalar integer array N
          from appears in the offset array_bases
portion of integer array N+1, and dividing the result
          by 4.  The length blob.

3.1.2 (embedded) blobs and blob arrays

An embedded blob component is represented as a series of the last integer array can be determined
          by subtracting the offset octets which is
an integral multiple of that integer array four octets long.  The storage for embedded
blobs is taken from the
          offset blob pool of the first string array, or if there are no string
          arrays, from string_pool_offset.

     -    A "string" is represented as an enclosing blob.  An integer
offset relative (relative to the beginning of the blob, which points to the contents of blob) indicates the
          string.  The contents starting
location of the string embedded blob.  For scalar embedded blob components
these offsets are encoded in a scalar blob array.  This array (of blob
offsets) is stored in the
          string_pool.  The offset integer pool and immediately follows the
offsets of any string must therefore be
          greater than or equal to the string_pool_offset and less than
          or equal numbered blob arrays.

A blob array is represented as an integer base (stored in array_bases)
which points to blob_length.

          String arguments, and elements an array of string arrays, are stored
          consecutively integers (stored in the string pool. integer pool), each



Moore                      Expires 21 May 2002                  [Page 7]
BLOB Protocol                Internet-Draft             21 November 2001


element of which is the offset of a blob (within the blob pool).

Each string embedded blob (within the blob pool) is followed in
          the string_pool by from 0-3 octets
with the value zero, so that any subsequent blob will be aligned on a zero octet which is
four-octet boundary.  These padding octets are not considered part of
the
          string.  Thus blob; however, the length of any string argument (other than the last) can be calculated by subtracting its offset inner blob (as seen from the
          offset of the subsequent
enclosing blob) will include any padding.

3.1.3 strings and string argument, minus 1.  The length
          of the last arrays

A string argument can be calculated by subtracting
          its offset from the offset is represented as a sequence of the first element octets; these octets may have
arbitrary values.  The contets of the first
          string array, or if there strings are no string arrays, from



Moore                   Expires January 12, 2002                [Page 8]
BLOB Protocol                Internet-Draft                 12 July 2001


          blob_length.

          Strings can be of zero length, stored in which case the corresponding string_pool.
An integer offset points to a zero octet which is immediately followed by
          the next string (stored in integer_pool) indicates the string_pool.  Strings can also be
          'missing' or NULL, in which case location of the offset is zero.

     -
contents of the string.

A "string array" string array is represented as an integer offset (relative
          to the beginning of the blob) base (stored in array_bases)
which points to an array of integers (stored in the integer pool), each
element of which
          points to indicates the offset of a string (within the (stored in string
pool).

          The length of any

Each string array element (other than the last
          one is followed in that array) can be calculated by subtracting its offset
          from the offset string_pool by a zero octet which is not
part of the subsequent element, minus 1.  The string.  Thus the length of the last element in a any string array (other than the last
scalar string array) component) can be calculated by subtracting its offset
from the offset of the first element of the subsequent string
          array, string, minus 1.  The length of the last element in the last
          string array

Strings can be calculated by subtracting its offset from
          blob_length, minus 1.

     -    A "struct" is represented as an integer offset (relative to
          the beginning of the blob) which points to the beginning of an
          inner blob (stored zero length, in the string portion of the outer blob), which contains case the inner struct.

     -    A "struct array" is represented as an integer corresponding offset (relative
          to the beginning of the blob), which
points to an array of
          integers (stored in the integer pool), each element of a zero octet which
          points to is immediately followed by the offset next string
in the string_pool.























Moore                      Expires 21 May 2002                  [Page 8]
BLOB Protocol                Internet-Draft             21 November 2001


3.2 Structure of a blob (within the string pool) that
          represents

The structure of a struct.

4. Use blob is as follows:

       octet offset                name

                  0 +--------------------------------+ \
                    |          blob_length           | |
                  4 +--------------------------------+ |
                    |      integer_pool_offset       | |
                  8 +--------------------------------+ |
                    |        blob_pool_offset        | |
                 12 +--------------------------------+ |
                    |      string_pool_offset        | |
                 16 +--------------------------------+ |
                    |     array_count_and_flags      | |
                 20 +--------------------------------+ + integer portion
                    :                                : |
                    :          array_bases           : |
                    :                                : |
integer_pool_offset +--------------------------------+ |
                    :                                : |
                    :          integer_pool          : |
                    :                                : /
   blob_pool_offset +--------------------------------+ \
                    :                                : |
                    :            blob_pool           : |
                    :                                : |
 string_pool_offset +--------------------------------+ + opaque portion
                    :                                : |
                    :           string_pool          : |
                    :                                : |
        blob_length +--------------------------------+ /


For this version of BLOBs by higher-level protocols

Higher-level protocols using the BLOB as an encoding mechanism need to
define their protocol data units protocol, the integer portion begins at
offset 0 and is blob_pool_offset octets in terms length.  The opaque portion
begins at blob_pool_offset and is (blob_length - blob_pool_offset)
octets in length.

Future versions of the BLOB "structs".  Since BLOB
groups all similarly-typed protocol may add additional pools for other
data together within the blob (for ease of
conversion), types, and since therefore may change these formulas.  BLOB rigidly defines decoder
implementations MUST therefore decode 'array_count_and_flags' (see
below) and verify that the order in which data must
appear, applications generally cannot refer flags portion of this field is equal to protocol elements within
a blob by a fixed offset.  Instead, zero,
before translating the application code references
protocol elements in terms of "the second string parameter", "the third
integer parameter" or "the second element remainder of the fourth integer array
parameter".  Macros which allow these elements to be accessed from a
decoded blob structure are easily constructed.

It is possible to define a simple specification language which allows
the elements of a struct portion to be specified in the order that makes format
used by the local machine.





Moore                      Expires January 12, 21 May 2002                  [Page 9]
BLOB Protocol                Internet-Draft                 12 July             21 November 2001


most sense


The following paragraphs describe the fields within a blob:

blob_length
     The blob_length is the length of the entire blob in octets.  The
     length includes the space occupied by blob_length.  blob_length
     does not include any padding which is added to make an application, and which produces embedded
     blob a list multiple of macros which
map from protocol data element names four octets long.

integer_pool_offset
     The integer_pool_offset is the octet offset (relative to routines which can access those
data elements.  This hides the details start
     of BLOB's reordering from the
application without significantly impairing efficiency.  An example blob) of
such a language is given in Appendix B.

If higher-level protocols employ data types other than the BLOB
primitive data types, they must define how integer_pool field of the application-specific data
types are represented as one blob.
     integer_pool_offset MUST be a multiple of four, greater than or more BLOB primitive types,
     equal to 24, and
implementations of less than or equal to blob_pool_offset.  If the protocol
     length of integer_pool is zero, integer_pool_offset will be responsible for conversion.
Applications which require a canonical form (say for signing) should
specify the conversion from application data types equal
     to BLOB types so that
there blob_pool_offset.

blob_pool_offset
     The blob_pool_offset is exactly one possible representation the offset (relative to the start of each application data
type within BLOB.

Since a single blobs cannot encode arbitrarily complex structures, and
since nesting blobs add a bit the
     blob) of overhead, protocol designers should
avoid deep nesting the blob_pool field of structures.  For instance, what to the application
is conceptually an array of structs may blob.  blob_pool_offset MUST be better represented within
BLOB as
     a set of parallel arrays.  At the same time, nesting multiple of structs
is useful when it is desired that an inner blob be opaque four, greater than or equal to integer_pool_offset,
     and less than or equal to string_pool_offset.  If the layer length of a protocol that decodes the outer blob.

5. Encoding Issues

Most blobs will contain at least one variable-length data structure.
This implies that a program that encodes a blob
     blob_pool is zero, blob_pool_offset will usually be unable equal to generate the elements of a blob in-place. Instead,
     string_pool_offset.

string_pool_offset
     The string_pool_offset is the program will
need offset (relative to copy the elements start of a blob from their various locations into a
contiguous location in memory, in ther order prescribed by the BLOB
specification.  A sample implementation is given in Appendix C.

6. Decoding Issues

On "well-behaved" machines it should be possible to use blobs in-place
after converting
     blob) of the integer string_pool portion of the blob to the local byte
order.  The protocol elements within the blob can then be accessed with
macros. blob.  It is necessary to check the blob for consistency before using it.  In
particular:

-    The blob_length must MUST be consistent with the length a
     multiple of the PDU four, greater than or
     buffer in which the blob was received.  (For instance, it must not
     be equal to blob_pool_offset, and
     less than or equal to blob_length.  If the length of data received).

-    The blob_length must be at least 16 (which would be the length
     string_pool is zero, string_pool_offset will be equal to
     blob_length.

array_counts_and_flags
     The array_counts_and_flags field indicates how many of
     an empty blob with no arguments).



Moore                   Expires January 12, each kind of
     array element are contained within the blob.  This field is
     calculated as follows:

          array_counts_and_flags = (num_int_arrays) +
                                   (num_blob_arrays << 8) +
                                   (num_string_arrays << 16) +
                                   (flags << 24)

     where num_xxx_args is the number of array arguments of type xxx.

     The "flags" portion of this field is used to indicate extensions to
     this format.  Blobs that do not use these extensions will have a
     flags field of zero.  For this version of the BLOB protocol, the
     flags field MUST be zero.



Moore                      Expires 21 May 2002                 [Page 10]
BLOB Protocol                Internet-Draft                 12 July             21 November 2001


-


array_basess
     The integer_pool_offset must be equal array_bases field contains the bases (offsets relative to the
     start of the number blob) of
     arguments (decoded from argument_counts) multiplied by 4, plus 16.

-    The string_pool_offset must be greater than or equal to
     integer_pool_offset.

-    The string_pool_offset must be less than or equal to blob_length.

- each of the arrays in the blob, including
     those arrays which contain the scalar components of the blob (using
     separate arrays for scalar integer, struct, and string components).
     Specifically the array_bases field contains, in order:

     1.   The offset base of each integer array must be a multiple array.  There are num_int_arrays
          (possibly zero) of 4.

- these.

     2.   The offset base of the first scalar integer array (if any) must be equal to
     integer_pool_offset.

-    Each subsequent non-null array.  This base is always
          present, even if there are no scalar integer array offset must be greater than
     or equal to components.  If
          there are no scalar integer components of the previous blob, the scalar
          integer array offset, and less than
     string_pool_offset.

-    The offset base will be the same as the base of blob array
          0.  (If there are no blob arrays in the first element blob, the base of the first string
          scalar integer array must will be
     greater than or equal to the offset same as the base of the last non-null integer
     array.

-
          scalar blob array.)

     3.   The offset of the first element base of each subsequent string array
     must be greater than or equal to the offset blob array.  There are num_blob_arrays
          (possibly zero) of the first element these.

     4.   The base of the previous string scalar blob array.

-    The first string argument must have an offset equal to string_pool.

-    Each subsequent non-null string argument must  This base is always
          present.  If there are no embedded scalar blob components in
          the blob, the scalar blob array base will have an offset
     greater (by at least 1) than that of the previous string argument.

-    The first element of same value
          as the first base of string array must have an 0.  (If there are no string arrays
          in this blob, this offset
     greater (by at least 1) than will be the offset same as the base of the last
          scalar string
     argument.

- array.)

     5.   The first element base of any subsequent each string array must have an
     offset which is greater (by at least 1) than the last element array.  There are num_string_arrays
          (possibly zero) of these.

     6.   The base of the previous scalar string array.

-    Each element of a  If there are no scalar
          string array must have an offset greater (by at
     least 1) than the offset components of the previous element in that array.

-    Except for blob, the first string, there must be a zero octet preceding
     each offset base of each non-null string argument or non-null the scalar string
          array element.

- will be equal to blob_length.

     7.   Any additional bases of arrays, or offsets of scalar
          components, which might be defined by future versions of this
          protocol.  The last octet presence of additional data types not supported
          in this version of the string_pool must BLOB protocol will be indicated by a zero.





Moore                   Expires January 12, 2002               [Page 11]
BLOB Protocol                Internet-Draft                 12 July 2001


A sample implementation is given
          nonzero value in Appendix D.

7. Security Considerations

It is believed that the BLOB encoding is unique and can serve as a
useful 'canonical form' for a data structure.  However, if higher-level
protocols encode non-native data types as BLOB primitive types, they
must also define a unique representation for each quantity to be stored
in that data-type.

In order to prevent possible attacks by transmission flags portion of blobs containing
bogus offsets, it is essential to perform the bounds checks listed in
section 6 while decoding blobs.  While such attacks could not easily
overwrite memory with data chosen by an attacker, they could cause a
server
          array_counts_and_flags field.

integer_pool
     The integer_pool contains 32-bit integers, assumed to malfunction.

8. Author's Address

Keith Moore
University of Tennessee
1122 Volunteer Blvd, Suite 203
Knoxville TN 37996-3450
email: moore@cs.utk.edu


9. References

[1]  "Specification be unsigned.
     These may be either scalar integer, elements of Basic Encoding Rules for Abstract Syntax Notation
     One (ASN.1)", CCITT Recommendation X.209, January 1988.

[2]  "Specification integer arrays,
     offsets of ASN.1 encoding rules: Basic, Canonical, and
     Distinguished Encoding Rules", ITU-T X.690, January 1994.

[3]  Srinivasan, R., "XDR: External Data Representation Standard", RFC
     1832, August 1995.

[4]  Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions
     (MIME) Part One: Format scalar blobs or strings, or bases of Internet Message Bodies", RFC 2045,
     November 1996.

[5]  "Extensible Markup Language (XML) 1.0 (Second Edition)", W3C
     Recommendation, October 2000,
     <http://www.w3.org/TR/2000/REC-xml-20001006>.

[6]  Crocker, D. (ed.), Overell, P. "Augmented BNF for Syntax
     Specifications: ABNF.".  RFC 2234, November 1997. blob or string
     arrays The integers within the integer_pool MUST appear in the
     following order:



Moore                      Expires January 12, 21 May 2002                 [Page 12] 11]
BLOB Protocol                Internet-Draft                 12 July             21 November 2001


Appendix A. ASCII-Art Picture


     1.   The elements of a BLOB

This diagram attempts to illustrate integer arrays.  The integer array components
          appear in order, and within each array, the ordering elements appear in
          order.  The arrays and their elements are numbered from zero.
          Thus the 0th element of the various 1st integer array immediately
          follows the last element of the 0th integer array.

     2.   The elements of a blob and the relationship scalar integer array.  Thus integer scalar
          component 0 immediately follows the last element of the offsets to last
          integer array; followed by integer scalar component 1, etc.
          (If there are no integer arrays, the elements to which
they point:

             octet offset                name of integer scalar
          0 +--------------------------------+
                          |          blob_length           |
                        4 +--------------------------------+
                          |      integer_pool_offset       |
                        8 +--------------------------------+
                          |      string_pool_offset        |
                       12 +--------------------------------+
                          |        argument_counts         |
                       16 +--------------------------------+ is integer_pool).

     3.   The argument list looks like this:

                       16 +--------------------------------+
                          |       1st scalar int arg       |
                          +--------------------------------+
                          |       2nd scalar int arg       |
                          +--------------------------------+
                          :                                :
(16 + num_int_args / 4)   +--------------------------------+
                          |  offset offsets of 1st int array arg   |--+
                          +--------------------------------+  |
                          |  offset elements of 2nd int array arg   |--|--+
                          +--------------------------------+  |  |
 16 + (num_int_args +     :                                :  |  |
 num_int_array_args) / 4  +--------------------------------+  |  |
                          | blob arrays.  Each blob offset MUST
          be an integral multiple of 1st struct/string arg|--|--|---+
                          +--------------------------------+  |  |   |
                          | four, and each blob offset of 2nd string/string arg|--|--|---|-+
                          +--------------------------------+  |  |   | |
                          :                                :  |  |   | |
    16 + (num_int_args +  :                                :  |  |   | |
    num_int_array_args +  :                                :  |  |   | |
numstr_or_strct_args) / 4 +--------------------------------+  |  |   | |
                          | MUST
          point into the blob_pool.  The offset of 1st str* array arg  |--|--|-+ | |
                          +--------------------------------+  |  | | | |
                          |  offset the element 0 of 2nd str* blob
          array arg  |  |  | | | |
                          +--------------------------------+  |  | | | |
                          |  offset 0 MUST be equal to blob_pool_offset.  Each subsequent
          element of 3rd str* a blob array arg  |  |  | | | |
      integer pool MUST have an offset +--------------------------------+  |  | | | |
                                                              |  | | | |



Moore                   Expires January 12, 2002               [Page 13]
BLOB Protocol                Internet-Draft                 12 July 2001


                                                              |  | | | |
                                                              |  | | | |
The integer pool looks like this:                             |  | | | |
                                                              |  | | | |
        integer pool offset =                                 |  | | | |
          offset of 1st   +--------------------------------+ <+  | | | |
           int array arg  |     1st element of 1st array   |     | | | |
                          +--------------------------------+     | | | |
                          |     2nd element of 1st array   |     | | | |
                          +--------------------------------+     | | | |
                          :                                :     | | | |
                          :                                :     | | | |
                          :                                :     | | | | equal to the
          offset of 2nd +--------------------------------+ <---+ | | |
            int array arg |     1st element the preceding blob plus the declared length of 2nd array   |       | | |
                          +--------------------------------+       | | |
                          :                                :       | | |
                          :                                :       | | |
            offset the
          preceding blob (after padding).

          NOTE: The data within an embedded blob is considered opaque to
          the enclosing blob; the only reason for separating blobs from
          strings is to ensure padding of 1st +--------------------------------+ <-----+ | |
           str* array arg | offset blobs to 4-octet boundaries.
          Blob encoders SHOULD NOT insist that the length field of 1st elem an
          embedded blob is consistent with the length declared for that
          blob, and blob decoders SHOULD NOT check the length fields of 1st str* |         | |
                          +--------------------------------+         | |
                          | offset
          embedded blobs when decoding the enclosing blob.

     4.   The offsets of 2nd elem elements of 1st str* |         | |
                          +--------------------------------+         | |
                          :                                :         | | the scalar blob array.  Each blob
          offset MUST be a integral multiple of 2nd +--------------------------------+         | |
           str* array arg | four, and MUST point
          into the blob_pool. The offset of 1st elem scalar blob component 0 MUST
          immediately follow the last element of 2nd str* |         | |
                          +--------------------------------+         | |
                          | the last blob array.
          (If there are no blob arrays, the offset of scalar blob
          component 0 is blob_pool).  Each subsequent scalar blob
          component MUST have an offset equal to the offset of 2nd elem the
          preceding blob plus the length of 2nd str* |         | |
                          +--------------------------------+         | |
                                                                     | |
                                                                     | | the preceding blob (after
          padding).

     5.   The offsets of elements of string pool looks like this:                                     | |
                                                                     | | arrays.  These offsets MUST
          point into the string_pool.  Element 0 of string pool array 0 MUST
          have an offset =                                            | | equal to string_pool_offset, and each
          subsequent string MUST have an offset equal to the preceding
          string's offset, plus the length of first +--------------------------------+ <-------+ |
              string arg  |   S       T       R       I    |           |
                          +--------------------------------+           |
                          |   N       G the preceding string, plus
          1      \0    |           |
                          +--------------------------------+ <---------+
         offset (for the trailing zero octet).

     6.   The offsets of second |   S       e       c       o    | elements of the scalar string arg +--------------------------------+
                          |   n       d     \040      S    |
                          +--------------------------------+
                          |   t       r       i       n    |
                          +--------------------------------+
                          |   g      \0    |
                          +----------------+ array.  These
          offsets MUST point into the string_pool.  The scalar string
          component 0 MUST have an offset equal to the offset of the



Moore                      Expires January 12, 21 May 2002                 [Page 14] 12]
BLOB Protocol                Internet-Draft                 12 July             21 November 2001


Appendix B. Example Abstract Syntax

This syntax used to describe BLOB structures


          preceding string, plus the length of the preceding string,
          plus 1 (for the trailing zero octet).  (If there are no string
          arrays, the offset of scalar string 0 is described below using string_pool).

blob_pool
     The blob_pool contains structures which are encoded in blob format.
     These structures may be scalar blob components of the ABNF syntax from [6]:

     file = *(block / comment-line)

     block = "BEGIN" 1*space id [ 1*space comment ] CRLF
             *element
             END [ comment ] CRLF

     element = "int" 1*space identifier [ comment ] CRLF /
               "string" 1*space identifier [ comment ] CRLF /
               "int<>" 1*space identifier [ comment ] CRLF /
               "string<>" 1*space identifier [ comment ] CRLF /
               "struct" 1*space identifier [ comment ] CRLF

     comment = *space "#" *char

     comment-line = comment CRLF

     id = letter *(letter / digit / "_")

     letter = "A".."Z" / "a".."z"

     digit = "0".."9"

     space = %20 / %09

     char = %01..%09 / %0B / %0C / %0E..%FF

     CRLF = 0*1%0D 0*1%0A


Here is a simple awk program to interpret this syntax and produce a list outer blob,
     or elements of C #define macros. scalar blob arrays of the outer blob.  The macros are contents
     of blob_pool appear in the form

     #define structname_element_type number

where 'structname' is following order:

     1.   The contents of each element of each blob array.  Element 0 of
          blob array 0 appears first, followed by element 1 of blob
          array 0, etc.

     2.   The contents of each element of the name scalar blob array, used to
          store scalar (embedded) blob components of the structure, 'element' is outer blob.

     Each blob in the name blob pool MUST be padded with from zero to three
     octets, each with a value of zero, so that the element, and 'type' length of each blob
     is an exact multiple of four octets.

string_pool
     The string_pool contains unaligned strings of arbitrary octets.
     These strings may be used for character data or for any other data
     which can be represented as a suffix indicating string of octets.  BLOB makes no
     assumptions regarding the type format of data (character encoding
     scheme, etc.) that is stored in strings.

     The contents of the
element (i = int, s = string/struct, ia = integer array, sa =
struct/string array) for ease string_pool appear in visual type checking.

This program is quite simplistic the following order:

     1.   The contents of each element of each string array of the blob.

     2.   The contents of each element of the scalar string array.

     For compatibility with programming languages which terminate
     strings with a zero octet, a zero octet is automatically appended
     to each string in the string_pool.  This zero octet is not part of
     the string.  Since zero octets MAY appear within BLOB strings, the
     zero octet that is appended to each string MUST NOT be used as a
     string terminator except when the higher-level protocol has
     specified that they may be used in this way.

4. Use of blobs by higher-level protocols

Higher-level protocols using BLOB as an encoding mechanism need to
define their protocol data units in terms of blobs.  Since BLOB groups
all similarly-typed data together within the blob (for ease of
conversion), and performs no error checking. since BLOB rigidly defines the order in which data must



Moore                      Expires January 12, 21 May 2002                 [Page 15] 13]
BLOB Protocol                Internet-Draft                 12 July             21 November 2001


#!/bin/sh
#


appear, applications generally cannot refer to protocol elements within
a blob by a fixed offset.  Instead, the sed line deletes comments
sed -e 's/[ ]*#.*//' | awk '
$1 == "BEGIN" {
        current_id = $2;
        nint = nstr = ninta = nstra = 0;
}
$1 == "int" {
        inames[nint] = $2;
        nint++;
        next;
}
$1 == "string" {
        snames[nstr] = $2;
        nstr++;
        next;
}
$1 == "struct" {
        snames[nstr] = $2;
        nstr++;
        next;
}
$1 == "int<>" {
        ianames[ninta] = $2;
        ninta++;
        next;
}
$1 == "string<>" {
        sanames[nstra] = $2;
        nstra++;
        next;
}
$1 == "struct<>" {
        sanames[nstra] = $2;
        nstra++;
        next;
}
$1 == "END" { application code references
protocol elements in terms of "the second scalar string component", "the
third scalar integer component" or "the second element of the fourth
integer array component".  Macros or functions which allow these
elements to be accessed from a decoded blob structure are easily
constructed.

It is possible to design a simple specification language which allows
the elements of a blob to be specified in the order that makes the most
sense to an application, and which produces a list of macros which map
from protocol data element names to routines which can access those data
elements.  This hides the details of BLOB's reordering from the
application without significantly impairing efficiency.  An example of
such a language is given in Appendix B.

If higher-level protocols employ data types other than the BLOB
primitive data types, they must define how the application-specific data
types are represented as one or more BLOB primitive types, and
implementations of the protocol will be responsible for conversion.
Applications which require a canonical form (say for signing) should
specify the conversion from application data types to BLOB types so that
there is exactly one possible representation of each application data
type within BLOB.

Since each blob is self-contained with its own header, embedded blobs
add a bit of overhead.  Protocol designers should avoid unnecessary
nesting of structures.  For instance, what is conceptually an array of
structures to an application might be better represented within BLOB as
several parallel arrays.  However, nesting of blobs is useful when it is
desired that an inner blob be opaque to the layer of a protocol that
decodes the outer blob.

4.1. Encoding Issues

Most blobs will contain at least one variable-length data structure.
This implies that the offsets of the components within the blob will not
be known in advance, and a program that encodes a blob will usually be
unable to generate the elements of a blob in-place. The encoder routine
will generally need to copy the elements of a blob from their various
locations into a contiguous area of memory, in the order prescribed by
the BLOB specification.

4.2. Decoding Issues

On "well-behaved" machines it should be possible to use blobs in-place
after converting the integer portion of the blob to the local byte
order.  The protocol elements within the blob can then be accessed with



Moore                      Expires 21 May 2002                 [Page 14]
BLOB Protocol                Internet-Draft             21 November 2001


macros.

It is necessary to check the blob for consistency before using it.  In
particular:

-    The blob_length must be consistent with the length of the PDU or
     buffer in which the blob was received.  (For instance, it must not
     be less than the length of data received).

-    The blob_length must be at least 32 (which would be the length of
     an empty blob with no arguments).

-    The 'flags' portion of array_counts_and_flags MUST be zero.

-    The integer_pool_offset must be equal to the the number of
     arguments (decoded from array_counts_and_flags) multiplied by 4,
     plus 20.

-    The blob_pool_offset must be greater than or equal to
     integer_pool_offset.

-    The string_pool_offset must be greater than or equal to
     blob_pool_offset.

-    The string_pool_offset must be less than or equal to blob_length.

-    The base of each integer array and each blob array must be an
     integral multiple of 4.

-    The base of the first integer array (if any) must be equal to
     integer_pool_offset.

-    Each subsequent integer array base must be greater than or equal to
     the previous integer array base, and less than or equal to
     blob_pool_offset.

-    The offset of element 0 of the first blob array (if any) must be
     equal to blob_pool_offset.

-    Each subsequent blob offset must be greater than the previous blob
     offset.

-    The last blob offset must be less than string_pool_offset.

-    The first string component must have an offset equal to
     string_pool.





Moore                      Expires 21 May 2002                 [Page 15]
BLOB Protocol                Internet-Draft             21 November 2001


-    The offset of each subsequent string must be greater than the
     offset of the first element of the previous string.

-    Except for (i = 0; i < nint; ++i)
                printf ("#define %s_%s_i %d\n", current_id, inames[i], i); the first string, there must be a zero octet preceding
     each offset of each string component or string array element.

-    The last octet in the string_pool must be a zero.

4.3 Encoding and decoding code

A free software sample blob encoder and decoder have been written and
will be made available at the location listed in Appendix C.

5. Security Considerations

It is believed that the BLOB encoding is unique and can serve as a
useful 'canonical form' for (i = 0; i < nstr; ++i)
                printf ("#define %s_%s_s %d\n", current_id, snames[i], i); a data structure.  However, if higher-level
protocols encode non-native data types as BLOB primitive types, they
must also define a unique representation for (i = 0; i < ninta; ++i)
                printf ("#define %s_%s_ia %d\n", current_id, ianames[i], i); each quantity to be stored
in that data-type.

In order to prevent possible attacks by transmission of blobs containing
bogus offsets, it is essential to perform the bounds checks listed in
section 4.2 while decoding blobs.  While such attacks could not easily
overwrite memory with data chosen by an attacker, they could cause a
server to malfunction.

6. Author's Address

Keith Moore
University of Tennessee
1122 Volunteer Blvd, Suite 203
Knoxville TN 37996-3450
email: moore@cs.utk.edu


7. References

[1]. Bradner, S.  "Key words for (i = 0; i < nstra; ++i)
                printf ("#define %s_%s_sa %d\n", current_id, sanames[i], i);
        next;
}' use in RFCs to Indicate Requirement
     Levels", RFC 2119, March 1997.

[2]  "Specification of Basic Encoding Rules for Abstract Syntax Notation
     One (ASN.1)", CCITT Recommendation X.209, January 1988.

[3]  "Specification of ASN.1 encoding rules: Basic, Canonical, and
     Distinguished Encoding Rules", ITU-T X.690, January 1994.





Moore                      Expires January 12, 21 May 2002                 [Page 16]
BLOB Protocol                Internet-Draft                 12 July             21 November 2001


Appendix C. Example Encoding Code

NB: due to deadline pressures this code has not been recently tested,


[4]  Srinivasan, R., "XDR: External Data Representation Standard", RFC
     1832, August 1995.

[5]  Freed, N. and probably contains bugs.  Check http://www.cs.utk.edu/~moore/blob for
the latest version.


struct preblob {
    int ni_args;          /* number of integer arguments */
    int i_args[256];      /* integer arguments */
    int nia_args;         /* number of integer array arguments */
    int *ia_args[256];    /* bases of integer array arguments */
    int lia_args[256];    /* num elements in each integer array */
    int ns_args;          /* number of string arguments */
    char *s_args[256];    /* bases of string arguments */
    int ls_args[256];     /* length of each string argument */
    int nsa_args;         /* number of string array arguments */
    char **sa_args[256];  /* base of each string array */
    int nlsa_args[256];   /* number of elements in each string array */
    int *lsa_args[256];   /* lengths of strings in each string array */
    char *blob;
    int blobsize;
};


/* initialize a blob - this is called only once */
#define blob_init (p) \
        memset (&(p), 0, sizeof (struct preblob))


/* reset the state of a blob without leaking any memory
   that it has allocated */
#define blob_reset (p) \
        do { \
            char *tblob = (p).blob; \
            int tblobsize = (p).blobsize; \
            blob_init (p); \
            (p).blob = tblob; \
            (p).blobsize = tblobsize; \
        } while (0)


/* set the number N. Borenstein, "Multipurpose Internet Mail Extensions
     (MIME) Part One: Format of integer parameters in a blob */
#define blob_set_nint (p, n) \
        (p).ni_args = (n); Internet Message Bodies", RFC 2045,
     November 1996.

[6]  "Extensible Markup Language (XML) 1.0 (Second Edition)", W3C
     Recommendation, October 2000,
     <http://www.w3.org/TR/2000/REC-xml-20001006>.

[7]  Crocker, D. (ed.), Overell, P. "Augmented BNF for Syntax
     Specifications: ABNF.".  RFC 2234, November 1997.






































Moore                      Expires January 12, 21 May 2002                 [Page 17]
BLOB Protocol                Internet-Draft                 12 July             21 November 2001


/* set the value of the nth integer parameter to x */
#define blob_set_int (p, n, x) \
        (p).i_args[n] = (x)


/* set the number


Appendix A. ASCII-Art Picture of string parameters in a blob */
#define blob_set_nstr (p, n) \
        (p).ns_args = (n);


/* set

This diagram attempts to illustrate the number ordering of integer array parameters in a blob */
#define blob_set_ninta (p, n) \
        (p).nia_args = (n);


/* set the number various elements
of string array parameters in a blob */
#define blob_set_nstra (p, n) \
        (p).nsa_args = (n);


/* set a blob and the value relationship of the nth string parameter offsets to 'str'
   where 'str' the elements to which
they point.

The following is NUL-terminated */
#define blob_set_str0 (p, n, str) \
        do a dump, in an assembler-like notation, of a blob which
encodes:

     2 scalar integers with values 10, 20 (decimal)
     1 integer array, with elements { \
             (p).s_args[n] = (str); \
             (p).ls_args[n] = strlen(str); \ 1 2 3 4 } while (0)


/* set
     0 scalar blobs
     0 blob arrays
     1 scalar string with the value of the nth "string"
     2 string parameter to 'str'
   where 'str' is 'len' bytes long */
#define blob_set_strl (p, n, str, len) \
        do arrays, with elements { \
             (p).s_args[n] = (str); \
             (p).ls_args[n] = (len); \ "a" "b" } while (0)


/* set and { "cc" "dd" "ee" }.

"label" denotes the name assigned to a particular offset; "xx" gives the
offset in hexadecimal; "contents" gives the value of the nth integer array to the
   in-core integer array starting octet or octets
which appear at 'base' that offset; and
   containing 'nelem' elements */
#define blob_set_int_array (p, n, base, nelem) \
        do { \
             (p).ia_args[n] = (base); \
             (p).lia_args[n] = (nelem); \
        } while (0) "description" gives a description of
the value that appears in that location.

                        label xx  contents description
     ------------------------:--:---------:------------------------
                             :00: 00000070: blob_length
                             :04: 0000002c: integer_pool
                             :08: 0000005c: blob_pool
                             :0c: 0000005c: string_pool
                             :10: 00020002: array_count_and_flags
                             :14: 0000002c: int_array_base_0
                             :18: 0000003c: scalar_int_array_base
                             :1c: 00000044: scalar_blob_array_base
                             :20: 00000044: string_array_base_0
                             :24: 0000004c: string_array_base_1
                             :28: 00000058: scalar_string_array_base
                 integer_pool:
             int_array_base_0:2c: 00000001:
                             :30: 00000002:
                             :34: 00000003:
                             :38: 00000004:
        scalar_int_array_base:3c: 0000000a: (10 decimal)
                             :40: 00000014: (20 decimal)
       scalar_blob_array_base:
          string_array_base_0:44: 0000005c: ptr_to_str[0,0]
                             :48: 0000005e: ptr_to_str[0,1]
          string_array_base_1:4c: 00000060: ptr_to_str[1,0]
                             :50: 00000063: ptr_to_str[1,1]
                             :54: 00000066: ptr_to_str[1,2]
     scalar_string_array_base:58: 00000069: ptr_to_scalar_str[0]



Moore                      Expires January 12, 21 May 2002                 [Page 18]
BLOB Protocol                Internet-Draft                 12 July             21 November 2001


/* set the value of the nth string array


                    blob_pool:
                  string_pool:
              ptr_to_str[0,0]:5c: 61: 'a'
                             :5d: 00:
              ptr_to_str[0,1]:5e: 62: 'b'
                             :5f: 00:
              ptr_to_str[0,0]:60: 63: 'c'
                             :61: 63: 'c'
                             :62: 00:
              ptr_to_str[0,0]:63: 64: 'd'
                             :64: 64: 'd'
                             :65: 00:
              ptr_to_str[0,0]:66: 65: 'e'
                             :67: 65: 'e'
                             :68: 00:
         ptr_to_scalar_str[0]:69: 73: 's'
                             :6a: 74: 't'
                             :6b: 72: 'r'
                             :6c: 69: 'i'
                             :6d: 6e: 'n'
                             :6e: 67: 'g'
                             :6f: 00:
                  blob_length:70:




























Moore                      Expires 21 May 2002                 [Page 19]
BLOB Protocol                Internet-Draft             21 November 2001


Appendix B. Example Abstract Syntax

This syntax used to the
   in-core string array starting at 'bases'
   and containing 'nelem' strings, where each
   string describe BLOB structures is NUL-terminated */
#define blob_set_str0_array (p, n, bases, nelem) \
        do { \
             (p).sa_args[n] described below using
the ABNF syntax from [7]:

     file = (bases); \
             (p).lsa_args[n] *(block / comment-line)

     block = NULL; \
             (p).nlsa_args[n] "BEGIN" 1*space id [ 1*space comment ] CRLF
             *element
             END [ comment ] CRLF

     element = (nelem); \
        } while (0)


/*
 * set the value of the nth string array to the
 * in-core string array starting at 'bases'
 * with the lengths stored in integer array 'lengths'
 * where each array is 'nelem' long
 */
#define blob_set_strl_array (p, n, bases, lengths, nelem) \
        do { \
             (p).sa_args[n] "int" 1*space identifier [ comment ] CRLF /
               "string" 1*space identifier [ comment ] CRLF /
               "int<>" 1*space identifier [ comment ] CRLF /
               "string<>" 1*space identifier [ comment ] CRLF /
               "struct" 1*space identifier [ comment ] CRLF
               "struct<>" 1*space identifier [ comment ] CRLF

     comment = (bases); \
             (p).lsa_args[n] *space "#" *char

     comment-line = (lengths); \
             (p).nlsa_args[n] comment CRLF

     id = (nelem); \
        } while (0)


/*
 * encode an int 'x' in big-endian format at ptr 'p'.
 * this is designed to be portable, there are certainly more
 * efficient ways to do this on any specific machine
 *
 * it should be okay to assume that 'ptr' is aligned on a 4-byte
 * boundary.
 */
#define ENCODE_INT(ptr, x) \
        do { \
            *ptr++ letter *(letter / digit / "_")

     letter = ((x) >> 24) & 0xff; \
            *ptr++ "A".."Z" / "a".."z"

     digit = ((x) >> 16) & 0xff; \
            *ptr++ "0".."9"

     space = ((x) >> 8) & 0xff; \
            *ptr++ %20 / %09

     char = (x) & 0xff; \
        } while (0)










Moore                   Expires January 12, 2002               [Page 19]
BLOB Protocol                Internet-Draft                 12 July 2001


/*
 * this routine encodes %01..%09 / %0B / %0C / %0E..%FF

     CRLF = 0*1%0D 0*1%0A


Here is a blob pointed simple awk program to by 'p'
 * interpret this syntax and leaves produce a list
of C #define macros.  The macros are of the result at p->blob
 * with form

     #define structname_element_type number

where 'structname' is the size in p->blobsize
 */

int
blob_encode (struct preblob *p)
{
    int i;
    int size = 0;
    int ipoolsize = 0;
    int spoolsize = 0;
    int nargs;
    unsigned int argcounts;
    char *ptr;
    char *iptr;
    char *sptr;

    if ((p->ni_args > 255) || (p->nia_args > 255) ||
        (p->ns_args > 255) || (p->nsa_args > 255))
        return -1;  /* too many arguments */

    /*
     * calculate name of the amount structure, 'element' is the name
of space needed
     */
    nargs the element, and 'type' is a suffix indicating the type of the
element (i = p->ni_args + p->nia_args + p->ns_args + p->nsa_args;
    argcounts int, b = p->ni_args +
                (p->nia_args << 8) +
                (p->ns_args << 16) +
                (p->nsa_args << 24);
    size blob, s = string, ia = 16 + (4 * nargs);

    /* size of integer array arguments */
    for (i array, ba = 0; i < p->nia_args; ++i)
        ipoolsize += p->lia_args[i] * 4;

    /* size of string arguments */
    for (i blob
array, sa = 0; i < p->ns_args; ++i) {
        if (p->s_args[i] != 0)
            spoolsize += p->ls_args[i] + 1;
    }

    /* size of string array arguments */ array) for (i = 0; i < p->nsa_args; ++i) {
        int j;
        int *lengths = p->lsa_args[i]; ease in visual type checking.

This program is quite simplistic and performs no error checking.





Moore                      Expires January 12, 21 May 2002                 [Page 20]
BLOB Protocol                Internet-Draft                 12 July             21 November 2001


        ipoolsize += p->nlsa_args[i] * 4;
        for (j


#!/bin/sh
# the sed line deletes comments
sed -e 's/[ ]*#.*//' | awk '
$1 == "BEGIN" {
        current_id = $2;
        nint = nblob = nstr = ninta = nbloba = nstra = 0; j < p->nlsa_args[i]; ++j) {
            if (p->sa_args[i][j] != 0) {
                if (lengths)
                    spoolsize += lengths[j] + 1;
                else
                    spoolsize += strlen (p->sa_args[i][j]) + 1;
            }
}
    }
    size = size + ipoolsize + spoolsize;

    /*
     * make sure there's enough space allocated
     */
    if (p->blobsize
$1 == 0) "int" {
        p->blob = (char *) malloc (size);
        p->blobsize
        inames[nint] = size; $2;
        nint++;
        next;
}
    else
$1 == "string" {
        p->blob
        snames[nstr] = (char *) realloc (p->blob, size);
        p->blobsize $2;
        nstr++;
        next;
}
$1 == "struct" {
        bnames[nblob] = size; $2;
        nblob++;
        next;
}

    /*
     * now, encode things
     */
    ptr
$1 == "int<>" {
        ianames[ninta] = p->blob;
    iptr $2;
        ninta++;
        next;
}
$1 == "string<>" {
        sanames[nstra] = p->blob + 16 + (nargs * 4);
    sptr $2;
        nstra++;
        next;
}
$1 == "struct<>" {
        banames[nbloba] = p->blob + 16 + (nargs * 4) + ipoolsize;

    /* header */
    ENCODE_INT (ptr, size);
    ENCODE_INT (ptr, 16 + (nargs * 4));
    ENCODE_INT (ptr, 16 + (nargs * 4) + ipoolsize);
    ENCODE_INT (ptr, argcounts);

    /* int arguments */ $2;
        nbloba++;
        next;
}
$1 == "END" {
        for (i = 0; i < p->ni_args; nint; ++i)
        ENCODE_INT (ptr, p->i_args[i]);

    /* int array arguments */
                printf ("#define %s_%s_i %d\n", current_id, inames[i], i);
        for (i = 0; i < p->nia_args; nblob; ++i) {
        int j;

        ENCODE_INT (ptr, iptr - p->blob);
                printf ("#define %s_%s_b %d\n", current_id, bnames[i], i);
        for (j (i = 0; j i < p->lia_args[i]; ++j)
            ENCODE_INT (iptr, p->ia_args[i][j]);



Moore                   Expires January 12, 2002               [Page 21]
BLOB Protocol                Internet-Draft                 12 July 2001


    }

    /* string arguments */ nstr; ++i)
                printf ("#define %s_%s_s %d\n", current_id, snames[i], i);
        for (i = 0; i < p->ns_args; ninta; ++i) {
        if (p->s_args[i] != 0) {
            ENCODE_INT (ptr, sptr - p->blob);
            memcpy (sptr, p->s_args[i], p->ls_args[i]);
            sptr[p->ls_args[i]] = '\0';
            sptr += p->ls_args[i] + 1;
        }
        else
            ENCODE_INT (ptr, 0);
    }

    /* string array arguments */
                printf ("#define %s_%s_ia %d\n", current_id, ianames[i], i);
        for (i = 0; i < p->nsa_args; nbloba; ++i) {
        int j;

        ENCODE_INT (ptr, iptr - p->blob);
                printf ("#define %s_%s_ba %d\n", current_id, banames[i], i);



Moore                      Expires 21 May 2002                 [Page 21]
BLOB Protocol                Internet-Draft             21 November 2001


        for (j (i = 0; j i < p->nlsa_args[i]; ++j) {
            if (p->sa_args[i][j] != 0) {
                ENCODE_INT (iptr, sptr - p->blob);
                if (p->lsa_args[i]) {
                    memcpy (sptr, p->sa_args[i][j], p->lsa_args[i][j]);
                    sptr += p->lsa_args[i][j];
                    *sptr++ = '\0';
                }
                else {
                    char *src = p->sa_args[i][j];

                    while (*sptr++ = *src++);
                }
            }
            else
                ENCODE_INT (iptr, 0);
        }
    }
} nstra; ++i)
                printf ("#define %s_%s_sa %d\n", current_id, sanames[i], i);
        next;
}'


Appendix D: C. Example Encoding and Decoding Code

This code will be supplied in a later version of this document.

Check http://www.cs.utk.edu/~moore/blob for availability. the latest version.










































Moore                      Expires January 12, 21 May 2002                 [Page 22]

----