view Side-By-Side changes
INTERNET-DRAFT S. Leggdraft-legg-xed-rxer-02.txt Adacel Technologiesdraft-legg-xed-rxer-03.txt eB2Bcom Intended Category: Standards Track D. PragerDeakin University June 16, 2004July 5, 2005 Robust XML Encoding Rules (RXER) forASN.1 TypesAbstract Syntax Notation One (ASN.1) Copyright (C) The Internet Society(2004). All Rights Reserved.(2005). Status of this MemoThis documentBy submitting this Internet-draft, each author represents that any applicable patent or other IPR claims of which he or she isan Internet-Draftaware have been or will be disclosed, andisany of which he or she becomes aware will be disclosed, infull conformanceaccordance withallSection 6 of BCP 79. By submitting this Internet-draft, I accept the provisions of Section103 ofRFC2026.BCP 78. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress". The list of current Internet-Drafts can be accessed athttp://www.ietf.org/ietf/1id-abstracts.txthttp://www.ietf.org/1id-abstracts.html The list of Internet-Draft Shadow Directories can be accessed athttp://www.ietf.org/shadow.html. Distribution of this document is unlimited.http://www.ietf.org/shadow.html Technical discussion of this document should take place on the XED developers mailing list<xeddev@adacel.com>.<xeddev@eb2bcom.com>. Please send editorial comments directly to the editor<steven.legg@adacel.com.au>.<steven.legg@eb2bcom.com>. Further information is available on the XED website: www.xmled.info. This Internet-Draft expires on16 December 2004.5 January 2006. Abstract This document defines a set of Abstract Syntax Notation One (ASN.1) encoding rules, called the Robust XML Encoding Rules or RXER, that produce an Extensible Markup Language (XML) representation for values of any given ASN.1 data type. Rules for producing a canonical RXER encoding are also defined. Legg & Prager Expires16 December 20045 January 2006 [Page 1] INTERNET-DRAFT Robust XML Encoding RulesJune 16, 2004July 5, 2005 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . .23 2. Conventions. . . . . . . . . . . . . . . . . . . . . . . . . .34 3. Definitions. . . . . . . . . . . . . . . . . . . . . . . . . .4 3.1. Qualified Reference Names5 4. Additional Basic Types . . . . . . . . . . . . . . . . .4 4. General Considerations. . . 5 4.1. The AnyType Type . . . . . . . . . . . . . . . . . .5 5. Standalone RXER Encodings.. . 5 4.1.1. Self-Containment . . . . . . . . . . . . . . . .6 6.8 4.1.2. Normalization for Canonical Encoding Rules . . . 10 4.2. The AnyURI Type. . . . . . . . . . . . . . . . . . . . . 12 4.3. The NCName Type. .6 6.1. Identifiers.. . . . . . . . . . . . . . . . . . . 12 4.4. The Name Type. . . .7 6.2. Type Referencing Notations. . . . . . . . . . . . . . .7 6.3. Restricted Character String Types.. . . 12 4.5. The QName Type . . . . . . . .8 6.4. BIT STRING. . . . . . . . . . . . . 12 5. Encoding Rules . . . . . . . . . .8 6.5. BOOLEAN.. . . . . . . . . . . . . . 13 5.1. Definitions and Common Constructs. . . . . . . . . . .10 6.6. CHARACTER STRING. 14 5.1.1. Qualified Reference Names. . . . . . . . . . . . 15 5.1.2. Identifiers. . . . . . . . .10 6.7. CHOICE. . . . . . . . . . 15 5.2. Encapsulating RXER Encodings . . . . . . . . . . . . . . 15 5.3. Component Encodings. .10 6.8. EMBEDDED PDV. . . . . . . . . . . . . . . . . 18 5.3.1. Element Components . . . . .11 6.9. ENUMERATED. . . . . . . . . . 18 5.3.1.1. Namespace Properties for Elements . . . 20 5.3.1.2. Namespace Prefixes for Element Names. . 22 5.3.2. Attribute Components . . . . . . . . .12 6.10. EXTERNAL. . . . . 23 5.3.2.1. Namespace Prefixes for Attribute Names. 24 5.3.3. Unencapsulated Components. . . . . . . . . . . . 24 5.3.4. Examples . . . . . . . .12 6.11. GeneralizedTime.. . . . . . . . . . . . 25 5.4. Type Referencing Notations . . . . . . . .12 6.12. INSTANCE OF.. . . . . . . 26 5.5. TypeWithConstraint and SEQUENCE OF Type. . . . . . . . . 27 5.6. Character Data Translations. . . . . . . .13 6.13. INTEGER.. . . . . . . 27 5.6.1. Restricted Character String Types. . . . . . . . 28 5.6.2. BIT STRING . . . . . . . . . .13 6.14. NULL. . . . . . . . . 29 5.6.3. BOOLEAN. . . . . . . . . . . . . . . . . .14 6.15. ObjectDescriptor. . . 30 5.6.4. ENUMERATED . . . . . . . . . . . . . . . . .14 6.16. OBJECT IDENTIFIER and RELATIVE-OID. . 31 5.6.5. GeneralizedTime. . . . . . . . . .15 6.17. OCTET STRING. . . . . . . 32 5.6.6. INTEGER. . . . . . . . . . . . . . . .15 6.18. REAL. . . . . 33 5.6.7. NULL . . . . . . . . . . . . . . . . . . . . .16 6.19. SEQUENCE and SET. 34 5.6.8. ObjectDescriptor . . . . . . . . . . . . . . . . 34 5.6.9. OBJECT IDENTIFIER and RELATIVE-OID . . .16 6.20. SEQUENCE OF and SET OF. . . . 35 5.6.10. OCTET STRING . . . . . . . . . . . . .18 6.21. UTCTime.. . . . . 35 5.6.11. QName. . . . . . . . . . . . . . . . . . . .19 6.22. Open Type.. . 36 5.6.11.1. Namespace Prefixes for Qualified Names 36 5.6.12. REAL . . . . . . . . . . . . . . . . . . . . .20 6.23. AnyType.. 36 5.6.13. UTCTime. . . . . . . . . . . . . . . . . . . . . 38 5.6.14. CHOICE as UNION. . . .21 7. RXER Transfer Syntax. . . . . . . . . . . . . 38 5.6.15. SEQUENCE OF as LIST. . . . . . . . .22 8. Relationship to XER.. . . . . . 40 5.7. Combining Types. . . . . . . . . . . . . . . .22 9. Security Considerations.. . . . . 41 5.7.1. CHARACTER STRING . . . . . . . . . . . . . .23 10. Acknowledgements. . 41 5.7.2. CHOICE . . . . . . . . . . . . . . . . . . . . .23 11. References41 5.7.3. EMBEDDED PDV . . . . . . . . . . . . . . . . . . 42 5.7.4. EXTERNAL . . . . . . . .23 11.1. Normative References. . . . . . . . . . . . 42 5.7.5. INSTANCE OF. . . . . . .23 11.2. Informative References. . . . . . . . . . . . 43 5.7.6. SEQUENCE and SET . . . . .25 Authors' Addresses. . . . . . . . . . . 43 5.7.7. SEQUENCE OF and SET OF . . . . . . . . . . . . .25 Full Copyright Statement44 5.8. Open Type. . . . . . . . . . . . . . . . . . . . . .26 1. Introduction This document defines a set of Abstract Syntax Notation One (ASN.1) [X680] encoding rules, called the Robust XML Encoding Rules or RXER, that produce an Extensible Markup Language (XML) [XML] representation of ASN.1 values of any given arbitrary ASN.1 type.. . 45 Legg & Prager Expires16 December 20045 January 2006 [Page 2] INTERNET-DRAFT Robust XML Encoding RulesJune 16, 2004 An ASN.1 value is regarded as analogous to the content of an element. The RXER encoding of an ASN.1 value is the well-formed and valid content of an elementJuly 5, 2005 5.9. AnyType. . . . . . . . . . . . . . . . . . . . . . . . . 46 5.10. Namespace Prefixes for CRXER . . . . . . . . . . . . . . 46 5.11. Serialization. . . . . . . . . . . . . . . . . . . . . . 48 5.11.1. Non-canonical Serialization . . . . . . . . . . 48 5.11.2. Canonical Serialization . . . . . . . . . . . . 50 5.11.3. Unicode Normalization inanXMLdocument [XML] conforming to XML namespaces [XMLNS]. Simple ASN.1 data types such as PrintableString, INTEGER, BOOLEAN, define character data content while the ASN.1 combining types (i.e., SET, SEQUENCE, SET OF, SEQUENCE OF, and CHOICE) define element content. The element names are provided by the identifiers of the components in combiningVersion 1.1. . . . 52 5.12. Syntax-Based Canonicalization. . . . . . . . . . . . . . 52 6. Transfer Syntax Identifiers. . . . . . . . . . . . . . . . . . 53 6.1. RXER Transfer Syntax . . . . . . . . . . . . . . . . . . 53 6.2. CRXER Transfer Syntax. . . . . . . . . . . . . . . . . . 53 7. Relationship to XER. . . . . . . . . . . . . . . . . . . . . . 53 8. Security Considerations. . . . . . . . . . . . . . . . . . . . 54 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 55 10. IANA Considerations. . . . . . . . . . . . . . . . . . . . . . 55 Appendix A. Additional Basic Definitions Module . . . . . . . . . 55 Normative References . . . . . . . . . . . . . . . . . . . . . . . 56 Informative References . . . . . . . . . . . . . . . . . . . . . . 57 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 58 Full Copyright Statement . . . . . . . . . . . . . . . . . . . . . 58 1. Introduction This document defines a set of Abstract Syntax Notation One (ASN.1) [X.680] encoding rules, called the Robust XML Encoding Rules or RXER, that produce an Extensible Markup Language (XML) [XML10][XML11] representation of ASN.1 values of any given arbitrary ASN.1 type. An ASN.1 value is regarded as analogous to the content of an XML element, or in some cases an XML attribute value. The RXER encoding of an ASN.1 value is the well-formed and valid content of an element, or attribute value, in an XML document [XML10][XML11] conforming to XML namespaces [XMLNS10][XMLNS11]. Simple ASN.1 data types such as PrintableString, INTEGER, BOOLEAN, define character data content or attribute values, while the ASN.1 combining types (i.e., SET, SEQUENCE, SET OF, SEQUENCE OF, and CHOICE) define element content and attributes. The element and attribute names are generally provided by the identifiers of the components in combining type definitions (i.e., elements and attributes correspond to the NamedType notation). RXER leaves some formatting details to the discretion of the encoder, so there is not a single unique RXER encoding for an ASN.1 value. However, this document also defines a restriction of RXER, called the Canonical Robust XML Encoding Rules (CRXER), which does produce a single unique encoding for an ASN.1 value. Obviously, the CRXER encoding of a value is also a valid RXER encoding of that value. The restrictions on RXER to produce the CRXER encoding are interspersed with the description of the rules for RXER. Note that "ASN.1 value" does not mean a Basic Encoding Rules (BER) [X.690] encoded value. The ASN.1 value is an abstract concept that is independent of any particular encoding. BER is just one possible way to encode an ASN.1 value. This document defines an alternative way to encode an ASN.1 value. Legg & Prager Expires 5 January 2006 [Page 3] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 A separate document [RXEREI] defines encoding instructions [X.680-1] that may be used in an ASN.1 specification to modify how values are encoded in RXER, for example, to encode a component of a combining ASN.1 type as an attribute rather than as a child element. A pre-existing ASN.1 specification will not have RXER encoding instructions so any mention of encoding instructions in this document can be ignored when dealing with such specifications. Encoding instructions for other encoding rules have no effect on RXER encodings. 2. Conventions The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED" and "MAY" in this document are to be interpreted as described in BCP 14, RFC 2119 [BCP14]. The key word "OPTIONAL" is exclusively used with its ASN.1 meaning. A reference to an ASN.1 production [X.680] (e.g., Type, NamedType) is a reference to the text in an ASN.1 specification corresponding to that production. The specification of RXER makes use of definitions from the XML Information Set (Infoset) [ISET]. In particular, information item property names are presented per the Infoset, e.g., [local name]. Literal values of Infoset properties are enclosed in double quotes, however the double quotes are not part of the property values. In the sections that follow, "information item" will be abbreviated to "item", e.g., "element information item" is abbreviated to "element item". The term "element" or "attribute" (without the "item") is referring to an element or attribute in an XML document rather than an information item. Literal character strings to be used in the RXER encoding appear within double quotes, however the double quotes are not part of the literal value and do not appear in the encoding. This document uses the namespace prefix [XMLNS10][XMLNS11] "xsi:" to stand for the namespace name "http://www.w3.org/2001/XMLSchema-instance", though in practice any valid namespace prefix is permitted in non-canonical RXER encodings (namespace prefixes are deterministically generated for CRXER). The encoding instructions [X.680-1] referenced by name in this specification are encoding instructions for RXER [RXEREI]. Throughout this document, references to the AnyType, AnyURI, NCName, Name and QName ASN.1 types are references to the types described in Section 4 and consolidated in the AdditionalBasicDefinitions module in Appendix A. Any provisions associated with the reference do not apply to types defined in other ASN.1 modules that happen to have these same names. Code points for characters [UCS][UNICODE] are expressed using the Unicode convention U+n, where n is four to six hexadecimal digits, Legg & Prager Expires 5 January 2006 [Page 4] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 e.g., the space character is U+0020. 3. Definitions Definition: A white space character is a space (U+0020), tab (U+0009), carriage return (U+000D) or line feed (U+000A) character. Definition: White space is a sequence of one or more white space characters. Definition: A namespace declaration attribute item is declaring the default namespace if the [prefix] of the attribute item has no value, the [local name] of the attribute item is "xmlns" and the [normalized value] is not empty. Definition: A namespace declaration attribute item is undeclaring the default namespace if the [prefix] of the attribute item has no value, the [local name] of the attribute item is "xmlns" and the [normalized value] is empty (i.e., xmlns=""). 4. Additional Basic Types This section defines an ASN.1 type for representing markup in abstract values, as well as basic types that are useful in encoding instructions [RXEREI] and other related specifications [ASN.X]. The ASN.1 definitions in this section are consolidated in the AdditionalBasicDefinitions ASN.1 module in Appendix A. 4.1. The AnyType Type A value of the AnyType ASN.1 type holds the [prefix], [attributes], [namespace attributes] and [children] of an element item, i.e., the content of an element. RXER has special provisions for encoding values of AnyType (see Section 5.9). For other encoding rules, a value of AnyType is encoded according to the following ASN.1 type definition (with AUTOMATIC TAGS): AnyType ::= CHOICE { text SEQUENCE { prolog UTF8String (SIZE(1..MAX)) OPTIONAL, prefix NCName OPTIONAL, attributes UTF8String (SIZE(1..MAX)) OPTIONAL, content UTF8String (SIZE(1..MAX)) OPTIONAL } } The text alternative of the AnyType CHOICE type provides for the [prefix], [attributes], [namespace attributes] and [children] of an element item to be represented as serialized XML using the UTF-8 character encoding [UTF8]. Legg & Prager Expires 5 January 2006 [Page 5] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 ASIDE: The CHOICE allows for one or more alternative compact representations of the content of elements to be supported in a future specification. Definition: A line break is any sequence of characters that is normalized to a line feed by XML End-of-Line Handling [XML10][XML11]. Definition: Serialized white space is a sequence of one or more white space characters and/or line breaks. With respect to some element item whose content is represented by a value of the text alternative of the AnyType type: - the prolog component of the value contains text that, after line break normalization, conforms to the XML prolog production [XML10][XML11], - the prefix component is absent if the [prefix] of the element item has no value, otherwise the prefix component contains the [prefix] of the element item, - the attributes component of the value contains an XML serialization of the [attributes] and [namespace attributes] of the element item, if any, with each attribute separated from the next by serialized white space, - the content component is absent if the [children] property of the element item is empty, otherwise the content component of the value contains an XML serialization of the [children] of the element item. All the components of a value of AnyType MUST use the same version of XML, either version 1.0 [XML10] or version 1.1 [XML11]. If XML version 1.1 is used then the prolog component MUST be present and MUST have an XMLDecl for version 1.1. If the prolog component is absent then XML version 1.0 is assumed. If the prefix component is present then there MUST be a namespace declaration attribute in the attributes component that defines that namespace prefix (since an element whose content is described by a value of AnyType is required to be self-contained, see Section 4.1.1). Note that the prefix component is critically related to the NamedType which has AnyType as its type. If an AnyType value is extracted from one enclosing abstract value and embedded in another enclosing abstract value (i.e., becomes associated with a different NamedType) then the prefix may no longer be appropriate, in which case it will need to be revised. It may also be necessary to add another namespace declaration attribute to the attributes component so as to declare a new namespace prefix. Leading and/or trailing serialized white space is permitted in the attributes component. A value of the attributes component consisting Legg & Prager Expires 5 January 2006 [Page 6] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 only of serialized white space (i.e., no actual attributes) is permitted. The attributes and content components MAY contain entity references [XML10][XML11]. If any entity references are used (other than references to the predefined entities) then the prolog component MUST be present and MUST contain entity declarations for those entities in the internal or external subset of the document type declaration. Example Given the following ASN.1 module: MyModule DEFINITIONS AUTOMATIC TAGS ::= BEGIN Message ::= SEQUENCE { messageType INTEGER, messageValue AnyType } ENCODING-CONTROL RXER TARGET-NAMESPACE "http://example.com/ns/MyModule" COMPONENT message Message -- a top level NamedType END Consider the following XML document: <?xml version='1.0'?> <!DOCTYPE message [ <!ENTITY TRUE 'true'> ]> <message> <messageType>1</messageType> <messageValue xmlns:ns="http://www.example.com/ABD" ns:foo="1" bar="0"> <this>&TRUE;</this> <that/> </messageValue> </message> An AnyType value corresponding to the content of the <messageValue> element is, in ASN.1 value notation [X.680] (where lf represents the line feed character): text:{ prolog { "<?xml version='1.0'?>", lf, "<!DOCTYPE root [", lf, " <!ENTITY TRUE 'true'>", lf, "]>", lf }, Legg & Prager Expires 5 January 2006 [Page 7] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 attributes { " xmlns:ns=""http://www.example.com/ABD""", lf, " ns:foo=""1"" bar=""0""" }, content { lf, " <this>&TRUE;</this>", lf, " <that/>", lf, " " } } The following AnyType value is an equivalent representation of the content of the <messageValue> element: text:{ attributes { "bar=""0"" ns:foo=""1"" ", "xmlns:ns=""http://www.example.com/ABD""" }, content { lf, " <this>true</this>", lf, " <that/>", lf, " " } } By itself the AnyType ASN.1 type imposes no datatype restriction on the markup contained by its values, and is therefore analogous to the XML Schema anyType [XSD1]. There is no ASN.1 basic notation that can directly impose the constraint that the markup represented by a value of AnyType must conform to the markup allowed by a specific type definition. However, certain encoding instructions (i.e., the reference encoding instructions [RXEREI]) have been defined to have this effect. 4.1.1. Self-Containment An element and its contents, including descendent elements, may contain qualified names [XMLNS10][XMLNS11] as the names of elements and attributes, in the values of attributes, and as character data content of elements. The binding between namespace prefix and namespace name for these qualified names is potentially determined by the namespace declaration attributes of ancestor elements (which in the Infoset representation are inherited as namespace items in the [in-scope namespaces]). In the absence of complete knowledge of the data type of an element item whose content is described by a value of AnyType it is not possible to determine with absolute certainty which of the namespace items inherited from the [in-scope namespaces] of the [parent] element item are significant in interpreting that content. The safe and easy option would be to assume that all the namespace items from the [in-scope namespaces] of the [parent] element item are significant and need to be retained within the AnyType value. When the AnyType value is re-encoded any of the retained namespace items that do not appear in the [in-scope namespaces] of the enclosing element item in the new encoding could be made to appear by outputting corresponding namespace declaration attribute items in the [namespace attributes] of the enclosing element item. Legg & Prager Expires 5 January 2006 [Page 8] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 From the perspective of the receiver of the new encoding, this enlarges the set of attribute items in the [namespace attributes] represented by the AnyType value. In addition, there is no guarantee that the sender of the new encoding has recreated the original namespace declaration attributes on the ancestor elements, so the [in-scope namespaces] of the enclosing element item is likely to have new namespace declarations that the receiver will retain and pass on in the [namespace attributes] when it in turn re-encodes the AnyType value. This unbounded growth in the set of attribute items in the [namespace attributes] defeats any attempt to produce a canonical encoding. To avoid this problem, the principle of self-containment is introduced. An element item (the subject element item) is self-contained if the constraints of Namespaces in XML [XMLNS10] are satisfied (i.e., that prefixes are properly declared) and none of the following bindings are determined by a namespace declaration attribute item in the [namespace attributes] of an ancestor element item of the subject element item: (1) the binding between the [prefix] and [namespace name] of the subject element item, (2) the binding between the [prefix] and [namespace name] of any descendant element item of the subject element item, (3) the binding between the [prefix] and [namespace name] of any attribute item in the [attributes] of the subject element item or the [attributes] of any descendant element item of the subject element item, (4) the binding between the namespace prefix and namespace name of any qualified name [XMLNS10] in the [normalized value] of any attribute item in the [attributes] of the subject element item or the [attributes] of any descendant element item of the subject element item, (5) the binding between the namespace prefix and namespace name of any qualified name represented by a series of character items (ignoring processing instruction and comment items) in the [children] of the subject element item or the [children] of any descendant element item of the subject element item. ASIDE: If an element is self-contained then separating the element from its parent does not change the semantic interpretation of its name and any names in its content. A supposedly self-contained element in a received RXER encoding that is in fact not self-contained SHALL be treated as an ASN.1 constraint violation. Legg & Prager Expires 5 January 2006 [Page 9] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 ASIDE: ASN.1 does not require an encoding with a constraint violation to be immediately rejected, however the constraint violation must be reported at some point, possibly in a separate validation step. Implementors should note that an RXER decoder will be able to detect some, but not all, violations of self-containment. For example, it can detect element and attribute names that depend on namespace declarations appearing in the ancestors of a supposedly self-contained element. Similarly, where type information is available, it can detect qualified names in character data that depend on the namespace declarations of ancestor elements. However, type information is not always available, so some qualified names will escape constraint checking. Thus the onus is on the creator of the original encoding to ensure that element items required to be self-contained really are completely self-contained. An element item whose content is described by a value of AnyType MUST be self-contained. ASN.1 extensibility [X.680] creates situations where a decoder may not be aware that it is dealing with a value of AnyType. In order to protect the integrity of AnyType values in these situations certain other element items are required to be self-contained. The particular cases are called out in later parts of this specification. ASIDE: The rationale is the same in each case. If a decoder receives an element item enclosing an unknown extension then it is obliged to save at least the [prefix], [attributes], [namespace attributes] and [children] of the element item for possible later re-encoding. If such element items are always self-contained then the application does not need to recreate exactly the same [in-scope namespaces] when re-encoding the extension (the namespace items corresponding to the [namespace attributes] are sufficient), and no new namespace declarations need be added to the [namespace attributes]. This is critical if the extension happens to be a value of AnyType. One further case related to the ELEMENT-REF encoding instruction and top level NamedType notation [RXEREI] is addressed in Section 5.3.1. ASIDE: The encoding procedures in Section 5, particularly Section 5.3.1.1, take account of the requirements for self-containment (for element items where the content is not described by a value of AnyType) so that an RXER encoder following these procedures will not create violations of self-containment. 4.1.2. Normalization for Canonical Encoding Rules Implementations are given some latitude in how the content of an element is represented as an abstract value of the AnyType type, in part because an Infoset can have different equivalent serializations. For example, the order of attributes and the amount and kind of white space characters between attributes are irrelevant to the Infoset Legg & Prager Expires 5 January 2006 [Page 10] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 representation. The content can also include one or more elements corresponding to an ASN.1 top level NamedType from a module with a TARGET-NAMESPACE encoding instruction. It is only necessary to preserve the abstract value for such elements and a particular abstract value can have different Infoset representations. These two characteristics mean that when an RXER encoded value of AnyType is decoded the components of the recovered AnyType value may not be exactly the same, character for character, as the original value that was encoded, though the recovered value will be semantically equivalent. However, canonical ASN.1 encoding rules such as the Distinguished Encoding Rules (DER) and the Canonical Encoding Rules (CER) [X.690], which encode AnyType values according to the ASN.1 definition of AnyType, depend on character for character preservation of string values. This requirement can be accommodated if values of AnyType are normalized when they are encoded according to a set of canonical encoding rules. ASIDE: The RXER encoding and decoding of an AnyType value might change the character string components of the value from the perspective of BER, but there will be a single, repeatable encoding for DER. A value of AnyType will appear as the content of an element in a CRXER encoding. When the value is encoded using a set of ASN.1 canonical encoding rules other than CRXER the components of the text alternative of the value MUST be normalized, by reference to this element, as follows: (1) The value of the prolog component SHALL be the XMLDecl <?xml version="1.1"?> with no other leading or trailing characters. (2) If the element's name is unprefixed in the CRXER encoding then the prefix component SHALL be absent, otherwise the value of the prefix component SHALL be the prefix of the element's name in the CRXER encoding. (3) Take the character string representing the element's attributes, including namespace declarations, in the CRXER encoding. If the first attribute is a namespace declaration that undeclares the default namespace (i.e., xmlns="") then remove it. Remove any leading space characters. If the resulting character string is empty then the attributes component SHALL be absent, otherwise the value of the attributes component SHALL be the resulting character string. ASIDE: Note that the attributes of an element can change if an RXER encoding is re-encoded in CRXER. (4) If the element has no characters between the start-tag and end-tag [XML11] in the CRXER encoding then the content component Legg & Prager Expires 5 January 2006 [Page 11] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 SHALL be absent, otherwise the value of the content component SHALL be identical to the character string in the CRXER encoding bounded by the element's start-tag and end-tag. ASIDE: A consequence of invoking the CRXER encoding is that any nested element corresponding to an ASN.1 top level NamedType from a module with a TARGET-NAMESPACE encoding instruction, or indeed the element itself, will be normalized according to its ASN.1 value rather than its Infoset representation. ASIDE: It is only through values of AnyType that PIs and comments can appear in CRXER encodings. If an application uses DER but has no knowledge of RXER then it will not know to normalize values of AnyType. If RXER is deployed into an environment containing such applications then AnyType values SHOULD be normalized even when encoding using non-canonical encoding rules. 4.2. The AnyURI Type A value of the AnyURI ASN.1 type is a character string conforming to the format of a Uniform Resource Identifier (URI) [URI]. AnyURI ::= UTF8String (CONSTRAINED BY { -- conforms to the format of a URI -- }) 4.3. The NCName Type A value of the NCName ASN.1 type is a character string conforming to the NCName production of Namespaces in XML [XMLNS10]. NCName ::= UTF8String (CONSTRAINED BY { -- conforms to the NCName production of -- Namespaces in XML -- }) 4.4. The Name Type A value of the Name ASN.1 type is a character string conforming to the Name production of XML version 1.0 [XML10]. Name ::= UTF8String (CONSTRAINED BY { -- conforms to the Name production of XML -- }) 4.5. The QName Type A value of the QName ASN.1 type describes a qualified name [XMLNS10]. ASIDE: In the terminology of Namespaces in XML 1.1 [XMLNS11], a QName value describes an expanded name. RXER has special provisions for encoding values of the QName type (see Section 5.6.11). For other encoding rules, a value of Qname is encoded according to the following ASN.1 type definition (with AUTOMATIC TAGS): Legg & Prager Expires 5 January 2006 [Page 12] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 QName ::= SEQUENCE { prefix NCName OPTIONAL, namespace-name AnyURI OPTIONAL, local-name NCName } The prefix component MAY hold the namespace prefix part of the qualified name. Implementations are not required to retain the actual prefix of a decoded qualified name and are not required to use a retained prefix in an RXER encoding of the qualified name. The prefix component SHALL be omitted when a QName value is encoded using a set of canonical encoding rules other than CRXER (e.g., DER or CER [X.690]). The namespace-name component holds the namespace name associated with the qualified name, if any. ASIDE: A namespace name can be associated with ASN.1 types and top level NamedType instances by using the TARGET-NAMESPACE encoding instruction. The local-name component holds the local part of the qualified name. 5. Encoding Rules ASN.1 abstract values are uniformly regarded as analogous to the content of an element or attribute, not complete elements or attributes in their own right. The RXER encoding of an abstract value is described as a translation into a synthetic Infoset which is then serialized as XML. This separation has been chosen for descriptive convenience and is not intended to impose any particular architecture on RXER implementations. An RXER encoder is free to encode an abstract value directly to XML provided the result is equivalent to following the two stage procedure described in this document. An RXER decoder is also an XML processor [XML10][XML11]. The process of translating an abstract value into an Infoset is described as producing either: (1) a string of characters that either becomes part of the [normalized value] of an attribute item or becomes character items among the [children] of an enclosing element item, or (2) a collection of zero or more attribute items contributing to the [attributes] of an enclosing element item plus a series of zero or more character, element, processing instruction (PI) or comment items contributing to the [children] of the enclosing element item. NamedType notation in the ASN.1 specification controls whether the Legg & Prager Expires 5 January 2006 [Page 13] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 translation of an abstract value is encapsulated in an element item or in an attribute item. Section 5.3 describes the general case and Section 5.2 describes special cases for the root of the encoding. Sections 5.4 to 5.9 describe the translation of abstract values into an Infoset for each of the ASN.1 type notations. Section 5.10 describes post-processing of namespace prefixes for CRXER encodings. Section 5.11 specifies how the Infoset translation is serialized as XML. This specification assumes that the COMPONENTS OF transformation specified in X.680, Clause 24.4 [X.680] has already been applied to all relevant types. Examples of RXER encodings in the following sections use a <value> start-tag and </value> end-tag to delimit the content. These start and end tags are for illustration only and are not part of the encoding of an abstract value. In normal use, the name of the enclosing element is provided by the context of the type of the abstract value, e.g., an enclosing SEQUENCE type. 5.1. Definitions and Common Constructs For convenience, a CHOICE type where the ChoiceType is subject to a UNION encoding instruction will be referred to as a UNION type, and a SEQUENCE OF type where the SequenceOfType is subject to a LIST encoding instruction will be referred to as a LIST type. Definition: A canonical namespace prefix is an NCName [XMLNS10] beginning with the letter "n" (U+006E) followed by a non-negative number string. A non-negative number string is either the digit character "0" (U+0030), or a non-zero decimal digit character (U+0031-U+0039) followed by zero, one or more of the decimal digit characters "0" to "9" (U+0030-U+0039). Definition: A NamedType belongs to an extension if it is in a ComponentType in a ComponentTypeList in an ExtensionAdditionGroup, or in a ComponentType in an ExtensionAddition, or in an AlternativeTypeList in an ExtensionAdditionAlternativesGroup, or in an ExtensionAdditionAlternative. It is not uncommon for extension markers to be neglected in specifications traditionally using only BER since extension markers do not alter BER encodings. Consequently, it is not immediately obvious in later versions of the specification which instances of NamedType belong to extensions of the original base specification. When using RXER with such specifications, implementors MUST either obtain the base specification and identify the extensions by comparison, or else be extremely conservative and assume that all known components are extensions when encoding and that none of the known components are extensions when decoding. Legg & Prager Expires 5 January 2006 [Page 14] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 5.1.1. Qualified Reference Names A Qualified Reference Name is a qualified name [XMLNS10] that uniquely identifies a particular type definition. Not all type definitions have a Qualified Reference Name. A Type has a Qualified Reference Name if one of the following applies: (1) the Type comprises (i.e., in a DefinedType in a ReferencedType) a typereference (not a DummyReference) or an ExternalTypeReference and the ASN.1 module in which the referenced type is defined has a TARGET-NAMESPACE encoding instruction and the referenced type is not directly or indirectly an open type [X.681] and the referenced type is not directly or indirectly AnyType (Section 4.1), (2) the Type comprises one of the productions in Table 1 of the specification for Abstract Syntax Notation X (ASN.X) [ASN.X], In case (1), the Qualified Reference Name is a qualified name where the local part is the typereference and the namespace name is the one assigned by the TARGET-NAMESPACE encoding instruction in the module in which the referenced type is defined. In case (2), the Qualified Reference Name is a qualified name with the namespace name "http://xmled.info/ns/ASN.1" and the local part as indicated in Table 1. Note that the Qualified Reference Name is the same qualified name that would be used to reference the corresponding type in the ASN.X representation of the ASN.1 specification, or the XML Schema translation [CXSD] of the ASN.1 specification. 5.1.2. Identifiers An identifier, as defined in ASN.1 notation (Clause 11.3 of X.680 [X.680]), is a character string that begins with a latin lowercase letter (U+0061-U+007A) and is followed by zero, one or more latin letters (U+0041-U+005A, U+0061-U+007A), decimal digits (U+0030-U+0039), and hyphens (U+002D). A hyphen is not permitted to be the last character and a hyphen is not permitted to be followed by another hyphen. The case of letters in an identifier is always significant. ASN.1 identifiers are used for the [local name] of attribute and element items, and may also appear in the character data content of elements or attributes. RXER encoding instructions can be used to substitute an NCName [XMLNS10] for an identifier. 5.2. Encapsulating RXER Encodings The RXER encoding of some abstract value generates only the content of an element (which may include attributes and child elements) or the value of an attribute. It is the responsibility of the specification invoking RXER to determine the context of the enclosing Legg & Prager Expires 5 January 2006 [Page 15] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 element or attribute for this content (i.e., the [local name] and [namespace name] for the corresponding information item) by either: (1) for an attribute, explicitly stating the [local name] and [namespace name] to be used, or (2) for an element, explicitly stating the [local name] and [namespace name] to be used, or (3) for an element, invoking a Standalone RXER Encoding, or (4) for an element, invoking a Standalone CRXER Encoding, or (5) nominating a top level NamedType [RXEREI] (which will correspond to either an element or an attribute by its definition). ASIDE: The ASN.1 basic notation does not have a concept analogous to a global element or attribute definition. That is, the basic notation does not allow a NamedType to appear on its own, outside of an enclosing combining type. However, an ASN.1 specification may use an RXER encoding control section [RXEREI] to define global elements and attributes using the NamedType notation. A CRXER encoding MAY be requested in case (1), (2) or (5). Case (4) is always a CRXER encoding. Case (1) SHALL NOT be used if the type of the abstract value would not be acceptable as the Type in a NamedType subject to an ATTRIBUTE encoding instruction. The element in case (2), (3), (4) or (5) is not necessarily the document element [XML10][XML11] of an XML document. In case (5), the abstract value is translated as a value of the NamedType, as specified in Section 5.3, and then serialized, as specified in Section 5.11, ASIDE: Whilst ordinarily one speaks of the encoding of an abstract value of an ASN.1 type, Section 5.3 introduces the notion of the value of an ASN.1 NamedType. This allows case (5) to be more conveniently described as the RXER encoding of a value of the nominated top level NamedType. The remainder of this section is intended to make cases (1), (2), (3) and (4) consistent with the effects of case (5). In case (3) or (4), the [local name] SHALL be "value", the [namespace name] SHALL have no value, and the [prefix] SHALL have no value. In case (2), (3) or (4), if the type of the abstract value is directly or indirectly AnyType then the [in-scope namespaces] and [namespace attributes] of the element item are constructed as specified in Section 5.9, otherwise the [in-scope namespaces] and Legg & Prager Expires 5 January 2006 [Page 16] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 [namespace attributes] of the element item are constructed as specified in Section 5.3.1.1. In case (1), the [prefix] of the attribute item is determined as specified in Section 5.3.2.1. In case (2), if the [namespace name] of the element item has no value then the [prefix] of the element item has no value, otherwise if the type of the abstract value is directly or indirectly AnyType then the [prefix] is determined by the AnyType value as specified in Section 5.9, otherwise the [prefix] of the element item is determined as specified in Section 5.3.1.2. In case (1), the [normalized value] of the attribute item is generated by the normal application of the rules in Section 5.6 to the abstract value being encoded. In case (2), (3) or (4), the [attributes] and [children] of the element item (i.e., its content), are generated by the normal application of the rules in Sections 5.4 to 5.9 to the abstract value being encoded. In case (2) or (3) for a non-canonical RXER encoding, if the type of the abstract value is not directly or indirectly AnyType then PI and comment items MAY be added to the [children] of the element item (before or after any other items). The element item becomes the [parent] for each PI and comment item. These particular PI and comment items in a received RXER encoding MAY be discarded by an application. ASIDE: There is no provision for representing comments and PIs in ASN.1 abstract values of types other than AnyType. These items will be lost if the abstract value is re-encoded using a different set of encoding rules. In case (2) or (3), if the ASN.1 type of the value being encoded has a Qualified Reference Name (see Section 5.1.1) then the [attributes] of the element item MAY contain an attribute item with the [local name] "type" and the [namespace name] "http://www.w3.org/2001/XMLSchema-instance" (i.e., an xsi:type attribute). The [prefix] of this attribute item is determined as specified in Section 5.3.2.1. The [normalized value] of this attribute item is the Qualified Reference Name with the namespace prefix determined as specified in Section 5.6.11.1. The element item is the [owner element] for the attribute item. ASIDE: The xsi:type attribute indicates to an XML Schema validator which type definition in the XML Schema translation of the ASN.1 specification [CXSD] it should use for validating the RXER encoding. In case (2) or (3), attribute items with the [local name] "schemaLocation" or "noNamespaceSchemaLocation" and the [namespace name] "http://www.w3.org/2001/XMLSchema-instance" [XSD1] Legg & Prager Expires 5 January 2006 [Page 17] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 MAY be added to the [attributes] of the element item. The [prefix] for each of these attribute items is determined as specified in Section 5.3.2.1. The [normalized value] of these attribute items MUST reference a compatible XML Schema translation of the ASN.1 specification [CXSD]. The element item is the [owner element] for the attribute items. 5.3. Component Encodings Strictly speaking, ASN.1 encoding rules are used to encode abstract values, each of which has a specific ASN.1 type. There is no conceptual basis in ASN.1 for talking about the value of a NamedType, or its encoding. However, elements are the fundamental discrete structures of an XML document. The content of an element or attribute, which is analogous to an abstract value, cannot exist on its own in XML. Since elements and attributes in an RXER encoding are defined by NamedType notation the notion of a NamedType having a value that can be encoded is useful for descriptive purposes (particularly for describing the RXER encoding of values of the ASN.1 combining types). Consequently, the following terminology is introduced. An abstract value of the Type in a NamedType is also a value of that NamedType. The RXER encoding (or translation) of the value of a NamedType is the RXER encoding (or translation) of the abstract value of the Type encapsulated according to the definition of that NamedType. The remainder of this section specifies the form of this encapsulation. 5.3.1. Element Components A value of a NamedType that is not subject to an ATTRIBUTE, ATTRIBUTE-REF or CONTENT encoding instruction is translated as an element item, either as a child element item added to the [children] of the enclosing element item or as the document element item added to the [children] and [document element] of the document item. If the element item is a child element item then the [parent] is the enclosing element item, otherwise the [parent] is the document item. If the NamedType is a top level NamedType from a module with a TARGET-NAMESPACE encoding instruction then the element item MUST be self-contained (see Section 4.1.1). ASIDE: A top level NamedType from a module with a TARGET-NAMESPACE encoding instruction can only be referenced from within an ASN.1 specification by using an ELEMENT-REF encoding instruction prefixing an AnyType. In such cases the AnyType value can optionally be regarded as the value of the type of the top level NamedType (see Section 5.9). For consistency, the requirement for self-containment is still assumed to apply. A top level NamedType might also be referenced through means Legg & Prager Expires 5 January 2006 [Page 18] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 outside the scope of this document. Whether self-containment should apply in these cases is an open question. For the sake of simplicity, all element items corresponding to a top level NamedType from a module with a TARGET-NAMESPACE encoding instruction are required to be self-contained. The proviso on the module having a TARGET-NAMESPACE encoding instruction is because an element corresponding to a top level NamedType can be unambiguously recognized even if the type of the surrounding context is unknown. A top level NamedType from a module that does not have a TARGET-NAMESPACE encoding instruction could be confused with a lower level NamedType. If the NamedType belongs to an extension (see Section 5.1) then the element item MUST be self-contained. For a CRXER encoding, if the NamedType is a NamedType in a ComponentType in a ComponentTypeList in a RootComponentTypeList or a NamedType in an AlternativeTypeList in a RootAlternativeTypeList and the NamedType appears in a module that does not have an EXTENSIONS-MARKED encoding instruction then the element item MUST be self-contained. ASIDE: If a module does not have an EXTENSIONS-MARKED encoding instruction then extension markers, or the lack thereof, cannot be relied upon. In such cases CRXER assumes every NamedType in a CHOICE, SEQUENCE or SET type is an extension. The element item may also be required to be self-contained as specified in Sections 5.3.3, 5.8 and 5.9. The [local name] of the element item is the value of the local-name component of the effective name of the NamedType. ASIDE: If there are no NAME, ATTRIBUTE-REF, ELEMENT-REF or REF-AS-ELEMENT encoding instructions then the value of the local-name component of the effective name of a NamedType is the same as the identifier of the NamedType. If the namespace-name component of the effective name is absent then the [namespace name] of the element item has no value (i.e., the element's name is not namespace qualified), otherwise the [namespace name] is the value of the namespace-name component of the effective name. If the type of the NamedType is directly or indirectly AnyType then the [in-scope namespaces] and [namespace attributes] of the element item are constructed as specified in Section 5.9, otherwise the [in-scope namespaces] and [namespace attributes] of the element item are constructed as specified in Section 5.3.1.1. If the [namespace name] of the element item has no value then the [prefix] of the element item has no value, otherwise if the type of the NamedType is not directly or indirectly AnyType then the [prefix] Legg & Prager Expires 5 January 2006 [Page 19] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 of the element item is determined as specified in Section 5.3.1.2, otherwise the [prefix] is determined by the AnyType value as specified in Section 5.9. The element item becomes the enclosing element item for the translation of the value of the Type in the NamedType. For a non-canonical RXER encoding, if the type of the NamedType is not directly or indirectly AnyType then PI and comment items MAY be added to the [children] of the element item (before or after any other items). The element item becomes the [parent] for each PI and comment item. These particular PI and comment items in a received RXER encoding MAY be discarded by an application. ASIDE: There is no provision for representing comments and PIs in ASN.1 abstract values of types other than AnyType. For a non-canonical RXER encoding, an attribute item with the [local name] "type" and the [namespace name] "http://www.w3.org/2001/XMLSchema-instance" (i.e., xsi:type) SHOULD be added to the [attributes] of the element item if the corresponding NamedType is subject to a TYPE-AS-VERSION encoding instruction and MAY be added to the [attributes] of the element item if the Type of the corresponding NamedType has a Qualified Reference Name (see Section 5.1.1). The [prefix] of this attribute item is determined as specified in Section 5.3.2.1. The [normalized value] of this attribute item is the Qualified Reference Name with the namespace prefix determined as specified in Section 5.6.11.1. For a non-canonical RXER encoding, attribute items with the [local name] "schemaLocation" or "noNamespaceSchemaLocation" and the [namespace name] "http://www.w3.org/2001/XMLSchema-instance" [XSD1] MAY be added to the [attributes] of the element item. The [prefix] for each of these attribute items is determined as specified in Section 5.3.2.1. The [normalized value] of these attribute items MUST reference a compatible XML Schema translation of the ASN.1 specification [CXSD]. The element item is the [owner element] for the attribute items. 5.3.1.1. Namespace Properties for Elements This section describes how the [in-scope namespaces] and [namespace attributes] of an element item are constructed when the content of the element item is not described by a value of AnyType (otherwise see Section 5.9). The [in-scope namespaces] property of the element item initially contains only the mandatory namespace item for the "xml" prefix [INFOSET]. For a CRXER encoding, if the element item is not the [document element] of the document item and the [in-scope namespaces] property of the element item's [parent] contains a namespace item for the default namespace then a namespace declaration attribute item Legg & Prager Expires 5 January 2006 [Page 20] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 that undeclares the default namespace (see Section 3) SHALL be added to the element item's [namespace attributes]. Definition: With respect to an element item, the default namespace is restricted if: (1) the [namespace name] of the element item has no value (i.e., the element's name is not namespace qualified), or (2) the element item is the enclosing element item for a value of the UNION type where the member attribute will be required (see Section 5.6.14), or (3) the element item is the enclosing element item for a value of the QName type where the namespace-name component is absent (see Section 5.6.11). This includes the case where the translation of the QName value is contained in the [normalized value] of an attribute item in the [attributes] of the element item. For a non-canonical RXER encoding, if the element item is not the [document element] of the document item and the [in-scope namespaces] property of the element item's [parent] contains a namespace item for the default namespace then either: (1) that item is copied to the [in-scope namespaces] of the element item, or (2) a namespace declaration attribute item that declares the default namespace is added to the element item's [namespace attributes] (the namespace name is the encoder's choice) and an equivalent namespace item is added to the [in-scope namespaces] of the element item, or (3) a namespace declaration attribute item that undeclares the default namespace is added to the element item's [namespace attributes]. Options (1) and (2) SHALL NOT be used if the default namespace is restricted with respect to the element item. For a CRXER encoding, if the element item is not the [document element] of the document item and the element item is not required to be self-contained then all the namespace items in the [in-scope namespaces] of the [parent], excluding the namespace item for the "xml" prefix and any namespace item for the default namespace, are copied to the [in-scope namespaces] of the element item. For a non-canonical RXER encoding, if the element item is not the [document element] of the document item and the element item is not required to be self-contained then any subset (including none or all) of the namespace items in the [in-scope namespaces] of the [parent], excluding the namespace item for the "xml" prefix and any namespace item for the default namespace, is copied to the Legg & Prager Expires 5 January 2006 [Page 21] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 [in-scope namespaces] of the element item. ASIDE: The descriptive approach used by this document only allows a namespace prefix to be used by a new namespace item if it is not currently used by another namespace item in the [in-scope namespaces]. By not inheriting a namespace item the prefix of that namespace is again available for reuse without fear of breaking an existing dependency on the prefix. Element items required to be self-contained inherit none of the namespace items in the [in-scope namespaces] of the [parent]. Definition: A namespace prefix is unused if it does not match the [prefix] of any namespace item in the [in-scope namespaces] of the element item. For a non-canonical RXER encoding, if the type of the NamedType is not directly or indirectly AnyType then additional namespace declaration attribute items for currently unused namespace prefixes MAY be added to the [namespace attributes] of the element item. An equivalent namespace item MUST be added to the [in-scope namespaces] of the element item for each such namespace declaration attribute item. For a non-canonical RXER encoding, if the type of the NamedType is not directly or indirectly AnyType and the [in-scope namespaces] of the element item does not contain a namespace item for the default namespace and the default namespace is not restricted with respect to the element item then a namespace declaration attribute item for the default namespace MAY be added to the [namespace attributes] of the element item, in which case an equivalent namespace item MUST be added to the [in-scope namespaces] of the element item. 5.3.1.2. Namespace Prefixes for Element Names This section describes how the [prefix] of an element item is determined when the element item has a value for its [namespace name] and the content of the element item is not described by a value of AnyType (otherwise see Section 5.9). For a CRXER encoding, if the [namespace name] of the element item has a value then if there is a namespace item in the [in-scope namespaces] with the same [namespace name] then the [prefix] of the element item SHALL be the same as the [prefix] of that namespace item, otherwise the [prefix] of the element item is any unused non-canonical namespace prefix. ASIDE: These prefixes will be rewritten to canonical namespace prefixes during the final step in producing the Infoset translation (see Section 5.10). Canonical namespace prefixes are not used here in the first instance because canonicalization depends on knowing the final set of [namespace attributes] produced by encoding the abstract value of the type of the NamedType. If an implementation looks ahead to determine this Legg & Prager Expires 5 January 2006 [Page 22] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 final set prior to translating the abstract value then it can assign the appropriate canonical namespace prefix in this step and skip the rewriting step. For a non-canonical RXER encoding, if the [namespace name] has a value then if there is a namespace item in the [in-scope namespaces] with the same [namespace name] then the [prefix] of the element item is either the same as the [prefix] of that namespace item (which in the case of a namespace item for the default namespace has no value) or is any unused namespace prefix, otherwise the [prefix] of the element item is any unused namespace prefix. If the [prefix] of the element item is an unused namespace prefix then a namespace declaration attribute item associating the namespace prefix with the namespace name MUST be added to the [namespace attributes] of the element item, and a corresponding namespace item MUST be added to the [in-scope namespaces] of the element item. ASIDE: The [local name] of the namespace declaration attribute item is the same as the [prefix] of the element item, the [namespace name] of the attribute item is "http://www.w3.org/2000/xmlns/" and the [normalized value] of the attribute item is the same as the [namespace name] of the element item. The namespace item has the same [prefix] and [namespace name] as the element item. 5.3.2. Attribute Components A value of a NamedType subject to an ATTRIBUTE or ATTRIBUTE-REF encoding instruction is translated as an attribute item added to the [attributes] of the enclosing element item (which becomes the [owner element] of the attribute item). The [local name] of the attribute item is the value of the local-name component of the effective name [RXEREI] of the NamedType. If the namespace-name component of the effective name is absent then the [namespace name] of the attribute item has no value, otherwise the [namespace name] is the value of the namespace-name component of the effective name. If the [namespace name] has a value then the [prefix] of the attribute item is determined as specified in Section 5.3.2.1, otherwise the [prefix] of the attribute item has no value. The [normalized value] of the attribute item is the translation of the value of the Type in the NamedType. ASIDE: An RXER decoder might have no knowledge of the NamedType if the NamedType belongs to an extension. Under the ASN.1 model of extensibility, the decoder must be prepared to re-encode (in RXER) Legg & Prager Expires 5 January 2006 [Page 23] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 any extension it receives, including those represented as attribute items. The [normalized value] of an attribute item may contain qualified names that depend on the [in-scope namespaces] of the [owner element] for interpretation. Therefore semantically faithful re-encoding of the extension may require reproduction of at least some part of the [in-scope namespaces] of the [owner element]. The simplest approach is to retain all the namespace items from the [in-scope namespaces] of the [owner element] and output them as namespace declaration attribute items in the [namespace attributes] of the [owner element] when re-encoding the extension. To avoid a proliferation of unnecessary namespace declarations, an application could examine the [normalized value] of an attribute item belonging to an unknown extension looking for character strings that resemble qualified names and retaining only those namespace items from the [in-scope namespaces] of the [owner element] that define the namespace prefixes of the putative qualified names. The concerns about the proliferation of namespace declarations raised in Section 4.1.1 do not apply here since the type of a NamedType subject to an ATTRIBUTE or ATTRIBUTE-REF encoding instruction cannot be AnyType. For completeness, the [specified] property is set to true and the [attribute type] and [references] properties have no value. 5.3.2.1. Namespace Prefixes for Attribute Names This section applies when an attribute item with a value for its [namespace name] is added to the [attributes] of an element item. For a CRXER encoding, if there is a namespace item, excluding a namespace item for the default namespace, with the same [namespace name] in the [in-scope namespaces] of the [owner element] then the [prefix] of the attribute item SHALL be the same as the [prefix] of that namespace item, otherwise the [prefix] of the attribute item is any unused non-canonical namespace prefix. For a non-canonical RXER encoding, if there is a namespace item, excluding a namespace item for the default namespace, with the same [namespace name] in the [in-scope namespaces] of the [owner element] then the [prefix] of the attribute item is either the same as the [prefix] of that namespace item or is any unused namespace prefix, otherwise the [prefix] of the attribute item is any unused namespace prefix. If the [prefix] of the attribute item is an unused namespace prefix then a namespace declaration attribute item associating the namespace prefix with the namespace name MUST be added to the [namespace attributes] of the [owner element], and a corresponding namespace item MUST be added to the [in-scope namespaces] of the [owner element]. Legg & Prager Expires 5 January 2006 [Page 24] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 5.3.3. Unencapsulated Components A value of a NamedType subject to a CONTENT encoding instruction is translated as the value of the Type in the NamedType, i.e., without encapsulation in an element item or attribute item. Consequently, the enclosing element item for the translation of the value of the NamedType is also the enclosing element item for the translation of the value of the Type in the NamedType. If the NamedType belongs to an extension then the element items in the [children] of the enclosing element resulting from the translation of the value of the Type MUST be self-contained. ASIDE: A value of a NamedType subject to a CONTENT encoding instruction doesn't produce an element item so a requirement for self-containment is instead inherited by the immediate child element items that the translation of the value produces. For a CRXER encoding, if the NamedType is a NamedType in a ComponentType in a ComponentTypeList in a RootComponentTypeList or a NamedType in an AlternativeTypeList in a RootAlternativeTypeList and the NamedType appears in a module that does not have an EXTENSIONS-MARKED encoding instruction then the element items in the [children] of the enclosing element resulting from the translation of the value of the Type MUST be self-contained. 5.3.4. Examples Consider this type definition: CHOICE { one [0] BOOLEAN, two [1] [RXER:ATTRIBUTE] INTEGER, three [2] [RXER:NAME AS "THREE"] OBJECT IDENTIFIER, four [3] [RXER:ATTRIBUTE-REF { namespace-name "http://www.example.com", local-name "foo" }] UTF8String, five [4] [RXER:ELEMENT-REF { namespace-name "http://www.example.com", local-name "bar" }] AnyType, six [5] [RXER:CONTENT] SEQUENCE { seven [0] [RXER:ATTRIBUTE] INTEGER, eight [1] INTEGER } } The content of each of the following <value> elements is the RXER encoding of a value of the above type: <value> <one>true</one> </value> <value two="100"/> Legg & Prager Expires 5 January 2006 [Page 25] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 <value> <THREE>2.5.4.3</THREE> </value> <value xmlns:ex="http://www.example.com" ex:foo="a string"/> <value> <ex:bar xmlns:ex="http://www.example.com">another string</ex:bar> </value> <value seven="200"> <eight>300</eight> </value> 5.4. Type Referencing Notations A value of a type with a defined type name is translated according to the type definition on the right hand side of the type assignment for the type name. A value of a type denoted by the use of a parameterized type with actual parameters is translated according to the parameterized type with the DummyReferences [X.683] substituted with the actual parameters. A value of a constrained type is translated as a value of the type without the constraint. See X.680 [X.680] and X.682 [X.682] for the details of ASN.1 constraint notation. A prefixed type [X.680-1] associates an encoding instruction with a type. A value of a prefixed type is translated as a value of the type without the prefix. ASIDE: This does not mean that RXER encoding instructions are ignored. It is simply easier to describe their effects in relation to specific built-in types, rather than as the translation of a value of a prefixed type. A tagged type is a special case of a prefixed type. A value of a tagged type is translated as a value of the type without the tag. ASN.1 tags do not appear in the XML encodings defined by this document. A value of a fixed type denoted by an ObjectClassFieldType is translated according to that fixed type (see Section 5.8 for the case of an ObjectClassFieldType denoting an open type). A value of a selection type is translated according to the type referenced by the selection type. Note that component encoding instructions are not inherited by the type referenced by a selection type [RXEREI]. A value of a type described by TypeFromObject notation [X.681] is Legg & Prager Expires 5 January 2006 [Page 26] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 translated according to the denoted type. A value of a type described by ValueSetFromObjects notation [X.681] is translated according to the governing type. 5.5. TypeWithConstraint and SEQUENCE OF Type For the purposes of this document, a TypeWithConstraint is treated as if it were the parent type [X.680] (either a SEQUENCE OF or SET OF type). For example, SEQUENCE SIZE(1..MAX) OF SomeType is treated like SEQUENCE OF SomeType Additionally, a "SEQUENCE OF Type" (including the case where it is the parent type for a TypeWithConstraint) is treated as if it were a "SEQUENCE OF NamedType", where the identifier of the NamedType is assumed to be "item". Similarly, a "SET OF Type" (including the case where it is the parent type for a TypeWithConstraint) is treated as if it were a "SET OF NamedType", where the identifier of the NamedType is assumed to be "item". For example, SEQUENCE SIZE(1..MAX) OF SomeType is ultimately treated like SEQUENCE OF item SomeType 5.6. Character Data Translations For the majority of ASN.1 built-in types, encodings of values of those types never have element content. The encoding of a value of an ASN.1 combining type (except a UNION or LIST type) typically has element content. For those types that do not produce element content, the translation of an abstract value is described as a character string of ISO 10646 characters [UCS]. This character data translation will either be destined to become part of the [normalized value] of an attribute item or a series of character items in the [children] of an element item (which becomes the [parent] for the character items). The case that applies is determined in accordance with Sections 5.2 and 5.3. For a non-canonical RXER encoding, if the type of the abstract value is not directly or indirectly a restricted character string type, the NULL type or a UNION type then leading and/or trailing white space characters MAY be added to the character data translation. Legg & Prager Expires 5 January 2006 [Page 27] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 ASIDE: White space characters are significant in the encoding of a value of a restricted character string type and a restricted character string type can be a member type of a UNION type. The encoding of a NULL value produces no character data. Optional white space characters are not permitted in a CRXER encoding. For a non-canonical RXER encoding, if the type of the abstract value is directly or indirectly the AnyURI, NCName or Name type then leading and trailing white space characters MAY be added to the character data translation. ASIDE: These types are indirectly a restricted character string type (UTF8String), however their definitions exclude white space characters, so any white space characters appearing in an encoding are not part of the abstract value and can be safely ignored. This exception does not apply to other subtypes of a restricted character string type that happen to exclude white space characters. 5.6.1. Restricted Character String Types The character data translation of a value of a restricted character string type is the sequence of characters in the string. Depending on the ASN.1 string type, and an application's internal representation of that string type, a character may need to be translated to or from the equivalent ISO 10646 character code [UCS]. The NumericString, PrintableString, IA5String, VisibleString (ISO646String), BMPString, UniversalString and UTF8String character encodings use the same character codes as ISO 10646. For the remaining string types (GeneralString, GraphicString, TeletexString, T61String and VideotexString) see X.680 [X.680]. The NUL character (U+0000) is not a legal character for XML. It is omitted from the character data translation of a string value. Certain other control characters are legal for XML version 1.1, but not for version 1.0. If any string value contains these characters then the RXER encoding must use XML version 1.1 (see Section 5.11). All white space characters in the RXER encoding of a value of a restricted character string type (excluding the AnyURI, NCName and Name subtypes) are significant, i.e., part of the abstract value. Examples The content of each of the following <value> elements is the RXER encoding of an IA5String value: <value> Don't run with scissors! </value> <value>Markup (e.g., <value>) has to be escaped.</value> Legg & Prager Expires 5 January 2006 [Page 28] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 <value>Markup (e.g., <![CDATA[<value>]]>) has to be escaped. </value> 5.6.2. BIT STRING The character data translation of a value of the BIT STRING type is either a binary digit string, a hexadecimal digit string or a list of bit names. A binary digit string is a sequence of zero, one or more of the binary digit characters "0" and "1" (i.e., U+0030 and U+0031). Each bit in the BIT STRING value is encoded as a binary digit in order from the first bit to the last bit. For a non-canonical RXER encoding, if the BIT STRING type has a NamedBitList then trailing zero bits MAY be omitted from a binary digit string. A hexadecimal digit string is permitted if and only if the number of bits in the BIT STRING value is a multiple of eight and the BIT STRING type is not directly or indirectly the component type of a LIST type and the character data translation is destined for the [children] of an element item. A hexadecimal digit string is a sequence of zero, one or more pairs of the hexadecimal digit characters "0"-"9", "A"-"F" and "a"-"f" (i.e., U+0030-U+0039, U+0041-U+0046 and U+0061-U+0066). Each group of eight bits in the BIT STRING value is encoded as a pair of hexadecimal digits where the first bit is the most significant. An odd number of hexadecimal digits is not permitted. The characters "a"-"f" (i.e., U+0061-U+0066) SHALL NOT be used in the CRXER encoding of a BIT STRING value. If a hexadecimal digit string is used then the enclosing element's [attributes] MUST contain an attribute item with the [local name] "format", no value for the [namespace name], and the [normalized value] "hex" (i.e., format="hex"). ASIDE: The hexadecimal digit string is intended to conform to the lexical representation of the XML Schema [XSD2] hexBinary datatype. For a non-canonical RXER encoding, if the preconditions for using a hexadecimal digit string are satisfied then a hexadecimal digit string MAY be used. A list of bit names is permitted if and only if the BIT STRING type has a NamedBitList and each "1" bit in the BIT STRING value has a corresponding identifier in the NamedBitList. ASIDE: ASN.1 does not require that an identifier be assigned for every bit. A list of bit names is a sequence of names for the "1" bits in the BIT STRING value, in any order, each separated from the next by at least one white space character. If the BitStringType is not subject Legg & Prager Expires 5 January 2006 [Page 29] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 to a VALUES encoding instruction then each "1" bit in the BIT STRING value is represented by its corresponding identifier from the NamedBitList. If the BitStringType is subject to a VALUES encoding instruction then each "1" bit in the BIT STRING value is represented by the replacement name [RXEREI] for its corresponding identifier from the NamedBitList. For a CRXER encoding, if the BIT STRING type has a NamedBitList then a binary digit string MUST be used and trailing zero bits MUST be omitted from the binary digit string, otherwise if the number of bits in the BIT STRING value is greater than zero and the preconditions for using a hexadecimal digit string are satisfied then a hexadecimal digit string MUST be used, otherwise a binary digit string MUST be used. Examples Consider this type definition: BIT STRING { black(0), red(1), orange(2), yellow(3), green(4), blue(5), indigo(6), violet(7) } The content of each of the following <value> elements is an RXER encoding of the same abstract value: <value> green violet orange</value> <value> 001<!--Orange-->01001 </value> <value format="hex"> 29 </value> <value>00101001</value> The final case contains the CRXER encoding of the abstract value. 5.6.3. BOOLEAN For a non-canonical RXER encoding, the character data translation of the BOOLEAN value TRUE is the string "true" or "1", at the encoder's option. For a CRXER encoding, the character data translation of the BOOLEAN value TRUE is the string "true". For a non-canonical RXER encoding, the character data translation of the BOOLEAN value FALSE is the string "false" or "0", at the encoder's option. For a CRXER encoding, the character data translation of the BOOLEAN value FALSE is the string "false". ASIDE: The RXER encoding of BOOLEAN values is intended to conform to the lexical representation of the XML Schema [XSD2] boolean datatype. Examples The content of each of the following <value> elements is the RXER Legg & Prager Expires 5 January 2006 [Page 30] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 encoding of a BOOLEAN value: <value>1</value> <value> false </value> <value> fal<!-- a pesky comment -->se </value> 5.6.4. ENUMERATED The character data translation of a value of an ENUMERATED type where the EnumeratedType is not subject to a VALUES encoding instruction is the identifier corresponding to the actual value. Examples Consider this type definition: ENUMERATED { sunday, monday, tuesday, wednesday, thursday, friday, saturday } The content of both of the following <value> elements is the RXER encoding of a value of the above type: <value>monday</value> <value> thursday </value> The character data translation of a value of an ENUMERATED type where the EnumeratedType is subject to a VALUES encoding instruction is the replacement name [RXEREI] for the identifier corresponding to the actual value. Examples Consider this type definition: [RXER:VALUES ALL CAPITALIZED sunday AS "SUNDAY", saturday AS "SATURDAY"] ENUMERATED { sunday, monday, tuesday, wednesday, thursday, friday, saturday } The content of each of the following <value> elements is the RXER encoding of a value of the above type: <value>SUNDAY</value> <value> Monday </value> Legg & Prager Expires 5 January 2006 [Page 31] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 <value> Tuesday </value> 5.6.5. GeneralizedTime The character data translation of a value of the GeneralizedTime type is a date, the letter "T", a time of day, optional fractional seconds and an optional time zone. The date is two decimal digits representing the century, followed by two decimal digits representing the year, "-" (U+002D), two decimal digits representing the month, "-" (U+002D), and two decimal digits representing the day. The time of day is two decimal digits representing the hour, followed by ":" (U+003A), two decimal digits representing the minutes, ":" (U+003A), and two decimal digits representing the seconds. Note that the hours value 24 is disallowed [X.680]. A GeneralizedTime value with fractional hours or minutes is first converted to the equivalent time with whole minutes and seconds and, if necessary, fractional seconds. The minutes are encoded as "00" if the GeneralizedTime value omits minutes. The seconds are encoded as "00" if the GeneralizedTime value omits seconds. The fractional seconds is a period "." (U+002E) followed by zero, one or more decimal digits (U+0030-U+0039). For a CRXER encoding, trailing zero digits (U+0030) in the fractional seconds SHALL be omitted and the period SHALL be omitted if there are no following digits. The time zone, if present, is either the letter "Z" (U+005A) to indicate Coordinated Universal Time, a "+" (U+002B) followed by a time zone differential, or a "-" (U+002D) followed by a time zone differential. A time zone differential indicates the difference between local time (the time specified by the preceding date and time of day) and Coordinated Universal Time. Coordinated Universal Time can be calculated from the local time by subtracting the differential. For a CRXER encoding, a GeneralizedTime value with a time zone differential SHALL be encoded as the equivalent Coordinated Universal Time, i.e., the time zone will be "Z". A local time GeneralizedTime value is not converted to Coordinated Universal Time for a CRXER encoding. Other canonical ASN.1 encoding rules specify that local times must be encoded as Coordinated Universal Time but do not specify a method to convert a local time to a Coordinated Universal Time. Consequently, canonicalization of local time values is unreliable and applications SHOULD NOT use local time. Legg & Prager Expires 5 January 2006 [Page 32] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 A time zone differential is encoded as two decimal digits representing hours, the character ":" (U+003A), and two decimal digits representing minutes. The minutes are encoded as "00" if the GeneralizedTime value omits minutes from the time zone differential. ASIDE: The RXER encoding of GeneralizedTime values is intended to conform to the lexical representation of the XML Schema [XSD2] dateTime datatype. Examples The content of each of the following <value> elements is the RXER encoding of a GeneralizedTime value: <value>2004-06-15T12:00:00Z</value> <value> 2004-06-15T02:00:00+10:00 </value> <value> 2004-06-15T12:00:00.5 </value> 5.6.6. INTEGER For a CRXER encoding, the character data translation of a value of an IntegerType is a canonical number string representing the integer value. A canonical number string is either the digit character "0" (U+0030), or an optional minus sign (U+002D) followed by a non-zero decimal digit character (U+0031-U+0039) followed by zero, one or more of the decimal digit characters "0" to "9" (U+0030-U+0039). For a non-canonical RXER encoding, the character data translation of a value of the IntegerType without a NamedNumberList is a number string representing the integer value. A number string is a sequence of one or more of the decimal digit characters "0" to "9" (U+0030-U+0039), with an optional leading sign, either "+" (U+002B) or "-" (U+002D). Leading zero digits are permitted in a number string for a non-canonical RXER encoding. ASIDE: The RXER encoding of values of the IntegerType without a NamedNumberList is intended to conform to the lexical representation of the XML Schema [XSD2] integer datatype. For a non-canonical RXER encoding, if the IntegerType has a NamedNumberList and the NamedNumberList defines an identifier for the actual value and the IntegerType is not subject to a VALUES encoding instruction then the character data translation of the value is either a number string or the identifier. Examples Legg & Prager Expires 5 January 2006 [Page 33] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 Consider this type definition: INTEGER { zero(0), one(1) } The content of each of the following <value> elements is the RXER encoding of a value of the above type: <value>0</value> <value> zero </value> <value> 2 <!-- This number doesn't have a name. --> </value> <value>00167</value> For a non-canonical RXER encoding, if the IntegerType is subject to a VALUES encoding instruction (it necessarily must have a NamedNumberList) and the NamedNumberList defines an identifier for the actual value then the character data translation of the value is either a number string or the replacement name [RXEREI] for the identifier. Examples Consider this type definition: [RXER:VALUES ALL UPPERCASED] INTEGER { zero(0), one(1) } The content of both of the following <value> elements is the RXER encoding of a value of the above type: <value>0</value> <value> ZERO </value> 5.6.7. NULL The character data translation of a value of the NULL type is an empty character string. Examples <value/> <value><!-- Comments don't matter. --></value> <value></value> The final case is the CRXER encoding. 5.6.8. ObjectDescriptor A value of the ObjectDescriptor typedefinitions (i.e., elements correspondis translated according to theNamedType notation). Note that "ASN.1 value" does not mean a BasicGraphicString type. Legg & Prager Expires 5 January 2006 [Page 34] INTERNET-DRAFT Robust XML Encoding Rules(BER) [X690] encoded value.July 5, 2005 5.6.9. OBJECT IDENTIFIER and RELATIVE-OID TheASN.1character data translation of a valueis an abstract concept that is independentofany particular encoding. BERthe OBJECT IDENTIFIER type or RELATIVE-OID type isjust one possible encoding of an ASN.1 value. This document defines another possible encoding. Rules for canonical RXER encodings will be introduced inarevision of this document. The effect"." (U+002E) separated list ofASN.1 encoding instructions on RXER encodings will be covered in a revisionthe object identifier components ofthis document. 2. Conventions The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", and "MAY" in this document are to be interpreted as described in BCP 14, RFC 2119 [BCP14]. The key word "OPTIONAL" is exclusively used with its ASN.1 meaning. Throughout this document "type" shall be taken to mean an ASN.1 type, and "value" shall be taken to mean an ASN.1 abstractthe value.A reference toEach object identifier component is translated as aASN.1 production [X680] (e.g., Type, NamedType)non-negative number string. A non-negative number string is either the digit character "0" (U+0030), or areferencenon-zero decimal digit character (U+0031-U+0039) followed by zero, one or more of the decimal digit characters "0" to "9" (U+0030-U+0039). Examples The content of each of thetext infollowing <value> elements is the RXER encoding of anASN.1 specification corresponding to that production.OBJECT IDENTIFIER value: <value>2.5.6.0</value> <value> 2.5.4.10 </value> <value> 2.5.4.3 <!-- commonName --> </value> 5.6.10. OCTET STRING Thespecificationcharacter data translation ofRXER makes usea value ofdefinitions fromtheXML Information Set (Infoset) [ISET]. In particular, information item property names are presented perOCTET STRING type is theInfoset, e.g., [local name]. Inhexadecimal digit string representation of thesections that follow,octets. The octets are encoded in order from theterm "element" shall be takenfirst octet tomean an Infoset element information item. Literal character stringsthe last octet. Each octet is encoded as a pair of the hexadecimal digit characters "0"-"9", "A"-"F" and "a"-"f" (i.e., U+0030-U+0039, U+0041-U+0046 and U+0061-U+0066) where the first digit in the pair corresponds to the four most significant bits of the octet. An odd number of hexadecimal digits is not permitted. The characters "a"-"f" (i.e., U+0061-U+0066) SHALL NOT be used in the CRXER encoding of an OCTET STRING value. ASIDE: The RXER encodingappear within double quotes, however the double quotes are not partof OCTET STRING values is intended to conform to theliteral value and do not appear inlexical representation of theencoding. This document usesXML Schema [XSD2] hexBinary datatype. Examples The content of each of thenamespace prefix "xsi:" to stand forfollowing <value> elements is thenamespace name "http://www.w3.org/2001/XMLSchema-instance", though inRXER encoding of an OCTET STRING value: <value>27F69A0300</value> <value> efA03bFF </value> Legg & Prager Expires16 December 20045 January 2006 [Page3]35] INTERNET-DRAFT Robust XML Encoding RulesJune 16, 2004 practice any validJuly 5, 2005 5.6.11. QName The character data translation of a value of the QName type (Section 4.5) is a qualified name conforming to the QName production of Namespaces in XML [XMLNS10]. The local part (i.e., LocalPart) of the qualified name SHALL be the the value of the local-name component of the QName value. If the namespace-name component of the QName value is absent then the namespace prefix (i.e., Prefix) of the qualified name SHALL be absent, otherwise the namespace prefix ispermitteddetermined as specified inRXER encodings. 3. Definitions The root elementSection 5.6.11.1 using the value ofan XML document isthe[document element]namespace-name component of thedocument information item corresponding toQName value as theXML document. The normalized content of an element information itemnamespace name. An RXER encoder is free to ignore thelistvalue ofinformation items formed by taking, in order, each character and element information item inthe[children]prefix component of theelement information item (thus eliminating any comments or PIs from consideration when determining the correctnessQName value. When decoding a value of the QName type, an RXERencoding). Ifdecoder MAY set the prefix component of thenormalized content contains only character information items then its stringvalueisto thesequence of [character codes] of those character information itemsPrefix actually used inorder, otherwise its string value is empty. Note thatthenormalized content definition is for descriptive purposes only. There is no requirementencoding. 5.6.11.1. Namespace Prefixes forRXER encodings to actually be normalized. White space is a sequence of one or more space (U+0020), tab (U+0009), carriage-return (U+000D) or line-feed (U+000A) characters. 3.1.QualifiedReferenceNamesA Qualified Reference Name isThis section describes how the namespace prefix of a qualified name[XMLNS] that uniquely identifies a particular type definition. Not all type definitions have a Qualified Reference Name. A Type hasis determined given the namespace name to which the namespace prefix must map. For aQualified Reference NameCRXER encoding, ifone ofthefollowing applies: a)namespace name matches theType is a typereference (not a DummyReference) or an ExternalTypeReference in[namespace name] of aDefinedTypenamespace item ina ReferencedType andtheASN.1 module[in-scope namespaces] of the enclosing element item then the namespace prefix of the qualified name SHALL be the same as the [prefix] of that namespace item, otherwise the namespace prefix of the qualified name is any unused non-canonical namespace prefix. ASIDE: If the qualified name appears inwhichthereferenced type[normalized value] of an attribute item then the enclosing element item isdefined hasthe [owner element] for that attribute item. For a non-canonical RXER encoding, if the namespace name[XEDNS], b)matches theType comprises one[namespace name] ofthe productionsa namespace item inTable 1the [in-scope namespaces] of thespecification for ASN.1 Schema [ASD], c)enclosing element item then theTypenamespace prefix of the qualified name isa typereference (not a DummyReference) or an ExternalTypeReference in a DefinedType in a ReferencedType andeither theASN.1 modulesame as the [prefix] of that namespace item (which inwhichthereferenced type is defined is SchemaLanguageIntegration [GLUE]. Incasea),of a namespace item for theQualified Reference Namedefault namespace has no value) or is any unused namespace prefix, otherwise the namespace prefix of the qualified namewithis any unused namespace prefix. If the namespacenameprefix of themodule (in which the referenced typequalified name isdefined) asan unused namespace prefix then a namespace declaration attribute item associating the namespacename,prefix with the namespace name MUST be added to the [namespace attributes] of the enclosing element item, and a corresponding namespace item MUST be added to thetypereference as[in-scope namespaces] of thelocalenclosing element item. 5.6.12. REAL Legg & Prager Expires16 December 20045 January 2006 [Page4]36] INTERNET-DRAFT Robust XML Encoding RulesJune 16, 2004 part. In case b),July 5, 2005 The character data translation of a value of theQualified Reference NameREAL type is thequalified name with the namespace name "http://xmled.info/ns/ASN.1" and the local part as indicated in Table 1. In case c),character string "0" if theQualified Reference Namevalue is positive zero, thequalified name with the namespace name "http://xmled.info/ns/ASN.1" and the typereference as the local part. Note thatcharacter string "-0" if theQualified Reference Namevalue is negative zero, thesame qualified name that would be used to referencecharacter string "INF" if thecorresponding type invalue is positive infinity, theASN.1 Schema representation [ASD] ofcharacter string "-INF" if theASN.1 specification, orvalue is negative infinity, theXML Schema derivation [CXSD] ofcharacter string "NaN" if theASN.1 specification. 4. General Considerations An RXER encodingvalue ispermitted to contain XML comments, processing instructions (PIs), CDATA sections, character references and parsed entity references in any position allowed fornot awell-formednumber, or a real number otherwise. A real number is the mantissa followed by either "E" (U+0045) or "e" (U+0065) andvalid XML document [XML]. However, note thattheenvironment in which an RXER encoding isexponent. The character "e" SHALL NOT be usedmay disallow processing instructions and entity references.for a CRXER encoding. Ifentity references (to other thanthepredefined entities) are usedexponent is zero then theXML document containing the RXER encoding must necessarily contain a document type declaration and the internal"E" orexternal subset of the document type definition (DTD) must contain a declaration for the entity. Although comments"e" andPIs are permitted in RXER encodings, there is no provisionexponent MAY be omitted forrepresenting comments and PIs in ASN.1 abstract values, therefore applications usinga non-canonical RXERMAY discard any commentsencoding. The mantissa is a decimal number with an optional leading sign, either "+" (U+002B) orPIs in received encodings. Similarly, there"-" (U+002D). A decimal number isno provision fora sequence of one or more of the decimal digit characters "0" to "9" (U+0030-U+0039) optionally partitioned by a single period character (U+002E) representingentity references in ASN.1 abstract values, therefore applications usingthe decimal point. Multiple leading zero digits are permitted for a non-canonical RXERMAY replace entity references with their replacement text at any time.encoding. The exponent is encoded as a number string (see Section 5.6.6). ASIDE: The[attributes] of any element in anRXER encodingare permittedof REAL values is intended tocontain an attribute information itembe compatible with the[local name] "type" andlexical representation of the[namespace name] "http://www.w3.org/2001/XMLSchema-instance" (i.e., xsi:type [XSD1]) providedXML Schema [XSD2] double datatype, but allows real values outside theType ofrange permitted by double. For a CRXER encoding: (1) The real number MUST be normalized so that thecorresponding NamedTypemantissa has aQualified Reference Name (see Section 3.1) that can be usedsingle, non-zero digit immediately toidentifythetype. Any elementleft of the decimal point. (2) Leading zero digits SHALL NOT be used. (3) A leading plus sign SHALL NOT be used inan RXER encoding is permitted tothe mantissa or the exponent. (4) The fractional part of the mantissa (i.e., that part following the decimal point) MUST havenamespace declaration attributes [XMLNS]. However note that, withat least one digit (which may be "0") and MUST NOT have any trailing zeroes after thepossiblefirst digit. (5) The exponent SHALL be present and SHALL be a canonical number string (see Section 5.6.6). Examples The content of each of the following <value> elements is the RXER encoding of a REAL value: <value>3.14159<!-- PI --></value> <value> 1.0e6 </value> <value> INF </value> Legg & Prager Expires16 December 20045 January 2006 [Page5]37] INTERNET-DRAFT Robust XML Encoding RulesJune 16, 2004 exceptionJuly 5, 2005 <value> -01e-06 </value> 5.6.13. UTCTime The character data translation of a value of theroot element,UTCTime type is a date, the[namespace name]letter "T", a time ofan element in an RXER encoding is required to have no value (i.e., non-root element names in RXER encodings are unqualified). 5. Standalone RXER Encodingsday and a time zone. TheRXER encoding of some value generates onlydate is two decimal digits representing thecontentyear (no century), "-" (U+002D), two decimal digits representing the month, "-" (U+002D), and two decimal digits representing the day. The time ofan element. Whenday is two decimal digits representing the hour, followed by ":" (U+003A), two decimal digits representing the minutes, ":" (U+003A), and two decimal digits representing the seconds. Note that the hours valuebeing encoded24 isonly part of some larger XML document (which is, for example,disallowed [X.680]. The seconds are encoded as "00" if theway ASN.1 Schema [ASD] uses RXER) then itUTCTime value omits seconds. The time zone is either theresponsibility of the specification invoking RXERletter "Z" (U+005A) todetermineindicate Coordinated Universal Time, a "+" (U+002B) followed by a time zone differential, or a "-" (U+002D) followed by a time zone differential. A time zone differential indicates thecontext ofdifference between local time (the time specified by theenclosing element (i.e., its [local name]preceding date and[namespace name]). RXERtime of day) and Coordinated Universal Time. Coordinated Universal Time canalsobeused to generate an entire XML documentcalculated from theencoding of a value. This is termed a Standalone RXER Encoding of the value. ASN.1 does not have a concept analogous tolocal time by subtracting theroot element of an XML document. That is, ASN.1 does not allowdifferential. For aNamedType to appear on its own, outside of an enclosing combining type. This means that the rules for encoding the root element inCRXER encoding, aStandalone RXER Encoding differ from those that apply to any other element in an RXER encoding. InUTCTime value with aStandalone RXER Encodingtime zone differential SHALL be encoded as the[local name] ofequivalent Coordinated Universal Time, i.e., theroot element SHALLtime zone will be"value", and"Z". A time zone differential is encoded as two decimal digits representing hours, the[namespace name]character ":" (U+003A), and two decimal digits representing minutes. 5.6.14. CHOICE as UNION The chosen alternative of a value of a UNION type corresponds to some NamedType in theroot element SHALL have no value. If the ASN.1UNION type definition (a ChoiceType). The character data translation ofthea valuebeing encoded hasof aQualified Reference Name (see Section 3.1) thenUNION type is the[attributes]character data translation of theroot element SHOULD contain an attribute information item with the [local name] "type" and the [namespace name] "http://www.w3.org/2001/XMLSchema-instance" (i.e., an xsi:type attribute). The [normalized value]value ofthis attribute SHALL betheQualified Reference Nametype of theASN.1 type. Where the xsi:type attribute is present, appropriate namespace declaration attributes for the namespace names in the attribute's namechosen alternative, i.e., without any kind of encapsulation. Leading andvalue MUSTtrailing white space characters are not permitted to be added to theroot element's [attributes]. The namespace prefixes are the encoder's choice. The [attributes] and [children]character data translation ofthe root element (i.e., its content) are generated by the normal applicationa value ofthe encoding rules ina UNION type (see Section65.6), however this does not preclude such white space being added to the character data translation of the valuebeing encoded. 6. Encoding Rules The following sections describeof theRXER encoding for valueschosen alternative. The character data translation ofeacha value ofthe ASN.1a UNION typenotations. ASN.1 values are uniformly regarded as analogous to the content of an element, not complete elements inis Legg & Prager Expires16 December 20045 January 2006 [Page6]38] INTERNET-DRAFT Robust XML Encoding RulesJune 16, 2004 their own right. ExamplesJuly 5, 2005 necessarily destined for the [children] ofencodings inan enclosing element item. ASIDE: This is because thefollowing sections use a <value> start tag and </value> end tagATTRIBUTE encoding instruction cannot be applied todelimita NamedType where thecontent. These starttype is a UNION type. The chosen alternative can be identified by a member attribute item, i.e., an attribute item with the [local name] "member" andend tags areno value forillustration only and are not part of the encoding oftheabstract value. In normal use,[namespace name] added to thename[attributes] of the enclosing elementis provided by the contextitem. The [normalized value] of this attribute item is theabstract value, e.g., an enclosing SEQUENCE type. In every case described in the following sections, if theRXER encoding ofan ASN.1 value produces no content thentheenclosing element MAY be encoded as an empty element (i.e., using an empty-element tag). 6.1. Identifiers An identifier, as defined in ASN.1 notation (Clause 11.3effective name (a QName) ofX.680 [X680]), is a character string that begins with a latin lowercase letter (U+0061-U+007A) and is followed by zero, one or more latin letters (U+0041-U+005A, U+0061-U+007A), decimal digits (U+0030-U+0039), and hyphens (U+002D). A hyphen is not permittedthe NamedType corresponding tobethelast character and a hyphenchosen alternative. ASIDE: It is notpermittedpossible tobe followed by another hyphen. The case of lettersassociate a namespace name with a NamedType inan identifier is always significant. ASN.1 identifiers are used fora UNION type using the[local name] of child elements and may also appear incurrent specification for RXER encoding instructions. Consequently, thecharacter data content of elements. 6.2. Type Referencing Notations A value[normalized value] of the member attribute item will always contain atype with a defined typequalified nameis encoded according towithout a namespace prefix. For a CRXER encoding, thetype definition onmember attribute item MUST be used and theright hand side[normalized value] of thetype assignment forattribute item MUST be thetypeCRXER translation of the effective name.A valueIn the absence of atype denotedmember attribute item, an RXER decoder MUST determine the chosen alternative by considering theusealternatives ofa parameterized type with actual parameters is encoded according totheparameterized type withchoice in theDummyReferences [X683] substituted withorder prescribed below and accepting theactual parameters. A value of a tagged or constrained typefirst alternative for which the encoding isencoded asvalid. If the UNION encoding instruction has avaluePrecedenceList then the alternatives of thetype withoutChoiceType referenced by thetag or constraint, respectively. Tags do not appearPrecedenceList are considered in theXML encodings definedorder identified bythis document. See X.680 [X680] and X.682 [X682] forthat PrecedenceList, then thedetails of ASN.1 constraint notation. A valueremaining alternatives are considered in the order of their definition in the ChoiceType. If the UNION encoding instruction does not have afixed type denoted byPrecedenceList then all the alternatives of the ChoiceType are considered in the order of their definition in the ChoiceType. A non-canonical RXER encoder MUST use the member attribute item if anObjectClassFieldType is encoded accordingRXER decoder would determine the chosen alternative tothat fixed type (see Section 6.22 forbe something other than thecasechosen alternative ofan ObjectClassFieldType denoting an open type). Athe CHOICE valueof a selection type is encoded according tobeing translated, otherwise the member attribute item MAY be used. Examples Consider this typereferenced bydefinition: [RXER:UNION PRECEDENCE serialNumber] CHOICE { name [0] IA5String, serialNumber [1] INTEGER } In theselection type. A valueabsence of atype described by TypeFromObject notation [X681] ismember attribute an RXER decoder would first consider whether the received encoding was a valid serialNumber (an INTEGER) before considering whether it was a valid name (an IA5String). Legg & Prager Expires16 December 20045 January 2006 [Page7]39] INTERNET-DRAFT Robust XML Encoding RulesJune 16, 2004 encoded according toJuly 5, 2005 The content of each of thedenoted type. Afollowing <value> elements is the RXER encoding of a value of the above type: <value>Bob</value> <value member="name">Alice</value> <value> <!-- Don't have atype described by ValueSetFromObjects notation [X681]name for this one! --> 344 </value> <value member="name"><!-- A strange name. -->100</value> The member attribute isencoded accordingrequired in the final case to prevent thegoverning type. 6.3. Restricted Character String Types Avalueofbeing interpreted as arestricted character string type is encoded such thatserialNumber. If thenormalized contentUNION (i.e., CHOICE) type isthe sequenceextensible [X.680] then an application MUST be capable ofcharacter information items representing the characters inaccepting, and if necessary, re-encoding a member attribute where thestring. Dependingvalue references an unknown alternative, on theASN.1 string type, and an application's internal representation ofassumption thatstring type,the sender is using acharacter may need to be translated to or frommore recent definition of theequivalent ISO 10646 character code [UCS].UNION type. ASIDE: TheNumericString, PrintableString, IA5String, VisibleString (ISO646String), BMPString and UniversalString character encodings use the samecharactercodes as ISO 10646. Fordata translation of theremaining string types (GeneralString, GraphicString, TeletexString, T61String and VideotexString) see X.680 [X680]. Note that a consequencevalue ofdefiningtheRXER encoding in termstype of an unknown alternative may contain qualified names that depend on theXML Infoset is[in-scope namespaces] of theimplied requirement for ampersand ('&', U+0026) and left angle bracket ('<', U+003C) characters in string values to be escaped appropriately [XML]. Certain characters (e.g., control characters) are not legal charactersenclosing element item forXML. These characters are encoded asinterpretation. Therefore semantically faithful re-encoding of the extension may require reproduction of at least some part of thereplacement character (U+FFFD). When decoding,[in-scope namespaces] of thereplacement character is retained if itenclosing element item. The simplest approach is to retain all the namespace items from the [in-scope namespaces] of the enclosing element item and output them as namespace declaration attribute items in the [namespace attributes] of the enclosing element item when re-encoding the extension. To avoid apermittedproliferation of unnecessary namespace declarations, an application could examine the character data looking for character strings that resemble qualified names and retaining only those namespace items from thestring type, otherwise it is converted to U+0000 if[in-scope namespaces] of the enclosing element item thatcharacter is permitted bydefine thestring type, otherwise it is discarded. All white space charactersnamespace prefixes of the putative qualified names. The concerns about the proliferation of namespace declarations raised in Section 4.1.1 do not apply here since theRXER encodingtype of a NamedType in a UNION type cannot be AnyType. 5.6.15. SEQUENCE OF as LIST The character data translation of a value of arestricted character stringLIST typeare significant, i.e., part(a SEQUENCE OF NamedType) is the concatenation of the character data translations of the component values, i.e., the abstractvalue. Examples The contentvalues ofeachthe type of thefollowing <value> elements isNamedType, each separated from theRXER encoding ofnext by at least one white space character. For aIA5String value: <value> Don't run with scissors! </value> <value>Markup (e.g., <value>) has to be escaped.</value> <value>Markup (e.g., <![CDATA[<value>]]>) has toCRXER encoding, separating white space MUST beescaped. </value> 6.4. BIT STRINGexactly one space character (U+0020). Legg & Prager Expires16 December 20045 January 2006 [Page8]40] INTERNET-DRAFT Robust XML EncodingRules June 16, 2004 A value of the BIT STRING type without a NamedBitList is encoded such that the string value of the normalized content is either a binary digit string or a hexadecimal digit string, optionally preceded by and/or followed by white space characters. A hexadecimal digit string MAY be used only ifRules July 5, 2005 Example Consider this type definition: [LIST] SEQUENCE OF timeStamp GeneralizedTime The content of thenumberfollowing <value> element is the RXER encoding ofbits inaBIT STRINGvalueis a multipleofeight, otherwise a binary digit string is used. A binary digit string is a sequencethe above type: <value> 2004-06-15T12:14:56Z 2004-06-15T12:18:13Z 2004-06-15T01:00:25Z </value> 5.7. Combining Types The encoding ofzero, one or morea value ofthe binary digit characters "0" and "1" (i.e., U+0030an ASN.1 combining type (except UNION andU+0031). Each bit in the BIT STRINGLIST types) typically has element content. The Infoset translation of a valueis encoded asof abinary digit in order from the first bitspecific ASN.1 combining type (excluding UNION and LIST types) contains zero or more attribute items to be added to thelast bit. A hexadecimal digit string is a sequence[attributes] ofzero, onethe enclosing element item and zero or morepairs ofelement items to be added to thehexadecimal digit characters "0"-"9", "A"-"F" and "a"-"f" (i.e., U+0030-U+0039, U+0041-U+0046 and U+0061-U+0066). Each group[children] ofeight bits intheBIT STRING value is encoded asenclosing element item. These translations are described in Sections 5.7.1 to 5.7.7. For apair of hexadecimal digits where the first bit isnon-canonical RXER encoding, white space character items MAY be added to themost significant. An odd number[children] ofhexadecimal digits is not permitted. If a hexadecimal digit string is used thenthe enclosingelement's [attributes] SHALL contain an attribute informationelement item (before or after any other items). For a CRXER encoding, a character item with the[local name] "type" and[character code] U+000A (a line feed) MUST be inserted immediately before each element item in the[namespace name] "http://www.w3.org/2001/XMLSchema-instance". The [normalized value][children] ofthis attribute SHALLthe enclosing element item. No other white space character items are permitted to be added to thequalified name with namespace name "http://www.w3.org/2001/XMLSchema" and local part "hexBinary" (e.g., xsi:type="xsd:hexBinary").[children] of the enclosing element item. ASIDE: Without the single line feed character before each child element, a typical CRXER encoding would be a single, very long line. 5.7.1. CHARACTER STRING A value of theBITunrestricted CHARACTER STRING typewith a NamedBitListisencoded such thattranslated according to thestringcorresponding SEQUENCE type defined in Clause 40.5 of X.680 [X.680]. 5.7.2. CHOICE The chosen alternative of a value ofthe normalized content is either, as above for the BIT STRING type withoutaNamedBitList or,CHOICE type corresponds to, and is alistvalue ofidentifiers separated by one or more white space characters optionally preceded by and/or followed by white space characters. In the latter case, each "1" bit(see Section 5.3), some NamedType in theBIT STRING value is represented by its corresponding identifier from the NamedBitList, in any order. Examples Consider thisCHOICE typedefinition: BIT STRING { black(0), red(1), orange(2), yellow(3), green(4), blue(5), indigo(6), violet(7) }definition. Thecontenttranslation ofeach of the following <value> elements is an RXER encodinga value of a CHOICE type other than thesame abstract value: <value> green violet orange</value> <value> 001<!--Orange-->01001 </value> <value>00101001</value>AnyType Legg & Prager Expires16 December 20045 January 2006 [Page9]41] INTERNET-DRAFT Robust XML Encoding RulesJune 16, 2004 <value xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xsi:type="xsd:hexBinary"> 29 </value> 6.5. BOOLEAN The BOOLEAN value TRUE is encoded such that the string value of the normalized content is the literal "true" or "1", at the encoder's option, optionally preceded by and/or followed by white space characters. The BOOLEAN value FALSE is encoded such that the string value of the normalized content is the literal "false"July 5, 2005 type (see Section 4.1) or"0", at the encoder's option, optionally preceded by and/or followed by white space characters. The RXER encoding of BOOLEAN values is intended to conform toa UNION type (see Section 5.6.14) is thelexical representationtranslation of theXML Schema [XSD2] boolean datatype.value of the NamedType corresponding to the actual chosen alternative. Examples Consider this type definition: CHOICE { name [0] IA5String, serialNumber [1] INTEGER } The content of each of the following <value> elements is the RXER encoding of aBOOLEAN value: <value>1</value>value of the above type: <value><name>Bob</name></value> <value>false<name>Alice</name> </value> <value>fal<!--<!-- Don't have apesky comment -->sename for this one! --> <serialNumber> 344 </serialNumber> </value>6.6. CHARACTER STRING<value> <!-- A strange name. --> <name>100</name> </value> If the CHOICE type is extensible [X.680] then an application MUST be capable of accepting, and if necessary, re-encoding any attribute or child element with a name that is not recognised, on the assumption that the sender is using a more recent definition of the CHOICE type. ASIDE: The outermost elements in extensions are required to be self-contained (see Sections 4.1.1 and 5.3.1), which allows such elements to be faithfully relayed despite a lack knowledge of their corresponding NamedType definitions. 5.7.3. EMBEDDED PDV A value of theunrestricted CHARACTER STRINGEMBEDDED PDV type isencodedtranslated according to the corresponding SEQUENCE type defined in Clause40.533.5 of X.680[X680]. 6.7. CHOICE[X.680]. 5.7.4. EXTERNAL A value ofa CHOICE type other than a ChoiceOfStrings type [RFC3641] ortheAnyTypeEXTERNAL type[GLUE] is encoded such that the normalized contentisa single child element information item - corresponding to the actual chosen alternative - optionally preceded by and/or followed by white space character information items. The chosen alternative corresponds to some NamedType in the CHOICE type definition. The [local name] of the child element correspondingtranslated according to thechosen alternative SHALL be the identifier of thecorrespondingNamedType, the [namespace name] of the child element Legg & Prager Expires 16 December 2004 [Page 10] INTERNET-DRAFT Robust XML Encoding Rules June 16, 2004 SHALL have no value, and the content of the child element SHALL be the encodingSEQUENCE type defined in Clause 8.18.1 oftheX.690 [X.690]. Legg & Prager Expires 5 January 2006 [Page 42] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 5.7.5. INSTANCE OF A value of thechosen alternativeINSTANCE OF type is translated according to theType of this NamedType. Examples Consider thiscorresponding SEQUENCE typedefinition: CHOICE { name [0] IA5String, serialNumber [1] INTEGER } The contentdefined in Annex C ofeachX.681 [X.681]. 5.7.6. SEQUENCE and SET Each component value ofthe following <value> elements is the RXER encodinga value of a SEQUENCE or SET type corresponds to, and is a value of (see Section 5.3), some NamedType in theabove type: <value><name>Bob</name></value> <value> <name>Alice</name> </value> <value> <!-- Don't have a name for this one! --> <serialNumber> 344 </serialNumber> </value>SEQUENCE or SET type definition. A value of aChoiceOfStringsSEQUENCE or SET type, other than the QName type (Section 4.5), isencoded such thattranslated by translating in turn each component value actually present in thestringSEQUENCE or SET valueofand adding thenormalized content isresulting attribute items and/or element items to theencoding[attributes] and/or [children] of thevalueenclosing element item. Attribute items may be added to the [attributes] of thechosen alternative. Theenclosingelement's [attributes] MAY contain an attribute informationelement itemwith the [local name] "type" and the [namespace name] "http://www.w3.org/2001/XMLSchema-instance" to identifyin any order. Element items resulting from thechosen alternative. The [normalized value]translating ofthis attribute SHALLcomponent values MUST be appended to thequalified name with namespace name "http://xmled.info/ns/ASN.1" and local part either "BMPString", "GeneralString", "GraphicString", "IA5String", "ISO646String", "NumericString", "PrintableString", "TeletexString", "T61String", "UniversalString", "UTF8String", "VideotexString", or "VisibleString", as appropriate. If the ChoiceOfStrings value has no character data then the[children] of enclosing elementMAY be encoded as an empty element (i.e., using an empty-element tag). 6.8. EMBEDDED PDV Legg & Prager Expires 16 December 2004 [Page 11] INTERNET-DRAFT Robust XML Encoding Rules June 16, 2004 A value of the EMBEDDED PDV type is encoded according toitem in the order of the component values' corresponding NamedType definitions in the SEQUENCE or SET typedefined in Clause 33.5 of X.680 [X680]. 6.9. ENUMERATED A valuedefinition. ASIDE: In the case ofan ENUMERATED typethe SET type, this is a deliberate departure from BER [X.690] where the components of a SET can be encodedsuch thatin any order. If a DEFAULT value is defined for a NamedType and thestringvalue of thenormalized contentNamedType is theidentifier corresponding tosame as theactual value, optionally preceded by and/or followed by white space characters.default value then the translation of the value of the NamedType SHALL be omitted for a CRXER encoding and MAY be omitted for a non-canonical RXER encoding. Examples Consider this type definition:ENUMERATEDSEQUENCE {sunday, monday, tuesday, wednesday, thursday, friday, saturdayname [0] IA5String OPTIONAL, partNumber [1] INTEGER, quantity [2] INTEGER DEFAULT 0 } The content of each of the following <value> elements is the RXER encoding of a value of the above type:<value>monday</value><value>thursday </value> 6.10. EXTERNAL A value of the EXTERNAL type is encoded according to the corresponding SEQUENCE type defined in Clause 8.18.1 of X.690 [X690]. 6.11. GeneralizedTime A value of the GeneralizedTime type is encoded such that the string value of the normalized content is optional leading whitespace characters followed by a date, the letter "T", a time of day, optional fractional seconds, an optional time zone and optional trailing white space characters. The date is two decimal digits representing the century, followed by two decimal digits representing the year, "-" (U+002D), two decimal digits representing the month, "-" (U+002D), and two decimal digits representing the day.<partNumber>23</partNumber> <!-- Thetime of day is two decimal digits representing the hour, followed by ":" (U+003A), two decimal digits representing the minutes, ":" (U+003A), and two decimal digits representing the seconds.quantity defaults to zero. --> </value> <value> <name>chisel</name> <partNumber>37</partNumber> <quantity>0</quantity> </value> Legg & Prager Expires16 December 20045 January 2006 [Page12]43] INTERNET-DRAFT Robust XML Encoding RulesJune 16, 2004July 5, 2005 <value> <!-- Thefractional secondsname component isa period "." (U+002E) followed by zero, one or more decimal digits (U+0030-U+0039). A GeneralizedTime value with fractional hoursoptional. --> <partNumber>1543</partNumber> <quantity>29</quantity> </value> If the SEQUENCE orminutesSET type isfirst converted to the equivalent time with whole minutesextensible [X.680] then an application MUST be capable of accepting, andseconds and,if necessary,fractional seconds. The minutes are encoded as "00" if the GeneralizedTime value omits minutes. The seconds are encoded as "00" if the GeneralizedTime value omits seconds. The time zone, if present, is either the letter "Z" (U+005A) to indicate Coordinated Universal Time, a "+" (U+002B) followed by a time zone differential,re-encoding any attribute or child element with a"-" (U+002D) followed by a time zone differential. A time zone differential indicates the difference between local time (the time specified by the preceding date and time of day) and Coordinated Universal Time. Coordinated Universal Time can be calculated from the local time by subtracting the differential. A time zone differentialname that isencoded as two decimal digits representing hours, the character ":" (U+003A), and two decimal digits representing minutes. The minutes are encoded as "00" ifnot recognised, on theGeneralizedTime value omits minutes fromassumption that thetime zone differential. The RXER encoding of GeneralizedTime valuessender isintended to conform to the lexical representationusing a more recent definition of theXML Schema [XSD2] dateTime datatype. Examples The contentSEQUENCE or SET type. ASIDE: Elements in extensions are required to be self-contained (see Sections 4.1.1 and 5.3.1), which allows such elements to be faithfully relayed despite a lack knowledge ofeachtheir corresponding NamedType definitions. 5.7.7. SEQUENCE OF and SET OF Each component value ofthe following <value> elements is the RXER encodinga value of aGeneralizedTime value: <value>2004-06-15T12:00:00Z</value> <value> 2004-06-15T02:00:00+10:00 </value> <value> 2004-06-15T12:00:00.5 </value> 6.12. INSTANCEtype that is a SET OFANamedType or a SEQUENCE OF NamedType corresponds to, and is a value of (see Section 5.3), theINSTANCE OF type is encoded according toNamedType in thecorresponding SEQUENCEtypedefined in Annex C of X.681 [X681]. 6.13. INTEGER Legg & Prager Expires 16 December 2004 [Page 13] INTERNET-DRAFT Robust XML Encoding Rules June 16, 2004definition. A value ofthe INTEGERa typewithoutthat is aNamedNumberListSET OF NamedType, or a SEQUENCE OF NamedType other than a LIST type (see Section 5.6.15), isencoded such thattranslated by adding thestringtranslation of each value of thenormalized content isNamedType to the [children] of the enclosing element item. ASIDE: An ATTRIBUTE encoding instruction cannot appear in the component type for anumber string representingSEQUENCE OF or SET OF type so there are no attribute items to add to theinteger value, optionally preceded by and/or followed by white space characters. A number string[attributes] of the enclosing element item. If the type is asequence of one or moreSEQUENCE OF NamedType then the values of thedecimal digit characters "0" to "9" (U+0030-U+0039), with an optional leading sign, either "+" (U+002B) or "-" (U+002D). Multiple leading zero digitsNamedType arepermittedtranslated ina number string. Athe order in which they appear in the value ofan INTEGER type withthe SEQUENCE OF type. For aNamedNumberListnon-canonical RXER encoding, if the type isencoded such thata SET OF NamedType then thestring valuevalues of thenormalized contentNamedType may be translated in any order. For a CRXER encoding, if the type iseitheranumber string orSET OF NamedType then theidentifier corresponding tovalues of theactual INTEGER value, optionally preceded by and/or followedNamedType MUST be translated in ascending order where the order is determined bywhite space characters. The RXER encodingcomparing the octets ofINTEGER valuestheir CRXER encodings. A shorter encoding isintended to conformordered before a longer encoding that is identical up to thelexical representationlength of theXML Schema [XSD2] integer datatype.shorter encoding. Examples Consider this type definition:INTEGER { zero(0), one(1) }SEQUENCE OF timeStamp GeneralizedTime The content ofeach ofthe following <value>elementselement is the RXER encoding of a value of the above type:<value>0</value> <value> zero </value> <value> 2 <!-- This number doesn't have a name. --> </value> <value>00167</value> 6.14. NULL A value of the NULL type is encoded such that the normalized content is empty. Examples <value></value> <value><!-- Comments don't matter. --></value> <value/> 6.15. ObjectDescriptorLegg & Prager Expires16 December 20045 January 2006 [Page14]44] INTERNET-DRAFT Robust XML Encoding RulesJune 16, 2004 A value of the ObjectDescriptor type is encoded according to the GraphicString type. 6.16. OBJECT IDENTIFIER and RELATIVE-OID A value of the OBJECT IDENTIFIER type or RELATIVE-OIDJuly 5, 2005 <value> <timeStamp>2004-06-15T12:14:56Z</timeStamp> <timeStamp>2004-06-15T12:18:13Z</timeStamp> <timeStamp> 2004-06-15T01:00:25Z </timeStamp> </value> Consider this typeis encoded such that the string value of the normalized content is a "." (U+002E) separated list of the object identifier components of the value, optionally preceded by and/or followed by white space characters. Each object identifier component is encoded as a non- negative number string. A non-negative number string is either the digit character "0" (U+0030), or a non-zero decimal digit character (U+0031-U+0039) followed by zero, one or more of the decimal digit characters "0" to "9" (U+0030-U+0039). Examplesdefinition: SEQUENCE OF INTEGER The content ofeach ofthe following <value>elementselement is the RXER encoding ofan OBJECT IDENTIFIER value: <value>2.5.6.0</value> <value> 2.5.4.10 </value>a value of the above type: <value>2.5.4.3<item>12</item> <item> 9 </item> <item> 7 <!--commonName -->A prime number. --></item> </value>6.17. OCTET STRING5.8. Open Type A value ofthe OCTET STRINGan open type denoted by an ObjectClassFieldType [X.681] isencoded such thattranslated according to thestring valuespecific Type of thenormalized content is the hexadecimal digit string representation ofvalue. If theoctets, optionally precededObjectClassFieldType is not constrained byand/or followeda TableConstraint, or is constrained bywhite space characters. The octets are encoded in order from the first octet toa TableConstraint where thelast octet. Each octetconstraining object set isencoded as a pairextensible, then the enclosing element item for the translation of thehexadecimal digit characters "0"-"9", "A"-"F"value MUST be self-contained. If the translation of the value does not generate an attribute item with the [local name] "type" and"a"-"f"the [namespace name] "http://www.w3.org/2001/XMLSchema-instance" (i.e.,U+0030-U+0039, U+0041-U+0046xsi:type) andU+0061-U+0066) wherethefirst digit inspecific Type of thepair correspondsvalue has a Qualified Reference Name (see Section 5.1.1) then an attribute item with the [local name] "type" and the [namespace name] "http://www.w3.org/2001/XMLSchema-instance" (i.e., xsi:type) MAY be added to thefour most significant bits[attributes] of theoctet. An odd numberenclosing element item. The [normalized value] ofhexadecimal digitsthis attribute item isnot permitted.the Qualified Reference Name with the namespace prefix determined as specified in Section 5.6.11.1. ASIDE: The xsi:type attribute is added by RXERencodingencoders for the benefit ofOCTET STRING values is intended to conform toXML Schema validators. This attribute tells an XML Schema validator which type definition in thelexical representationXML Schema translation of the ASN.1 specification [CXSD] it should use for validating the content of theXML Schema [XSD2] hexBinary datatype.enclosing element. For an RXER decoder, the actual type in an open type value is generally determined by an associated component relation constraint [X.682], so the xsi:type attribute can be ignored. Examples Legg & Prager Expires 5 January 2006 [Page 45] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 The content ofeach ofthe following <value>elementselement is the RXER encoding of anOCTET STRINGopen type value containing a BOOLEAN value:Legg & Prager Expires 16 December 2004 [Page 15] INTERNET-DRAFT Robust XML Encoding Rules June 16, 2004 <value>27F69A0300</value> <value> efA03bFF<value xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:asn1="http://xmled.info/ns/ASN.1" xsi:type="asn1:BOOLEAN"> true </value>6.18. REAL A5.9. AnyType Conceptually, a value of AnyType holds theREAL type is encoded such that the string[prefix], [attributes], [namespace attributes] and [children] of an element item. The Infoset translation of a value of AnyType initially simply sets thenormalized content isthecharacter string "0" if[prefix], [attributes], [namespace attributes] and [children] of thevalue is positive zero,enclosing element item to thecharacter string "-0" ifcorresponding properties represented by thevalue is negative zero,AnyType value. Recall that thecharacter string "INF" ifenclosing element item for the translation of an AnyType value ispositive infinity, the character string "-INF" ifrequired to be self-contained (Section 4.1.1). If thevalueenclosing element item isnegative infinity,not thecharacter string "NaN" if[document element] of the document item and the [in-scope namespaces] property of the enclosing element item's [parent] contains a namespace item for the default namespace and the [namespace attributes] property represented by the AnyType valueisdoes not contain anumber,namespace item declaring or undeclaring the default namespace then areal number otherwise, optionally preceded by and/or followed by white space characters in each case. A real numbernamespace declaration attribute item that undeclares the default namespace SHALL be added to the enclosing element item's [namespace attributes]. It is not necessary to populate themantissa followed by either "E" (U+0045) or "e" (U+0065) and[in-scope namespaces] of theexponent. Ifenclosing element item for encoding purposes (though it may be warranted for other purposes). An element item nested in theexponent[children] iszero thenpotentially the"E" or "e" and exponent MAY be omitted. The mantissa is a sequenceInfoset translation ofone or morea value of a top level NamedType, and thedecimal digit characters "0" to "9" (U+0030-U+0039), withentire AnyType value can represent the content of anoptional leading sign, either "+" (U+002B) or "-" (U+002D). Multiple leading zero digits are permitted. The exponentelement item that isencoded asthe translation of anumber string (see Section 6.13).value of a top level NamedType. ASIDE: TheRXERlatter case arises when an ELEMENT-REF encoding instruction references a top level NamedType. For a non-canonical RXER encoding, any element item, at any level ofREAL values is intendednesting (including the enclosing element item itself), that corresponds to the value of a top level NamedType from a module with a TARGET-NAMESPACE encoding instruction MAY becompatiblereplaced withthe lexical representationany valid translation of that value according to theXML Schema [XSD2] double datatype (but allows real values outside the range permitted by double). Examples The content of eachtop level NamedType (see Section 5.3). For a CRXER encoding, any element item, at any level of nesting (including thefollowing <value> elements isenclosing element item itself), that corresponds to theRXER encodingvalue of aREAL value: <value>3.14159<!-- PI --></value> <value> 1.0e6 </value> <value> INF </value> <value> -01e-06 </value> 6.19. SEQUENCE and SETtop level NamedType from a module with a TARGET-NAMESPACE encoding instruction MUST be replaced with the CRXER translation of that value according to the top level NamedType. 5.10. Namespace Prefixes for CRXER Legg & Prager Expires16 December 20045 January 2006 [Page16]46] INTERNET-DRAFT Robust XML Encoding RulesJune 16, 2004 AJuly 5, 2005 The final step in translating the value of a top level NamedType for a CRXER encoding, or an abstract value for a Standalone CRXER encoding, is the replacement of the arbitrarily chosen namespace prefixes with algorithmically determined canonical namespace prefixes. This procedure for prefix replacement applies to each element item where the [namespace attributes] have been constructed according to Section 5.3.1.1. This includes any element item corresponding to a value of aSEQUENCE type ortop level NamedType (from aSET type is encoded suchmodule with a TARGET-NAMESPACE encoding instruction) thatthe normalized contentis nested in aseriesvalue ofzero, one or more child element information items - one forAnyType. For eachcomponent value actually present in the SEQUENCE or SET value - optionally preceded by, followed by, and/or separated by white space character information items. Each component value corresponds to some NamedType inelement item where prefix replacement applies, theSEQUENCE or SET type definition. The [local name]following sequence ofthe child element correspondingsteps is repeated until there are no more eligible attribute items to select in step (1): (1) Select thecomponent value SHALL be the identifier ofattribute item with thecorresponding NamedType,least [normalized value] from amongst the [namespace attributes] for which the [local name]ofis not a canonical namespace prefix (i.e., select from thechild element SHALLnamespace declaration attribute items that haveno value, and the content of the child element SHALL be the encoding ofnot already been processed). A [normalized value] is less than another [normalized value] if thecomponent value according toformer appears before theTypelatter in an ordering of theNamedType. The componentvaluesare encoded indetermined by comparing theorderISO 10646 code points [ISO10646] of theircorresponding NamedType definitions in the SEQUENCE or SET type definition. In the casecharacters, from first to last. A shorter string ofthe SET type, thischaracters is ordered before adeliberate departure from BER wherelonger string of characters that is identical up to thecomponentslength of the shorter string. ASIDE: Note that when aSET can be encodednamespace declaration (other than for the default namespace) is represented as an attribute item inany order. IftheSEQUENCE or SET type[namespace attributes], the attribute's [prefix] is "xmlns", its [local name] is the namespace prefix, and its [normalized value] is the namespace name. (2) A canonical namespace prefix isextensible [X680] thenunused if it is not currently theRXER decoder must be capable[prefix] ofskipping overanychildnamespace item in the [in-scope namespaces] of the element item. Replace the [local name] of the selected attribute item witha name that is not recognised, ontheassumptionunused canonical namespace prefix that has thesendernon-negative number string with the least integer value (e.g., n2 isusingless than n10). (3) The selected attribute item has amore recent definitioncorresponding namespace item in the [in-scope namespaces] of theSEQUENCE or SET type. Examples Considerelement. Replace the [prefix] of thistype definition: SEQUENCE { name [0] IA5String OPTIONAL, partNumber [1] INTEGER, quantity [2] INTEGER DEFAULT 0 }corresponding namespace item with the canonical namespace prefix determined in step (2). (4) Thecontent of each ofelement item and its [attributes], and descendent element items and their [attributes], may depend on thefollowing <value> elements isselected attribute item to determine theRXER encodingbinding between their [prefix] and [namespace name]. Replace the [prefix] of any such dependent element items and attribute items with the canonical namespace prefix determined in step (2). Note that avaluenamespace prefix can be redeclared (reused). Replacement of theabove type: <value> <partNumber>23</partNumber> <!-- The quantity defaultsprefix does not apply tozero. --> </value> <value> <name>chisel</name> <partNumber>37</partNumber> <quantity>0</quantity> </value> <value> <!-- The name component is optional. -->an element item Legg & Prager Expires16 December 20045 January 2006 [Page17]47] INTERNET-DRAFT Robust XML Encoding RulesJune 16, 2004 <partNumber>1543</partNumber> <quantity>29</quantity> </value> 6.20. SEQUENCE OFJuly 5, 2005 wherein the prefix is redeclared, or to the descendants of such an element item. (5) The character data translations for values of the QName ASN.1 type may depend on the selected attribute to determine the binding between their namespace prefix andSET OF A valuenamespace name. Replace the namespace prefix of any such dependent character data translation with the canonical namespace prefix determined in step (2). Note that a character data translation can appear in the [normalized value] ofa SEQUENCE OFan attribute item, orSET OF ASN.1 typeas a sequence of character items in the [children] of an element item. 5.11. Serialization The final RXER encoding isencodedproduced by serializing the Infoset translation as an XML document. An implementation is free to serialize the Infoset translation as an XML document in any way such that thenormalizedInfoset of the resulting XML document matches the Infoset translation, after ignoring the following properties: (1) all properties of the document item except the [document element], (2) the [base URI] of any item, (3) the [element contentis a serieswhitespace] ofzero, one or more child elements - one for each component value - optionally preceded by, followed by, and/or separated by white spacecharacterinformationitems, (4) the [notation] of processing instruction items, (5) the [in-scope namespaces] of element items. ASIDE: The[namespace name][in-scope namespaces] ofeach childa parent elementSHALL have no value, and the content of eachitem are only selectively inherited by its child elementSHALL beitems in theencodingInfoset translations of abstract values. This means that thecorresponding component value according toInfoset reconstructed by parsing theType. For a valueXML document serialization ofa SEQUENCE OF NamedType or SET OF NamedType,the[local name] of each child element SHALLoriginal Infoset will generally have more namespace items in its [in-scope namespaces] but these extra namespace items will not bethe identifiersignificant. ASIDE: A consequence of case (1) is that comments and PIs before and after theNamedType. Fordocument element are permitted. In general there is more than one possible serialization for any given Infoset translation. Section 5.11.1 highlights some important considerations in producing avaluecorrect serialization and discusses some ofa SEQUENCE OF Type or SET OF Type,the[local name]serialization options. Section 5.11.2 applies to CRXER encodings and limits the serialization options so that each distinct Infoset has only one possible serialization. 5.11.1. Non-canonical Serialization Legg & Prager Expires 5 January 2006 [Page 48] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 This section discusses aspects of Infoset serialization for non-canonical RXER encodings, but is not an exhaustive list ofeach child element SHALL betheliteral "item".options for non-canonical serialization. If one or more character items have a [character code] in theSEQUENCE OFrange U+0001 to U+0008, U+000B to U+000C orSET OF value has no component valuesU+000E to U+001F, or one or more characters in any attribute's [normalized value] are in the range U+0001 to U+0008, U+000B to U+000C or U+000E to U+001F then theenclosing element MAYInfoset translation MUST beencodedserialized as anempty element (i.e., using an empty-element tag). Examples Consider this type definition: SEQUENCE OF INTEGER The content ofXML version 1.1 document, otherwise thefollowing <value> elementInfoset translation istheserialized as either an XML version 1.0 or version 1.1 document. A non-canonical RXER encodingof a valuemay use any of theabove type: <value> <item>12</item> <item> 9 </item> <item> 7 <!-- A prime number. --></item> </value> Consider this type definition: SEQUENCE OF timeStamp GeneralizedTime The contentallowed character encoding schemes for XML. RXER encoders and decoders MUST support the UTF-8 character encoding. An element item may be serialized as an empty-element tag if it has no items in its [children]. Attributes of an element can appear in any order since thefollowing <value>[attributes] and [namespace attributes] of an elementisitem are unordered. Ampersand ('&', U+0026) and open angle bracket ('<', U+003C) characters in theRXER encoding[normalized value] of an attribute item must be escaped appropriately [XML10][XML11] (with avalue ofcharacter reference or a predefined entity reference). Double quote (U+0022) and single quote (U+0027) characters in an attribute item's [normalized value] may also need to be escaped. Character items with theabove type: Legg & Prager Expires 16 December 2004 [Page 18] INTERNET-DRAFT Robust[character code] U+0026 or U+003C must be escaped appropriately (with a character reference, a predefined entity reference or a CDATA section). Line break normalization by XML processors allows some freedom in how a character item for a line feed character (U+000A) is serialized: (1) If XMLEncoding Rules June 16, 2004 <value> <timeStamp>2004-06-15T12:14:56Z</timeStamp> <timeStamp>2004-06-15T12:18:13Z</timeStamp> <timeStamp> 2004-06-15T01:00:25Z </timeStamp> </value> 6.21. UTCTime A value of the UTCTime typeversion 1.0 isencoded such that the string value ofselected then a character item with thenormalized content[character code] U+000A isoptional leading whitespace characters followed byserialized as either adate, the letter "T",U+000A character, atime of day, an optional time zone and optional trailing white space characters. The date is two decimal digits representing the century,U+000D character followed bytwo decimal digits representing the year, "-" (U+002D), two decimal digits representing the month, "-" (U+002D), and two decimal digits representinga U+000A character, or a U+000D character provided theday. A UTCTime value doesnext item is notindicate the century, therefore the century in the RXER encodinga character item that isgenerated from the year valueserialized asfollows.a U+000A character. (2) Ifthe yearXML version 1.1 isin the range 50-99selected then a character item with thecentury[character code] U+000A is"19", otherwiseserialized as either a U+000A character, a U+0085 character, a U+2028 character, a U+000D character followed by a U+000A character, a U+000D character followed by a U+0085 character, or a U+000D character provided thecenturynext item is"20". Notenot a character item thatRXER encoded UTCTime values withis serialized as afour digit year outside the range 1950U+000A or U+0085 character. ASIDE: All these sequences will be normalized to2049 are illegal. RXER decoders MUST discardU+000D during decoding. A character item with thecentury before passing[character code] U+000D, U+0085 or U+2028 must be serialized as aUTCTime valuecharacter reference toan application.protect the character Legg & Prager Expires 5 January 2006 [Page 49] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 from line break normalization during decoding. Thetime of dayattribute value normalization performed by XML processors allows some freedom in how a space character (U+0020) istwo decimal digits representing the hour,serialized: (1) If XML version 1.0 is selected then a space character (U+0020) in an attribute item's [normalized value] is serialized as either a U+0020 character, a U+0009 character, a U+000D character, a U+000A character, a U+000D character followed by":" (U+003A), two decimal digits representinga U+000A character, or a U+000D character provided theminutes, ":" (U+003A), and two decimal digits representingnext character in theseconds. The seconds are encoded[normalized value] is not serialized as"00" if the UTCTime value omits seconds. The time zone, if present,a U+000A character. (2) If XML version 1.1 is selected then a space character (U+0020) in an attribute item's [normalized value] is serialized as eitherthe letter "Z" (U+005A) to indicate Coordinated Universal Time,a"+" (U+002B)U+0020 character, a U+0009 character, a U+000D character, a U+000A character, a U+0085 character, a U+2028 character, a U+000D character followed by atime zone differential, orU+000A character, a"-" (U+002D)U+000D character followed by atime zone differential. A time zone differential indicatesU+0085 character, or a U+000D character provided thedifference between local time (the time specified bynext character in thepreceding date and time of day)[normalized value] is not serialized as a U+000A or U+0085 character. ASIDE: All these sequences will be normalized to U+0020 during decoding through a combination of line break normalization andCoordinated Universal Time. Coordinated Universal Time canattribute value normalization. Each tab (U+0009), line feed (U+000A) or carriage return (U+000D) character in an attribute item's [normalized value] must becalculatedserialized as a character reference to protect the character from attribute value normalization during decoding. In addition, if XML version 1.1 is selected then each U+0085 or U+2028 character must be serialized as a character reference. Parsed entity references may be used (unless thelocal time by subtractingenvironment in which thedifferential. A time zone differentialRXER encoding isencoded as two decimal digits representing hours,used disallows entity references). If entity references to other than thecharacter ":" (U+003A),predefined entities are used then the XML document containing the RXER encoding must necessarily contain a document type declaration andtwo decimal digits representing minutes.the internal or external subset of the document type declaration must contain entity declarations for those entities. 5.11.2. Canonical Serialization This section discusses Infoset serialization for CRXER encodings. TheRXER encodingserialization ofUTCTime valuesan Infoset for a CRXER encoding isintended to conformrestricted so that each distinct Infoset has only one possible serialization as an XML document. ASIDE: These restrictions have been chosen so as to be consistent with Canonical XML [CXML] where possible. The document SHALL be encoded in UTF-8 without a leading Byte Order Mark [ISO10646]. The XMLDecl of the document SHALL be <?xml version="1.1"?>. Legg & Prager Expires16 December 20045 January 2006 [Page19]50] INTERNET-DRAFT Robust XML Encoding RulesJune 16, 2004 lexical representation of the XML Schema [XSD2] dateTime datatype. The inclusion of two digits for the century inJuly 5, 2005 A document type declaration (doctypedecl) SHALL NOT be used. ASIDE: This has theRXER encodingeffect ofa UTCTime value is not intended to alter UTCTime abstract values, nor to alter how applications might already calculate a suitable century for UTCTime values. The reasonexcluding entity references except those forincluding the century in the encoding is to allowtheUTCTime type to be mapped [CXSD] to something meaningful in XML Schema (i.e., dateTime) so that XML Schema aware toolkits will invoke reasonably sensible default processing of UTCTime values. 6.22. Open Typepredefined entities (e.g., &). Avalue of an open type denoted by an ObjectClassFieldType [X.681] is encoded according to the specific Type ofsingle line feed character (U+000A) SHALL be inserted immediately before thevalue. Ifdocument element. No other white space characters are permitted before or after theencoding ofdocument element. There SHALL NOT be any PIs or comments before or after thevalue does not generate an attribute informationdocument element. An element item SHALL NOT be serialized as an empty-element tag. ASIDE: If an element itemwith the [local name] "type" and the [namespace name] "http://www.w3.org/2001/XMLSchema-instance" (i.e., xsi:type, see Sections 6.4 & 6.7) and the specific Type of the value of the open typehasa Qualified Reference Name (see Section 3.1)no items in its [children] then it is serialized as a start-tag followed by an end-tag. There SHALL NOT be any white space characters immediately before the[attributes]closing '>' ofthe enclosing element SHOULD containanattribute information item with the [local name] "type"element's start-tag andthe [namespace name] "http://www.w3.org/2001/XMLSchema-instance" (i.e., xsi:type), where the [normalized value] of thisend-tag. The white space preceding each attribute MUST be exactly one space character (U+0020). There SHALL NOT be any white space characters immediately before or after theQualified Reference Name.equals sign (U+003D) in an attribute. Thexsi:type attribute is added by RXER encodersdelimiter for attribute values MUST be thebenefitdouble quote character (U+0022). Namespace declaration attributes MUST appear before any other attributes ofXML Schema validators. ForanRXER decoder,element. A namespace declaration for theactual type in an open type value is generally determined by an associated component relation constraint [X682], hence RXER decoders MAY ignoredefault namespace, if present, MUST appear as thexsi:typefirst attribute.Where the xsi:type attribute is present, appropriateThe remaining namespace declaration attributesfor the namespace namesMUST appear in lexicographic order based on [local name]. ASIDE: In particular, this means that xmlns:n10 comes before xmlns:n2. The attributes that are not namespace declarations are lexicographically ordered on [namespace name] as theattribute's nameprimary key andvalue[local name] as the secondary key. CDATA sections SHALL NOT be used. Each ampersand character ('&', U+0026) in an attribute item's [normalized value] MUST beadded toserialized as theenclosing element's [attributes] if not alreadyentity reference &. Each open angle bracket character ('<', U+003C) in an attribute item's [normalized value] MUST be serialized as the[in-scope namespaces] for the element. The namespace prefixes are the encoder's choice. Examples The content ofentity reference <. Each double quote character (U+0022) in an attribute item's [normalized value] MUST be serialized as thefollowing <value> element isentity reference ". Each character in theRXER encoding ofrange U+0001 to U+001F or U+007F to U+009F in anopen type value containingattribute item's [normalized value] MUST be serialized as aBOOLEAN value: <value xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:asn1="http://xmled.info/ns/ASN.1" xsi:type="asn1:BOOLEAN"> true </value>character reference. No other character in a [normalized value] is permitted to be serialized as an entity reference or character reference. Legg & Prager Expires16 December 20045 January 2006 [Page20]51] INTERNET-DRAFT Robust XML Encoding RulesJune 16, 2004 6.23. AnyType The AnyType type [GLUE] is used to embed arbitrary XML within ASN.1 abstract values. The RXER encoding ofJuly 5, 2005 Each character item with the [character code] U+0026 (the ampersand character) MUST be serialized as the entity reference &. Each character item with the [character code] U+003C (the open angle bracket character) MUST be serialized as the entity reference <. Each character item with the [character code] U+003E (the closing angle bracket character) MUST be serialized as the entity reference >. Each character item with avalue of[character code] in theAnyType type is intendedrange U+0001 to U+0008, U+000B to U+001F or U+007F to U+009F MUST beInfoset equivalentserialized as a character reference. No other character item is permitted to be serialized as an entity reference or character reference. Character references, where they are permitted, MUST use uppercase hexadecimal with no leading zeroes. For example, theoriginal XML used to populatecarriage return character is represented as 
. A space character (U+0020) in an attribute item's [normalized value] MUST be serialized as a single U+0020 character. A character item with theAnyType value.[character code] U+000A MUST be serialized as a single U+000A character. Thecharacter stringwhite space separating the [target] and [content] in theattributesserialization of a processing instruction item SHALL be exactly one space character (U+0020). ASIDE: A processing instruction orcontext componentcomment can only appear in a CRXER encoding if it is embedded in an AnyType value. 5.11.3. Unicode Normalization in XML Version 1.1 XML Version 1.1 recommends, but does not absolutely require, that text be normalized according to Unicode Normalization Form C [UNICODE]. ASN.1 has no similar requirement on abstract values of string types, and ASN.1 canonical encoding rules depend on thetext alternative of an AnyType value is an XML textual representationcode points ofa sequencecharacters being preserved. To accommodate both requirements, applications SHOULD normalize abstract values ofattribute information items. TheASN.1 character stringintypes according to Unicode Normalization Form C at thecontent component oftime thetext alternative of an AnyTypevalues are created, but MUST NOT normalize a previously decoded abstract valueisof anXML textual representationASN.1 character string type prior to re-encoding it. An application may of course normalize asequence of character, comment, processing instruction and child element information items. Adecoded abstract valueof the AnyType type is encodedfor other purposes suchthat: a) the [children] of the enclosing element is the sameasthe sequence of information items represented by the content component, b) the [attributes] of the enclosing element includes the attribute information items represented by the attributes component, plus the namespace declarations in the context component thatdisplay to user. 5.12. Syntax-Based Canonicalization ASN.1 encoding rules are designed to preserve abstract values, but notalready defined in the [in-scope namespaces]to preserve every detail of each transfer syntax that is used. In theenclosing element. The character string in the prolog componentcase of RXER this means that thetext alternativeInfoset representation of anAnyTypeabstract value istext conforming tonot necessarily preserved when theprolog production of XML [XML]. Itabstract value isused to interpret entity references in the context, attributes or content components. Any entity references in the context, attributes or content components MUST either be replaced indecoded and re-encoded (regardless of theRXERencodingby their replacement text, or the corresponding entity declarations in the prolog component must be added torules used). However, syntax-based canonicalization for XML documents (e.g., Canonical XML [CXML]) depends on theDTDInfoset of an XML document being Legg & Prager Expires 5 January 2006 [Page 52] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 preserved. The Infoset representation ofthean XML document containing the RXERencoding. Noteencoding of a value of a top level NamedType potentially changes if that value is decoded and re-encoded, disrupting thelatter may notCanonical XML representation. Extra normalization is required if RXER is to bepossible becauseusefully deployed in environments where syntax-based canonicalization is used. Prior to applying syntax-based canonicalization to an XML document, any elements in the XML document that correspond to the value of an ASN.1 top level NamedType from aconflictmodule withan existing entity declaration of the same name. Suchaconflict canTARGET-NAMESPACE encoding instruction MUST beresolved by renaming onere-encoded according to CRXER. If an application uses Canonical XML but has no knowledge ofthe entities throughout theRXERencoding (to some unused name of the encoder's choosing), however applications will generally findthen iteasierwill not know toexpand out entity references at the earliest opportunity.normalize RXER encodings. Ifthe content componentRXER isabsentdeployed into an environment containing such applications thenthe enclosing element MAYall RXER encodings SHOULD beencoded as an empty element (i.e., using an empty-element tag). Example Consider theCRXER encodings. 6. Transfer Syntax Identifiers 6.1. RXER Transfer Syntax The followingAnyType value represented in ASN.1 value Legg & Prager Expires 16 December 2004 [Page 21] INTERNET-DRAFTOBJECT IDENTIFIER has been assigned by xmled.org to identify the Robust XML EncodingRules June 16, 2004 notation [X680]: text:{ context "xmlns:ns=""http://www.example.com/SLI""", attributes "bar=""0"" ns:foo=""1""", contentRules, under an arc assigned to xmled.org by IANA: {lf, " <this>true</this>", lf, " <that/>", lf }iso(1) identified-organization(3) dod(6) internet(1) private(4) enterprise(1) xmled(21472) asn1(1) encoding(1) rxer(0) }The content ofThis OBJECT IDENTIFIER would be used, for example, to describe thefollowing <value> element istransfer syntax for an RXERencoding of the above AnyType value: <value xmlns:ns="http://www.example.com/SLI" bar="0" ns:foo="1"> <this>true</this> <that/> </value> 7. RXERencoded data-value in an EMBEDDED PDV value. 6.2. CRXER Transfer Syntax The following OBJECT IDENTIFIER has been assigned byAdacel Technologies, under an arc assigned to Adacel by Standards Australia,xmled.org to identify the Canonical Robust XML EncodingRules:Rules, under an arc assigned to xmled.org by IANA: {1 2 36 79672281 0 2iso(1) identified-organization(3) dod(6) internet(1) private(4) enterprise(1) xmled(21472) asn1(1) encoding(1) crxer(1) } This OBJECT IDENTIFIER would be used, for example, to describe the transfer syntax foran RXERa CRXER encoded data-value in an EMBEDDED PDV value.8.7. Relationship to XER RXER and XER[X693][X.693] are separate, distinctly different and incompatible ASN.1 encoding rules for producing XML markup from ASN.1 abstract values. RXER is therefore unrelated to the XML ASN.1 Value Notation of X.680[X680].[X.680]. Legg & Prager Expires 5 January 2006 [Page 53] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 There is usually a requirement on applications specified in ASN.1 to maintain backward compatibility with the encodings generated by previous versions. The encodings in question are typically BER. Even with the backward compatibility constraint there is still considerable leeway for specification writers to rewrite the earlier specification. For example, renaming types, factoring out an in-line type definition as anameddefined type (or the reverse), or replacing a type definition with an equivalent parameterized reference. These changes produce no change to BER, DER,CER, PER [X691],CER [X.690], Packed Encoding Rules (PER) [X.691], orGSERGeneric String Encoding Rules (GSER) [RFC3641] encodings (so specification writers have felt free to make suchLegg & Prager Expires 16 December 2004 [Page 22] INTERNET-DRAFT Robust XML Encoding Rules June 16, 2004changes to improve their specification), but can change the[local name]names of elements in the XER encoding. The RXER encoding is immune to this problem, thus RXER encodings are more stable than XER encodings over successive revisions of an ASN.1specification.specification (which explains the first 'R' in RXER). That has an obvious benefit for interoperability.RXER allows entity references, comments and PIs in encodings. XER does not. RXER is conformant with XML namespaces [XMLNS], while XER does not allow qualified names at all. RXER has also been designed so that is it possible to generate, from any arbitrary ASN.1 specification, a compatible XML Schema that will validate correct RXER encodings [CXSD]. The same is not generally true of XER, except by making changes to the original ASN.1 specification. 9.8. Security Considerations RXER does not necessarily enable the exact BER octet encoding of values of the TeletexString, VideotexString, GraphicString or GeneralString types to be reconstructed, so a transformation from DER to RXER and back to DER may not reproduce the original DER encoding.ThereforeThis is a result of inadequate normalization of values of these string types in DER. A character in a TeletexString value (for example) that corresponds to a specific ISO 10646 character can be encoded for BER in a variety of ways that are indistinguishable in an RXER re-encoding of the TeletexString value. DER does not mandate one of these possible character encodings in preference to all others. Because of the above, RXER MUST NOT be used to re-encode, whether for storage ortransmission,transmission, ASN.1 abstract values whose original DER or CER encoding must be recoverable, and whose type definitions involve the TeletexString, VideotexString, GraphicString or GeneralString type. Such recovery is needed for the verification of digital signatures. In such cases, protocols ought to use DER or a DER- reversible encoding. In other cases where ASN.1 canonical encoding rules are used, values of AnyType must be normalized as described in Section 4.1.2 and values of QName must be normalized as described in Section 4.5. A transformation from CRXER to BER and back to CRXER does reproduce the original CRXER encoding, therefore it is safe to use BER, DER or CER to re-encode ASN.1 abstract values whose originalbinaryCRXER encoding must be recoverable.Such recovery is needed forDigital signatures may also be calculated on theverificationCanonical XML representation ofdigital signatures. Inan XML document. If RXER encodings appear in suchcases, protocols ought to use DER or a DER-reversibledocuments then applications must normalize the encodings as described in Section 5.12. The NUL character (U+0000) cannot be represented in XML and hence Legg & Prager Expires 5 January 2006 [Page 54] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 cannot be transmitted in an RXER encoding. NUL characters in abstract values of ASN.1 string types will be dropped if the values are RXER encoded, therefore RXER MUST NOT be used by applications that attach significance to the NUL character. When interpreting security-sensitive fields, and in particular fields used togrant or deny access, implementations MUST ensure that any comparisons are done ongrant or deny access, implementations MUST ensure that any comparisons are done on the underlying abstract value, regardless of the particular encoding used. Comparisons of AnyType values MUST operate as though the values have been normalized as specified in Section 4.1.2. Comparisons of QName values MUST operate as though the values have been normalized as specified in Section 4.5. 9. Acknowledgements This document and the technology it describes are a product of a joint research project between Adacel Technologies Limited and Deakin University on leveraging existing directory technology to produce an XML-based directory service. 10. IANA Considerations This document has no actions for IANA. Appendix A. Additional Basic Definitions Module This appendix is normative. AdditionalBasicDefinitions { iso(1) identified-organization(3) dod(6) internet(1) private(4) enterprise(1) xmled(21472) asn1(1) module(0) basic(0) } -- Copyright (C) The Internet Society (2005). This version of -- this ASN.1 module is part of RFC XXXX; see the RFC itself -- for full legal notices. DEFINITIONS AUTOMATIC TAGS EXTENSIBILITY IMPLIED ::= BEGIN AnyType ::= CHOICE { text SEQUENCE { prolog UTF8String (SIZE(1..MAX)) OPTIONAL, prefix NCName OPTIONAL, attributes UTF8String (SIZE(1..MAX)) OPTIONAL, content UTF8String (SIZE(1..MAX)) OPTIONAL } } AnyURI ::= UTF8String (CONSTRAINED BY { -- conforms to the format of a URI -- }) NCName ::= UTF8String (CONSTRAINED BY Legg & Prager Expires 5 January 2006 [Page 55] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 { -- conforms to theunderlying abstract value, regardlessNCName production of -- Namespaces in XML -- }) Name ::= UTF8String (CONSTRAINED BY { -- conforms to theparticular encoding used. 10. Acknowledgements This document and the technology it describes are a productName production ofa joint research project between Adacel Technologies Limited and Deakin University on leveraging existing directory technology to produce an XML-based directory service. 11. References 11.1.XML -- }) QName ::= SEQUENCE { prefix NCName OPTIONAL, namespace-name AnyURI OPTIONAL, local-name NCName } ENCODING-CONTROL RXER TARGET-NAMESPACE "http://xmled.info/ns/ASN.1" EXTENSIONS-MARKED END Normative References [BCP14] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.Legg & Prager Expires 16 December 2004 [Page 23] INTERNET-DRAFT[URI] Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform Resource Identifiers (URI): Generic Syntax", STD 66, RFC 3986, January 2005. [RXEREI] Legg, S., "Encoding Instructions for the Robust XML Encoding RulesJune 16, 2004 [XEDNS] Legg, S. and D. Prager, "The XML Enabled Directory: IANA Considerations", draft-legg-xed-iana-xx.txt,(RXER)", draft-legg-xed-rxer-ei-xx.txt, a work in progress,to be published. [GLUE]July 2005. [ASN.X] Legg,S. and D. Prager, "The XML Enabled Directory: Schema Language Integration", draft-legg-xed-glue-xx.txt,S., "Abstract Syntax Notation X (ASN.X)", draft-legg-xed-asd-xx.txt, a work in progress,June 2004. [X680]July 2005. [X.680] ITU-T Recommendation X.680 (07/02) | ISO/IEC 8824-1, Information technology - Abstract Syntax Notation One (ASN.1): Specification of basic notation.[X681][X.680-1] Draft Amendment 1 (to ITU-T Rec. X.680 | ISO/IEC 8824-1) Support for EXTENDED-XER. [X.681] ITU-T Recommendation X.681 (07/02) | ISO/IEC 8824-2, Information technology - Abstract Syntax Notation One (ASN.1): Information object specification.[X682][X.682] ITU-T Recommendation X.682 (07/02) | ISO/IEC 8824-3, Information technology - Abstract Syntax Notation One (ASN.1): Constraint specification.[X683][X.683] ITU-T Recommendation X.683 (07/02) | ISO/IEC 8824-4, Information technology - Abstract Syntax Notation One (ASN.1): Parameterization of ASN.1 specifications.[X690]Legg & Prager Expires 5 January 2006 [Page 56] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 [X.690] ITU-T Recommendation X.690 (07/02) | ISO/IEC 8825-1, Information technology - ASN.1 encoding rules: Specification of Basic Encoding Rules (BER), Canonical Encoding Rules (CER) and Distinguished Encoding Rules (DER). [UCS] ISO/IEC 10646-1:2000, Information technology - Universal Multiple-Octet Coded Character Set (UCS) - Part 1: Architecture and Basic Multilingual Plane.[XML][UNICODE] The Unicode Consortium, "The Unicode Standard, Version 4.0", Boston, MA, Addison-Wesley Developers Press, 2003. ISBN 0-321-18578-1. [XML10] Bray, T., Paoli, J., Sperberg-McQueen, C., Maler, E. and F. Yergeau, "Extensible Markup Language (XML) 1.0 (Third Edition)", W3C Recommendation, http://www.w3.org/TR/2004/REC-xml-20040204, February 2004.[XMLNS][XML11] Bray, T., Paoli, J., Sperberg-McQueen, C., Maler, E., Yergeau, F., and J. Cowan, "Extensible Markup Language (XML) 1.1", W3C Recommendation, http://www.w3.org/TR/2004/REC-xml11-20040204, February 2004. [XMLNS10] Bray, T., Hollander, D. and A. Layman, "Namespaces in XML", http://www.w3.org/TR/1999/REC-xml-names-19990114, January 1999. [XMLNS11] Bray, T., Hollander, D., Layman, A. and R. Tobin, "Namespaces in XML 1.1", http://www.w3.org/TR/2004/REC- xml-names11-20040204, January 1999. [ISET] Cowan, J. and R. Tobin, "XML Information Set", W3C Recommendation, http://www.w3.org/TR/2001/REC-xml- infoset-20011024, October 2001. [XSD1] Thompson, H., Beech, D., Maloney, M. and N. Mendelsohn,Legg & Prager Expires 16 December 2004 [Page 24] INTERNET-DRAFT Robust XML Encoding Rules June 16, 2004"XML Schema Part 1: Structures", W3C Recommendation, http://www.w3.org/TR/2001/REC-xmlschema-1-20010502, May 2001.11.2.Informative References [RFC3641] Legg, S., "Generic String Encoding Rules (GSER) for ASN.1 Types", RFC 3641, October 2003.[ASD] Legg, S. and D. Prager, "ASN.1 Schema: An XML Representation for ASN.1 Specifications", draft-legg-xed-asd-xx.txt, a work in progress, June 2004.[CXSD] Legg, S. and D. Prager, "Translation of ASN.1 Specifications into XML Schema", draft-legg-xed-xsd-xx.txt, a work in progress, to be published.[X691][X.691] ITU-T Recommendation X.691 (07/02) | ISO/IEC 8825-4:2002, Information technology - ASN.1 encoding rules: Legg & Prager Expires 5 January 2006 [Page 57] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 Specification of Packed Encoding Rules (PER)[X693][X.693] ITU-T Recommendation X.693 (12/01) | ISO/IEC 8825-4:2002, Information technology - ASN.1 encoding rules: XML encoding rules (XER) [XSD2] Biron, P.V. and A. Malhotra, "XML Schema Part 2: Datatypes", W3C Recommendation, http://www.w3.org/TR/2001/REC-xmlschema-2-20010502, May 2001. [CXML] Boyer, J., "Canonical XML", W3C Recommendation, http://www.w3.org/TR/2001/REC-xml-c14n-20010315, March 2001. Authors' Addresses Dr.Steven Legg Adacel Technologies Ltd. 250 Bay Street Brighton, Victoria 3186 AUSTRALIA Phone: +61 3 8530 7710 Fax: +61 3 8530 7888 EMail: steven.legg@adacel.com.au Dr. Daniel Prager C/o Professor Lynn Batten Department of Computing and Mathematics Deakin University Geelong, Victoria 3217 Legg & Prager Expires 16 December 2004 [Page 25] INTERNET-DRAFT Robust XML Encoding Rules June 16, 2004Steven Legg eB2Bcom Suite 3, Woodhouse Corporate Centre 935 Station Street Box Hill North, Victoria 3129 AUSTRALIA Phone: +61 3 9896 7830 Fax: +61 3 9896 7801 EMail:dan@layabout.netsteven.legg@eb2bcom.com Dr. Daniel Prager EMail:lmbatten@deakin.edu.audan@layabout.net Full Copyright Statement Copyright (C) The Internet Society(2004).(2005). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights Legg & Prager Expires 5 January 2006 [Page 58] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Changes in Draft 00 The Directory XML Encoding Rules (DXER) have been renamed to the Robust XML Encoding Rules (RXER). The previous file name for this draft was draft-legg-xed-dxer-00.txt .Legg & Prager Expires 16 December 2004 [Page 26] INTERNET-DRAFT Robust XML Encoding Rules June 16, 2004The rules for forming the [local name] and [namespace name] of theroot element[document element] of a Standalone DXER Encoding have been changed to remove any dependency on type reference names. Changes in Draft 01 The namespace name for the ASN.1 namespace has been shortened. Additional insignificant leading and trailing white space is permitted in the encodings for some of the simple ASN.1 types in order to align them fully with their analogous XML Schema types. Changes in Draft 02 The AnyType ASN.1 type from [GLUE] has been revised to be a CHOICE whose only alternative is the previous SEQUENCE type. The description of the RXER encoding of values of AnyType has been revised to account for the change. Examples of RXER encodings have been added to the specification. Changes in Draft 03 Descriptions of the effects of RXER encoding instructions on RXER encodings have been added. Rules for a canonical variant of RXER (CRXER) have been added. Both of these changes have forced a radical reorganization of the document. The OBJECT IDENTIFIER identifying RXER (Section 6.1) has been replaced. An OBJECT IDENTIFIER identifying CRXER (Section 6.2) has Legg & Prager Expires16 December 20045 January 2006 [Page27]59] INTERNET-DRAFT Robust XML Encoding Rules July 5, 2005 been allocated. This draft incorporates the SchemaLanguageIntegration module and associated descriptions from "The XML Enabled Directory: Schema Language Integration" draft (draft-legg-xed-glue-02.txt). Changes to the incorporated material are described here. The mechanism of constraining values of AnyType using user defined constraint notation with specially assigned object identifiers has been discarded in favour of RXER reference encoding instructions [RXEREI]. The parts of the SchemaLanguageIntegration module pertaining to this old mechanism have been stripped out. The OBJECT IDENTIFIER for the SchemaLanguageIntegration module has been replaced. The SchemaLanguageIntegration module has been renamed to AdditionalBasicDefinitions. The QName ASN.1 type has been introduced into the AdditionalBasicDefinitions module. The century pad digits for a UTCTime value have been removed. The pad digits were there to allow UTCTime to be translated into XML Schema dateTime, but a forthcoming time types amendment will add more time types that don't have a natural counterpart in XML Schema. Forcing UTCTime into dateTime will then seem rather arbitrary. Use of the xsi:type attribute to identify BIT STRING values encoded in hexadecimal has been discarded in favour of the format attribute. The provisions for ChoiceOfString types have been subsumed by the UNION encoding instruction. Use of the xsi:type attribute to identify the alternative in a UNION/ChoiceOfStrings type has been discarded in favour of the member attribute. Legg & Prager Expires 5 January 2006 [Page 60] ----