view Side-By-Side changes
Network Working GroupInternet-DraftM. MurataExpires: May 31, 2000 Fuji Xerox Information SystemsInternet-Draft S. St.LaurentDecember 1999Expires: September 30, 2000 D. Kohn April 2000 XML Media Typesdraft-murata-xml-02.txtdraft-murata-xml-03.txt Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed athttp://www.ietf.org/ietf/1id-abstracts.txthttp://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire onMay 31,September 30, 2000. Copyright Notice Copyright (C) The Internet Society(1999).(2000). All Rights Reserved. Abstract This documentproposesstandardizes five new media types, text/xml, application/xml, text/xml-external-parsed-entity, application/xml-external-parsed-entity, and application/xml-dtd, for use in exchanging network entities which areconformingrelated to the Extensible Markup Language (XML). This document alsoproposesstandardizes a convention (using the suffix '|xml') for naming mediasubtypestypes outside of these fivesubtypestypes when thosesubtypesmedia types represent XML entities. XML MIME entities are currently exchanged via the HyperText Transfer Protocol on the World Wide Web, are an integral part of the WebDAV protocol for remote web authoring, and are expected to have utility in many domains.,Murata, et. al. ExpiresMay 31,September 30, 2000 [Page 1] Internet-Draft XML Media TypesDecember 1999April 2000 [This document is intended to be a standards-track replacement for RFC 2376. It could also be specified as updating RFC 2048.] Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . .3 1.14 2. Editor's Notes . . . . . . . . . . . . . . . . . . . . . . .4 2.5 3. Notational Conventions . . . . . . . . . . . . . . . . . . .5 3.6 4. XML Media Types . . . . . . . . . . . . . . . . . . . . . .6 3.17 4.1 Text/xml Registration . . . . . . . . . . . . . . . . . . .7 3.28 4.2 Application/xml Registration . . . . . . . . . . . . . . . .9 3.3 text/xml-external-parsed-entity11 4.3 Text/xml-external-parsed-entity Registration . . . . . . . .11 3.4 application/xml-external-parsed-entity12 4.4 Application/xml-external-parsed-entity Registration . . . . 133.54.5 Application/xml-dtd Registration . . . . . . . . . . . . . . 144. Security Considerations . . .4.6 Summary . . . . . . . . . . . . . . .16 5. The Byte Order Mark (BOM) and Conversions to/from UTF-16. .18 6. A naming convention for XML-based media types. . . . . . .19 7. Examples. . 15 4.7 Referencing . . . . . . . . . . . . . . . . . . . . . . . .21 7.1 text/xml with UTF-816 5. The Byte Order Mark (BOM) and Conversions to/from the UTF-16 Charset . . . . . . . . . . . . . . . .21 7.2 text/xml with UTF-16 Charset . . . . . .. . . . . . . . . .21 7.3 text/xml with ISO-2022-KR Charset17 6. Fragment Identifiers . . . . . . . . . . . . .21 7.4 text/xml with Omitted Charset. . . . . . . 18 7. The Base URI . . . . . . . .22 7.5 application/xml with UTF-16 Charset. . . . . . . . . . . .22 7.6 application/xml with ISO-2022-KR Charset. . . . 19 8. A Naming Convention for XML-Based Media Types . . . . . .22 7.7 application/xml with Omitted Charset and UTF-16 XML MIME entity. 20 9. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 237.8 application/xml9.1 Text/xml withOmitted Charset andUTF-8EntityCharset . . .23 7.9 application/xml with Omitted Charset and Internal Encoding Declaration. . . . . . . . . . . . . 23 9.2 Text/xml with UTF-16 Charset . . . . . . . . . . .23 7.10 text/xml-external-parsed-entity with UTF-8 Charset. . . . .24 7.11 application/xml-external-parsed-entity23 9.3 Text/xml withUTF-16UTF-16BE Charset .24 7.12 application/xml-dtd. . . . . . . . . . . . . . 24 9.4 Text/xml with ISO-2022-KR Charset . . . . . .24 7.13 application/mathml-xml. . . . . . . 24 9.5 Text/xml with Omitted Charset . . . . . . . . . . . .25 7.14 application/XSLT-xml. . . 24 9.6 Application/xml with UTF-16 Charset . . . . . . . . . . . . 24 9.7 Application/xml with UTF-16BE Charset . . . . .25 7.15 application/rdf-xml. . . . . . 25 9.8 Application/xml with ISO-2022-KR Charset . . . . . . . . . . 25 9.9 Application/xml with Omitted Charset and UTF-16 XML MIME Entity . . . .25 7.16 image/svg-xml. . . . . . . . . . . . . . . . . . . . . . . 258. Revision History . . .9.10 Application/xml with Omitted Charset and UTF-8 Entity . . . 26 9.11 Application/xml with Omitted Charset and Internal Encoding Declaration . . . . . . . . . . . . . . . .26 References. . . . . . . . 26 9.12 Text/xml-external-parsed-entity with UTF-8 Charset . . . . . 26 9.13 Application/xml-external-parsed-entity with UTF-16 Charset . 27 9.14 Application/xml-external-parsed-entity with UTF-16BE Charset 27 9.15 Application/xml-dtd . . . . . . . . . . . . . . . . . . . . 27Authors' Addresses9.16 Application/mathml|xml . . . . . . . . . . . . . . . . . . . 28 9.17 Application/xslt|xml . . . . . . . . . . . . . . . . . . . . 28A. Acknowledgement9.18 Application/rdf|xml . . . . . . . . . . . . . . . . . . . . 28 9.19 Image/svg|xml . .30 Full Copyright Statement. . . . . . . . . . . . . . . . . .31 ,. . . 28 10. Security Considerations . . . . . . . . . . . . . . . . . . 29 References . . . . . . . . . . . . . . . . . . . . . . . . . 32 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 34 A. Why Use the '|xml' Suffix for XML-Based MIME Types? . . . . 36 A.1 Why not just use text/xml or application/xml and let the XML processor dispatch to the correct application based on the Murata, et. al. ExpiresMay 31,September 30, 2000 [Page 2] Internet-Draft XML Media TypesDecember 1999 1. Introduction The World Wide Web Consortium (W3C)[20] has issued Extensible Markup Language (XML), version 1[10]. To enable the exchange of XML network entities, this document proposes fiveApril 2000 referenced DTD? . . . . . . . . . . . . . . . . . . . . . . 36 A.2 Why not create a newmedia types, text/xml, application/xml, text/xml-external-parsed-entity, application/xml-external-parsed-entity, and application/xml-dtd as well assubtree (e.g., image/xml.svg) to represent XML MIME types? . . . . . . . . . . . . . . . . . 36 A.3 Why not create anaming conventionnew top-level MIME type foridentifyingXML-basedMIMEmediatypes. XML entities are currently exchanged on the World Wide Web, and XML is also used for property values and parameter marshalling bytypes? . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 A.4 Why not just have theWebDAV protocol for remote web authoring. Thus, thereMIME processor 'sniff' the content to determine whether it is XML? . . . . . . . . . . . . . . . . 37 A.5 Why not use aneed forMIME parameter to specify that a media typeto properly label the exchange ofuses XMLnetwork entities. (Note that, as sometimes happens between two communities, bothsyntax? . . . . . . . . . . . . . . . . . . . . . . 37 A.6 How about labeling with parameters in the other direction (e.g., application/xml; Content-Feature=iotp)? . . . . . . . 38 A.7 How about a new superclass MIMEand XML haveparameter that is defined to apply to all MIME types (e.g., Content-Type: application/iotp; $superclass=xml)? . . . . . . . . . . . . 38 A.8 What about adding a new parameter to theterm entity, with different meanings.) AlthoughContent-Disposition header or creating a new Content-Structure header to indicate XMLissyntax? . . . . . . . . . . . . . . . . . . . . 39 A.9 How about asubset of the Standard Generalized Markup Language (SGML) ISO 8879[1], and currentlynew Alternative-Content-Type header? . . . . . . 39 A.10 How about using a conneg tag instead (e.g., accept-features: (syntax=xml))? . . . . . . . . . . . . . . . . . . . . . . . 39 A.11 How about a third-level content-type, such as text/xml/rdf? 39 A.12 What isassignedthemedia types text/sgmlsemantic difference between application/foo andapplication/sgml,application/foo|xml? . . . . . . . . . . . . . . . . . . . . 39 A.13 What happens when an even better markup language (e.g., EBML) is defined, or a new category of data? . . . . . . . . 40 A.14 Why must I use the '|xml' suffix for my new XML-based media type? . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 B. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 41 C. Revision History . . . . . . . . . . . . . . . . . . . . . . 42 Full Copyright Statement . . . . . . . . . . . . . . . . . . 43 Murata, et. al. Expires September 30, 2000 [Page 3] Internet-Draft XML Media Types April 2000 1. Introduction The World Wide Web Consortium has issued Extensible Markup Language (XML), version 1.0[XML]. To enable the exchange of XML network entities, this document standardizes five new media types, text/xml, application/xml, text/xml-external-parsed-entity, application/xml-external-parsed-entity, and application/xml-dtd as well as a naming convention for identifying XML-based MIME media types. XML entities are currently exchanged on the World Wide Web, and XML is also used for property values and parameter marshalling by the WebDAV[RFC2518] protocol for remote web authoring. Thus, there is a need for a media type to properly label the exchange of XML network entities. (Note that, as sometimes happens between two communities, both MIME and XML have defined the term entity, with different meanings.) Although XML is a subset of the Standard Generalized Markup Language (SGML) ISO 8879[SGML], and currently is assigned the media types text/sgml and application/sgml, there are several reasons why use of text/sgml or application/sgml to label XML is inappropriate. First, there exist many applications which can process XML, but which cannot process SGML, due to SGML's larger feature set. Second, SGML applications cannot always process XML entities, because XML uses features of recent technical corrigenda to SGML. Third, the definition of text/sgml and application/sgml in [RFC1874] includes parameters for SGML bit combination transformation format (SGML-bctf), and SGML boot attribute (SGML-boot). Since XML does not use these parameters, it would be ambiguous if such parameters were given for an XML MIME entity. For these reasons, the best approach for labeling XML network entities is to provide new media types for XML. Since XML is an integral part of the WebDAV Distributed Authoring Protocol, and since World Wide Web Consortium Recommendations have conventionally been assigned IETF tree media types, and since similar media types (HTML, SGML) have been assigned IETF tree media types, the XML media types also belong in the IETF media types tree. Similarly, XML will be used as a foundation for other media types, including types in every branch of the IETF media types tree. To facilitate the processing of such types, media types based on XML, but which are not identified using text/xml or application/xml, should be named using a suffix of '|xml' as described in Section 8. This will allow XML-based tools -- browsers, editors, search engines, and other processors -- to work with all XML-based media types. Murata, et. al. Expires September 30, 2000 [Page 4] Internet-Draft XML Media Types April 2000 2. Editor's Notes In the final version of this document, this section will be removed. It provides a listing of all the Editor's Notes appearing in this document. Notes still appear in the document in the section noted. Section 4.1 - [Editor's note: should we mandate this parameter? US-ASCII is not a good default, since it is not international. ISO-8859-1, which is used by HTML 1.1, is not international either. UTF-8 is international, but is not currently used by MIME or HTML specifications and implementations. By mandating this parameter, we do not have to choose one of these unsatisfactory possibilities and we can very strongly encourage the use of this parameter. On the other hand, those users who only need US-ASCII characters will be forced to specify the charset parameter.] Section 4.1 - [Editor's note: should we say anything about dispatching based on namespace URIs in this document?] Section 7 - [Editor's note: one of the authors have sent a comment to the XML Linking WG and requested not to allow default values for xml:base.] Section 8 - [Editor's note: the use of non-XPointer fragment identifiers by XML vocabularies like SVG and SMIL requires further discussion.] Section 10 - [Editor's note: some applications of XML may open up new security considerations. This issue needs further consideration.] Murata, et. al. Expires September 30, 2000 [Page 5] Internet-Draft XML Media Types April 2000 3. Notational Conventions The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. As defined in [RFC2781], the three charsets "utf-16", "utf-16le", and "utf-16be" are used to label UTF-16 text. In this document, "the UTF-16 family" refers to those three charsets. By contrast, the phrases "utf-16" or UTF-16 in this document refer specifically to the single charset "utf-16". Murata, et. al. Expires September 30, 2000 [Page 6] Internet-Draft XML Media Types April 2000 4. XML Media Types This document introduces five new media types for XML MIME entities, text/xml, application/xml, text/xml-external-parsed-entity, application/xml-external-parsed-entity, and application/xml-dtd. Registration information for these media types is described in the sections below. Within the XML specification, XML MIME entities can be classified into four types. In the XML terminology, they are called "document entities", "external DTD subsets", "external parsed entities", and "external parameter entities". The media types text/xml and application/xml can be used for "document entities", while text/xml-external-parsed-entity or application/xml-external-parsed-entity are appropriate for "external parsed entities". The media type application/xml-dtd can be used for "external DTD subsets" or "external parameter entities". For backward compatibility, application/xml and text/xml can also be used for "external parsed entities", "external DTD subsets", and "external parameter entities". Neither external DTD subsets nor external parameter entities parse as XML documents, and while some XML document entities may be used as external parsed entities and vice versa, there are many cases where the two are not interchangeable. XML also has unparsed entities, internal parsed entities, and internal parameter entities, but they are not XML MIME entities. If an XML document -- that is, the unprocessed, source XML document -- is readable by casual users, text/xml is preferable to application/xml. MIME user agents (and web user agents) that do not have explicit support for text/xml will treat it as text/plain, for example, by displaying the XML entity as plain text. Application/xml is preferable when the XML MIME entity is unreadable by casual users. Similarly, text/xml-external-parsed-entity is preferable when an external parsed entity is readable by casual users, but application/xml-external-parsed-entity is preferable when a plain text display is inappropriate. NOTE: Users are in general not used to text containing tags such as <price>, and often find such tags quite disorienting or annoying. If one is not sure, the conservative principle would suggest using application/* instead of text/* so as not to put information in front of users that they will quite likely not understand. The top-level media type "text" has some restrictions on MIME entities and they are described in [RFC2045] and [RFC2046]. In particular, the UTF-16 family, UCS-4, and UTF-32 are not allowed Murata, et. al. Expires September 30, 2000 [Page 7] Internet-Draft XML Media Types April 2000 (except over HTTP[RFC2616], which uses a MIME-like mechanism). Thus, if an XML document or external parsed entity is encoded in such character encoding schemes, it cannot be labeled as text/xml or text/xml-external-parsed-entity (except for HTTP). Text/xml and application/xml behave differently when the charset parameter is not explicitly specified. If the default charset (i.e., US-ASCII) for text/xml is inconvenient for some reason (e.g., bad WWW servers), application/xml provides an alternative (see "Optional parameters" of application/xml registration in Section 4.2). The same rules apply to the distinction between text/xml-external-parsed-entity and application/xml-external-parsed-entity. XML provides a general framework for defining sequences of structured data. In some cases, it may be desirable to define new media types that use XML but define a specific application of XML, perhaps due to domain-specific security considerations or runtime information. Furthermore, such media types may allow UTF-8 or UTF-16 only and prohibit other charsets. This document does not prohibit such media types and in fact expects them to proliferate. However, developers of such media types are STRONGLY RECOMMENDED to use this document as a basis for their registration. In particular, the charset parameter SHOULD be used in the same manner in order to enhance interoperability. 4.1 Text/xml Registration MIME media type name: text MIME subtype name: xml Mandatory parameters: none Optional parameters: charset [Editor's note: should we mandate this parameter? US-ASCII is not a good default, since it is not international. ISO-8859-1, which is used by HTML 1.1, is not international either. UTF-8 is international and is the IESG recommended charset for new protocols, but it is not currently used by MIME or HTML specifications and implementations. By mandating this parameter, we do not have to choose one of these unsatisfactory possibilities and we can very strongly encourage the use of this parameter. On the other hand, those users who only need US-ASCII characters would be forced to specify the charset parameter.] Although listed as an optional parameter, the use of the charset parameter is STRONGLY RECOMMENDED, since this information can be Murata, et. al. Expires September 30, 2000 [Page 8] Internet-Draft XML Media Types April 2000 used by XML processors to determine authoritatively the character encoding of the XML MIME entity. The charset parameter can also be used to provide protocol-specific operations, such as charset-based content negotiation in HTTP. "utf-8" (see [RFC2279]) is the recommended value, representing the UTF-8 charset. UTF-8 is supported by all conforming processors of [XML]. If the XML MIME entity is transmitted via HTTP, which uses a MIME-like mechanism that is exempt from the restrictions on the text top-level type (see section 19.4.1 of [RFC2616]), "utf-16" ([RFC2781]) is also recommended. UTF-16 is supported by all conforming processors of [XML]. Since the handling of CR, LF and NUL for text types in most MIME applications would cause undesired transformations of individual octets in UTF-16 multi-octet characters, gateways from HTTP to these MIME applications MUST transform the XML MIME entity from text/xml; charset="utf-16" to application/xml; charset="utf-16". Conformant with [RFC2046], if a text/xml entity is received with the charset parameter omitted, MIME processors and XML processors MUST use the default charset value of "us-ascii". In cases where the XML MIME entity is transmitted via HTTP, the default charset value is still "us-ascii". (Note: There is an inconsistency between this specification and HTTP/1.1, which uses ISO-8859-1 as the default for a historical reason. Since XML is a new format, a new default should be chosen for better I18N. US-ASCII was chosen, since it is the intersection of UTF-8 and ISO-8859-1 and since it is already used by MIME.) There are several reasonswhy use of text/sgml or application/sgml to label XMLthat the charset parameter isinappropriate.authoritative. First,there exist many applications which can process XML, but which cannot process SGML, duesome MIME processing engines do transcoding of MIME bodies of the top-level media type "text" without reference toSGML's larger feature set. Second, SGML applications cannot always process XML entities, because XML uses featuresany ofrecent technical corrigendathe internal content. Thus, it is possible that some agent might change a text/xml;charset="iso-2022-jp" toSGML. Third,text/xml;charset="utf-8" without modifying thedefinitionencoding declaration oftext/sgml and application/sgml in RFC 1874[4] includes parameters for SGML bit combination transformation format (SGML- bctf), and SGML boot attribute (SGML-boot). Since XML does not use these parameters, it would be ambiguous if such parameters were given foran XML document. Second, text/xml must be compatible with text/plain, since MIMEentity. For these reasons,agents that do not understand text/xml will fallback to handling it as text/plain. If thebest approachcharset parameter forlabeling XML network entitiestext/xml were not authoritative, such fallback would cause data corruption. Third, recent WWW servers have been improved so that users can specify the charset parameter. Fourth, [RFC2130] specifies that the recommended specification scheme isto provide new media types for XML.the "charset" parameter. SinceXMLthe charset parameter is authoritative, the charset is not always declared within anintegral part ofXML encoding declaration. Thus, special care is needed when theWebDAV Distributed Authoring Protocol, and since World Wide Web Consortium Recommendations have conventionally been assigned IETF tree media types, and since similar media types (HTML, SGML) have been assigned IETF tree media types,recipient strips theXML media types also belong inMIME header and provides persistent storage of theIETF media types tree. Similarly,received XMLwill be used as a foundation for other media types, including typesMIME entity (e.g., inevery branch of the IETF media types tree. To facilitatea file system). Unless theprocessing of such types, media types based on XML, but which are not identified using text/xmlcharset is UTF-8 orapplication/xml, should be named using a suffix of -xml. This will allow XML-based tools - browsers, editors, search engines, and other processors - to work with all XML-based media types. ,UTF-16, Murata, et. al. ExpiresMay 31,September 30, 2000 [Page3]9] Internet-Draft XML Media TypesDecember 1999 1.1 Editor's NotesApril 2000 the recipient SHOULD also persistently store information about the charset, perhaps by embedding a correct XML encoding declaration within the XML MIME entity. Encoding considerations: Thissection willmedia type MAY beremoved byencoded as appropriate for thefinal draft of this document. It provides a listingcharset and the capabilities ofalltheEditor's Notes appearingunderlying MIME transport. For 7-bit transports, data inthis document. Notes still appeareither UTF-8 or UTF-16 MUST be encoded inthe documentquoted-printable or base64. For 8-bit clean transport (e.g., 8BITMIME[RFC1652] ESMTP or NNTP[RFC0977]), UTF-8 is not encoded, but UTF-16 MUST be encoded inthe section noted. General - [Editor's note: should we replace 'external-parsed-entity' with 'epse'?] 3.1 -base64.For binary clean transports (e.g., HTTP[RFC2616]), no content-transfer-encoding is necessary. Security considerations: See Section 10. Interoperability considerations: XML has proven to be interoperable across WebDAV clients and servers, and for import and export from multiple XML authoring tools. Published specification: Extensible Markup Language (XML) 1.0[XML] Applications which use this media type: XML is device-, platform-, and vendor-neutral and is supported by a wide range of Web user agents, WebDAV[RFC2518] clients and servers, as well as XML authoring tools. [Editor's note: should we say anything about dispatching based on namespace URIs in this document?]3.2 - [Editor's note: should we say anything about dispatching basedAdditional information: Magic number(s): None. Although no byte sequences can be counted onnamespace URIsto always be present, XML MIME entities inthis document?] 4. - [Editor's note: some applicationsASCII-compatible charsets (including UTF-8) often begin with hexadecimal 3C 3F 78 6D 6C ("<?xml"), and those in UTF-16 often begin with hexadecimal FE FF 00 3C 00 3F 00 78 00 6D or FF FE 3C 00 3F 00 78 00 6D 00 (the Byte Order Mark (BOM) followed by "<?xml"). For more information, see Annex F ofXML may open up new security considerations. This issue needs[XML]. File extension(s): .xml Macintosh File Type Code(s): "TEXT" Person and email address for furtherconsideration.] 6. - [Editor's note:information: Murata Makoto (Family Given) <mura034@attglobal.net> Simon St.Laurent <simonstl@simonstl.com> Murata, et. al. Expires September 30, 2000 [Page 10] Internet-Draft XML Media Types April 2000 Daniel Kohn <dan@dankohn.com> Intended usage: COMMON Author/Change controller: The XML specification is a work product of the World Wide Web Consortium's XML Working Group, and was edited by: Tim Bray <tbray@textuality.com> Jean Paoli <jeanpa@microsoft.com> C. M. Sperberg-McQueen <cmsmcq@uic.edu> The W3C, and the W3C XML Core Working Group, have change control over the XML specification. 4.2 Application/xml Registration MIME media type name: application MIME subtype name: xml Mandatory parameters: none Optional parameters: charset Although listed as an optional parameter, the use ofnon-XPointer fragment identifiersthe charset parameter is STRONGLY RECOMMENDED, since this information can be used by XMLvocabularies like SVG and SMIL requires further discussion.] , et. al. Expires May 31, 2000 [Page 4] Internet-Draftprocessors to determine authoritatively the charset of the XMLMedia Types December 1999 2. Notational ConventionsMIME entity. Thekey words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY",charset parameter can also be used to provide protocol-specific operations, such as charset-based content negotiation in HTTP. "utf-8" (see [RFC2279]) and"OPTIONAL""utf-16" ([RFC2781]) are the recommended values, representing the UTF-8 and UTF-16 charsets, respectively. These charsets are preferred since they are supported by all conforming processors of [XML]. If an application/xml entity is received where the charset parameter is omitted, no information is being provided about the charset by the MIME Content-Type header. Conforming XML processors MUST follow the requirements in section 4.3.3 of [XML] that directly address thisdocumentcontingency. However, MIME processors that areto be interpreted as described in RFC 2119[8]. ,not XML processors should not assume a default charset if the charset parameter is omitted from an application/xml entity. Since the charset parameter is authoritative, the charset is not Murata, et. al. ExpiresMay 31,September 30, 2000 [Page5]11] Internet-Draft XML Media TypesDecember 1999 3. XML Media Types This document introduces five new media types forApril 2000 always declared within an XML encoding declaration. Thus, special care is needed when the recipient strips the MIMEentities, text/xml, application/xml, text/xml-external-parsed-entity, application/xml-external-parsed-entity,header andapplication/xml-dtd. Registration information for these media types are describedprovides persistent storage of the received XML MIME entity (e.g., in a file system). Unless thesections below. Withincharset is UTF-8 or UTF-16, the recipient SHOULD also persistently store information about the charset, perhaps by embedding a correct XMLspecification, XML MIME entities can be classified into four types. Inencoding declaration within the XMLterminology, they are called "document entities", "external DTD subsets", "external parsed entities", and "external parameter entities". The media types text/xml and application/xml can be usedMIME entity. Encoding considerations Same as those for"document entities", while "external parsed entities" require text/xml-external-parsed-entity or application/xml-external-parsed-entity. For backward compatibility, application/xml andtext/xmlcan also be used for "external parsed entities", "external DTD subsets",as described in Section 4.1. Security considerations: See Section 10. Interoperability considerations: Same as Section 4.1. Published specification: Same as Section 4.1. Applications which use this media type: Same as Section 4.1. Additional information: Same as Section 4.1. Person and"external parameter entities". Theemail address for further information: Same as Section 4.1. Intended usage: COMMON Author/Change controller: Same as Section 4.1. 4.3 Text/xml-external-parsed-entity Registration MIME media typeapplication/xml-dtd can be used for "external DTD subsets" or "externalname: text MIME subtype name: xml-external-parsed-entity Mandatory parameters: none Optional parameters: charset The charset parameterentities". Neither external DTD subsets norof text/xml-external-parsed-entity is handled the same as that of text/xml as described in Section 4.1. Encoding considerations: Same as Section 4.1. Security considerations: See Section 10. Interoperability considerations: XML externalparameterparsed entitiesparseare as interoperable as XML documents, though they have a less tightly constrained structure andwhile sometherefore need to be referenced by XMLdocument entities maydocuments for proper handling by XML processors. Similarly, XML Murata, et. al. Expires September 30, 2000 [Page 12] Internet-Draft XML Media Types April 2000 documents cannot be reliably used as external parsed entitiesand vice versa, there are many cases where the twobecause external parsed entities arenot interchangeable.prohibited from having standalone document declarations or DTDs. Identifying XMLalso has unparsed entities, internalexternal parsedentities,entities with their own content type should enhance interoperability of both XML documents andinternal parameter entities, but they are notXMLMIMEexternal parsed entities.If anWhen non-validating processors handle XMLdocument is readable by casual users, text/xml is preferable to application/xml. MIME user agents (and web user agents) thatdocuments, they do nothave explicit support for text/xml will treat italways read external parsed entities. Thus, interoperability is not guaranteed. Published specification: Same astext/plain,Section 4.1. Applications which use this media type: Same as Section 4.1. Additional information: Magic number(s): Same as Section 4.1. File extension(s): .xml or .ent Macintosh File Type Code(s): "TEXT" Person and email address forexample, by displaying the XML entityfurther information: Same asplain text. Application/xmlSection 4.1. Intended usage: COMMON Author/Change controller: Same as Section 4.1. 4.4 Application/xml-external-parsed-entity Registration MIME media type name: application MIME subtype name: xml-external-parsed-entity Mandatory parameters: none Optional parameters: charset The charset parameter of application/xml-external-parsed-entity ispreferable whenhandled the same as that of application/xml as described in Section 4.2. Encoding considerations: Same as Section 4.2. Security considerations: See Section 10. Interoperability considerations: Same as those for Murata, et. al. Expires September 30, 2000 [Page 13] Internet-Draft XMLMIME entity is unreadable by casual users. Similarly,Media Types April 2000 text/xml-external-parsed-entityis preferable when an external parsed entity is readable by casual users, but application/xml-external-parsed-entity is preferable when a plain text display is inappropriate. The top-levelas described in Section 4.3. Published specification: Same as text/xml as described in Section 4.1. Applications which use this media type: Same as Section 4.1. Additional information: Magic number(s): Same as Section 4.1. File extension(s): .xml or .ent Macintosh File Type Code(s): "TEXT" Person and email address for further information: Same as Section 4.1. Intended usage: COMMON Author/Change controller: Same as Section 4.1. 4.5 Application/xml-dtd Registration MIME media type"text" has some restrictions onname: application MIMEentities and they aresubtype name: xml-dtd Mandatory parameters: none Optional parameters: charset The charset parameter of application/xml-dtd is handled the same as that of application/xml as described inRFC 2045[5] and RFC 2046[6]. In particular, UTF-16, UCS-4,Section 4.2. Encoding considerations: Same as Section 4.2. Security considerations: See Section 10. Interoperability considerations: XML DTDs have proven to be interoperable by DTD authoring tools andUTF-32 are not allowed (except for HTTP, which uses a MIME-like mechanism). Thus, if anXMLdocument orbrowsers, among others. Note, however, that some XML processors do not read externalparsed entity is encoded in such character encoding schemes, it cannot be labled as text/xmlDTD subsets ortext/xml-external-parsed-entity (except for HTTP). Text/xml and application/xml behave differently when the charsetexternal parameter entities. Thus, interoperability is notexplicitly specified. If the default charset (i.e., US-ASCII) forguaranteed. Published specification: Same as text/xmlis inconvenient for some reason (e.g., bad WWW servers), application/xml provides an alternative (see ,as described in Section 4.1. Murata, et. al. ExpiresMay 31,September 30, 2000 [Page6]14] Internet-Draft XML Media TypesDecember 1999 "Optional parameters" of "3.2 Application/xml Registration"). The same rules apply to the distinction between text/xml-external-parsed-entity and application/xml-external-parsed-entity. XML provides a general framework for defining sequences of structured data. In some cases, it may be desirable to define new media typesApril 2000 Applications which use this media type: DTD authoring tools handle external DTD subsets as well as external parameter entities. XMLbut define a specific application of XML, perhaps due to domain-specific security considerationsbrowsers may also access external DTD subsets and external parameter entities. Additional information: Magic number(s): Same as Section 4.1. File extension(s): .dtd orruntime information. This document does not prohibit future media types dedicated.mod Macintosh File Type Code(s): "TEXT" Person and email address for further information: Same as Section 4.1. Intended usage: COMMON Author/Change controller: Same as Section 4.1. 4.6 Summary The following list applies tosuch XML applications. However, developers of suchtext/xml, text/xml-external-parsed-entity, and XML-based media typesare recommended to use this document as a basis. In particular, the charset parameter should be used inunder thesame manner. 3.1 Text/xml Registration MIME mediatop-level typename: text MIME subtype name: xml Mandatory parameters: none Optional parameters: charset Although listed as an optional parameter, the use of"text" that define the charset parameteris STRONGLY RECOMMENDED, since this information can be used by XML processorsaccording todetermine authoritatively the character encoding of the XML MIME entity. The charsetthis specification: o Charset parametercan also be used to provide protocol-specific operations, such as charset-based content negotiation in HTTP. "UTF-8" (see RFC 2279[9])isthestrongly recommendedvalue, representing the UTF-8 charset. UTF-8 is supported by all conforming processors of XML 1.0[10].o If theXML MIME entity is transmitted via HTTP, which uses a MIME-like mechanism thatcharset parameter isexempt from the restrictions onnot specified, thetext top- level type (see section 19.4.1 of RFC 2616[13])), "UTF-16" (Appendix C.3 of Unicode 3.0[14] and Amendment 1 of ISO/IEC 10646[2]) is also recommended. UTF-16default issupported by all conforming processors of XML 1.0[10] . Since the handling of CR, LF and NUL for text types in most MIME applications would cause undesired transformations"us-ascii". The default ofindividual octets"iso-8859-1" inUTF-16 multi-octet characters, gateways fromHTTP is explicitly overridden. o No error handling provisions. o An encoding declaration, if present, is irrelevant, but when saving a received resource as a file, the correct encoding declaration should be inserted. The next list applies tothese MIME applications MUST transformapplication/xml, application/xml-external-parsed-entity, application/xml-dtd, and XML-based media types under top-level types other than "text" that define theXML MIME entity from a text/xml; charset="utf-16"charset parameter according toapplication/xml; charset="utf-16". Conformant with RFC 2046[6], if a text/xml entitythis specification: o Charset parameter isreceived withstrongly recommended, and if present, it takes precedence. o If the charset parameter is omitted,MIME processors andconforming XML processors MUSTusefollow thedefault charset valuerequirements in section 4.3.3 of"us-ascii". In cases where the XML MIME entity is transmitted via HTTP, the default charset ,[XML]. Murata, et. al. ExpiresMay 31,September 30, 2000 [Page7]15] Internet-Draft XML Media TypesDecember 1999 value is still "us-ascii". (Note: There is an inconsistency between this specification and HTTP/1.1, which uses "ISO-8859-1" as the default for a historical reason. Since XML is a new format, a new default should be chosen for better I18N. "US-ASCII" was chosen asApril 2000 4.7 Referencing New media type registrations under theintersection of "UTF-8" and "ISO-8859-1".) One reason thattop-level type "text" SHOULD, in specifying the charset parameter, define it as: "Same as charset parameteris authoritative is that some MIME processing engines do transcoding of MIME bodiesofthe top-leveltext/xml as specified in RFC XXXX." New media type registrations under top-level types other than "text"without reference to any ofSHOULD, in specifying theinternal content. Thus,charset parameter, define itis possible that some agent might change a text/xml;charset=iso-2022-jp to text/xml;charset=UTF-8 without modifyingas: "Same as charset parameter of application/xml as specified in RFC XXXX." Murata, et. al. Expires September 30, 2000 [Page 16] Internet-Draft XML Media Types April 2000 5. The Byte Order Mark (BOM) and Conversions to/from theencoding declarationUTF-16 Charset Section 4.3.3 ofan[XML] specifies that XMLdocument. SinceMIME entities in the charsetparameter"utf-16" must begin with a byte order mark (BOM), which isauthoritative,a hexadecimal octet sequence 0xFE 0xFF (or 0xFF 0xFE, depending on endian). The XML Recommendation further states that thecharsetBOM isnot always declared withinanXMLencodingdeclaration. Thus, special caresignature, and isneeded whennot part of either therecipient stripsmarkup or theMIME header and provides persistent storagecharacter data of thereceivedXMLMIME entity (e.g., in a file system). Unless the charset is UTF-8 or UTF-16,document. Due to therecipient SHOULD also persistently store information aboutpresence of thecharset, perhaps by embedding a correctBOM, applications which convert XML from "utf-16" to a non-Unicode encodingdeclaration withinMUST strip theXML MIME entity. Encoding considerations: This media type MAYBOM before conversion. Similarly, when converting from another encoding into "utf-16", the BOM MUST beencoded as appropriate foradded after conversion is complete. In addition to the charsetand the capabilities of the underlying"utf-16", [RFC2781] introduces "utf-16le" (little endian) and "utf-16be" (big endian) as well. The BOM is prohibited for these charsets. When an XML MIMEtransport. For 7-bit transports, data in both UTF-8 and UTF-16entity is encoded inquoted- printable or base64. For 8-bit clean transport (e.g., 8BITMIME ESMTP"utf-16le" orNNTP), UTF-8 is"utf-16be", it MUST notencoded,begin with the BOM butUTF-16 is base64 encoded. For binary clean transports (e.g., HTTP), no content- transfer-encoding is necessary. Security considerations: See section 4 below. Interoperability considerations: XML has provenSHOULD contain an encoding declaration. Conversion from "utf-16" tobe interoperable across WebDAV clients and servers, and for import"utf-16be" or "utf-16le" andexport from multiple XML authoring tools. Published specification: see XML 1.0[10] Applications which use this media type:conversion in the other direction MUST strip or add the BOM, respectively. Murata, et. al. Expires September 30, 2000 [Page 17] Internet-Draft XML Media Types April 2000 6. Fragment Identifiers [RFC2396] notes that the semantics of a fragment identifier (the part of a URI after a "#") isdevice-, platform-,a property of the data resulting from a retrieval action, andvendor-neutralthat the format and interpretation of fragment identifiers issupporteddependent on the media type of the retrieval result. For documents labeled as text/xml or application/xml, the fragment identifier is an escaped XPointer represented in US-ASCII. An XPointer in UTF-8 is constructed from this escaped XPointer by converting %HH to awide rangebyte ofWeb user agents, WebDAV clients and servers, as well ,the hexadecimal value HH. XPointers are described in detail in [XPtr]; in particular, escaping (i.e., the use of %HH for non-allowed characters) is described in section 2.2. If an XML-based media type requires a fragment identifier format other than escaped XPointers, the media type SHOULD NOT follow the naming convention for XML-based media types (a suffix of '|xml'). Murata, et. al. ExpiresMay 31,September 30, 2000 [Page8]18] Internet-Draft XML Media TypesDecember 1999 as XML authoring tools. [Editor's note: should we say anything about dispatching based on namespace URIsApril 2000 7. The Base URI Section 5.1 of RFC 2396[RFC2396] specifies that the semantics of a relative URI reference embedded inthis document?] Additional information: Magic number(s): none Although no byte sequences can be counted on to always be present, XMLa MIMEentitiesentity is dependent on the base URI. The base URI is either (1) the base URI embedded inASCII-compatible charsets (including UTF-8) often begin with hexadecimal 3C 3F 78 6D 6C ("<?xml"). For more information, see Appendix Fthe MIME entity, (2) the base URI ofXML 1.0[10]. File extension(s): .xml Macintosh File Type Code(s): "TEXT" Person & email address forthe encapsulating MIME entity, (3) the URI used to retrieve the MIME entity, or (4) the application-dependent default base URI, where (1) has the highest precedence.[RFC2396] furtherinformation: Murata Makoto (Family Given) <mura034@attglobal.net> Simon St.Laurent <simonstl@simonstl.com> Intended usage: COMMON Author/Change controller:specifies that the mechanism for embedding the base URI is dependent on the media type. TheXML specificationmedia type dependent mechanism for embedding the base URI in a MIME entity of type text/xml, application/xml, text/xml-external-parsed-entity, or application/xml-external-parsed-entity is to use the xml:base attribute described in detail in [XBase]. Note that the base URI may be embedded in awork product ofdifferent MIME entity, since theWorld Wide Web Consortium's XML Working Group, and was edited by: Tim Bray <tbray@textuality.com> Jean Paoli <jeanpa@microsoft.com> C. M. Sperberg-McQueen <cmsmcq@uic.edu> The W3C, anddefault value for theW3C XML Core Working Group,xml:base attribute may be specified in an external DTD subset or external parameter entity, which is labeled as application/xml-dtd.[Editor's note: one of the authors havechange control oversent a comment to the XMLspecification. 3.2 Application/xml Registration MIME media type name: application MIME subtype name: xml Mandatory parameters: none Optional parameters: charset ,Linking WG and requested not to allow default values for xml:base.] Murata, et. al. ExpiresMay 31,September 30, 2000 [Page9]19] Internet-Draft XML Media TypesDecember 1999 Although listed as an optional parameter,April 2000 8. A Naming Convention for XML-Based Media Types This document recommends the use ofthe charset parameter is STRONGLY RECOMMENDED, since this information can be used by XML processors to determine authoritatively the charseta naming convention (a suffix ofthe XML'|xml') for identifying XML-based MIMEentity. The charset parameter can also be used to provide protocol-specific operations, such as charset-basedmedia types, whatever their particular contentnegotiation in HTTP. "UTF-8" (see RFC 2279[9]) and "UTF-16" (Appendix C.3may represent. This allows the use ofUnicode 3.0[14]generic XML processors andAmendment 1technologies on a wide variety ofISO/IEC 10646[2]) are the recommended values, representingdifferent XML document types at a minimum cost, using existing frameworks for media type registration. Although theUTF-8 and UTF-16 charsets, respectively. These charsets are preferred since they are supported by all conforming processorsuse of a suffix was not considered as part ofXML 1.0[10]. If an application/xml entity is received wherethecharset parameter is omitted, no informationoriginal MIME architecture, this choice isbeing provided aboutconsidered to provide thecharset bymost functionality with theMIME Content-Type header. Conforming XML processors MUST followleast potential for interoperability problems or lack of future extensibility. The alternatives to therequirements'|xml' suffix and the reason for its selection are described insection 4.3.3 ofAppendix A. As XML1.0[10] which directly address this contingency. However, MIME processors whichdevelopment continues, new XML document types arenotappearing rapidly. Many of these XMLprocessors should not assume a default charset if the charset parameter is omitteddocument types would benefit froman application/xml entity. Since the charset parameter is authoritative,thecharset is not always declared within an XML encoding declaration. Thus, special careidentification possibilities of a more specific MIME media type than text/xml or application/xml can provide, and it isneeded whenlikely that many new media types for XML-based document types will be registered in therecipient stripsnear and ongoing future. While the benefits of specific MIMEheader and provides persistent storagetypes for particular types ofthe receivedXMLMIME entity (e.g., in a file system). Unless the charsetdocuments are significant, all XML documents share common structures and syntax that make possible common processing. Some areas where 'generic' processing isUTF-8useful include: o Browsing - An XML browser can display any XML document with a provided [CSS] orUTF-16,[XSLT] style sheet, whatever therecipient SHOULD also persistently store information aboutvocabulary of that document. o Editing - Any XML editor can read, modify, and save any XML document. o Fragment identification - XPointers[XPtr] can work with any XML document, whatever vocabulary it uses and whether or not it uses XPointer for its own fragment identification. [Editor's note: thecharset, perhapsuse of non-XPointer fragment identifiers byembedding a correctXMLencoding declaration within thevocabularies like SVG and SMIL requires further discussion.] o Hypertext Linking - [XLink] hypertext linking is designed to connect any XMLMIME entity. Encoding considerations: This media type MAYdocuments, regardless of vocabulary. o Searching - XML-oriented search engines, web crawlers, agents, and query tools should beencoded as appropriate forable to read XML documents and extract thecharsetnames and content of elements and attributes even if thecapabilitiestools are ignorant of theunderlying MIME transport. For 7-bit transports, data in both UTF-8 and UTF-16 is encoded in quoted- printable or base64. For 8-bit clean transport (e.g., 8BITMIME ESMTP or NNTP), UTF-8 is not encoded, but UTF-16 is base64 encoded. For binary clean transport (e.g., HTTP), no content- transfer-encoding is necessary. Security considerations: See section 4 below. Interoperability considerations: XML has proven to be interoperableparticular vocabulary used forimport and export from multiple XML authoring tools. ,elements Murata, et. al. ExpiresMay 31,September 30, 2000 [Page10]20] Internet-Draft XML Media TypesDecember 1999 Published specification: seeApril 2000 and attributes. o Storage - XML-oriented storage systems, which keep XML documents internally in a parsed form, should similarly be able to process, store, and recreate any XML document. o Well-formedness and validity checking - An XML processor can confirm that any XML document is well-formed and that it is valid (i.e., conforms to its declared DTD or Schema). When a new media type is introduced for an XML-based format, the name of the media type SHOULD end with '|xml'. This convention will allow applications that can process XML generically to detect that the MIME entity is supposed to be an XML1.0[10] Applications which usedocument, verify thismedia type:assumption by invoking some XMLis device-, platform-, and vendor-neutralprocessor, andis supportedthen process the XML document accordingly. Applications may match for types that represent XML entities bya wide rangecomparing the subtype to the pattern '*/*|xml'. (Of course, 4 ofWeb user agents and XML authoring tools. [Editor's note: should we say anything about dispatching based on namespace URIsthe 5 media types defined in thisdocument?] Additional information: Magic number(s): none Although no byte sequences can be counted on to always be present,document -- text/xml, application/xml, text/xml-external-parsed-entity, and application/xml-external-parsed-entity -- also represent XMLMIMEentitiesin ASCII-compatible charsets (including UTF-8) often begin with hexadecimal 3C 3F 78 6D 6C ("<?xml"),while not conforming to the '*/*|xml' pattern.) NOTE: Section 14.1 of HTTP[RFC2616] does not support Accept headers of the form "Accept: */*|xml" andthoseso this header MUST NOT be used inUTF-16 often begin with hexadecimal FE FF 00 3C 00 3F 00 78 00 6D or FF FE 3C 00 3F 00 78 00 6D 00 (the Byte Order Mark (BOM) followed by "<?xml"). For more information, see Annex F ofthis way. Instead, content negotiation[RFC2703] could potentially be used if an XML-based MIME type is needed. XML1.0[10]. File extension(s): .xml Macintosh File Type Code(s): "TEXT" Person & email addressgeneric processing is not always appropriate forfurther information: SeeXML-based media types. For example, authors of some such media types may wish that the types remain entirely opaque except to applications that are specifically designed to deal with that media type. By NOT following the naming convention '|xml', such media types can avoid XML-generic processing. Since generic processing will be useful in many cases, however -- including in some situations that are difficult to predict ahead of time -- those registering media types SHOULD use the '|xml' convention unless they have a particularly compelling reason not to. The registrationof text/xml. Intended usage: COMMON Author/Change controller:process for these media types is described in [RFC2048]. Thesame asregistrar for theauthor/change controller of text/xml. 3.3 text/xml-external-parsed-entity Registration MIMEIETF tree will encourage new XML-based media typename: text MIME subtype name: xml-external-parsed-entity Mandatory parameters: none Optional parameters: charsetregistrations in the IETF tree to follow this guideline. Registrars for other trees SHOULD follow this convention in order to ensure maximum interoperability of their XML-based documents. Similarly, media subtypes that do not represent XML entities MUST NOT be allowed to register with a '|xml' suffix. The optional charset parameter SHOULD be used with XML-based media types and its use SHOULD be specified as described in Section 4.7. Murata, et. al. Expires September 30, 2000 [Page 21] Internet-Draft XML Media Types April 2000 The use oftext/xml-external-parsed-entitythe charset parameter ishandled exactlySTRONGLY RECOMMENDED, since this information can be used by XML processors to determine authoritatively thesame as thatcharset oftext/xml. ,the XML MIME entity. Murata, et. al. ExpiresMay 31,September 30, 2000 [Page11]22] Internet-Draft XML Media TypesDecember 1999 Encoding considerations:April 2000 9. Examples Theencoding considerationsexamples below give the value oftext/xml apply. Security considerations: See section 4 below. Interoperability considerations:the MIME Content-type header and the XMLexternal parsed entities are as interoperable asdeclaration (which includes the encoding declaration) inside the XMLdocuments, though they have a less tightly constrained structureMIME entity. For UTF-16 examples, the Byte Order Mark character is denoted as "{BOM}", andmust therefore be referenced by XML documents for proper handling bythe XMLprocessors. Similarly,declaration is assumed to come at the beginning of the XMLdocuments cannotMIME entity, immediately following the BOM. Note that other MIME headers may bereliably used as external parsed entities because external parsed entities are prohibited from usingpresent, and thestandalone declarationXML MIME entity may contain other data in addition to the XMLdeclaration. Identifying XML external parsed entitiesdeclaration; the examples focus on the Content-type header and the encoding declaration for clarity. 9.1 Text/xml withtheir own content type should enhance interoperability of both XML documentsUTF-8 Charset Content-type: text/xml; charset="utf-8" <?xml version="1.0" encoding="utf-8"?> This is the recommended charset value for use with text/xml. Since the charset parameter is provided, MIME and XMLexternal parsed entities. Since non-validatingprocessorsof XML 1.0 do not always read external parsed entities, interoperability is not guaranteed. Published specification: seemust treat the enclosed entity as UTF-8 encoded. If sent using a 7-bit transport (e.g. SMTP[RFC0821]), the XML1.0[10] Applications whichentity must usethis media type: Applicationsa content-transfer-encoding oftext/xmleither quoted-printable orapplication/xml may use external parsed entities. [Editor's note: should we say anything about dispatching based on namespace URIs in this document?] Additional information: Magic number(s): none The same as magic numbers for text/xml. File extension(s): .xml Macintosh File Type Code(s): "TEXT" Person & email address for further information: Seebase64. For an 8-bit clean transport (e.g., 8BITMIME ESMTP or NNTP), or a binary clean transport (e.g., HTTP), no content-transfer-encoding is necessary. 9.2 Text/xml with UTF-16 Charset Content-type: text/xml; charset="utf-16" {BOM}<?xml version='1.0' encoding='utf-16'?> or {BOM}<?xml version='1.0'?> This is possible only when the XML MIME entity is transmitted via HTTP, which uses a MIME-like mechanism and is a binary-clean protocol, hence does not perform CR and LF transformations and allows NUL octets. As described in [RFC2781], utf-16 charset may not be used with media types under theregistrationtop-level type "text" except over HTTP (see section 19.4.1 oftext/xml. ,[RFC2616] for details). Since HTTP is binary clean, no content-transfer-encoding is necessary. Murata, et. al. ExpiresMay 31,September 30, 2000 [Page12]23] Internet-Draft XML Media TypesDecember 1999 Intended usage: COMMON Author/Change controller: The same asApril 2000 9.3 Text/xml with UTF-16BE Charset Content-type: text/xml; charset="utf-16be" <?xml version='1.0' encoding='utf-16be'?> Observe that theauthor/change controller of text/xml. 3.4 application/xml-external-parsed-entity Registration MIME media type name: applicationBOM does not exist. This is again possible only when the XML MIMEsubtype name: xml-external-parsed-entity Mandatory parameters: none Optional parameters:entity is transmitted via HTTP. 9.4 Text/xml with ISO-2022-KR Charset Content-type: text/xml; charset="iso-2022-kr" <?xml version="1.0" encoding='iso-2022-kr'?> This example shows text/xml with a Korean charsetThe(e.g., Hangul) encoded following the specification in [RFC1557]. Since the charset parameterof application/xml-external-parsed-entityishandled exactlyprovided, MIME and XML processors must treat thesameenclosed entity asthat of application/xml. Encoding considerations: The encoding considerations of application/xml apply. Security considerations: See section 4 below. Interoperability considerations: The interoperability considerations of text/xml-external-parsed-entity apply. Published specification: see XML 1.0[10] Applications whichencoded per RFC 1557. Since ISO-2022-KR has been defined to usethis media type: Applicationsonly 7 bits of data, no content-transfer-encoding is necessary with any transport. 9.5 Text/xml with Omitted Charset Content-type: text/xml {BOM}<?xml version="1.0" encoding="utf-16"?> orapplication/xml may use external parsed entities. [Editor's note: should we say anything about dispatching based on namespace URIs in{BOM}<?xml version="1.0"?> This example shows text/xml with the charset parameter omitted. In thisdocument?] Additional information: Magic number(s): nonecase, MIME and XML processors must assume the charset is "us-ascii", the default charset value for text media types specified in [RFC2046]. Thesame as magic numbersdefault of "us-ascii" holds even if the text/xml entity is transported using HTTP. Omitting the charset parameter is NOT RECOMMENDED for text/xml.,For example, even if the contents of the XML MIME entity are UTF-16 or UTF-8, or the XML MIME entity has an explicit encoding declaration, XML and MIME processors must assume the charset is "us-ascii". 9.6 Application/xml with UTF-16 Charset Content-type: application/xml; charset="utf-16" {BOM}<?xml version="1.0" encoding="utf-16"?> Murata, et. al. ExpiresMay 31,September 30, 2000 [Page13]24] Internet-Draft XML Media TypesDecember 1999 File extension(s): .xml Macintosh File Type Code(s): "TEXT" Person & email addressApril 2000 or {BOM}<?xml version="1.0"?> This is a recommended charset value forfurther information: See the registration of text/xml. Intended usage: COMMON Author/Change controller: The same asuse with application/xml. Since theauthor/change controller of text/xml. 3.5 Application/xml-dtd Registration MIME media type name: application MIME subtype name: xml-dtd Mandatory parameters: none Optional parameters: charset Thecharset parameterof application/xml-dtdishandled exactlyprovided, MIME and XML processors must treat thesameenclosed entity asthat of application/xml. Encoding considerations: The encoding considerations of application/xml apply. Security considerations: See section 4 below. Interoperability considerations:UTF-16 encoded. If sent using a 7-bit transport (e.g., SMTP) or an 8-bit clean transport (e.g., 8BITMIME ESMTP or NNTP), the XMLDTDs has proven toMIME entity must beinteroperable by DTD authoring tools and XML WWW browsers among others. Published specification: see XML 1.0[10] Applications which use this media type: DTD authoring tools handle external DTD subsets as well as externalencoded in quoted-printable or base64. For a binary clean transport (e.g., HTTP), no content-transfer-encoding is necessary. 9.7 Application/xml with UTF-16BE Charset Content-type: application/xml; charset="utf-16be" <?xml version='1.0' encoding='utf-16be'?> Observe that the BOM does not exist. Since the charset parameterentities. XML browsers may also access external DTD subestsis provided, MIME andexternal parameter entities. Additional information: , et. al. Expires May 31, 2000 [Page 14] Internet-DraftXMLMedia Types December 1999 Magic number(s): none The same as magic numbers for application/xml. File extension(s): .dtd Macintosh File Type Code(s): "TEXT" Person & email address for further information: Seeprocessors must treat theregistration of text/xml. Intended usage: COMMON Author/Change controller: The sameenclosed entity as UTF-16BE encoded. 9.8 Application/xml with ISO-2022-KR Charset Content-type: application/xml; charset="iso-2022-kr" <?xml version="1.0" encoding="iso-2022-kr"?> This example shows application/xml with a Korean charset (e.g., Hangul) encoded following theauthor/change controller of text/xml. , et. al. Expires May 31, 2000 [Page 15] Internet-Draftspecification in [RFC1557]. Since the charset parameter is provided, MIME and XMLMedia Types December 1999 4. Security Considerations XML, as a subset of SGML, hasprocessors must treat thesame security considerationsenclosed entity asspecified inencoded per RFC1874[4]. [Editor's note: some applications of XML may open up new security considerations. This issue needs further consideration.] To paraphrase section 31557, independent ofRFC 1874[4],whether the XML MIMEentities contain informationentity has an internal encoding declaration (this example does show such a declaration, which agrees with the charset parameter). Since ISO-2022-KR has been defined tobe parseduse only 7 bits of data, no content-transfer-encoding is necessary with any transport. 9.9 Application/xml with Omitted Charset andprocessed by the recipient'sUTF-16 XMLsystem. These entities may contain and such systems may permit explicit system level commands to be executed while processingMIME Entity Content-type: application/xml {BOM}<?xml version='1.0' encoding="utf-16"?> or {BOM}<?xml version='1.0'?> Murata, et. al. Expires September 30, 2000 [Page 25] Internet-Draft XML Media Types April 2000 For this example, thedata. ToXML MIME entity begins with a BOM. Since theextent that ancharset has been omitted, a conforming XMLsystem will execute arbitrary command strings, recipientsprocessor follows the requirements of [XML], section 4.3.3. Specifically, the XMLMIME entities may be at risk. In general, it may be possible to specify commandsprocessor reads the BOM, and thus knows deterministically thatperform unauthorized file operations orthe charset is UTF-16. An XML-unaware MIME processor should makechanges tono assumptions about thedisplay processor's environment that affect subsequent operations. Usecharset of the XMLis expected to be varied,MIME entity. 9.10 Application/xml with Omitted Charset andwidespread. XML is under scrutiny by a wide range of communities for use as a common syntax for community-specific metadata. ForUTF-8 Entity Content-type: application/xml <?xml version='1.0'?> In this example, theDublin Core group is using XML for document metadata, and a new effortcharset parameter hasbegun whichbeen omitted, and there isconsidering use of XML for medical information. Other groups viewno BOM. Since there is no BOM, the XMLas aprocessor follows the requirements in section 4.3.3, and optionally applies the mechanismfor marshalling parameters for remote procedure calls. More usesdescribed in appendix F (which is non-normative) ofXML will undoubtedly arise. Security considerations will vary by domain[XML] to determine the charset encoding ofuse. For example,UTF-8. The XMLmedical records will have much more stringent privacy and security considerations thanentity does not contain an encoding declaration, but since the encoding is UTF-8, this is still a conforming XMLlibrary metadata. Similarly, useMIME entity. An XML-unaware MIME processor should make no assumptions about the charset of the XMLas aMIME entity. 9.11 Application/xml with Omitted Charset and Internal Encoding Declaration Content-type: application/xml <?xml version='1.0' encoding="iso-10646-ucs-4"?> In this example, the charset parametermarshalling syntax necessitates a case by case security review.has been omitted, and there is no BOM. However, the XMLmay alsoMIME entity does havesome ofan encoding declaration inside thesame security concerns as plain text. Like plain text,XMLcan contain escape sequences which, when displayed, haveMIME entity which specifies thepotential to changeentity's charset. Following thedisplay processor environmentrequirements inways that adversely affect subsequent operations. Possible effects include, but are not limited to, lockingsection 4.3.3, and optionally applying thekeyboard, changing display parameters so subsequent displayed text is unreadable, or even changing display parameters to deliberately obscure or distort subsequent displayed material so that its meaning is lost or altered. Display processors should either filter such material from displayed text or else make sure to reset all important settings after a given display operation is complete. Some terminal devices have keys whose output, when pressed, can be changed by sendingmechanism described in appendix F (non-normative) of [XML], thedisplayXML processora character sequence. If this is possibledetermines thedisplaycharset encoding ofa text object containing such character sequences could reprogram keys to perform some illicit or dangerous action when the key is subsequently pressed bytheuser. In some cases not only can keys be programmed, they can be triggered ,XML MIME entity (in this example, UCS-4). An XML-unaware MIME processor should make no assumptions about the charset of the XML MIME entity. 9.12 Text/xml-external-parsed-entity with UTF-8 Charset Content-type: text/xml-external-parsed-entity; charset="utf-8" <?xml encoding="utf-8"?> Murata, et. al. ExpiresMay 31,September 30, 2000 [Page16]26] Internet-Draft XML Media TypesDecember 1999 remotely, making it possible for a text display operation to directly perform some unwanted action. As such,April 2000 This is theability to program keys should be blocked either by filtering or by disablingrecommended charset value for use with text/xml-external-parsed-entity. Since theability to program keys entirely. Note that itcharset parameter isalso possible to construct XML documents which make use of whatprovided, MIME and XMLterms "entity references" (usingprocessors must treat the enclosed entity as UTF-8 encoded. If sent using a 7-bit transport (e.g. SMTP), the XMLmeaningentity must use a content-transfer-encoding of either quoted-printable or base64. For an 8-bit clean transport (e.g., 8BITMIME ESMTP or NNTP), or a binary clean transport (e.g., HTTP) no content-transfer-encoding is necessary. 9.13 Application/xml-external-parsed-entity with UTF-16 Charset Content-type: application/xml-external-parsed-entity; charset="utf-16" {BOM}<?xml encoding="utf-16"?> or {BOM}<?xml?> This is a recommended charset value for use with application/xml-external-parsed-entity. Since theterm "entity", which differs from thecharset parameter is provided, MIMEdefinition of this term), to construct repeated expansions of text. Recursive expansions are prohibited by XML 1.0[10]and XML processorsare required to detect them. However, even non-recursive expansions may cause problemsmust treat the enclosed entity as UTF-16 encoded. If sent using a 7-bit transport (e.g., SMTP) or an 8-bit clean transport (e.g., 8BITMIME ESMTP or NNTP), the XML MIME entity must be encoded in quoted-printable or base64. For a binary clean transport (e.g., HTTP), no content-transfer-encoding is necessary. 9.14 Application/xml-external-parsed-entity with UTF-16BE Charset Content-type: application/xml-external-parsed-entity; charset="utf-16be" <?xml encoding="utf-16be"?> Since the charset parameter is provided, MIME and XML processors must treat thefinite computing resources of computers, if they are performed many times. ,enclosed entity as UTF-16BE encoded. 9.15 Application/xml-dtd Content-type: application/xml-dtd; charset="utf-8" <?xml encoding="utf-8"?> Charset "utf-8" is a recommended charset value for use with Murata, et. al. ExpiresMay 31,September 30, 2000 [Page17]27] Internet-Draft XML Media TypesDecember 1999 5. The Byte Order Mark (BOM)April 2000 application/xml-dtd. Since the charset parameter is provided, MIME andConversions to/from UTF-16 TheXMLRecommendation, in section 4.3.3, specifies that UTF-16 XML MIME entitiesprocessors mustbegin with a byte order mark (BOM), which istreat theZERO WIDTH NO-BREAK SPACE character, hexadecimal sequence 0xFEFF (or 0xFFFE, depending on endian). Theenclosed entity as UTF-8 encoded. 9.16 Application/mathml|xml Content-type: application/mathml|xml <?xml version="1.0" ?> MathML documents are XMLRecommendation further states that the BOM is an encoding signature, and is not part of either the markup or the character data ofdocuments whose content describes mathematical information, as defined by [MathML]. As a format based on XML, MathML documents should use the '|xml' suffix convention in their MIME content-type identifier. 9.17 Application/xslt|xml Content-type: application/xslt|xml <?xml version="1.0" ?> Extensible Stylesheet Language (XSLT) documents are XMLdocument. Due to the BOM, applications which convertdocuments whose content describes stylesheets for other XMLfromdocuments, as defined by [XSLT]. As a format based on XML, XSLT documents should use theUTF-16 encoding to another encoding SHOULD strip'|xml' suffix convention in their MIME content-type identifier. 9.18 Application/rdf|xml Content-type: application/rdf|xml <?xml version="1.0" ?> RDF documents identified using this MIME type are XML documents whose content describes metadata, as defined by [RDF]. As a format based on XML, RDF documents should use theBOM before conversion. Similarly, when converting from another encoding into UTF-16,'|xml' suffix convention in their MIME content-type identifier. 9.19 Image/svg|xml Content-type: image/svg|xml <?xml version="1.0" ?> Scalable Vector Graphics (SVG) documents are XML documents whose content describes graphical information, as defined by [SVG]. As a format based on XML, SVG documents should use theBOM SHOULD be added after conversion is complete. ,'|xml' suffix convention in their MIME content-type identifier. Murata, et. al. ExpiresMay 31,September 30, 2000 [Page18]28] Internet-Draft XML Media TypesDecember 1999 6. A naming convention for XML-based media types This document proposes the use ofApril 2000 10. Security Considerations XML, as anaming convention (a suffixsubset of'-xml') for identifying XML-based MIME media types, whatever their particular contents may represent. This allows the useSGML, has all ofgeneric XML processorsthe same security considerations as specified in [RFC1874], andtechnologies on a wide varietylikely more, due to its expected ubiquitous deployment. To paraphrase section 3 ofdifferentRFC 1874, XMLdocument types at a minimum cost, using existing frameworks for media type registration. The use of a suffix convention is intendedMIME entities contain information toavoid interference withbe parsed and processed by theexisting MIME type structures. Asrecipient's XMLdevelopment continuessystem. These entities may contain and such systems may permit explicit system level commands todevelop, newbe executed while processing the data. To the extent that an XMLdocument types are appearing rapidly. Manysystem will execute arbitrary command strings, recipients oftheseXMLdocument types would benefit from the identification possibilities of a more specificMIMEmedia type than text/xml or application/xml can provide, andentities may be a risk. In general, itis likely that many new media types for XML-based document types willmay beregistered in the near and ongoing future. Whilepossible to specify commands that perform unauthorized file operations or make changes to thebenefitsdisplay processor's environment that affect subsequent operations. In general, any information stored outside ofspecific MIME types for particular typesthe direct control ofXML documents are significant, all XML documents share common structuresthe user -- including CSS style sheets, XSL transformations, entity declarations, andsyntax that make possible common processing. Some areas where 'generic' processing is useful include: o Browsing - An XML browserDTDs -- candisplay any XML document withbe aprovided CSS[12]source of insecurity, by either obvious orXSLT[19]subtle means. For example, a tiny "whiteout attack" modification made to a "master" stylesheet, whateversheet could make words in critical locations disappear in user documents, without directly modifying thevocabulary of that document. o Editing - Any XML editor can read, modify, and save any XML document. o Fragment identification - XPointers[16] can work with any XML document, whatever vocabulary it uses and whetheruser document ornotthe stylesheet ituses XPointer for its own fragment identification. [Editor's note:references. Thus, theusesecurity ofnon-XPointer fragment identifiers by XML vocabularies like SVG and SMIL requires further discussion.] o Hypertext Linking - XLink[17] hypertext linking is designed to connectany XMLdocuments, regardlessdocument is vitally dependent on all ofvocabulary. o Searching - Search engines, agents,the documents recursively referenced by that document. The entity lists andXML-oriented query tools should be ableDTDs for XHTML 1.0[XHTML], for instance, are likely toread XML documentsbe a commonly used set of information. Many developers will use andextracttrust them, few of whom will know much about thecontent and nameslevel ofelements and attributes even if theysecurity on the W3C's servers, or on any similarly trusted repository. The simplest attack involves adding declarations that break validation. Adding extraneous declarations to a list of character entities can effectively "break the contract" used by documents. A tiny change that produces a fatal error in a DTD could halt XML processing on a large scale. Extraneous declarations areignorant offairly obvious, but more sophisticated tricks, like changing attributes from being optional to required, can be difficult to track down. Perhaps theparticular vocabulary usedmost dangerous option available to crackers is redefining default values forelements and attributes. o Storage - XML-oriented storage systems, which keep XML documents internally inattributes: e.g., if developers have relied on defaulted attributes for security, aparsed form, should similarlyrelatively small change might expose enormous quantities of information. Apart from the structural possibilities, another option, "entity spoofing," can beableused toprocess, store,insert text into documents, vandalizing andrecreate any XML document. When a new media type is introduced forperhaps conveying anXML-based format,unintended message. Because XML 1.0 permits multiple entity declarations, and the,first declaration takes precedence, it's possible to insert malicious content where an Murata, et. al. ExpiresMay 31,September 30, 2000 [Page19]29] Internet-Draft XML Media TypesDecember 1999 name of the media type should end with "-xml". This convention will allow applications that can process XML generically to detect that the MIMEApril 2000 entity issupposed to be an XML document, verify this assumption by invoking some XML processor, and then process the XML document accordingly. Applications may match for types that represent XML entities by comparing the subtype to the pattern */*-xml. XML-generic processing is not always appropriate for XML-based media types. For example, some such media types may require fragment identifiers different from XPointer. By *not* following the naming convention */*-xml,used, suchmedia types can avoid XML-generic processing. The registration process for these media types is described in RFC 2048[7]. The registrar for the IETF tree will enforce this rule for all XML-based media types created inas by inserting theIETF tree. Registrars for other trees should follow this conventionCommunist Manifesto inorder to ensure maximum interoperabilityevery occurrence oftheir XML-based documents. Similarly, media subtypes that do not represent XML MIME entities should—. Use of the digital signatures work currently underway by the xmldsig working group may eventually ameliorate the dangers of referencing external documents notbe allowed to register with a -xml suffix. The suffix approach allowsunder one's own control. Use of XMLdocument typesis expected to beidentified within any subtree. The vendor subtree,varied, and widespread. XML is under scrutiny by a wide range of communities for use as a common syntax for community-specific metadata. For example, the Dublin Core[RFC2413] group islikely to includeusing XML for document metadata, and alarge numbernew effort has begun which is considering use ofXML-based document types. By usingXML for medical information. Other groups view XML as asuffix, rathermechanism for marshalling parameters for remote procedure calls. More uses of XML will undoubtedly arise. Security considerations will vary by domain of use. For example, XML medical records will have much more stringent privacy and security considerations thansetting upXML library metadata. Similarly, use of XML as aseparate subtree, those typesparameter marshalling syntax necessitates a case by case security review. XML mayremain inalso have some of the samelocationsecurity concerns as plain text. Like plain text, XML can contain escape sequences which, when displayed, have the potential to change the display processor environment in ways that adversely affect subsequent operations. Possible effects include, but are not limited to, locking thetree of MIME typeskeyboard, changing display parameters so subsequent displayed text is unreadable, or even changing display parameters to deliberately obscure or distort subsequent displayed material so thatthey wouldits meaning is lost or altered. Display processors should either filter such material from displayed text or else make sure to reset all important settings after a given display operation is complete. Some terminal devices haveoccupied had they not been based on XML. The optional charset parameter maykeys whose output, when pressed, can beused with media types following these conventions as described in this document for text/xml and application/xml.changed by sending the display processor a character sequence. Ifan XML-based media typethis isunderpossible the display of a texttop-level type,object containing such character sequences could reprogram keys to perform some illicit or dangerous action when thecharset parameterkey isauthoritative andsubsequently pressed by thedefault value is "US-ASCII". If an XML-based media type is under other top-level types,user. In some cases not only can keys be programmed, they can be triggered remotely, making it possible for a text display operation to directly perform some unwanted action. As such, thecharset parameterability to program keys should be blocked either by filtering or by disabling the ability to program keys entirely. Note that it isauthoritative and there are no default values. MIME processors which are notalso possible to construct XMLprocessors should not assume a default charset, while conformingdocuments which make use of what XMLprocessors MUST followterms "entity references" (using therequirements in section 4.3.3 ofXML1.0[10]. The usemeaning of thecharset parameter is STRONGLY RECOMMENDED, sinceterm "entity", which differs from the MIME definition of thisinformation can be usedterm), to construct repeated expansions of text. Recursive Murata, et. al. Expires September 30, 2000 [Page 30] Internet-Draft XML Media Types April 2000 expansions are prohibited by [XML] and XML processors are required todetermine authoritativelydetect them. However, even non-recursive expansions may cause problems with thecharsetfinite computing resources ofthe XML MIME entity. ,computers, if they are performed many times. Murata, et. al. ExpiresMay 31,September 30, 2000 [Page20]31] Internet-Draft XML Media Types April 2000 References [CSS] Bos, B., Lie, H.W., Lilley, C. and I. Jacobs, "Cascading Style Sheets, level 2 (CSS2) Specification", World Wide Web Consortium Recommendation REC-CSS2, May 1998, <http://www.w3.org/TR/REC-CSS2/>. [IOTP] Burdett, D., "Internet Open Trading Protocol - IOTP Version 1.0", draft-ietf-trade-iotp-v1.0-protocol-07.txt (work in progress). [MathML] Ion, P. and R. Miner, "Mathematical Markup Language (MathML) 1.01", World Wide Web Consortium Recommendation REC-MathML, July 1999, <http://www.w3.org/TR/REC-MathML/>. [PNG] Boutell, T., "PNG (Portable Network Graphics) Specification", World Wide Web Consortium Recommendation REC-png, October 1996, <http://www.w3.org/TR/REC-png>. [RDF] Lassila, O. and R.R. Swick, "Resource Description Framework (RDF) Model and Syntax Specification", World Wide Web Consortium Recommendation REC-rdf-syntax, February 1999, <http://www.w3.org/TR/REC-rdf-syntax/>. [RFC0821] Postel, J., "Simple Mail Transfer Protocol", RFC 821, August 1982. [RFC0977] Kantor, B. and P. Lapsley, "Network News Transfer Protocol", RFC 977, Februrary 1986. [RFC1557] Choi, U., Chon, K. and H. Park, "Korean Character Encoding for Internet Messages", RFC 1557, December1999 7. Examples The examples below give the value of the Content-type MIME header1993. [RFC1652] Klensin, J., Freed, N., Rose, M., Stefferud, E. andthe XML declaration (which includes the encoding declaration) inside the XML MIME entity. For UTF-16 examples, the Byte Order Mark character is denoted as "{BOM}",D. Crocker, "SMTP Service Extension for 8bit-MIMEtransport", RFC 1652, July 1994. [RFC1874] Levinson, E., "SGML Media Types", RFC 1874, December 1995. [RFC2045] Freed, N. andthe XML declaration is assumed to come at the beginningN. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part One: Format ofthe XML MIME entity, immediately following the BOM. Note that other MIME headers may be present,Internet Message Bodies", RFC 2045, November 1996. [RFC2046] Freed, N. andthe XML MIME entity may contain other data in addition to theN. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types", RFC 2046, November 1996. [RFC2048] Freed, N., Klensin, J. and J Postel, "Multipurpose Internet Mail Extensions (MIME) Part Four: Registration Murata, et. al. Expires September 30, 2000 [Page 32] Internet-Draft XMLdeclaration; the examples focus on the Content-type headerMedia Types April 2000 Procedures", RFC 2048, November 1996. [RFC2060] Crispin, M., "Internet Message Access Protocol - Version 4rev1", RFC 2060, December 1996. [RFC2077] Nelson, S.D., Parks, C. andthe encoding declarationMitra, "The Model Primary Content Type forclarity. 7.1 text/xml with UTF-8 Charset Content-type: text/xml; charset="utf-8" <?xml version="1.0" encoding="utf-8"?> This is the recommended charset valueMultipurpose Internet Mail Extensions", RFC 2077, January 1997. [RFC2119] Bradner, S., "Key words for usewith text/xml. Since the charset parameter is provided, MIMEin RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC2130] Weider, C., Cecilia Preston, C., Simonsen, K., Alvestrand, H., Atkinson, R., Crispin, M. andXML processors must treat the enclosed entity as UTF-8 encoded. If sent using a 7-bit transport (e.g. SMTP),P Svanberg, "The Report of theXML entity must useIAB Character Set Workshop held 29 February - 1 March, 1996", RFC 2130, April 1997. [RFC2279] Yergeau, F., "UTF-8, acontent-transfer-encodingtransformation format ofeither quoted-printable or base64. ForISO 10646", RFC 2279, January 1998. [RFC2396] Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform Resource Identifiers (URI): Generic Syntax.", RFC 2396, August 1998. [RFC2413] Weibel, S., Kunze, J., Lagoze, C. and M. Wolf, "Dublin Core Metadata for Resource Discovery", RFC 2413, September 1998. [RFC2445] Dawson, F. and D. Stenerson, "Internet Calendaring and Scheduling Core Object Specification (iCalendar)", RFC 2445, November 1998. [RFC2518] Goland, Y., Whitehead, E., Faizi, A., Carter, S. and D. Jensen, "HTTP Extensions for Distributed Authoring -- WEBDAV", RFC 2518, February 1999. [RFC2616] Fielding, R., Gettys, J., Mogul, J., Nielsen, H., Masinter, L., Leach, P. and T. Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999. [RFC2703] Klyne, G., "Protocol-independent Content Negotiation Framework", RFC 2703, September 1999. [RFC2781] Hoffman, P. and F. Yergeau, "UTF-16, an8-bit clean transport (e.g., 8BITMIME ESMTP or NNTP), or a binary clean transport (e.g., HTTP) no content-transfer-encoding is necessary. 7.2 text/xml with UTF-16 Charset Content-type: text/xml; charset="utf-16" {BOM}<?xml version='1.0' encoding='utf-16'?> This is possible only when theencoding of ISO 10646", RFC 2781, Februrary 2000. [SGML] International Standard Organization, "Information Processing -- Text and Office Systems -- Standard Generalized Markup Language (SGML)", ISO 8879, October Murata, et. al. Expires September 30, 2000 [Page 33] Internet-Draft XMLMIME entity is transmitted via HTTP, which uses a MIME-like mechanismMedia Types April 2000 1986. [SVG] Ferraiolo, J, "Scalable Vector Graphics (SVG)", World Wide Web Consortium Working Draft SVG, August 1999, <http://www.w3.org/TR/SVG>. [UML] Object Management Group, "OMG Unified Modeling Language Specification, Version 1.3", OMG Specification ad/99-06-08, June 1999, <http://www.omg.org/uml/>. [XBase] Marsh, J., "XML Base (XBase)", World Wide Web Consortium Working Draft xmlbase, February 2000, <http://www.w3.org/TR/xmlbase>. [XHTML] Pemberton, S and et al, "XHTML 1.0: The Extensible HyperText Markup Language", World Wide Web Consortium Recommendation xhtml1, December 1999, <http://www.w3.org/TR/xhtml1>. [XLink] DeRose, S., Maler, E., Orchard, D. andis a binary-clean protocol, hence does not perform CRB. Trafford, "XML Linking Language (XLink)", World Wide Web Consortium Working Draft xlink, July 1999, <http://www.w3.org/TR/xlink/>. [XML] Bray, T, Paoli, J andLF transformationsC.M. Sperberg-McQueen, "Extensible Markup Language (XML) 1.0", World Wide Web Consortium Recommendation REC-xml, February 1998, <http://www.w3.org/TR/REC-xml>. [XPtr] DeRose, S., Daniel Jr., R. andallows NUL octets. This differs from typical text MIME type processing (see section 19.4.1 of RFC 2616[13]) for details). Since HTTP is binary clean, no content-transfer-encoding is necessary. 7.3 text/xml with ISO-2022-KR Charset Content-type: text/xml; charset="iso-2022-kr" <?xml version="1.0" encoding='iso-2022-kr'?>E. Maler, "XML Pointer Language (XPointer)", World Wide Web Consortium Working Draft xptr, July 1999, <http://www.w3.org/TR/xptr>. [XSLT] Clark , J., "XSL Transformations (XSLT) Version 1.0", World Wide Web Consortium Recommendation xslt, November 1999, <http://www.w3.org/TR/xslt>. Authors' Addresses MURATA Makoto (FAMILY Given) Bridge Takatsu 201, 7-23, Sakado 3-chome, Takatsu-ku Kawasaki-shi, Kanagawa-ken 213-0012 Japan Phone: +81-44-833-5233 EMail: mura034@attglobal.net Murata, et. al. ExpiresMay 31,September 30, 2000 [Page21]34] Internet-Draft XML Media TypesDecember 1999 This example shows text/xml with a Korean charset (e.g., Hangul) encoded following the specification in RFC 1557[3]. SinceApril 2000 Simon St.Laurent 1259 Dryden Road Ithaca, New York 14850 USA EMail: simonstl@simonstl.com URI: http://www.simonstl.com/ Daniel Kohn 1445 120th Avenue NE Bellevue, Washington 98005 USA Phone: +1-425-602-6222 EMail: dan@dankohn.com URI: http://www.dankohn.com/ Murata, et. al. Expires September 30, 2000 [Page 35] Internet-Draft XML Media Types April 2000 Appendix A. Why Use thecharset parameter is provided,'|xml' Suffix for XML-Based MIMEand XML processors must treatTypes? Although theenclosed entity as encoded per RFC 1557[3]. Since ISO-2022-KR has been defined touseonly 7 bitsofdata, no content-transfer-encoding is necessary with any transport. 7.4 text/xml with Omitted Charset Content-type: text/xml {BOM}<?xml version="1.0" encoding="utf-16"?> This example shows text/xml witha suffix was not considered as part of thecharset parameter omitted. In this case,original MIMEand XML processors must assume the charsetarchitecture, this choice is"us-ascii", the default charset value for text media types specified in RFC 2046[6]. The default of "us-ascii" holds even ifconsidered to provide thetext/xml entity is transported using HTTP. Omittingmost functionality with thecharset parameter is NOT RECOMMENDEDleast potential fortext/xml. For example, even if the contents of the XML MIME entity are UTF-16 or UTF-8,interoperability problems or lack of future extensibility. The alternatives to theXML MIME entity has an explicit encoding declaration, XML'|xml' suffix andMIME processors must assumethecharset is "us-ascii". 7.5 application/xml with UTF-16 Charset Content-type: application/xml; charset="utf-16" {BOM}<?xml version="1.0"?> This is a recommended charset valuereason for its selection are described below. A.1 Why not just usewith application/xml. Since the charset parameter is provided, MIMEtext/xml or application/xml and let the XMLprocessors must treatprocessor dispatch to theenclosed entity as UTF-16 encoded. If sent using a 7-bit transport (e.g., SMTP),correct application based on theXML MIME entity must be encoded in quoted-printable or base64. For a binary clean transport (e.g., HTTP) or an 8-bit clean transport (e.g., 8BITMIME ESMTP or NNTP), no content-transfer-encoding is necessary. 7.6 application/xml with ISO-2022-KR Charset Content-type: application/xml; charset="iso-2022-kr" <?xml version="1.0" encoding="iso-2022-kr"?> This example showsreferenced DTD? text/xml and application/xmlwith a Korean charset (e.g., Hangul) encoded following the specificationremain useful in many situations, especially for document-oriented applications that involve combining XML with a stylesheet inRFC 1557[3]. Sinceorder to present thecharset parameterdata. However, XML isprovided, MIMEalso used to define entirely new data types, andXMLan XML-based format such as image/svg|xml fits the definition of a MIME media type exactly as well as image/png[PNG] does. Although extra functionality is available for MIME processorsmust , et. al. Expires May 31, 2000 [Page 22] Internet-Draftthat are also XMLMedia Types December 1999 treat the enclosed entityprocessors, XML-based media types -- even when treated asencoded per RFC 1557[3], independentopaque, non-XML media types -- are just as useful as any other media type and should be treated as such. Since MIME dispatchers work off ofwhethertheXMLMIMEentity has an internal encoding declaration (this example does show such a declaration, which agrees with the charset parameter). Since ISO-2022-KR has been defined totype, useonly 7 bitsofdata, no content-transfer-encoding is necessary with any transport. 7.7text/xml or application/xmlwith Omitted Charsetto label discrete media types will hinder correct dispatching andUTF-16general interoperability. Finally, many XML documents use neither DTDs nor namespaces, yet are perfectly legal XML. A.2 Why not create a new subtree (e.g., image/xml.svg) to represent XML MIMEentity Content-type: application/xml {BOM}<?xml version='1.0'?> For this example,types? The subtree under which a media type is registered -- IETF, vendor (*/vnd.*), or personal (*/prs.*); see [RFC2048] for details -- is completely orthogonal from whether the media type uses XMLMIME entity begins withsyntax or not. The suffix approach allows XML document types to be identified within any subtree. The vendor subtree, for example, is likely to include aBOM. Sincelarge number of XML-based document types. By using a suffix, rather than setting up a separate subtree, those types may remain in thecharset hassame location in the tree of MIME types that they would have occupied had they not beenomitted,based on XML. A.3 Why not create aconforming XML processor followsnew top-level MIME type for XML-based media types? The top-level MIME type (e.g., model/*[RFC2077]) determines what kind of content therequirementstype is, not what syntax it uses. For example, agents using image/* to signal acceptance of any image format should certainly be given access to media type image/svg|xml, which is in all respects a standard image subtype. It just happens to use XML1.0[10], section 4.3.3. Specifically, theto Murata, et. al. Expires September 30, 2000 [Page 36] Internet-Draft XMLprocessor readsMedia Types April 2000 describe its syntax. The two aspects of theBOM, and thus knows deterministically thatmedia type are completely orthogonal. XML-based data types will most likely be registered in ALL top-level categories (e.g., application/mathml|xml[MathML], model/uml|xml[UML], image/svg|xml[SVG]. A.4 Why not just have thecharset encoding is UTF-16. An XML-unawareMIME processorshould make no assumptions about'sniff' thecharset of the XML MIME entity. 7.8 application/xml with Omitted Charset and UTF-8 Entity Content-type: application/xml <?xml version='1.0'?> In this example,content to determine whether it is XML? Rather than explicitly labeling XML-based media types, thecharset parameter has been omitted,processor could look inside each type andtheresee whether or not it is XML. The processor could also cache a list of XML-based media types. Although this method might work acceptably for some mail applications, it would fail completely in many other uses of MIME. For instance, an XML-based web crawler would have noBOM. Since thereway of determining whether a file isno BOM, theXMLprocessor follows the requirements in section 4.3.3,except to fetch it andoptionallycheck. The same issue appliesthe mechanism describedinappendix F (which is non-normative)some IMAP4[RFC2060] mail applications, where the client first fetches the MIME type as part ofXML 1.0[10]the message structure and then decides whether todeterminefetch thecharset encoding of UTF-8. The XML entity does not contain an encoding declaration, but sinceMIME entity. Requiring these fetches just to determine whether theencoding is UTF-8, thisMIME type isstillXML could have significant bandwidth and latency disadvantages in many situations. Sniffing XML also isn't as simple as it might seem. DOCTYPE declarations aren't required, and they can appear fairly deep into aconformingdocument under certain unpreventable circumstances. (E.g., the XMLMIME entity. An XML-unaware MIME processor should make no assumptions aboutdeclaration, comments, and processing instructions can occupy space before thecharsetDOCTYPE declaration.) Even sniffing the DOCTYPE isn't completely reliable, thanks to a variety of issues involving default values for namespaces within external DTDs and overrides inside the internal DTD. Finally, the variety in potential character encodings (something XML provides tools to deal with), also makes reliable sniffing less likely. A.5 Why not use a MIMEentity. 7.9 application/xml with Omitted Charset and Internal Encoding Declaration Content-type: application/xml <?xml version='1.0' encoding="ISO-10646-UCS-4"?> In thisparameter to specify that a media type uses XML syntax? For example, one could use "Content-Type: application/iotp; alternate-type=text/xml" or "Content-Type: application/iotp; syntax=xml". Section 5 of [RFC2045] says that "Parameters are modifiers of thecharset parameter has been omitted,media subtype, andthere is no BOM.as such do not fundamentally affect the nature of the content". However, all XML-based media types are by their nature always XML. Parameters, as they have been defined in theXMLMIMEentity does have an encoding ,architecture, are never invariant across all instantiations of a media type. Murata, et. al. ExpiresMay 31,September 30, 2000 [Page23]37] Internet-Draft XML Media TypesDecember 1999 declaration inside the XMLApril 2000 More practically, very few if any MIMEentity which specifies the entity's charset. Following the requirements in section 4.3.3,dispatchers andoptionally applyingother MIME agents support dispatching off of a parameter. While MIME agents on themechanism describedreceiving side will need to be updated inappendix F (non-normative) ofeither case to support (or fall back to) generic XML1.0[10],processing, it has been suggested that it is easier to implement this functionality when acting off of the</section> XML processor determinesmedia type rather than a parameter. More important, sending agents require no update to properly tag an image as "image/svg|xml", but few if any sending agents currently support always tagging certain content types with a parameter. A.6 How about labeling with parameters in the other direction (e.g., application/xml; Content-Feature=iotp)? This proposal fails under thecharset encodingsimplest case, of a user with neither knowledge oftheXML nor an XML-capable MIMEentity (in this example, UCS-4). An XML-unawaredispatcher. In that case, the user's MIMEprocessordispatcher is likely to dispatch the content to an XML processing application when the correct default behavior shouldmake no assumptions aboutbe to dispatch thecharset ofcontent to theXML MIME entity. 7.10 text/xml-external-parsed-entity with UTF-8 Charset Content-type: text/xml-external-parsed-entity; charset="utf-8" <?xml encoding="utf-8"?> This isapplication responsible for therecommended charset valuecontent type (e.g., an ecommerce engine foruse with text/xml-external-parsed-entity. Since[IOTP]). Note that even if thecharset parameter is provided, MIMEuser had already installed the appropriate application (e.g., the ecommerce engine), andXML processors must treatthat installation had updated theenclosed entityMIME registry, many operating system level MIME registries such asUTF-8 encoded. If sent using.mailcap in Unix and HKEY_CLASSES_ROOT in Windows do not currently support dispatching off a7-bit transport (e.g. SMTP),parameter, and cannot easily be upgraded to do so. And, even if theXML entity must use a content-transfer-encoding of either quoted-printable or base64. For an 8-bit clean transport (e.g., 8BITMIME ESMTP or NNTP), or a binary clean transport (e.g., HTTP) no content-transfer-encoding is necessary. 7.11 application/xml-external-parsed-entity with UTF-16 Charset Content-type: application/xml-external-parsed-entity; charset="utf-16" {BOM}<?xml?> This isoperating system were upgraded to support this, each MIME dispatcher would also separately need to be upgraded. A.7 How about arecommended charset value for use with application/xml-external-parsed-entity. Since the charsetnew superclass MIME parameter that isprovided,defined to apply to all MIMEand XML processors must treattypes (e.g., Content-Type: application/iotp; $superclass=xml)? This combines theenclosed entity as UTF-16 encoded.problems of Appendix A.5 and Appendix A.6. Ifsent usingthe sender attaches an image/svg|xml file to a7-bit transport (e.g., SMTP) ormessage and includes the instructions "Please copy the French text on the road sign", someone with an8-bit clean transport (e.g., 8BITMIME ESMTP or NNTP),XML-aware MIME client and an XML browser but no support for SVG can still probably open theXML MIME entityfile and copy the text. By contrast, with superclasses, the sender must add superclass support to her existing mailer AND the receiver must add superclass support to his before this transaction can work correctly. If the receiver comes to rely on the superclass tag being present and applications are deployed relying on that tag (as always seems to happen), then only upgraded senders will beencoded in quoted-printable or base64. For a binary clean transport (e.g., HTTP), no content-transfer-encoding is necessary. 7.12 application/xml-dtd Content-type: application/xml-dtd; charset="utf-8" <?xml version="1.0" encoding="utf-8"?> ,able to interoperate with those receiving applications. Murata, et. al. ExpiresMay 31,September 30, 2000 [Page24]38] Internet-Draft XML Media TypesDecember 1999 Charset "utf-8" isApril 2000 A.8 What about adding arecommended charset value for use with application/xml-dtd. Since the charsetnew parameter to the Content-Disposition header or creating a new Content-Structure header to indicate XML syntax? This has nearly identical problems to Appendix A.7, in that it requires both senders and receivers to be upgraded, and few if any operating systems and MIME dispatchers support working off of anything other than the MIME type. A.9 How about a new Alternative-Content-Type header? This isprovided,better than Appendix A.8, in that no extra functionality needs to be added to a MIME registry to support dispatching of information other than standard content types. However, it still requires both sender andXML processors must treatreceiver to be upgraded, and it will also fail in many cases (e.g., web hosting to an outsourced server), where theenclosed entity as UTF-8 encoded. 7.13 application/mathml-xml Content-type: application/mathml-xml <?xml version="1.0" ?> MathML documents are XML documents whose content describes mathematical information, as described by MathML 1.01[15]. Asuser can set MIME types (often through implicit mapping to file extensions), but has no way of adding arbitrary HTTP headers. A.10 How about using aformat based on XML, MathML documents should useconneg tag instead (e.g., accept-features: (syntax=xml))? When the-xml suffix convention inconneg protocol is fully defined, this may potentially be a reasonable thing to do. But given the limited current state of conneg[RFC2703] development, it is not a credible replacement for a MIME-based solution. Also, note that adding a content-type parameter doesn't work with conneg either, since conneg only deals with media types, not their parameters. This is another illustration of the limits of parameters for MIMEcontent-type identifier. 7.14 application/XSLT-xml Content-type: application/XSLT-xml <?xml version="1.0" ?> Extensible Stylesheet Language (XSLT) documents are XML documents whosedispatchers. A.11 How about a third-level content-type, such as text/xml/rdf? MIME explicitly defines two levels of contentdescribes stylesheetstype, the top-level forother XML documents, as described by XSLT[19]. As a format based on XML, XSLT documents should usethe-xml suffix conventionkind of content and the second-level for the specific media type. [RFC2048] extends this intheir MIME content-type identifier. 7.15 application/rdf-xml Content-type: application/rdf-xml <?xml version="1.0" ?> RDF documents identifiedan interoperable way by usingthis MIMEprefixes to specify separate trees for IETF, vendor, and personal registrations. This specification also extends the two-level typeare XML documents whose content describes mathematical information, as describedbyRDF[11]. RDF documents that use a format based on XML should useusing the-xml suffix convention in their MIME content-type identifier. 7.16 image/svg-xml Content-type: image/svg-xml <?xml version="1.0" ?> Scalable Vector Graphics (SVG) documents'|xml' suffix. In both cases, processors that areXML documents whose content describes graphical information,unaware of these later specifications treat them asdescribed by SVG[18]. Asopaque and continue to interoperate. By contrast, adding aformat based on XML, SVG documents should usethird-level type would break the-xml suffix convention in theircurrent MIMEcontent-type identifier. , et. al. Expires May 31, 2000 [Page 25] Internet-Draft XML Media Types December 1999 8. Revision History draft-murata-00: Application/xml-dtd, a naming convention (*/*-xml), and examples (application/mathml-xml, application/XSLT-xml, application/rdf-xml, and image/svg-xml) are added. draft-murata-01: When text/xml is more appropriate than application/xmlarchitecture andvice versa. draft-murata-02: Replaced "(e.g., ESMTP, 8BITMIME, or NNTP)" with "(e.g., 8BITMIME ESMTP or NNTP)"; transcoding without revising encoding declarationscause numerous interoperability failures. A.12 What ismentioned;thechoice of "US-ascii" as the default is explained. text/xml-external-parsed-entitysemantic difference between application/foo andapplication/xml-external-parsed-entityapplication/foo|xml? MIME processors that areadded. Examplesunaware ofthese two media types are added (7.10 and 7.11). References are updated. ,XML will treat the '|xml' suffix as completely opaque, so it is essential that no extra semantics be Murata, et. al. ExpiresMay 31,September 30, 2000 [Page26]39] Internet-Draft XML Media TypesDecember 1999 References [1] International Standard Organization, "Information Processing -- Text and Office Systems -- Standard Generalized Markup Language (SGML).", ISO 8879, October 1986. [2] International Standard Organization/International Electrotechnical Commission, "Information Technology - Universal Multiple- Octet Coded Character Set (UCS) - Part 1: Architecture and Basic Multilingual Plane. Several amendments and technical corrigenda have been published upApril 2000 assigned tonow. Other amendments are currently at various stages of standardization.", ISO/IEC 10646, May 1993. [3] Choi, U., Chon, K.its presence. Therefore, application/foo andH. Park, "Korean Character Encodingapplication/foo|xml SHOULD be treated as completely independent media types. Although, forInternet Messages", RFC 1557, December 1993. [4] Levinson, E., "SGML Media Types", RFC 1874, December 1995. [5] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part One: Formatexample, text/calendar|xml could be an XML version ofInternet Message Bodies", RFC 2045, November 1996. [6] Freed, N.text/calendar[RFC2445], it is possible that this (hypothetical) new media type would include new semantics as well as new syntax, andN. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types", RFC 2046, November 1996. [7] Freed, N., Klensin, J.in any case, there would be many applications that support text/calendar but had not yet been upgraded to support text/calendar|xml. A.13 What happens when an even better markup language (e.g., EBML) is defined, or a new category of data? In the ten years that MIME has existed, XML is the first generic data format that has seemed to justify special treatment, so it is hoped that no further suffixes will be necessary. However, if some are later defined, andJ Postel, "Multipurpose Internet Mail Extensions (MIME) Part Four: Registration Procedures", RFC 2048, November 1996. [8] Bradner, S., "Key wordsthese documents were also XML, they would need to specify that the '|xml' suffix is always the outermost suffix (e.g., application/foo|ebml|xml not application/foo|xml|ebml). If they were not XML, then they would use a regular suffix (e.g., application/foo|ebml). A.14 Why must I use the '|xml' suffix for my new XML-based media type? You don't have to, but unless you have a good reason to explicitly disallow generic XML processing, you should usein RFCsthe suffix so as not toIndicate Requirement Levels", BCP 14, RFC 2119, March 1997. [9] Yergeau, F., "UTF-8, a transformation formatcurtail the options ofISO 10646", RFC 2279, January 1998. [10] Bray, T, Paoli, J and C.M. Sperberg-McQueen, "Extensible Markup Language (XML) 1.0", World Wide Web Consortium Recommendation REC-xml-19980210. http://www.w3.org/TR/1998/REC-xml-19980210, February 1998. [11] Lassila, O. and R.R. Swick, "Resource Description Framework (RDF) Model and Syntax Specification", World Wide Web Consortium Recommendation REC-rdf-syntax-19990222. http://www.w3.org/TR/1999/REC-rdf-syntax-19990222, February 1999. [12] Bos, B., Lie, H.W., Lilley, C.future users andI. Jacobs, "Cascading Style , et. al. Expires May 31, 2000 [Page 27] Internet-Draftdevelopers. Whether the inventors of a media type, today, design it for dispatch to generic XMLMedia Types December 1999 Sheets, level 2 (CSS2) Specification", World Wide Web Consortium Recommendation REC-CSS2-19980512 http://www.w3.org/TR/1998/REC-CSS2-19980512, May 1998. [13] Fielding, R., Gettys, J., Mogul, J., Nielsen, H., Masinter, L., Leach, P. and T. Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999. [14]processing machinery (and most won't) is not the critical issue. TheUnicode Consortium, "The Unicode Standard, Version 3.0 (ISBN 0-201-61633-5)", September 1999. [15] Ion, P.core notion is that the knowledge that some media type happens to use XML syntax opens the door to unanticipated kinds of processing beyond those envisioned by its inventors, andR. Miner, "Mathematical Markup Language (MathML) 1.01", World Wide Web Consortium Recommendation REC-MathML-19980407; revised 19990707. http://www.w3.org/1999/07/REC-MathML-19990707, July 1999. [16] DeRose, S.on this basis identifying such encoding is a good and useful thing. Developers of new media types are often tightly focused on a particular type of processing that meets current needs. But there is no need to rule out generic processing as well, which could make your media type more valuable over time. It is believed that registering with the '|xml' suffix will cause no interoperability problems whatsoever, while it may enable significant new functionality andR. Daniel Jr., "XML Pointer Language (XPointer)", World Wide Web Consortium Working Draft. http://www.w3.org/1999/07/WD-xptr-19990709, July 1999. [17] DeRose, S., Orchard, D.interoperability now andB. Trafford, "XML Linking Language (XLink)", World Wide Web Consortium Working Draft WD-xlink-19990726 http://www.w3.org/1999/07/WD-xlink-19990726, July 1999. [18] Ferraiolo, J, "Scalable Vector Graphics (SVG)", World Wide Web Consortium Working Draft. http://www.w3.org/1999/08/WD-SVG-19990812/, August 1999. [19] Clark , J., "XSL Transformations (XSLT) Version 1.0", World Wide Web Consortium Recommendation REC-xslt-19991116. http://www.w3.org/TR/1999/REC-xslt-19991116, November 1999. [20] http://www.w3.org/ Authors' Addresses ,in the future. So, the conservative approach is to include the '|xml' suffix. Murata, et. al. ExpiresMay 31,September 30, 2000 [Page28]40] Internet-Draft XML Media TypesDecember 1999 MURATA Makoto (FAMILY Given) Fuji Xerox Information Systems KSP 9A7, 2-1, Sakado 3-chome, Takatsu-ku Kawasaki-shi, Kanagawa-ken 213-0012 Japan Phone: +81-44-812-7230 Fax: +81-44-812-7231 EMail: mura034@attglobal.net URI: http://www.fxis.co.jp/DMS/sgml/ Simon St.Laurent 126 Birchwood Drive #2 Ithaca, New York 14850 US EMail: simonstl@simonstl.com URI: http://www.simonstl.com/ , et. al. Expires May 31,April 2000[Page 29] Internet-Draft XML Media Types December 1999AppendixA. AcknowledgementB. Acknowledgements Chris Newman and Yaron Y. Goland both contributed content to the security considerations section of this document. In particular, some text in the security considerations section is copied verbatim from work in progress, draft-newman-mime-textpara-00, by permission of the author. Chris Newman additionally contributed content to the encoding considerations sections. Dan Connolly contributed content discussing when to use text/xml. Discussions with Ned Freed and Dan Connolly helped refine theauthor'sauthors' understanding of the text media type; feedback from Larry Masinter was also very helpful in understanding media type registration issues. Members of the W3C XML Working Group and XML Special Interest group have made significant contributions to this document, and the authors would like to specially recognize James Clark, Martin Duerst, Rick Jelliffe, Gavin Nicol for their many thoughtful comments.,Ned Freed was particularly helpful in evaluating how the '|xml' suffix interacted with the MIME architecture as compared to alternative proposals. David Megginson's presentation to XTech 2000 was instrumental in raising XML security concerns within the XML community. The description of the problems of relying on external documents are based on his examples. Murata, et. al. ExpiresMay 31,September 30, 2000 [Page30]41] Internet-Draft XML Media TypesDecember 1999April 2000 Appendix C. Revision History [To be deleted before publication.] draft-murata-00: Application/xml-dtd, a naming convention (*/*|xml), and examples (application/mathml|xml, application/xslt|xml, application/rdf|xml, and image/svg|xml) are added. draft-murata-01: When text/xml is more appropriate than application/xml and vice versa. draft-murata-02: Replaced "(e.g., ESMTP, 8BITMIME, or NNTP)" with "(e.g., 8BITMIME ESMTP or NNTP)"; transcoding without revising encoding declarations is mentioned; the choice of "us-ascii" as the default is explained draft-murata-03: fragment identifiers for text/xml and application/xml are escaped XPointers (section 6); the base URI may be embedded in text/xml, application/xml, text/xml-external-parsed-entity, or application/xml-external-parsed-entity (section 7); utf-16le and utf-16be are mentioned (sections 5 and 9); appendix comparing alternatives to '|xml' suffix added; added security concerns; Lots of minor editing. Murata, et. al. Expires September 30, 2000 [Page 42] Internet-Draft XML Media Types April 2000 Full Copyright Statement Copyright (C) The Internet Society(1999).(2000). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Acknowledgement Funding for the RFC editor function is currently provided by the Internet Society.,Murata, et. al. ExpiresMay 31,September 30, 2000 [Page31]43] ----