draft-kunze-dc-01.txt  -->   draft-kunze-dc-02.txt

view Side-By-Side changes

Dublin Core Workshop Series                                     S. Weibel
Internet-Draft                                                   J. Kunze
draft-kunze-dc-01.txt
draft-kunze-dc-02.txt                                           C. Lagoze
27 August 1997
10 February 1998
Expires in six months


          Dublin Core Metadata for Simple Resource Discovery


1. Status of this Document

This document is an Internet-Draft.  Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas, and
its working groups.  Note that other groups may also distribute working
documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time.  It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as ``work in progress.''

To learn the current status of any Internet-Draft, please check the
``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow
Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
ftp.isi.edu (US West Coast).

Distribution of this document is unlimited.  Please send comments
to weibel@oclc.org, or to the discussion list meta2@mrrl.lut.ac.uk.


2. Introduction

Finding relevant information on the World Wide Web has become
increasingly problematic in proportion to the explosive growth of
networked resources.  Current Web indexing evolved rapidly to fill the
demand for resource discovery tools, but that indexing, while useful,
is a poor substitute for richer varieties of resource description.

An invitational workshop held in March of 1995 brought together
librarians, digital library researchers, and text-markup specialists
to address the problem of resource discovery for networked resources.
This activity evolved into a series of related workshops and ancillary
activities that have become known collectively as the Dublin Core Metadata
Workshop Series.  

The goals that motivate the Dublin Core effort are:

    - Simplicity of creation and maintenance
    - Commonly understood semantics
    - International scope and applicability
    - Extensibility
    - Interoperability among collections and indexing systems

These requirements work at cross purposes to some degree, but all are
desirable goals.  Much of the effort of the Workshop Series has been
directed at minimizing the tensions among these goals.

One of the primary deliverables of this effort is a set of elements
that are judged by the collective participants of these workshops
to be the core elements for cross-disciplinary resource discovery.
The term ``Dublin Core'' applies to this core of descriptive elements.

Early experience with Dublin Core deployment has made clear the need
to support additional qualification of elements for some applications.
Thus, Dublin Core elements may be expressed in simple unqualified ways
that minimal discovery and retrieval tools can use, or they may be
expressed with additional structure to support semantics-sharpening
qualifiers that minimal tools can safely ignore but that more complex
tools can employ to increase discovery precision.

The broad agreements about syntax and semantics that have emerged from
the workshop series will be expressed in a series of five Informational
RFCs, of which this document is the first.  These RFCs (currently they
are Internet-Drafts) will comprise the following documents.

2.1. Dublin Core Metadata for Simple Resource Discovery

An introduction to the Dublin Core and a description of the intended semantics
of the 15-element Dublin Core element set without qualifiers.
This is the present document.

2.2. Encoding Dublin Core Metadata in HTML  

A formal description of the convention for embedding unqualified Dublin
Core metadata in HTML. an HTML file.

2.3. Qualified Dublin Core Metadata for Simple Resource Discovery

The principles of element qualification and the semantics of Dublin Core
metadata when expressed with a recommended qualifier set known as the
Canberra Qualifiers.

2.4. Encoding Qualified Dublin Core Metadata in HTML 

A formal description of the convention for embedding qualified Dublin
Core metadata in HTML. an HTML file.

2.5. Dublin Core on the Web:  RDF Compliance and DC Extensions

A formal description for encoding Dublin Core metadata with qualifiers
in HTML RDF (Resource Description Framework) [1] compliant metadata, and how
to extend the core element set.


3. Description of Dublin Core Elements  

The following is the reference definition of the Dublin Core Metadata
Element Set.  It is expected that practice will evolve to include
qualifiers for certain of the elements.  The evolving reference description of
the elements description, including any defined
qualifiers, resides at [1]:

	http://purl.org/metadata/dublin_core_elements

Note that elements have [2]:

        http://purl.org/metadata/dublin_core

In the element descriptions below, each element has a descriptive name
intended to convey a common semantic understanding of the element. element, as well
as a formal single-word label intended to make the syntactic specification
of elements simpler for encoding schemes.

Although some environments, such as HTML, are not case-sensitive, it is
recommended best practice always to adhere to the case conventions in the
element labels given below to avoid conflicts in the event that the
metadata is subsequently extracted or converted to a case-sensitive
environment, such as XML (Extensible Markup Language) [3].

Each element is optional and repeatable.  Furthermore, metadata elements may
appear in any order, and with no significance being attached to that order.

To promote global interoperability, a number of the element descriptions
suggest a controlled vocabulary for the respective element values.  It is
assumed that other controlled vocabularies will be developed for
interoperability within certain local domains.

In

A metadata element's meaning is unaffected by whether or not the element descriptions below, a formal single-word label
is
specified embedded in the resource that it describes.

The metadata elements fall into three groups which roughly indicate the
class or scope of information stored in them: (1) elements related mainly
to make the syntactic specification Content of the resource, (2) elements simpler
for encoding schemes.  Each element is optional related mainly to the
resource when viewed as Intellectual Property, and repeatable. (3) elements related
mainly to the Instantiation of the resource.

        Content          Intellectual Property       Instantiation
        -----------      ---------------------       -------------
        Title                 Creator                  Date
        Subject               Publisher                Type
        Description           Contributor              Format
        Source                Rights                   Identifier
        Language
        Relation
        Coverage


3.1.  Title                             Label: TITLE "Title"

     The name given to the resource resource, usually by the CREATOR Creator or PUBLISHER. Publisher.

3.2.  Author or Creator                 Label: CREATOR "Creator"

     The person or organization primarily responsible for creating
     the intellectual content of the resource.  For example, authors
     in the case of written documents, artists, photographers,
     or illustrators in the case of visual resources.

3.3.  Subject and Keywords              Label: SUBJECT "Subject"

     The topic of the resource.  Typically, subject will be expressed
     as keywords or phrases that describe the subject or content of the
     resource.  The use of controlled vocabularies and formal
     classification schemas schemes is encouraged.

3.4.  Description                       Label: DESCRIPTION "Description"

     A textual description of the content of the resource, including
     abstracts in the case of document-like objects or content
     descriptions in the case of visual resources. 

3.5.  Publisher                         Label: PUBLISHER "Publisher"

     The entity responsible for making the resource available in its
     present form, such as a publishing house, a university department,
     or a corporate entity.   

3.6.  Other Contributor                 Label: CONTRIBUTOR "Contributor"

     A person or organization not specified in a CREATOR Creator element who
     has made significant intellectual contributions to the resource
     but whose contribution is secondary to any person or organization
     specified in a CREATOR Creator element (for example, editor, transcriber,
     and illustrator).

3.7.  Date                              Label: DATE

     The "Date"

     A date associated with the resource was made available creation or availability of the resource.
     Such a date is not to be confused with one belonging in its present form. the Coverage
     element, which would be associated with the resource only insofar as
     the intellectual content is somehow about that date.  Recommended best
     practice is an 8 digit number in the form
     YYYY-MM-DD as defined in [2], a profile of ISO 8601. 8601 [4] that includes (among
     others) dates of the forms YYYY and YYYY-MM-DD.  In this scheme, for
     example, the date  element 1994-11-05 corresponds to November 5, 1994.  Many other schema are possible, but if used, they should
     be identified in an unambiguous manner.

3.8.  Resource Type                     Label: TYPE "Type"

     The category of the resource, such as home page, novel, poem,
     working paper, technical report, essay, dictionary.  For the sake
     of interoperability, TYPE Type should be selected from an enumerated
     list that is currently under development in the workshop series at the time
     of publication of this draft. series.

3.9.  Format                            Label: FORMAT "Format"

     The data format of the resource, used to identify the software
     and possibly hardware that might be needed to display or operate
     the resource.  For the sake of interoperability, FORMAT Format should be
     selected from an enumerated list that is currently under development
     in the workshop series at the time of publication of this draft. series.

3.10. Resource Identifier               Label: IDENTIFIER

     String "Identifier"

     A string or number used to uniquely identify the resource.  Examples
     for networked resources include URLs and URNs (when implemented).
     Other globally-unique identifiers, such as International Standard
     Book Numbers (ISBN) or other formal names are also candidates
     for this element.

3.11. Source                            Label: SOURCE

     A string or number used to uniquely identify the work "Source"

     Information about a second resource from which the present resource
     is derived.  While it is generally recommended that elements contain
     information about the present resource only, this element may contain
     a date, creator, format, identifier, or other metadata for the second
     resource was derived, if applicable. when it is considered important for discovery of the present
     resource; recommended best practice is to use the Relation element
     instead.  For example, it is possible to use a PDF
     version Source date of the novel ``Gone 1603 in
     a description of a 1996 film adaptation of a Shakespearean play, but it
     is preferred instead to use Relation "IsBasedOn" with the Wind'' might have a SOURCE
     element containing an ISBN number for the physical book from which reference to a
     separate resource whose description contains a Date of 1603.  Source
     is not applicable if the PDF version was derived. present resource is in its original form.

3.12. Language                          Label: LANGUAGE

     Language(s) "Language"

     The language of the intellectual content of the resource.
     Where practical, the content of this field should coincide with the
     NISO Z39.53 three character codes for written languages.
     RFC 1766 [5]; examples include en, de, es, fi, fr, ja, th, and zh.

3.13. Relation (experimental)                          Label: RELATION 

     The relationship "Relation"

     An identifier of this a second resource and its relationship to other resources.  The intent of
     this the present
     resource.  This element is to provide a means to express relationships among permits links between related resources that have formal relationships and
     resource descriptions to others, but exist as
     discrete resources themselves.  For example, images in be indicated.  Examples include an edition of
     a document,
     chapters in work (IsVersionOf), a book, or items in translation of a collection.  Formal specification work (IsBasedOn), a chapter
     of RELATION is currently under development.  Users a book (IsPartOf), and developers a mechanical transformation of a dataset into
     an image (IsFormatOf).  For the sake of interoperability, relationships
     should understand be selected from an enumerated list that use of this element is currently considered
     to be experimental. under
     development in the workshop series.

3.14. Coverage (experimental)                          Label: COVERAGE "Coverage"

     The spatial and/or or temporal characteristics of the resource.
     Formal specification intellectual content
     of COVERAGE is currently under development.
     Users the resource.  Spatial coverage refers to a physical region (e.g.,
     celestial sector); use coordinates (e.g., longitude and developers should understand latitude) or
     place names that are from a controlled list or are fully spelled out.
     Temporal coverage refers to what the resource is about rather than
     when it was created or made available (the latter belonging in the
     Date element); use of this the same date/time format (often a range) [4] as
     recommended for the Date element
     is currently considered to be experimental. or time periods that are from a
     controlled list or are fully spelled out.

3.15. Rights Management (experimental)                 Label: RIGHTS "Rights"

     A link to a copyright notice, rights management statement, an identifier that links to a rights-management rights
     management statement, or an identifier that links to a service that would provide
     providing information about terms of access
     to rights management for the resource.  Formal specification of RIGHTS is currently under
     development.  Users and developers should understand that use of
     this element is currently considered to be experimental.


4. Security Considerations

The Dublin Core element set poses no risk to computers and networks.
It poses minimal risk to searchers who obtain incorrect or private
information due to careless mapping from rich data descriptions to
simple Dublin Core scheme.  No other security concerns are likely
to be raised by the element description consensus documented here.


5. References

   [1] Resource Description Framework (RDF) Model and Syntax,
       http://www.w3.org/TR/WD-rdf-syntax
       
   [2] Dublin Core Metadata Element Set: Reference Description,
       http://purl.org/metadata/dublin_core_elements
       
   [2]
       http://purl.org/metadata/dublin_core
       
   [3] Extensible Markup Language (XML),
       http://www.w3.org/TR/PR-xml
       
   [4] Date and Time Formats (based on ISO 8601 Profile for the Dublin Core,
       http://purl.org/metadata/dublin_core_date_formats


7. 8601), W3C Technical Note,
       http://www.w3.org/TR/NOTE-datetime

   [5] RFC 1766, Tags for the Identification of Languages,
       http://ds.internic.net/rfc/rfc1766.txt


6. Authors' Addresses

Stuart L. Weibel
OCLC Online Computer Library Center, Inc.
Office of Research
6565 Frantz Rd.
Dublin, Ohio, 43017, USA
Email: weibel@oclc.org
Voice: +1 614-764-6081
Fax:   +1 614-764-2344

John A. Kunze
Center for Knowledge Management
University of California, San Francisco
530 Parnassus Ave, Box 0840
San Francisco, CA  94143-0840, USA
Email: jak@ckm.ucsf.edu
Voice: +1 415-502-6660
Fax:   +1 415-476-4653

Carl Lagoze
Digital Library Research Group
Department of Computer Science
Cornell University
Ithaca, NY  14853, USA
Email: lagoze@cs.cornell.edu
Voice: +1-607-255-6046
Fax:   +1-607-255-4428


----