Metadata in data archives

   Main points

When a data set is included in a data archive (ingest) it is inspected to verify whether the data set has been defined adequately. The key question is: does a (future) user have sufficient information to understand what the data set implies?

Both the creator and the data manager can assign so-called metadata. By assigning metadata it becomes easier to find, recognise, (re)use or link a source of information to other sources of information. 
(Also refer to the section Data documentation and metadata).

Various disciplines apply their own metadata schedules and standards (see box). Which metadata fields are mandatory or desirable may vary per file. Both DANS and 4TU.Centre for Research Data use Dublin Core Metadata Initiative (DCMI)(1) as the standard for metadata. Dublin Core is easy to use and it is used worldwide. As a result, it is easy to link metadata to other files and automatically search through them, which increases the familiarity of the data. The underlying data itself, however, cannot be searched.

The metadata in 4TU.Centre for Research Data are available in so-called RDF-format. RDF is a general standard that allows you to easily connect data deriving from several sources. RDF makes it possible to use existing metadata schedules like Dublin Core and combine these with other metadata schedules. Dublin Core is a collection of the actual metadata fields and with RDF you can make the connections. 

If a researcher is planning to deposit his data in an archive, it is advisable to define the metadata fields at an early stage. This will prevent him from having to add the documentation or metadata afterwards.

About metadata schemes and metadata standards

A metadata scheme is a set of individual metadata elements that you can use to describe data. Most schemes are developed and subscribed to by certain communities. Each metadata element in a metadata scheme is assigned a name and a meaning. An example of a scheme developed by the community is the Data Documentation Initiative(2) (DDI), an international standard on describing data from socioscientific, behavioural and economic research.

If a standardisation institution such as ISO(3) approves a metadata scheme, it is called a metadata standard. An example of a metadata standard is the Dublin Core Metadata Element Set(4) also known as ISO 15836:2009.(5)

     An in-depth look

  • There are many kinds of metadata schedules and standards, depending on the research community, its objective, the function and the field. The following illustration(6) paints a good picture of the diversity mentioned. 

  • The English Digital Curation Centre (DCC) provides a good overview(7) of the schemes and the standards in some of the disciplines. 

Mandatory metadata fields for DANS and 4TU.Centre for Research Data

DANS 4TU.Centre for Research Data Meaning
Creator Creator

The most important researchers involved in creating data

Title Title Name of title of data set
Date created Date created
Description Description
Audience The audience for whom the data is interesting, described in terms of research fields
Publication year

 

These are only the most important mandatory metadata fields. The more fields that are filled in, the easier it is to find the data set.

Another metadata element is the identifier. An identifier is generally a number or code linked to a data object. Preferably, the identifier must be unique and persistent so that the data set's findability is guaranteed for the long term.

Some examples of identifiers:

  • ISBN - International Standard Book Number.
  • DOI - Digital Object Identifier, used worldwide for publications such as journal articles (via CrossRef(8)) and in recent years also for data sets (via DataCite(9)).
  • URN - Uniform Resource Name, unique and persistent identifier.
  • URL - Uniform Resource Locator or web address (although persistence isn't always guaranteed).

Also see Persistent identifiers.

 

 

  Sources

Click to open/close
  1. Dublin Core Metadata Initiative. Retrieved from dublincore.org
  2. Data Documentation Initiative. Retrieved from http://www.ddialliance.org/
  3. ISO. Retrieved from www.iso.org/iso/home.html
  4. Dublin Core Metadata Initiative, Dublin Core Metadata Elementen Set, Version 1.1. Retrieved from dublincore.org/documents/dces/
  5. ISO. ISO 15836:2009, information and documentation - The Dublin Core metadata element set. Retrieved from http://www.iso.org/iso/catalogue_detail.htm?csnumber=52142
  6. Bargmeyer, B.; Gillman, D. (2000). Metadata standards and metadata registries: An overview. Retrieved from http://stats.bls.gov/ore/pdf/st000010.pdf
  7. DCC. Disciplinary metadata. Retrieved from www.dcc.ac.uk/resources/metadata-standards
  8. CrossRef. Retrieved from www.crossref.org
  9. Datacite. Retrieved from https://www.datacite.org/

botMessage_toctoc_comments_928