Print Pages

Access Page via Rule

Track Page Visit

Visited Pages JQuery

Navigatie EN tablet HIII H I H II H III H IV H V H VI
Navigatie EN mob H3 H I H II H III H IV H V H VI

Data documentation and metadata


"Scientific metadata provide the information necessary for investigators separated by time, space, institution or disciplinary norm to establish common ground."  -  Christine Borgman e.a.(1)

   Main points

Data documentation is describing the characteristics of a dataset, occurring at various levels, such as:

  • A description of the process a researcher uses to collect data. Documentation takes place in, for instance a codebook, lab journal, log or diary.
  • A description of the data itself (how much, what data format, what software to use to read the data).
  • A description of the changes of the dataset in time. This is used to create a historical report of all uses and edits of the research data over a period of time. In data jargon this is called data provenance. In order to make a historical report, a description of the data collection process and of the data itself is also essential.  

Proper data documentation ensures that research data are traceable and unambiguously understood and used by current and future users (including the researcher).

Due to the great diversity of datasets, the choices for documenting the data are not always obvious.

It is useful to know that metadata can sometimes be derived from the data itself. Certain data formats include metadata in their data, e.g. digital photos. When you store them, details about the circumstances you took the picture in are automatically stored: diaphragm, lighting, etc. 

The function of data documentation depends on the phase of the research lifecycle it is in. It is important for data archives to strive for a certain (international) standard for their documentation of data to be able to tie in with other archives. This will be discussed under Metadata for data archives.

"We don't know when data is metadata or just data. Metadata is data that is used to describe other data, so the usage turns it into metadata." - Bargmeyer and Gillman(2)

Metadata types 

Metadata are often called data about data, or information about information. There are metadata to describe content (descriptive metadata) and metadata to interpret the context (data of creation, instruments etc.).

Without contextual metadata some data would appear to be no more than an accidental range of numbers, images or words. Without descriptive metadata it would be impossible to find relevant data in a data archive (also see Metadata for data archives). 

The types of metadata that occur the most:  

Types of metadataGoalExample
Descriptive metadata

The minimal metdata, required to find a digital object.  

If there are additional contextual metadata, a user will have a better idea on how to use the data

Author, title, abstract, date

Contextuele metadata are for example location, time, data collection method (tools)

Structural metadata

These link the individual objects of a unity

Links to related digital objects, (e.g. the article written based on the linked research data)
Technical metadataInformation on the technical aspects of the data setData format, hardware/software used, calibration, version, authentication, encryption, metadata standard
Administrative metadataMetadata focusing on user rights and management of digital objects

License, possible reasons for an embargo, waivers

Search logs, user tracking



Data documentation can take place at various levels. The accordion below shows two cases: one about the data collection process and one about documenting version management. 

In addition, metadata tools(3) are used in an increasing number of fields of study. These tools help fit the process of adding metadata into the workflow.

Open notebook science

Open notebook science

An example of recording the data collection process is open notebook science:



Version management

Version management

Evan Lantsoght, researcher at Delft Technical University, describes how researchers copy tables from one sheet to another when they analyse their research data. And when they want to write an article they cannot help but wonder: what did I edit and why? In the blog she describes a solution(6): 

"Start by adding an extra 'version management' tab to a new spreadsheet. In this sheet, carefully write down a version name (name of the file, typically) in the first column, in the second column the date, and in a third column an explanation of all changes you made to the sheet. Carefully fill out this sheet every single time you move something around, or tinker with the sheet."

   Sources and additional reading

Click to open/close

  1. Edwards, P. (2011). Science Friction: Data, Metadata, Collaboration. Social Studies of Science, 41(5), 667-690. doi:10.1177/0306312711413314
  2. Bargmeyer, B.E. Metadata standards and metadata registries. Retrieved from
  3. Metadata tools. Retrieved from
  4. Stanford University Libraries. University of Southampton. Open Source malaria. Retrieved from
  5. Bohle, S. (2014, January 1). A four part series on open notebook science. [blog]. Retrieved from 
  6. Lantsoght, E. (2013, October 10). Keeping your spreadsheets under control. [blog]. Retrieved from

Additional reading

Like · Dislike 1 ·  
Not rated yet. Be the first who rates this item!

Annemiek Kuil - Toevallig sprak ik gisteren een onderzoeker die vertelde dat veel labjournaals nog met de hand worden geschreven. Een onderzoeker staat met handschoenen aan, voorzichtig om niet te knoeien met water (het ging hier om watetrmanagemetn), maar wil wel even een schetsje maken van hetgeen er gebeurd. Dit is lastig te doen met een e-lab journal. Soms wordt zo'n geschreven (leesbaar?) labjournaal gescand en digitaal bewaard.
Ik dacht dat het niet meer van deze tijd was, maar toch.

3 years 8 months ago · 
Not rated yet. Be the first who rates this item!
I like maybeLike · 

peter verberne - Jos Odekerken van onze universiteit heeft een tijdje geleden het UM-metadataformaat opgetekend. Da's een handig instrument voor elk object dat voorbij komt om te toetsen of het van de juiste labels is voorzien. Het wordt opgeslagen in xml en via stylesheets komt dat weer naar buiten als dublin core, marc21 of zelfs nl-didl

3 years 7 months ago · 
Not rated yet. Be the first who rates this item!
I like maybeLike · 

Rahul Thorat - During my PhD research I kept lab journal. It really helps to write down what your plan is, what have you done, what was the problem during experimentation. My professor did not like the idea of writing down electronically as he thought that you do not get the "feeling". It is also like a personal diary where you can easily look back at the page and retrospect about something you did.

3 months 2 weeks ago · 
Not rated yet. Be the first who rates this item!
I like maybeLike · 

RDNL uses cookies. More information Close