"Scientific metadata provide the information necessary for investigators separated by time, space, institution or disciplinary norm to establish common ground." - Christine Borgman e.a.(1)
Data documentation is describing the characteristics of a dataset, occurring at various levels, such as:
- A description of the process a researcher uses to collect data. Documentation takes place in, for instance a codebook, lab journal, log or diary.
- A description of the data itself (how much, what data format, what software to use to read the data).
- A description of the changes of the dataset in time. This is used to create a historical report of all uses and edits of the research data over a period of time. In data jargon this is called data provenance. In order to make a historical report, a description of the data collection process and of the data itself is also essential.
Proper data documentation ensures that research data are traceable and unambiguously understood and used by current and future users (including the researcher).
Due to the great diversity of datasets, the choices for documenting the data are not always obvious.
Metadata is a Love Note to the Future - UK Higher Education Research Data Management (RDM) Survey http://t.co/J80ySXEsf5— Mariette van Selm (@mvanselm) October 10, 2013
It is useful to know that metadata can sometimes be derived from the data itself. Certain data formats include metadata in their data, e.g. digital photos. When you store them, details about the circumstances you took the picture in are automatically stored: diaphragm, lighting, etc.
The function of data documentation depends on the phase of the research lifecycle it is in. It is important for data archives to strive for a certain (international) standard for their documentation of data to be able to tie in with other archives. This will be discussed under Metadata for data archives.
"We don't know when data is metadata or just data. Metadata is data that is used to describe other data, so the usage turns it into metadata." - Bargmeyer and Gillman(2)
Data documentation can take place at various levels. The accordion below shows two cases: one about the data collection process and one about documenting version management.
In addition, metadata tools(3) are used in an increasing number of fields of study. These tools help fit the process of adding metadata into the workflow.