Please select a page template in page properties.

Data documentation

In a lab notebook, the researcher records all kinds of project related information – from hypothesis to results of experiments. It often serves as the most important piece of documentation of the researchers’ work. Lab notebooks are crucial for ensuring accountability and reproducibility of research, and they are often needed to enable the re-usability of data. | Larsen, 2018

Clear and detailed data documentation increases the data quality and increases the chance that the data will be understood by (future) others. Data documentation is therefore essential to enable the reproducibility of research and the reuse of research data. In this section you will find a number of examples of data documentation.

Data documentation

In order to make data usable for other researchers who have not yet worked with it, data documentation that is as complete and detailed as possible is essential. Data documentation describes the characteristics of a dataset at various levels, such as:

  • A description of the data itself 
    Think of an overview of all files of the dataset with a description of the content per file in a README file. Here you will find answers to questions such as:
    • What is the data format?
    • Which software can you use to read the data?
    • Which codes and variables are used and what do they mean?
  • A description of the data collection process and the tools used
    Think of instruments such as a codebook, lab journal, logbook, diary, questionnaires, manuals, etc.   
  • A description of the changes of the dataset over time
    A so-called historical report of the wanderings and processing of the research data in time is necessary to understand the origin of the data. In data jargon this is called data provenance. To be able to make a historical report, a description of the data collection process and of the data itself is also necessary.

 

Because of the great diversity of datasets, the choices to document data are not always standard.

High-level documentation is very important. A good README file does part of the job, but documentation and a user manual are also important. Any information (e.g. equations, model) behind the software also needs to be shared. | Workshop software reproducibility, 2018

About metadata

Metadata is a special form of standardised data documentation or 'data about data'. Not only people but also computers can read, interpret and combine metadata and therefore metadata are an important element in the creation of a FAIR data infrastructure. The assignment of metadata, for example, helps to determine the time of collection of the data, the collection location, the creator(s) and the terms of use of the research data (licence). You can learn more about this in the section 'Standardised metadata'.

The area between data, data documentation and metadata is a grey area. For example, certain data formats also have metadata in their data. Think of digital photos. As soon as you save them, data are automatically stored with information about the circumstances under which you took the photo: aperture, lighting, etc. 

Ultimately, it is not a question of whether something is called data, metadata or data documentation, but of focusing on the underlying goal: to describe the data in such detail that the chance of reproducibility and reuse increases.

In the spotlight


Documentation at project and dataset level (CESSDA)

The CESSDA Data Management Expert Guide provides an overview of the distinction between documentation at the project level and at the dataset level (CESSDA, 2017). 

Case Wageningen University & Research: Electronic lab notebooks as data documentation

Wageningen UR's Open Science Blog contains a case study of WUR Data Champions working with an electronic lab journal (ELN) (Wageningen University & Research, 2018). The arguments for choosing an ELN are described in an article in a blog on OpenAIRE (Larsen, 2018).

 

Documenting versions in Excel: an example

Evan Lantsoght, researcher at Delft University of Technology, describes how researchers copy tables from one sheet to another when analysing their research data. And the moment they want to write an article, they scratch their heads: what edits have I made and why? On her blog she describes a solution (Lantsoght, 2013):

"Start by adding an extra 'version management' tab to a new spreadsheet. In this sheet, carefully write down a version name (name of the file, typically) in the first column, in the second column the date, and in a third column an explanation of all changes you made to the sheet. Carefully fill out this sheet every single time you move something around, or tinker with the sheet."

Guidelines for creating a README file (4TU.ResearchData)

4TU.Centre for ResearchData has published guidelines to create a README file (4TU.Center for Research Data, 2017). 

Tips for data documentation (Wageningen University & Research)

The Open Science blog of Wageningen University & Research, contains a number of practical tips & tools for data documentation (Wageningen University & Research, 2017).


Sources

Click to open/close

4TU.Centre for Research Data. (2017). Guidelines for creating a README file. https://researchdata.4tu.nl/fileadmin/user_upload/Documenten/Guidelines_for_creating_a_README_file.pdf

CESSDA (2017). Data Management Expert Guide. Documentation and metadata. https://www.cessda.eu/Training/Training-Resources/Library/Data-Management-Expert-Guide/2.-Organise-Document/Documentation-and-metadata

Cruz, M.J., Kurapati, S., der Velden, Y.T. (2018, July 2018). Software Reproducibility: How to put in into practice? https://doi.org/10.31219/osf.io/z48cm

Lantsoght, E. (2013, October 10). Keeping your spreadsheets under control. [blog]. http://phdtalk.blogspot.nl/2013/10/keeping-your-spreadsheets-under-control.html

Larsen (2018). OpenAIRE. Electronic Lab Notebooks - should you go "e"? [blog]. https://www.openaire.eu/blogs/electronic-lab-notebooks-should-you-go-e-1

Wageningen University & Research (2018, 27 August). WUR Data Champions Katharina Hanika & Eliana Papoutsoglou: actively promoting good data management practices. OpenScience blog [blog]. https://weblog.wur.eu/openscience/wur-data-champions-electronic-lab-notebook/

Wageningen University & Research (2017, 8 September). Documenting your research data along the way: tips and tools. OpenScience blog [blog]. https://weblog.wur.eu/openscience/documenting-research-data-along-way-tips-tools/