Publishing data

Data sharing is a key part of the drive towards greater openness in scientific research, allowing readers to reproduce and confirm an article’s findings, or even reuse its data as part of a new study. | Federer, 2018

In this section, we will show two ways that lead to a data publication and zoom in on the data paper as a form of research output.

Ways to publish data

The publication of research data is to make the (meta)data findable, citable and (re)usable under a licence that makes clear what is allowed and possible with the data. In the scientific literature, the publication of data is often referred to as data sharing. 

There are roughly two ways to publish (about) research data: 

  • In a data archive as an independent, citable object;
  • As supplementary material to a journal article. 

Data that is only published as supplementary material is less findable and less FAIR than research data that is published in a data archive.

While sharing data as supplementary information is better than not sharing data at all, it is a sub-optimal solution. Data deposited in a repository is more findable and accessible. | Baynes, 2018

Data papers

In addition to publishing the data itself, it is also possible to publish a description of a dataset in a data paper. The data paper was created as a mechanism to promote data publication and enable data paper citation (Chavan, 2011). There are voices saying that data papers will disappear from the scene as soon as quoting the underlying datasets has become standard. But even if data processing becomes standard, a data paper remains valuable. (Callaghan, 2013). A data paper is actually a form of extensive data documentation and therefore has a right to exist of its own.

Some examples of data journals in which data papers are published:

 

The publication of the described data in a data archive is a precondition for being allowed to publish a data paper.

In the spotlight


GBIF: linking biodiversity data

An example not to be missed is the way in which the Global Biodiversity Information Facility (GBIF, n.d.a.) stimulates the making of data papers.

If a researcher or research institute has access to biodiversity data, in a database or another format, these data can be linked to the GBIF network using the so-called GBIF Integrated Publishing Toolkit (IPT) (GBIF. n.d.b.). Through this Toolkit, the metadata can also be published as a data paper manuscript, ready to be submitted to a selection of magazines by Pensoft Publishers. The metadata can be easily downloaded into an RTF-formatted manuscript that is ready to be edited and submitted for peer review according to normal procedures.

Want to know more? See the section on data papers on the GBIF website (GBIF, n.d.c.). 

Examples of data and journal articles linked

  • A small example of research information in context is the article 'The initial stages of template-controlled CaCO3 formation revealed by Cryo TEM' (Pouquet, 2009) whose research data (Pouquet, 2010)  are included in 4TU.Centre for Research Data (n.d.). Article and research data are linked. 
  • The underlying research data (Wols, 2010b) of Bastiaan Wols' thesis (Wols, 2010a) are included in 4TU.Centre for Research Data. The data consists of films of simulated currents.

Advantages and disadvantages of different data publication routes (CESSDA)

CESSDA has identified the advantages and disadvantages of various data publication routes (CESSDA, 2017). 


Sources

Click to open/close

4TU.Center for Research Data (n.d.) https://researchdata.4tu.nl/en/

Baynes, G. (2016, April 16th). We need more carrots: give academic researchers the support and incentives to share data. LSE Impact Blog [blog]. https://blogs.lse.ac.uk/impactofsocialsciences/2018/04/16/we-need-more-carrots-give-academic-researchers-support-and-incentives-to-share-data/

BMC (n.d.). BMC Research Notes. https://bmcresnotes.biomedcentral.com/about/introducing-data-notes

BRILL (n.d.) Research Data Journal for the Humanities and Social Sciences. https://brill.com/view/journals/rdj/rdj-overview.xml?lang=en

Callaghan, S. (2013, January 29). Citing Bytes - Adventures in Data Ciation. [blog]. Data journals - as soon-to-be-obsolete stepping stone to something better? http://citingbytes.blogspot.co.uk/2013_01_01_archive.html 

CESSDA (2017). Data Management Expert Guide. Data publishing routes. https://www.cessda.eu/Training/Training-Resources/Library/Data-Management-Expert-Guide/6.-Archive-Publish/Data-publishing-routes

Chavan, V., Penev, L. (2011). The data paper: a mechanism to incentivize data publishing in biodiversity science. BMC Bioinformatics, 12 Suppl. 15:S2. http://www.ncbi.nlm.nih.gov/pubmed/22373175

Federer, L. (2018, June 14th). Journal data sharing policies are moving the scientific community towards greater openness but clearly more work remains. LSE Impact Blog [blog]. https://blogs.lse.ac.uk/impactofsocialsciences/2018/06/14/journal-data-sharing-policies-are-moving-the-scientific-community-towards-greater-openness-but-clearly-more-work-remains/

GBIF (n.d.a.). Free and open access to biodiversity data. https://www.gbif.org/

GBIF (n.d.b.). IPT: The Integrated Publishing Toolkit. A free open source software tool used to publish and share biodiversity datasets through the GBIF network. https://www.gbif.org/en/ipt 

GBIF (n.d.c.). Data papers. Getting scholarly recognition for your datasets. https://www.gbif.org/data-papers

Knowledge Exchange. (2013). The value of research data. Metrics for datasets from a cultural and technical point of view. Retrieved from http://repository.jisc.ac.uk/6205/1/Value_of_Research_Data.pdf

DANS (n.d.). NARCIS. https://www.narcis.nl/ 

Pensoft (n.d.). Biodiversity Data Journal. https://bdj.pensoft.net/

Pouget, E.M., Bomans, P.H., Goos, J.A., Frederik, P.M., Sommerdijk, N.A. (2009). The initial stages of template-controlled CaCO3 formation revealed by cryo-TEM. Science. 2009;323: 1455-1458. https://doi.org/10.1126/science.1169434 

Pouget, E.M.(Emilie); Bomans, P.H.H.(Paul); Goos, J.A.C.M.(Jeroen); Frederik, P.M.(Peter); de With (Gijsbertus); Sommerdijk, N.A.J.M.(Nico) (2010). The Initial Stages of Template-Controlled CaCO3 Formation Revealed by Cryo-TEM. Eindhoven University of Technology. Dataset. https://doi.org/10.4121/uuid:29b1a9fa-e8b0-4585-8bb6-fccebc925b68

Sefton P, Lynch M. (2019). Packaging Research data with DataCrate - a cry for help! https://doi.org/10.6084/m9.figshare.8066936.v1 

Ubiquity Press (n.d.a.). Journal of Open Psychology Data. https://openpsychologydata.metajnl.com/

Ubiquity Press (n.d.b.). Journal of Open Archaeology Data. https://openarchaeologydata.metajnl.com/

Ubiquity Press (n.d.c.). Journal of Open Health Data. http://openhealthdata.metajnl.com/

Wiley (n.d.a.). Brain and Behavior. https://onlinelibrary.wiley.com/page/journal/21579032/homepage/data_set.htm

Wiley (n.d.b.) Geoscience Data Journal. https://rmets.onlinelibrary.wiley.com/journal/20496060

Wols, B. (2010a). CFD in drinking water treatment. Doctoral thesis. https://doi.org/10.4233/uuid:b1d4405e-a364-4105-ab03-21800b46df5b 

Wols, B.A. (Bas) (2010b) CFD in drinking water treatment. 4TU.Centre for Research Data. Dataset. https://doi.org/10.4121/uuid:c1ac7344-1419-4398-ba13-c757551c303f

Zenodo (n.d.) https://zenodo.org/