Publishing data

   Main points

For a data supporter it is essential to know that how researchers publish their work is changing. 

Publications, data sets and other forms of research information are frequently stored and displayed as separate types of information. However, they still have a lot in common. By linking the related information to each other, a so-called Scientific Publication Package - a term invented by Jane Hunter(2) - is created. For instance, combining the publication and underlying data sets, images, models, visualisations, reports, documentation, appendices and links to the related research.

There are roughly three ways to publish research data: 

  • In a data archive as an independent object. 
  • As a description of a data set in a data paper.
  • As a component in a web of relations. 

(Also read chapter 'Open Research Data: From Vision to Practice'(1) in the book 'Opening Science').

Until recently we referred to this as 'enriched publication', but nowadays this term is not really used anymore.  This demonstrates the rapid changes in the field where you as a data supporter are active. What remains is the approach to purposefully link various information sources. To this end, the dynamics are essential. The content of a publication is no longer linear: "It is a network of components and readers get to choose their own optimal path" (in Dutch).(3)

The dynamic publication has irrevocaly entered the playing field (also read chapter 'Dynamic Publication Formats and Collaborative Authoring'(4) in the book 'Opening Science').

A significant consideration when establishing relations between digital objects is to make sure that all objects that are related must have a persistent identifier. You will learn more on this subject in the Data citation section

To formulate relations between objects such as data, publication and researcher one could use specifications that are included in the Resource Description Framework (RDF). By recording digital objects as linked data you can create a web of relations. 

Data papers and data journals

Data papers initially were created as a mechanism(5) to promote data publication and enable data paper citation. There are now those who believe that data papers will disappear when citing the underlying data sets becomes the new standard. But even if data citation becomes the standard, data papers will continue to be valuable according to Sarah Callaghan.(6) A data paper is in fact a very extensive form of data documentation and therefore already has earned its right to legitimacy.

Some examples of data journals:  


  • A minor example of research information in context is the article(12) 'The initial stages of template-controlled CaCO3 formation revealed by Cryo TEM'  of which the research data(13) has been included in the 4TU.Centre for Research Data. Article and research data are linked.
  • The underlying research data in the dissertation of Bastiaan Wols(14) has been incorporated(15) in 4TU.Centre for Research Data and consists of short films on simulated flow.
  • In NARCIS(16) 'enriched publications' are included as a separate form of information. In Richard Zijdemans enriched publication 'Like my father before me'(17): intergenerational occupational status transfer during industrialisation (Zeeland, 1811-1915) a relation is made between an article from the repository of Utrecht University and the associated data sets that are included in EASY.
    The enriched publications in Narcis are an experiment that visualise a web of relations. This form of visualising research output is familiar to the 'old school' publications: what you do has a beginning and an end, a head and a tail, everything seems to be gathered in one place. The latter only appears to be so, because the power of having research information in context is exactly that: related sources can be anywhere. Moreover, the dynamics described above indicates that the relations can change over time. 
  • OpenAIRE(18)  aims to be an infrastructure that focuses on(19) linking publications, project information, underlying data sets, author information etc. resulting in a contextual presentation of research output.
    OpenAIRE uses the OAI-PMH protocol(20) to 'harvest' metadata. That is an important condition for interoperability between data archives and repositories.

  An in-depth look


Click to open/close


  1. Pampel, H.; Dallmeier-Tiessen, S. (2014). Open Research Data: From Vision to Practice. In Opening Science (pp. 213-224). Springer Link. Retrieved from
  2. Hunter, J. (2006). Scientific Publication Packages - a selective Approach to the Communication and Archival of Scientific Output. The International Journal of Digital Curation, 1 (1), 33-52. Retrieved from 
  3. SURFshare. (2010). Verrijkte publicaties. Onderzoeksresultaten in samenhang. Retrieved from
  4. Heller, L.; The, R.; Bartling, S. (2014). Dynamic Publication Formats and Collaborative Authoring. In Opening Science (pp. 191-211). Springer Link. Retrieved from 
  5. Chavan, V.; Penev, L. (2011). The data paper: a mechanism to incentivize data publishing in biodiversity science. BMC Bioinformatics, 12 Suppl. 15:S2. Retrieved from
  6. Callaghan, S. (2013, January 29). Citing Bytes - Adventures in Data Ciation. [blog]. Data journals - as soon-to-be-obsolete stepping stone to something better? Retrieved from 
  7. Brain and Behavior (Wiley). Retrieved from
  8. Geoscience data journal (Wiley). Retrieved from
  9. Biodiversity data journal (Pensoft). Retrieved from
  10. Callaghan, S. A list of data journals (in no particular order). Retrieved from
  11. Journal of open psychology data (upmetajournals). Retrieved from
  12. Sommerdijk. N.A.J.M., e.a. (2009). The Initial Stages of Template Controlled CaCO3 Formation Revealed by Cryo-TEM, Science, 323(5920), 1455-1458. Retrieved from
  13. Pouget, E.M.; Bomans, P.H.H.; Goos, J.A.C.M.; Frederik, P.M.; de With, G.; Sommerdijk, N.A.J.M.. (2010) The Initial Stages of Template-Controlled CaCO3 Formation Revealed by Cryo-TEM. Eindhoven University of Technology. [dataset].
  14. Wols, B.A. (2010). CDF in drinking water treatment. [dissertation]. Retrieved from
  15. Wols, B.A. (2010). CFD in drinking water treatment. Delft University of Technology. [dataset].
  16. NARCIS. Retrieved from 
  17. Zijdeman. R.L. (2009). Like my father before me: intergenerational occupational status transfer during industrialization. [verrijkte publicatie]. Retrieved from
  18. OpenAIRE. Retrieved from
  19. OpenAIRE.  OpenAIRE compatible data sources. Retrieved from
  20. Open Archives Initiative Protocol for Metadata Harvesting. Retrieved from
  21. Manghi, P., e.a. (2012). OpenAIREplus: the European Scholarly Communication Data Infrastructure. D-lib magazine, 18 (9/10). Retrieved from
  22. Schirrwagen, J. e.a. (2013). Data curation in the OpenAIRE scholarly communication. Information Standards Quarterly, 25(3), 13-19. Retrieved from

Additional reading

  Your additions

How can you use the knowledge gained from this section in a conversation with a researcher? Are there data papers in the field you support? Do you have anything else you would like to add to this page? If so you can leave your remarks in the comments.