Archiving data

"Early interaction with the data repository was paramount to our success." - Purdue University(1)

  Main points

Data archiveren in stappen 

The infrastructure that enables sharing research data is layered: 

  • First a researcher saves his or her data locally or in a data lab. In most cases, the data will not be available for others to view yet.
  • The institute will save the data for the short term, for example in an institutional repository.
  • After selection, the data will be saved in a data archive, for example 4TU.Centre for Research Data or EASY.
  • The data can be made available under various conditions: under embargo, closed access or open access. All cases bring data publications. The most optimal form of data publication availibility is open access, because then the data can be used without use restrictions.

To visualise the layered structure the Edinburgh Research data blog discusses  'The four quadrants of research data curation systems'.(2)  

Whereas research projects usually have a short lead-time, data archives often focus on continuity. For instance:

  • Long-term storage.
    This is for reusing data.
  • Facilitating interoperability.
    In order to be able to compare and combine data sets with each other, it is important that metadata are assigned consistently and that metadata standards and data formats are used.
  • Facilitating a data citation network.
    Being able to cite data and link data and literature leads to more transparency in academic science and furthers scientific integrity. 

The White Paper 'Sustaining Domain Repositories for Digital Data'(3)  describes the roles of data archives in detail.


The below accordion lists a number of categories for data archives and provides one or more examples per category. In the searching for data section we will discuss how to find the appropriate data archives in more detail.

Institutional data archives and repositories

Institutional data archives and repositories

  • 4TU.Centre for Research Data (4) has been established as a data archive for the three technical universities in the Netherlands: TU Delft, TU Eindhoven and the University of Twente, but is now also open to other universities. Therefore, 4TU.Centre for Research Data has surpassed its status as an institutional data repository.
  • Edinburgh Datashare(5) is a data repository intended for researchers’ data sets at The University of Edinburgh.
National/European data archives

National/European data archives


  • EASY(6), the online archiving system of DANS initially focused on humanities and social sciences, but is now also open to other disciplines


  • Zenodo(7) is a European repository for so-called 'orphan data': if a researcher does not have access to an institutional or discipline-specific data archive he can make use of this repository.
    If participants in the Open data pilot of Horizon 2020 do not have access to a data archive they can make use of Zenodo(8) as well. 
Discipline-specific data archives

Discipline-specific data archives

  • Life sciences general.
  • Life sciences specific: DNA sequences 
Research output-specific

Research output-specific 

Examples of data archives that focus on a certain type of research output are software repositories(13) such as:

General data archives

General data archives

   Case Wageningen University

The university library of Wageningen partners (in Dutch)(20) with DANS and 4TU.Centre for Research Data to transfer research data from the institution to the data archive.

A double interview(21) held with researcher Frits van Evert (Agrosystems Research) and library employee Annemarie Patist explains how that process generally works:

"The E-depot does now have data sets, but they won't stay there! We are working on sustainably archiving datasets , accompanied by a 'read me file' and a 'methodology file' in national archives like DANS (Data Archiving and Network Services) and 3TU Datacenter Delft. The data has to remain available over the long term. That’s why the data are converted into sustainable formats before they are sent to the national archives. The formats are independent of specific versions of software. After the datasets are checked by DANS or 3TU Datacenter employees, they are published. The persistent link received from the archive centre is then the only thing that is stored in the E-depot."

This is an example of what the front office - back office model of Research Data Netherlands may yield in practice. 

   Sources and additional reading

Click to open/close

Click to open/close


  1. Zilinski, L.D. (2013). Evaluation of data creation, Management, Publication, and Curation in the Research Process [conference paper]. Retrieved from
  2. Lewis, S. (2013, December 6). The four quadrants of research data curation systems. [blog]. Retrieved from
  3. Ember, C. (2013). Sustaining Domain Repositories for Digital Data: A White Paper. Retrieved from
  4. 4TU.Centre for Research Data. Retrieved from
  5. Edinburgh DataShare. Retrieved from
  6. EASY. Retrieved from
  7. Zenodo. Retrieved from 
  8. OpenAIRE. Open access to research data: the Open Research Data Pilot. Retrieved from
  9. CLARIN. Retrieved from 
  10. Search resources at CLARIN in Europe. Retrieved from*%3A*
  11. Dryad data repository. Retrieved from
  12. Genbank. Retrieved from
  13. Khodiyar, V., (2013, October 11). Open access software: Our recent software repository collaborations. [blog]. Retrieved from 
  14. Github. Retrieved from
  15. Github Education. Retrieved from
  16. Runmycode. Retrieved from 
  17. Research Compendia. Retrieved from
  18. Figshare. Retrievedfrom
  19. Maxmen, A. (2013, August 1). Preserving Research. The top online archives for storing your unpublished findings. The scientist. Retrieved from
  20. DANS. WUR kiest voor duurzame data-opslag bij DANS en 3TU. Retrieved from
  21. WUR. Data archiving: a double interview. Retrieved from

Additional reading