Print Pages

Access Page via Rule

Track Page Visit

Visited Pages JQuery

Navigatie EN HIII H I H II H III H IV H V H VI
Navigatie EN tablet HIII H I H II H III H IV H V H VI
Navigatie EN mob H3 H I H II H III H IV H V H VI

Storing data

    Main points

Where and on which medium will a researcher store research data? How are backups and version control used? (see box) In this section the possibilities are outlined.

Storage media

Information requires an information carrier – a storage medium. Over time we have learned that storage media quickly become outdated (see infographics  (1) en (2)). Maybe a researcher thinks the best option to store data is a backup on a USB flash drive, but how long will they be around? Will it be possible to get information from it in the future? Will laptops (or equal) still have a USB port? Will it be possible to open data stored in a certain format with the software that will be available at that time? 

Storage strategy

If you want to keep data readable and usable for a long period of time, you have to carefully think about a strategy. UK Data Archive has included the following points(3) on its list with data storage best practices

Store data in an open format, unattached to a specific software supplier (also see preferred formats).

Use a data storage strategy in which two different types of storage media are used (for instance CD and hard disk), even for a short-term project. 

Copy or migrate data to new storage media every two to five years. Storage media degrade and will be impossible to open with the hardware and software used in the future.

Never rewrite an old backup with a new one. It is better to make an entirely new backup of changed files.

Regularly check the data integrity, for instance with a checksum checker.(4)

Organize and document research data. Make digital versions of paper data documentation in a PDF/A format (suitable for long-term storage).


Master file

Short-term storage

There are roughly three options available for short-term storage and backup: 

  • On own PC or laptop.

    If a researcher works on his own PC of laptop, then that is where the master file is: the file that is used when the data is entered. 
    The backup is the file that is used to restore data when the master file is lost, damaged, accidentally deleted or wrongly changed. 
    Make regular backups of your master file on a USB flash drive, DVD, CD or external hard disk (disk storage). 

    Researchers often have several workstations. They work on the lab PC or on their own laptops when they are at home or on the road and share their research data through cloud services such as Dropbox(5) or the more recent SURFdrivehttps://www.surfdrive.nl/en(6). Of course it is possible to copy files from one computer onto another one. However, that does mean that you will have to copy them by hand and it is very easy to lose track of the latest version of your file (see also version management). In this case file synchronization software offers a solution (e.g. Syncback.(7)).
     
  • Through central storage services (network storage) at the institute the researcher works for. 

    If a researcher uses the institute's network storage facilities, making a backup has often been arranged for. Often there are also restore possibilities in place, offerting the possibility to return to older data versions. 
    Some research groups install their own NAS server, which is in fact an external hard disk with network facilities. Such a NAS server can be linked to a computer network and from that moment on every device linked provides access to your files. All PCs share the same backup server. Setting up such a NAS server requires expert knowledge. 
  • Through cloud storage facilities with synchronisation to one or several PCs.

    With the emergence of cloud services the term 'Master file' gradually looses its meaning. A program like Dropbox can easily be downloaded on your PC. All changes you make will automatically be stored online. If you change the online document from another computer, those changes will also be stored on your PC when you turn it on (and there is internet access). 
    You can also make your own cloud storage (Dutch).(8) The disadvantage of global services like Dropbox is that you do not know if data is safe and whether someone reads your files or not. For this reason several Dutch research organisations prefer SURFdrive to Dropbox.


The table below lists the various possibilities - with their pros and cons. The table has been reproduced from the 'Data management plan template' by Wageningen University.(9)

Storage SolutionsAdvantagesDisadvantagesSuitable for
Personal Computer & Laptop

Always available

Portable

Drive may fail

Laptop may be stolen

Temporary storage

Networked drives

File servers managed by your university, research group or facilities like a NAS-server

Regularly backed up

Stored securely in a single place

Costs

Master copy of your data

(if enough storage space is provided ..)

External storage devices

USB flash drive, DVD/CD, external hard drive

Low cost

Portability

Easily damaged or lostTemporary storage
Cloud services

Automatic synchronization between folders and files

Easy to access and use

It's not sure whether data security is taken care of

You don't have direct influence on how often backups take place and by whom

Data sharing

Version management

If data is worked on continually, it is useful to introduce some kind of version management to be able to properly follow the changes. The easiest way of version management is by adding a number at the end of a file after every important change, e.g. experiment_021213_v2.doc.

Version management can also be used within a single file. In the section Data documentation you can read a case (see tab 'version management') in which a researcher includes version management in her data files by adding a 'version management' tab.

Some programs have their own automated version management. On the right is an example of the Dropbox program.  

If research is not too complicated, the methods mentioned above are excellent way to manage versions. If a researcher often works with other people on data and/or the same dataset is continuously edited, version management software such as Git(10) (also used in Github.(11)) might be a solution.

   Sources

Click to open/close

  1. Mashable. (2011). The history of digital storage. Mashable Infographics. Retrieved from http://mashable.com/2011/10/08/digital-storage-infographic/
  2. Mozy. (2011). The past, present and future of data storage. Retrieved from http://mozy.com/infographics/the-past-present-and-future-of-data-storage/
  3. UK Data Archive. (2011). Managing and sharing data. Retrieved from http://www.data-archive.ac.uk/media/2894/managingsharing.pdf
  4. National Archives of Australia. Checksum Checker. Retrieved from http://checksumchecker.sourceforge.net/ 
  5. Dropbox. Retrieved from https://www.dropbox.com/
  6. SURFdrive. Retrieved from https://www.surfdrive.nl/en
  7. 2BrightSparks. Syncback: backup software. Retrieved from http://www.2brightsparks.com/syncback/
  8. Vanderfeesten, M. Maak je eigen cloudopslag. Retrieved from https://www.surfspace.nl/artikel/1151-maak-je-eigen-cloudopslag/
  9. Wageningen Universiteit. Data Management Plans. Retrieved from http://www.wageningenur.nl/en/Expertise-Services/Data-Management-Support-Hub/Browse-by-Subject/Storage-solutions.htm (see the DMP Template)
  10. Git, fast version control. Retrieved from git-scm.com
  11. Github. Retrieved from https://github.com/

Additional reading

    Your additions

Do you have examples of reliable ways to store data and do backups? Do you have tips on how to use version management? Or do you have other comments on this section? Let us know and post a comment below.


Like · Dislike ·  
Not rated yet. Be the first who rates this item!
I like maybeLike · 

Jan Heul - Gebruik thuis een dubbel backup systeem.
Eerste backup, wekelijks naar een extra harddisk in de pc.
Tweede backup, maandelijks naar een externe harddisk.

Versiebeheer is thuis niet heel belangrijk, het gaat voornamelijk om mijn foto's.
Bewerkte foto's houden de oorspronkelijke bestandsnaam, bewerkte krijgen een aangepaste naam.

2 years 9 months ago · 
Not rated yet. Be the first who rates this item!
I like maybeLike · 

Richard Visscher - Binnen inholland gebruiken we Sharepoint. Ik gebruik vaak version beheer. daardoor hoef ik geen versiecode in de documentnaam op te nemen. Terugkijken van een eerdere versie is eenvoudig.
Voor software ontwikkeling (webapps) gebruik ik Subversion. Alle changes worden gelogd en gedocumenteerd. Ideaal

1 year 5 months ago · 
Not rated yet. Be the first who rates this item!
I like maybeLike · 

Richard Visscher - Over backups (punt 4 opslag strategie) Ik mis info over soort backup. Een backup overschrijf je inderdaad niet daarvoor heb je het sysyeem van de incrementele backup: de eerste backup is een full backup, de volgende backups bevatten de wijzigingen: samen vormen ze steeds de actuele backup. Als je dat combineert met punt 3 gaat het nooit mis :-)

1 year 5 months ago · 
Not rated yet. Be the first who rates this item!
I like maybeLike · 

Frans de Liagre Böhl - Bij de ontwikkeling van een storage voor datasets kom ik twee soorten van 'versies' tegen, nl die van bestanden die deel uitmaken van een dataset, maar ook versies van de set als geheel. In de onderzoekspraktijk komt het veelvuldig voor dat op basis van ruwe data meerdere 'versies' worden getrokken. Die versies moeten afzonderlijk bewaard worden als 'zelfstandige' datasets omdat elk van die versies ten grondslag liggen aan andere publicaties. Wij versionen dus zowel bestanden als datasets.

4 months 2 weeks ago · 
Not rated yet. Be the first who rates this item!
I like maybeLike · 

Frans de Liagre Böhl - Waar ik overigens benieuwd naar ben is of iemand in de praktijk onderzoekers kennen die van Github gebruik maken voor versioning. Dat is een zeer krachtige tool, maar kan zoveel dat ik al snel door de versies het bos niet meer zie. Het blijft zoeken naar de balans tussen bruikbaarheid en zorgvuldigheid.

4 months 2 weeks ago · 
Not rated yet. Be the first who rates this item!
I like maybeLike · 

Narges Zarrabi - For Storage solutions, I am missing a solution for long-term storage, for example storing the data in an Archive facility.

4 months 2 weeks ago · 
Not rated yet. Be the first who rates this item!
I like maybeLike · 

Research Data Netherlands - @Narges: the course distinguishes storing data - during a project or a study - from archiving data - for the long term. In Chapter 4 you can find information about archiving, including references to certified data archives.

4 months 2 weeks ago · 
Not rated yet. Be the first who rates this item!
I like maybeLike · 

Harry Garst - Je kunt thuis wel een backup maken op een externe harde schijf, maar tijdens een blikseminslag kun je alles kwijtraken. Is mij overkomen, externe harde schijf stond niet eens aan, maar was wel onherstelbaar beschadigd (en veel onderzoeksgegevens kwijtgeraakt).
Het beste is het netsnoer uit het stopcontact halen van een apparaat als je het niet gebruikt.

4 months 1 week ago · 
Not rated yet. Be the first who rates this item!
I like maybeLike · 

RDNL uses cookies. More information Close