Preferred formats are file formats of which DANS is confident that they will offer the best long-term guarantees in terms of usability, accessibility and sustainability. | DANS, n.d.
Whether data can be reused depends on the data format and the software available to read the data. Data archives often use a list of preferred formats in which the data should be delivered. For data delivered in a preferred format, data archives guarantee that they can be stored for a long time and will remain legible and accessible in the future. You will learn more about this in this section.
Preferred formats and acceptable formats
Data archives usually use two categories of research data for recording: preferred or recommended formats and acceptable formats.
Preferred formats roughly have the following qualities:
- The file format can be read in freely available software;
- The data format is well documented and the documentation is openly available;
- The file format is widely used, either in general or within a research discipline.
In the paragraph 'Data formats' we already commented on this.
Differences between data archives
Different data formats for the same file type often have unique advantages and disadvantages. Take, for example, data formats for photos:
- JPEG is a universally used format, easy to open by all kinds of applications;
- With the Uncompressed TIFF format, you can preserve the image without compression, in the highest quality, but in a large format;
- Although PNG is a high-quality format with a smaller size than Uncompressed TIFF, it is not able to store file properties like the type of camera used. JPEG and TIFF can.
Different data archives have different considerations when it comes to specifying the preferred formats for photos:
- For example, at UK Data Service (n.d.) and the Australian Data Archive (n.d.) TIFF 6.0 uncompressed files (.tif) are preferred or JPG (.jpg) if the photos are made in that format. PNG (.png) is on the list of acceptable formats;
- At 4TU.Centre for Research Data (2019) and DANS (n.d.) JPEG, TIFF, PNG are all on the list of preferred formats.
While many organisations prefer the conversion of all image formats to uncompressed TIFF because this basically contains the maximum quality, the statement of DANS and 4TU.ResearchData is that JPEG can maintain sufficient quality and in the long run has a higher robustness and support.
There is as yet no system in place to harmonise the guidelines, which leads to differences. Differences can be the result of new developments, of the value that an organisation attaches to certain characteristics of the format and of the experience that an organisation has with the future-proof preservation of a certain data format.
Click to open/close
4TU.ResearchData (2019). Preferred file formats. https://data.4tu.nl/info/fileadmin/user_upload/Documenten/Preferred_File_Formats_2019.pdf
Australian Data Archive (n.d.). Preferred forms for depositing data with ADA. https://legacy.ada.edu.au/ada/preferred-formats
DANS (n.d.). File formats. https://dans.knaw.nl/en/about/services/easy/information-about-depositing-data/before-depositing/file-formats?set_language=en
UK Data Service (n.d.). Recommended Formats. https://ukdataservice.ac.uk/manage-data/format/recommended-formats.aspx