Research data appears in many shapes and sizes(1): text, numerical data, models, software, multimedia. In addition, there is also discipline-specific research data, or data characteristic for the tool used to measure it.
A data format or file format is the format in which the data is coded. The information is coded in such a way that a program or application can recognize, read and use the data.
The history of digital storage(2) provides a wonderful insight into the limitations of information carriers. If software/hardware is no longer used, data can become unreadable. In order to prevent this, it is vital to choose an open format: that is a software format that is not attached to a certain software supplier (proprietary software). For open formats all format details are public, everyone can (re)write the software used to read the data if they want to. If it is a standard open format, someone else has probably written the software for you.
Data archives often use a list with preferred formats for researchers to supply their data in. Data archives prefer open formats because it enables them to guarantee a longer research data life.
An in-depth look: MIME types
Data formats are often indicated by their MIME type. MIME stands for Multipart (Multipurpose) Internet Mail Extension. MIME provides web browser information on how to deal with a file.
A MIME type is noted as two indications separated by a slash (MIME type/subtype). Example: text/plain is the MIME type for plain text.
Many people recognize data formats by their extension – the three of four letters following the file name. A video on your computer, for instance, has the extension .avi. The corresponding MIME type is video/msvideo. If the .avi video is on a website, the URL does not have to end in .avi for an .avi file. An extension does not always have to be the correct one, because it can be renamed, for instance, and does not refer to a data format anymore. Someone can decide to use extension .CH1 for ‘chapter 1’. It is also possible for several types of formats to use the same extension, for instance .mid for MIDI audio files and the geographic map file Mapinfo Interchange Drawing.
The advantage of the use of MIME types is that the website page source can always be used to trace it. It is a file format that is transferred 'under water' and can also be read by computers.
MIME types with more information and some examples.