Collecting data

The first way of thinking about research data is where it comes from. | University of Southampton, 2016 

How do researchers get their research data? In this section a number of methods is briefly discussed.

Collecting data

How do researchers collect data? It is done in one of two ways (University of Southampton, 2016): 

  1. By using existing data

    Existing data are:

    • Data collected by others, such as large institutions like Statistics Netherlands (CBS), the Land Registry (Kadaster), ministries and so forth. For example, CentERdata (n.d.) is an institute that collects, analyses, and makes (panel) data available for scientific research.
    • Research data deposited by other researchers for reuse in data archives. See the section 'Searching for data' to find reusable datasets. 
  2. By collecting their own research data

    When generating research data, there is a distinction between raw data and processed data, but this dividing line isn't so strict in practice. Measuring devices are becoming increasingly sophisticated and part of the processing of the data can already be done in the device. This means that the raw data has in fact already been processed.

Five ways 

There are roughly five different ways to create research data (University of Southampton, 2016):

  • Through observation
    This type of data can generally be collected once and is, therefore, unique. For example, climate data, astronomical observations, archaeological excavations, opinion polls, surveys. 
  • By experimenting 
    Data collected through experiments (with the aid of lab equipment). For example, the synthesis of new molecules, gene sequence analysis and psychological tests. In general, these experiments can be repeated. 
  • By simulation (test models)
    Climate models and economic models, for example. The results of simulations can usually be reproduced. It is more useful to store the model and metadata itself than the data resulting from the simulations. 
  • By data processing
    Combining, reprocessing, (re)grouping etc. of data created before. 
  • By researching sources
    For example, data deriving from archive and literature research in order to compose texts, or series of 'measurable' data from archived material, manuscripts and (professional) publications. Specialist queries of large linguistic databases are also an example. 


Researchers collecting their own research data have to take a variety of factors into account. For instance, has the measuring equipment been properly calibrated? And does it measure what it is supposed to? Are the survey questions not too directive? Questions like these are essential to data quality, but are not included in this course.  

In the spotlight

The University of Melbourne University Library has made a number of videos about data management, including the one about data collection (University of Melbourne University Library, 2017). Which of the five different ways to create research data do you recognise? 


Click to open/close

CentERdata (n.d.).

University of Melbourne University Library (2017). Collecting research data [video].

University of Southampton (2016). Introducing Research Data. 4th Edition.